Agglomerative Clustering For AOI

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：12

、訪客IP：18.221.11.68

姓名

康鑫玲(Hsin-Ling Kang) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

(Agglomerative Clustering For AOI)

相關論文

★ 零售業商業智慧之探討	★ 有線電話通話異常偵測系統之建置
★ 資料探勘技術運用於在學成績與學測成果分析 -以高職餐飲管理科為例	★ 利用資料採礦技術提昇財富管理效益 -以個案銀行為主
★ 晶圓製造良率模式之評比與分析－以國內某DRAM廠為例	★ 商業智慧分析運用於學生成績之研究
★ 運用資料探勘技術建構國小高年級學生學業成就之預測模式	★ 應用資料探勘技術建立機車貸款風險評估模式之研究－以A公司為例
★ 績效指標評估研究應用於提升研發設計品質保證	★ 基於文字履歷及人格特質應用機械學習改善錄用品質
★ 以關係基因演算法為基礎之一般性架構解決包含限制處理之集合切割問題	★ 關聯式資料庫之廣義知識探勘
★ 考量屬性值取得延遲的決策樹建構	★ 從序列資料中找尋偏好圖的方法 - 應用於群體排名問題
★ 利用分割式分群演算法找共識群解群體決策問題	★ 以新奇的方法有序共識群應用於群體決策問題

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

由於資料庫（Data Base）技術的出現，資料量成倍數成長，從眾多資料中挖掘所需知識成為一重要議題，因此不同領域的學者針對不同問題提出許多資料探勘方法，而屬性導向歸納法（Attribute Oriented Induction，簡稱為AOI方法）也於1990年代首次被提出。AOI方法是資料探勘（Data Mining）最重要方法之一，為設定導向的方法，主要用於將關聯式資料庫中的屬性一般化以進行知識挖掘（Knowledge Discovery），此方法的屬性會根據概念樹進行一般化，而概念樹由使用者背景知識設定而成，減少資料庫挖掘的複雜計算。由於傳統的屬性導向歸納法無法判斷何種一般化表格較佳，因此本研究導入成本的概念，將屬性一般化所喪失的詳細度量化為成本，使得結果的優劣能夠根據量化的成本大小判斷，同時，提出概念與AOI方法相似的聚合式階層分群演算法（Agglomerative Clustering）。此演算法根據成本概念計算資料列兩兩間的合併成本，並找出最小合併成本的兩資料列進行合併，由下而上合併直到滿足終止條件，歸納出較傳統AOI方法更佳的結果。本研究的最後將提出的演算法與傳統AOI方法進行比較，分析在不同資料量及歸納至不同資料列筆數時的表現，發現本研究提出的演算法在不同的情境下，最終歸納表格成本較低，整體表現較佳。

摘要(英)

Due to the database technology, it has been estimated that the amount of information in the world doubles every 20 months. Mining information and knowledge from large databases has been recognized as an important issue. Researchers in many different fields have developed lots of solutions in data mining. One of these important methods called Attribute Oriented Induction (short for AOI) has also been proposed in 1990. AOI is well recognized as the most important method of data mining that generalizes attribute in relational databases according to concept trees ascension for knowledge discovery. A concept tree represents the background knowledge for generalization, which applies well-developed set-oriented database operations and substantially reduces the computational complexity of the database learning processes. However, traditional AOI method cannot distinguish which result is better. In this paper, we propose the concept of cost to quantify the losing details when attribute values are generalizing. And we develop an algorithm which combine AOI with agglomerative clustering that is similar to AOI. The proposed algorithm will merge every two tuples and compute the merging cost first, then will find the two tuples whose merging cost are minimized and recursively running the process until the results meets the conditions. Performance studies have shown that the proposed algorithm is superior then traditional AOI.

關鍵字(中)

★ 屬性導向歸納法
★ 聚合式階層分群法
★ 資料探勘
★ 知識挖掘

關鍵字(英)

★ Attribute Oriented Induction
★ Agglomerative Clustering
★ Data Mining
★ Knowledge Discovery

論文目次

目錄
論文摘要 i
Abstract ii
圖目錄 iv
表目錄 vi
一、緒論 1
二、文獻探討 6
2.1 AOI方法 6
2.2 分群方法 11
三、研究方法 15
3.1 問題定義 15
3.2演算法 25
3.2.1傳統聚合式階層分群法 25
3.2.2本研究方法 26
四、實驗 37
4.1實驗設計 37
4.2實驗結果 38
五、結論 52
六、參考文獻 53
附錄A 58
附錄B 60

參考文獻

[1] Frawley, W. J., Piatetsky-Shapiro, G., & Matheus, C. J. (1992). Knowledge discovery in databases: An overview. AI magazine, 13(3), 57.
[2] Silberschatz, A., Stonebraker, M., & Ullman, J. D. (1990). Database systems: Achievements and opportunities. ACM Sigmod Record, 19(4), 6-22.
[3] Cai, Y., Cercone, N., & Han, J. (1990, February). An attribute-oriented approach for learning classification rules from relational databases. In Data Engineering, 1990. Proceedings. Sixth International Conference on (pp. 281-288). IEEE.
[4] Han, J., Cai, Y., & Cercone, N. (1992, August). Knowledge discovery in databases: An attribute-oriented approach. In VLDB (Vol. 92, pp. 24-27).
[5] Han, J., Cai, Y., & Cercone, N. (1993). Data-driven discovery of quantitative rules in relational databases. Knowledge and Data Engineering, IEEE Transactions on, 5(1), 29-40.
[6] Chen, M. S., Han, J., & Yu, P. S. (1996). Data mining: an overview from a database perspective. Knowledge and data Engineering, IEEE Transactions on, 8(6), 866-883.
[7] Warnars, S. (2015). Mining Frequent and Similar Patterns with Attribute Oriented Induction High Level Emerging Pattern (AOI-HEP) Data Mining Technique.
[8] Warnars, H. L. H. S., Wijaya, M. I., & Tjung, H. B. (2016). Easy Understanding of Attribute Oriented Induction (AOI) Characteristic Rule Algorithm. International Journal of Applied Engineering Research, 11(8), 5369-5375.
[9] Han, J., & Fu, Y. (1996). 16 Exploration of the Power of Attribute-Oriented Induction in Data Mining.
[10] Cai, Y. (1989). Attribute-oriented induction in relational databases (Doctoral dissertation, Simon Fraser University).
[11] Carter, C. L., & Hamilton, H. J. (1998). Efficient attribute-oriented generalization for knowledge discovery from large databases. Knowledge and Data Engineering, IEEE Transactions on, 10(2), 193-208.
[12] Cheung, D. W., Hwang, H. Y., Fu, A. W., & Han, J. (2000). Efficient rule-based attribute-oriented induction for data mining. Journal of Intelligent Information Systems, 15(2), 175-200.
[13] Huang, S. M., Hsu, P. Y., & Wang, W. C. (2012). A study on the modified attribute oriented induction algorithm of mining the multi-value attribute data. In Intelligent Information and Database Systems (pp. 348-358). Springer Berlin Heidelberg.
[14] Chen, Y. L., & Shen, C. C. (2005). Mining generalized knowledge from ordered data through attribute-oriented induction techniques. European Journal of Operational Research, 166(1), 221-245.
[15] Wu, Y. Y., Chen, Y. L., & Chang, R. I. (2011). Mining negative generalized knowledge from relational databases. Knowledge-Based Systems, 24(1), 134-145.
[16] Muyeba, M. K., Crockett, K., Wang, W., & Keane, J. A. (2014). A hybrid heuristic approach for attribute-oriented mining. Decision Support Systems, 57, 139-149.
[17] Chen, Y. L., Wu, Y. Y., & Chang, R. I. (2012). From data to global generalized knowledge. Decision Support Systems, 52(2), 295-307.
[18] Knorr, E. M., & Ng, R. T. (1996, August). Extraction of Spatial Proximity Patterns by Concept Generalization. In KDD (pp. 347-350).
[19] Wang, L. Z., Zhou, L. H., & Chen, T. (2004, August). A new method of attribute-oriented spatial generalization. In Machine Learning and Cybernetics, 2004. Proceedings of 2004 International Conference on (Vol. 3, pp. 1393-1398). IEEE.
[20] Lee, K. M. (2001, July). Mining generalized fuzzy quantitative association rules with fuzzy generalization hierarchies. In IFSA World Congress and 20th NAFIPS International Conference, 2001. Joint 9th (pp. 2977-2982). IEEE.
[21] Raschia, G., & Mouaddib, N. (2002). SAINTETIQ: a fuzzy set-based approach to database summarization. Fuzzy sets and systems, 129(2), 137-162.
[22] Angryk, R., & Petry, F. E. (2005, May). Mining multi-level associations with fuzzy hierarchies. In Fuzzy Systems, 2005. FUZZ′05. The 14th IEEE International Conference on (pp. 785-790). IEEE.
[23] Lee, D. H., & Kim, M. H. (1997). Database summarization using fuzzy ISA hierarchies. Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on, 27(1), 68-78.
[24] Jain, A. K., Murty, M. N., & Flynn, P. J. (1999). Data clustering: a review. ACM computing surveys (CSUR), 31(3), 264-323.
[25] Han: Data Mining-Concepts and Techniques 3/E
[26] Taşdemir, K. (2012). Vector quantization based approximate spectral clustering of large datasets. Pattern Recognition, 45(8), 3034-3044.
[27] Aliguliyev, R. M. (2009). Performance evaluation of density-based clustering methods. Information Sciences, 179(20), 3583-3602.
[28] D’hondt, J., Vertommen, J., Verhaegen, P. A., Cattrysse, D., & Duflou, J. R. (2010). Pairwise-adaptive dissimilarity measure for document clustering.Information Sciences, 180(12), 2341-2358.
[29] Hogenboom, F., Frasincar, F., Kaymak, U., de Jong, F., & Caron, E. (2016). A Survey of event extraction methods from text for decision support systems. Decision Support Systems, 85, 12-22.
[30] Golsefid, S. M. M., Zarandi, M. F., & Turksen, I. B. (2016). Multi-central general type-2 fuzzy clustering approach for pattern recognitions. Information Sciences, 328, 172-188.
[31] Kalhori, M. R. N., & Zarandi, M. F. (2015). Interval type-2 credibilistic clustering for pattern recognition. Pattern Recognition.

[32] Mehrabani, M., & Hansen, J. H. (2013). Singing speaker clustering based on subspace learning in the GMM mean supervector space. Speech Communication, 55(5), 653-666.
[33] Wang, D., Vogt, R., & Sridharan, S. (2013). Eigenvoice modelling for cross likelihood ratio based speaker clustering: A Bayesian approach. Computer Speech & Language, 27(4), 1011-1027.
[34] Cobos, C., Muñoz-Collazos, H., Urbano-Muñoz, R., Mendoza, M., León, E., & Herrera-Viedma, E. (2014). Clustering of web search results based on the cuckoo search algorithm and Balanced Bayesian Information Criterion.Information Sciences, 281, 248-264.
[35] Anupama, D. S., & Gowda, S. D. (2015). Clustering of Web User Sessions to Maintain Occurrence of Sequence in Navigation Pattern. Procedia Computer Science, 58, 558-564.
[36] Coelho, A. L., Fernandes, E., & Faceli, K. (2011). Multi-objective design of hierarchical consensus functions for clustering ensembles via genetic programming. Decision Support Systems, 51(4), 794-809.
[37] Combes, C., & Azema, J. (2013). Clustering using principal component analysis applied to autonomy–disability of elderly people. Decision Support Systems, 55(2), 578-586.
[38] Yang, Y., Tan, W., Li, T., & Ruan, D. (2012). Consensus clustering based on constrained self-organizing map and improved Cop-Kmeans ensemble in intelligent decision support systems. Knowledge-Based Systems, 32, 101-115.
[39] Xu, R., & Wunsch, D. (2005). Survey of clustering algorithms. Neural Networks, IEEE Transactions on, 16(3), 645-678.
[40] Dao Lam, Donald C. Wunsch. (2014). Academic Press Library in Signal Processing: Volume 1 — Signal Processing Theory and Machine Learning, Pages 1115–1149.
[41] Heinonen, O., & Mannila, H. (1996). Attribute-oriented induction and conceptual clustering. Departement of Computer Science, University of Helsinki, Finland.

[42] Muyeba, M., Khan, M. S., & Gong, Z. (2007). On Clustering Attribute-oriented Induction. In Research and Development in Intelligent Systems XXIII (pp. 403-407). Springer London.
[43] Sautot, L., Faivre, B., Journaux, L., & Molin, P. (2015). The hierarchical agglomerative clustering with gower index: a methodology for automatic design of olap cube in ecological data processing context. Ecological Informatics, 26, 217-230.
[44] Wei, C. P., Yang, C. S., & Hsiao, H. W. (2008). A collaborative filtering-based approach to personalized document clustering. Decision Support Systems, 45(3), 413-428.

指導教授

陳彥良(Yen-Liang Chen)

審核日期

2016-8-25

推文