從關聯規則集中建立分類決策樹

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：18

、訪客IP：18.188.96.26

姓名

洪子軒(Tzu-hsuan Hung) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

從關聯規則集中建立分類決策樹
(Using Decision Tree to Summarize Associative Classification Rules)

相關論文

★ 零售業商業智慧之探討	★ 有線電話通話異常偵測系統之建置
★ 資料探勘技術運用於在學成績與學測成果分析 -以高職餐飲管理科為例	★ 利用資料採礦技術提昇財富管理效益 -以個案銀行為主
★ 晶圓製造良率模式之評比與分析－以國內某DRAM廠為例	★ 商業智慧分析運用於學生成績之研究
★ 運用資料探勘技術建構國小高年級學生學業成就之預測模式	★ 應用資料探勘技術建立機車貸款風險評估模式之研究－以A公司為例
★ 績效指標評估研究應用於提升研發設計品質保證	★ 基於文字履歷及人格特質應用機械學習改善錄用品質
★ 以關係基因演算法為基礎之一般性架構解決包含限制處理之集合切割問題	★ 關聯式資料庫之廣義知識探勘
★ 考量屬性值取得延遲的決策樹建構	★ 從序列資料中找尋偏好圖的方法 - 應用於群體排名問題
★ 利用分割式分群演算法找共識群解群體決策問題	★ 以新奇的方法有序共識群應用於群體決策問題

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

關聯規則探勘是資料探勘領域其中一種最廣為人之的探勘方法，其主要內容是在一組交易資料中計算不同商品同時被購買的頻率，進而找出這些共同被購買之關係中的規則。另一方面關聯規則在解決分類問題之應用層面亦已行之有年（關聯式分類）。然而一旦分類規則產生出來，其缺乏組織反而造成閱讀與理解上的缺陷。為了解決此點，因此本文提出從關聯規則集中摘要以及建立決策樹的構想與具體作法。期望結合兩者優點來建立分類模型。就分類模型而言，此方法連結關聯式分類與決策樹二者之優點：相較於前者更加具理解力、有組織，精簡、容易使用的分類模型；相較於後者分類正確度亦比傳統C4.5建立決策樹方式來的更為精確。

摘要(英)

Association rule mining is one of the most popular areas in data mining. It is to discover items that co-occur frequently within a set of transactions, and to discover rules based on these co-occurrence relations. Association rules have been adopted into classification problem for years (associative classification). However, once rules have been generated, their lacking of organization causes readability problem, i.e., it is difficult for user to analyze them and understand the domain. To resolve this weakness, our work presented two algorithms that can use decision tree to summarize associative classification rules. As a classification model, it connects the advantages of both associative classification and decision tree. On one hand, it is a more readable, compact, well-organized form and easier to use when compared to associative classification. On the other hand, it is more accurate than traditional TDIDT (abbreviated from Top-Down Induction of Decision Trees) classification algorithm.

關鍵字(中)

★ 資料探勘
★ 規則歸納法
★ 以規則為基礎的分類法

關鍵字(英)

★ rule summarization
★ rule-based classification
★ data mining

論文目次

LIST OF FIGURES II
LIST OF TABLES III
CHAPTER 1　INTRODUCTION 1
CHAPTER 2　RELATED WORKS 3
2.1　DECISION TREE 3
2.2　ASSOCIATIVE CLASSIFICATION 4
2.3　RULE SUMMARIZATION 5
CHAPTER 3　BASIC PRINCIPLES 7
3.1　ASSOCIATIVE CLASSIFICATION RULE 7
3.2　ASSOCIATIVE CLASSIFICATION TREE (ACT) 10
3.3　PROBLEM STATEMENT 12
CHAPTER 4　ALGORITHM ACT 13
4.1　SPLITTING ATTRIBUTES 14
4.1.1 Splitting by the Confidence Gain Criterion 16
4.1.2 Splitting by entropy gain criterion 18
4.2　LABEL ASSIGNMENT 22
4.3　STOP CRITERIA OF NODE 23
CHAPTER 5　EXPERIMENTS AND PERFORMANCE EVALUATION 25
CHAPTER 6　CONCLUSIONS AND FUTURE WORKS 33
REFERENCES 34
APPENDIXES 36
APPENDIX A. EXPERIMENTAL RESULTS IN ALL COMPARISONS 36

參考文獻

[1] L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone, “Classification and Regression Trees,” Wadsworth, California, USA, 1984.
[2] T. Calders and B. Goethals, “Mining All Non-Derivable Frequent Itemsets,” Proc.of 2002 European Conf. on Principles of Data Mining and Knowledge Discovery, pp. 74–85, 2002.
[3] G. Dong, X. Zhang, L. Wong, and J. Li, “CAEP: Classification by Aggregating Emerging Patterns,” DS’99 (LNCS1721), Japan, Dec.1999.
[4] J. Gehrke, V. Ganti, R. Ramakrishnan, and W-Y. Loh, “BOAT—Optimistic Decision Tree Construction,” Proceedings of the 1999 ACM SIGMOD international conference on Management of Data, pp. 169–180, 1999.
[5] J. Gehrke, R. Ramakrishnan, and V. Ganti, “RainForest—A Framework for Fast Decision Tree Construction of Large Datasets,” Data Mining and Knowledge Discovery, 4:2/3, pp. 127–162, 2000.
[6] D. Gunopulos, H. Mannila, R. Khardon, and H. Toivonen, “Data Mining, Hypergraph Transversals, and Machine Learning,” Proc. 1997 ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 209–216, 1997.
[7] J. Han, J. Wang, Y. Lu, and P. Tzvetkov, “Mining Top-K Frequent Closed Patterns Without Minimum Support,” Proc. of 2002 Int. Conf. on Data Mining, pp. 211–218, 2002.
[8] B. Liu, W. Hsu, Y. Ma, “Integrating Classification and Association Rule Mining,” Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, pp. 80–86, 1998.
[9] B. Liu, W. Hsu, and Y. Ma, “Pruning and Summarizing the Discovered Associations,” KDD-99. 1999.
[10] B. Liu, M. Hu, and W. Hsu, “Multi-Level Organization and Summarization of the Discovered Rules,” Proc. ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pp. 208-217, 2000.
[11] M. Mehta, R. Agrawal, and J. Rissanen, “SLIQ: A Fast Scalable Classifier for Data Mining,” Advances in Database Technology—Proceedings of the Fifth International Conference on Extending Database Technology, pp.18–32, 1996.
[12] N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal, “Discovering Frequent Closed Itemsets for Association Rules,” Proc. of 7th Int. Conf. on Database Theory, pp. 398–416, 1999.
[13] J. Pei, G. Dong, W. Zou, and J. Han, “On Computing Condensed Frequent Pattern Bases,” Proc. 2002 Int. Conf. on Data Mining, pp. 378–385, 2002.
[14] J. R. Quinlan, “Induction on Decision Trees,” Machine Learning, 1, pp. 81–106, 1986.
[15] J. R. Quinlan, “C4.5: Programs for Machine Learning,” Morgan Kaufmann Series in Machine Learning. Kluwer Academic Publishers, 1993
[16] J. R. Quinlan and R. M. Cameron-Jones, “Cameron-Jones. Foil: A midterm Report,” Proceedings of the 1993 European Conference on Machine Learning, pp. 3–20, 1993.
[17] R. Rastogi, and K. Shim, “PUBLIC: A Decision Tree Classifier That Integrates Building and Pruning,” VLDB’98, Proceedings of 24th International Conference on Very Large Data Bases, pp. 404–415, 1998.
[18] J. C. Shafer, R. Agrawal, and M. Mehta, “SPRINT: A Scalable Parallel Classifier for Data Mining” VLDB’96, Proceedings of 22nd International Conference on Very Large Data Bases, pp. 544–555, 1996.
[19] L. Wenmin, H. Jiawei, and P. Jian, “CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules,” ICDM 2001: 369-376. 2001.
[20] X. Yan, H. Cheng, J. Han, and D. Xin, “Summarizing Itemset Patterns: A Profile-Based Approach,” Proceedings of the 2005 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug 2005.
[21] C. Yang, U. Fayyad, and P. S. Bradley, “Efficient Discovery of Error-Tolerant Frequent Itemsets in High Dimensions,” Proc. Of 2001 ACM Int. Conf. on Knowledge Discovery in Databases, pp. 194–203, 2001.
[22] X. Yin and J. Han, “CPAR: Classification Based on Predictive Association Rules” Proceedings of the Third SIAM International Conference on Data Mining, pp. 208–217, 2003.

指導教授

陳彥良(Yen-liang Chen)

審核日期

2007-7-2

推文