樹葉節點數目限制下的決策樹建構

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：53

、訪客IP：13.59.73.248

姓名

楊翔宇(Xiang-Yu Yang) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

樹葉節點數目限制下的決策樹建構
(Decision tree induction with constrained number of leaf node)

相關論文

★ 零售業商業智慧之探討	★ 有線電話通話異常偵測系統之建置
★ 資料探勘技術運用於在學成績與學測成果分析 -以高職餐飲管理科為例	★ 利用資料採礦技術提昇財富管理效益 -以個案銀行為主
★ 晶圓製造良率模式之評比與分析－以國內某DRAM廠為例	★ 商業智慧分析運用於學生成績之研究
★ 運用資料探勘技術建構國小高年級學生學業成就之預測模式	★ 應用資料探勘技術建立機車貸款風險評估模式之研究－以A公司為例
★ 績效指標評估研究應用於提升研發設計品質保證	★ 基於文字履歷及人格特質應用機械學習改善錄用品質
★ 以關係基因演算法為基礎之一般性架構解決包含限制處理之集合切割問題	★ 關聯式資料庫之廣義知識探勘
★ 考量屬性值取得延遲的決策樹建構	★ 從序列資料中找尋偏好圖的方法 - 應用於群體排名問題
★ 利用分割式分群演算法找共識群解群體決策問題	★ 以新奇的方法有序共識群應用於群體決策問題

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

分類是依據已知的資料及其類別屬性來建立資料的分類模型，並以此預測其他未經分類資料的類別，是一項應用非常廣泛的資料探勘技術。其中決策樹是最常使用的一種分類技術，因為它有容易了解、計算效率高的特性。決策樹廣泛使用在訊號分析、專家系統、醫療辨識等領域裡。但決策樹常因為訓練資料內含的雜訊資料、特殊案例的影響，造成樹體結構龐大、分支太多，產生規則過多難以理解與應用的問題，此項缺點減少了決策樹的可用性。
因此本研究透過限制決策樹的葉節點數，控制決策樹產生的規則量，並在使用者給定的葉節點數範圍內，達到最高的準確度。我們發展出一套新的演算法，本演算法以階層式分群法中的聚合法合併決策樹的分支，限制決策樹為二元樹，以便控制決策樹的節點數量。最後本研究再以實際資料進行實驗實作。
實驗結果顯示，我們提出的新演算法與C4.5比較，在同樣的葉節點數限制下，達到比C4.5更好的準確度。

摘要(英)

Classification, which builds a data classification model based on attribute value and label of existing data, is a very widespread data mining technology. Decision tree is one of the most popular classification technologies, because it is easy to understand and has the high efficiency computing. Decision tree is widely applied to signal classification, expert system, and medical diagnosis. Because of the noise data and special case of training data sets, decision tree is always huge and it contains too many branches and rules which are difficult to understand. This shortcoming reduces the availability of decision tree.
Therefore, we reduce rules from a decision tree by limiting the number of leaf nodes of the decision tree and achieve the highest accuracy with the number of leaf nodes given by user. For this purpose, we propose a new algorithm. We use the agglomerative approach of the hierarchical clustering to limit the decision tree to binary tree by combining the branches of decision tree.
Experiment results show that compared with the C4.5, the proposed algorithm successfully reduces the number of leaf nodes and makes better accuracy.

關鍵字(中)

★ 決策樹
★ 分群法
★ 限制樹
★ 資料探勘
★ 分類

關鍵字(英)

★ constraints tree
★ classification
★ data mining
★ decision tree
★ clustering

論文目次

目錄 III
表目錄 V
圖目錄 VI
第一章緒論 1
第一節研究背景 1
第二節研究動機與目的 2
第三節研究流程 3
第四節論文架構 4
第二章文獻探討 6
第一節 C4.5演算法 6
第二節決策樹修剪 7
第三節限制樹 12
第三章問題描述與相關定義 16
第一節問題描述 16
第二節相關定義 17
第四章 BiTree演算法 20
第一節演算法基本概念 20
第二節演算法架構 20
第三節範例說明 23
第五章實驗評估 29
第一節實驗發展工具與環境 29
第二節實驗設計與實驗流程 30
第三節評估準則 32
第四節實驗結果及分析 32
第六章結論 41
第一節結論 41
第二節研究貢獻 41
第三節未來研究方向 41
參考文獻 43

參考文獻

1. Bishop, C.M., “Neural Networks for Pattern Recognition,” New York: Oxford University Press, 1995.
2. Bohanec, M. and Bratko, I., “Trading accuracy for simplicity in decision trees,” Machine Learning, Vol.15, pp.223–250, 1994.
3. Breiman, L., Friedman, J., Olshen, R., Stone, C., Classification and Regression Trees, Wadsworth Statistics, 1984.
4. Cheeseman, P., Kelly, J., Self, M., “AutoClass: A Bayesian classification system,” Machine Learning, Vol.5, 1988.
5. Frawley, W. J., Piatetsky-Shapiro, G., Matheus, C. J., “Knowledge Discovery in Databases: An Overview,” Knowledge Discovery in Databases, pp 1-27, 1991.
6. Goldberg, D.E., “Genetic Algorithms in Search, Optimization and Machine Learning,” Morgan Kaufmann, 1989.
7. Garofalakis, M., Hyun, D., Rastogi, R., Shim, K., “Building Decision Trees with Constraints,” Data Mining and Knowledge Discovery, Vol.7, pp.187–214, 2003.
8. Krichevsky, R. and Trofimov, V., “The performance of universal encoding,” IEEE Transactions on Information Theory, Vol.27, pp.199–207, 1981.
9. Mehta, M., Rissanen, J., Agrawal, R., “MDL-based decision tree pruning,” Knowledge Discovery in Databases and Data Mining, Montreal, Canada, 1995.
10. Mingers, J., “Expert Systems—Rule Induction With Statistical Data,” Journal of the Operational Research Society, vol. 38, pp. 39-47, 1987.
11. Niblett, T., Bratko, I., “Learning Decision Rules in Noisy Domains,” Proceedings of Expert Systems 86, Cambridge University Press, pp 25-34, 1987.
12. Quinlan, J.R., Rivest, R.L., “Inferring decision trees using the minimum description length principle,” Information and Computation, Vol.80, pp.227–248, 1989.
13. Quinlan, J. R., C4.5: Programs for Machine Learning, Morgen Kaufmann Publishers, San Mateo, CA, 1993.
14. Russell, R., “Pruning algorithms-a survey,” IEEE Transactions on Neural Networks, Vol.4, pp.740-747, 1993.
15. Rasoul Safavian, S., Landgrebe, D., “A survey of decision tree classifier methodology,” IEEE Transactions on Systems, Vol. 21, pp 660-674, 1991.

指導教授

陳彥良(Yan-liang Chen)

審核日期

2009-7-20

推文