DC 欄位 值 語言 DC.contributor 資訊管理學系 zh_TW DC.creator 胡筱薇 zh_TW DC.creator Hsiao-Wei Hu en_US dc.date.accessioned 2009-11-3T07:39:07Z dc.date.available 2009-11-3T07:39:07Z dc.date.issued 2009 dc.identifier.uri http://ir.lib.ncu.edu.tw:444/thesis/view_etd.asp?URN=944403002 dc.contributor.department 資訊管理學系 zh_TW DC.description 國立中央大學 zh_TW DC.description National Central University en_US dc.description.abstract 在眾多資料探勘技術中,決策樹是相當受歡迎的一種分類方法,主要是因為決策樹分類方法所探勘出來的規則有較佳的可讀性。然而,在絕大部份探討決策樹分類的文獻中,均假設標籤為類別屬性,而這樣的假設往往無法反映所有真實世界的情況,因為分類標籤本身可能是具有概念階層的資料、連續性的數值資料亦或是具有概念階層的連續性數值資料。為了降低此研究構面的落差以及處理在不同標籤屬性變化之下的分類,本研究針對三種不同的標籤屬性,分別設計與發展其專屬的決策樹分類法,並命名為(1)HLC(Hierarchical Label Classifier),(2)CLC(Continuous Label Classifier),以及 (3)HCC(Hierarchical Continuous-label Classifier)。 本研究所提出的三種創新決策樹分類法並不同於傳統決策樹分類法,其中在主要功能方面即包括:如何控制決策樹的長成、如何選擇適當的測試屬性、如何決定最適合代表葉節點的標籤以及如何預測新的資料。在HLC的發展策略方面,主要是以考量資料在概念階層之間的分布情形,來設計具概念階層之標籤間的相似度測量,而在CLC的發展策略中,本研究透過發展一創新的動態離散化方法來協助開發在數值屬性標籤間的相似度測量,最後,HCC的發展策略則是透過同時考量HCL與CLC的相關處理,進而設計其專屬的相似度測量方法。 實驗結果說明 HLC, CLC 和 HCC 不僅能由各式各樣的標籤資料集來挖掘出規則,而且得到具說服性的正確率和精確率。 zh_TW dc.description.abstract Presently, decision tree classifiers are designed to classify the data with categorical or Boolean labels. In many practical situations, however, there are more complex classification scenarios, where the labels to be predicted are not just nominal variable with flat structure. For example, the predicted labels can be (1) hierarchically related, (2) continuous variable, or (3) hierarchical continuous variable. Unfortunately, existing research paid little attention to the issue of classification for constructing a DT from data with various types of labels. To remedy this research gap, this research has developed three innovative label-driven DT algorithms named (1)HLC (Hierarchical Label Classifier), (2)CLC (Continuous Label Classifier), and (3)HCC (Hierarchical Continuous-label Classifier) HLC, CLC and HCC are different from the traditional decision tree classifiers in some major functions including growing a decision tree, selecting attribute, assigning labels to represent a leaf and making a prediction for a new data. The development strategy of the proposed algorithms is mainly based on measuring similarity among labels by considering data distribution over the predefined concept hierarchy and by a proposed dynamic discretization for the continuous label at each node during the tree-induction process. The experimental results show that this research can not merely mine classification rules from variety types of labels, but also gets convincing accuracy and precision of rules. en_US DC.subject 資料離散化 zh_TW DC.subject 決策樹 zh_TW DC.subject 資料探勘 zh_TW DC.subject 概念階層 zh_TW DC.subject Decision Tree en_US DC.subject Data Discretization en_US DC.subject Concept Hierarchy en_US DC.subject Data Mining en_US DC.title 不同標籤屬性變化下的決策樹建構系統 zh_TW dc.language.iso zh-TW zh-TW DC.title Constructing Decision Trees from Data with Various Label-Driven Inductions en_US DC.type 博碩士論文 zh_TW DC.type thesis en_US DC.publisher National Central University en_US