不同標籤屬性變化下的決策樹建構系統

DC 欄位	值	語言
DC.contributor	資訊管理學系	zh_TW
DC.creator	胡筱薇	zh_TW
DC.creator	Hsiao-Wei Hu	en_US
dc.date.accessioned	2009-11-3T07:39:07Z
dc.date.available	2009-11-3T07:39:07Z
dc.date.issued	2009
dc.identifier.uri	http://ir.lib.ncu.edu.tw:444/thesis/view_etd.asp?URN=944403002
dc.contributor.department	資訊管理學系	zh_TW
DC.description	國立中央大學	zh_TW
DC.description	National Central University	en_US
dc.description.abstract	在眾多資料探勘技術中，決策樹是相當受歡迎的一種分類方法，主要是因為決策樹分類方法所探勘出來的規則有較佳的可讀性。然而，在絕大部份探討決策樹分類的文獻中，均假設標籤為類別屬性，而這樣的假設往往無法反映所有真實世界的情況，因為分類標籤本身可能是具有概念階層的資料、連續性的數值資料亦或是具有概念階層的連續性數值資料。為了降低此研究構面的落差以及處理在不同標籤屬性變化之下的分類，本研究針對三種不同的標籤屬性，分別設計與發展其專屬的決策樹分類法，並命名為(1)HLC(Hierarchical Label Classifier)，(2)CLC(Continuous Label Classifier)，以及 (3)HCC(Hierarchical Continuous-label Classifier)。本研究所提出的三種創新決策樹分類法並不同於傳統決策樹分類法，其中在主要功能方面即包括:如何控制決策樹的長成、如何選擇適當的測試屬性、如何決定最適合代表葉節點的標籤以及如何預測新的資料。在HLC的發展策略方面，主要是以考量資料在概念階層之間的分布情形，來設計具概念階層之標籤間的相似度測量，而在CLC的發展策略中，本研究透過發展一創新的動態離散化方法來協助開發在數值屬性標籤間的相似度測量，最後，HCC的發展策略則是透過同時考量HCL與CLC的相關處理，進而設計其專屬的相似度測量方法。實驗結果說明 HLC, CLC 和 HCC 不僅能由各式各樣的標籤資料集來挖掘出規則，而且得到具說服性的正確率和精確率。	zh_TW
dc.description.abstract	Presently, decision tree classifiers are designed to classify the data with categorical or Boolean labels. In many practical situations, however, there are more complex classification scenarios, where the labels to be predicted are not just nominal variable with flat structure. For example, the predicted labels can be (1) hierarchically related, (2) continuous variable, or (3) hierarchical continuous variable. Unfortunately, existing research paid little attention to the issue of classification for constructing a DT from data with various types of labels. To remedy this research gap, this research has developed three innovative label-driven DT algorithms named (1)HLC (Hierarchical Label Classifier), (2)CLC (Continuous Label Classifier), and (3)HCC (Hierarchical Continuous-label Classifier) HLC, CLC and HCC are different from the traditional decision tree classifiers in some major functions including growing a decision tree, selecting attribute, assigning labels to represent a leaf and making a prediction for a new data. The development strategy of the proposed algorithms is mainly based on measuring similarity among labels by considering data distribution over the predefined concept hierarchy and by a proposed dynamic discretization for the continuous label at each node during the tree-induction process. The experimental results show that this research can not merely mine classification rules from variety types of labels, but also gets convincing accuracy and precision of rules.	en_US
DC.subject	資料離散化	zh_TW
DC.subject	決策樹	zh_TW
DC.subject	資料探勘	zh_TW
DC.subject	概念階層	zh_TW
DC.subject	Decision Tree	en_US
DC.subject	Data Discretization	en_US
DC.subject	Concept Hierarchy	en_US
DC.subject	Data Mining	en_US
DC.title	不同標籤屬性變化下的決策樹建構系統	zh_TW
dc.language.iso	zh-TW	zh-TW
DC.title	Constructing Decision Trees from Data with Various Label-Driven Inductions	en_US
DC.type	博碩士論文	zh_TW
DC.type	thesis	en_US
DC.publisher	National Central University	en_US

博碩士論文 944403002 完整後設資料紀錄