博碩士論文 944403002 完整後設資料紀錄

DC 欄位 語言
DC.contributor資訊管理學系zh_TW
DC.creator胡筱薇zh_TW
DC.creatorHsiao-Wei Huen_US
dc.date.accessioned2009-11-3T07:39:07Z
dc.date.available2009-11-3T07:39:07Z
dc.date.issued2009
dc.identifier.urihttp://ir.lib.ncu.edu.tw:444/thesis/view_etd.asp?URN=944403002
dc.contributor.department資訊管理學系zh_TW
DC.description國立中央大學zh_TW
DC.descriptionNational Central Universityen_US
dc.description.abstract在眾多資料探勘技術中,決策樹是相當受歡迎的一種分類方法,主要是因為決策樹分類方法所探勘出來的規則有較佳的可讀性。然而,在絕大部份探討決策樹分類的文獻中,均假設標籤為類別屬性,而這樣的假設往往無法反映所有真實世界的情況,因為分類標籤本身可能是具有概念階層的資料、連續性的數值資料亦或是具有概念階層的連續性數值資料。為了降低此研究構面的落差以及處理在不同標籤屬性變化之下的分類,本研究針對三種不同的標籤屬性,分別設計與發展其專屬的決策樹分類法,並命名為(1)HLC(Hierarchical Label Classifier),(2)CLC(Continuous Label Classifier),以及 (3)HCC(Hierarchical Continuous-label Classifier)。 本研究所提出的三種創新決策樹分類法並不同於傳統決策樹分類法,其中在主要功能方面即包括:如何控制決策樹的長成、如何選擇適當的測試屬性、如何決定最適合代表葉節點的標籤以及如何預測新的資料。在HLC的發展策略方面,主要是以考量資料在概念階層之間的分布情形,來設計具概念階層之標籤間的相似度測量,而在CLC的發展策略中,本研究透過發展一創新的動態離散化方法來協助開發在數值屬性標籤間的相似度測量,最後,HCC的發展策略則是透過同時考量HCL與CLC的相關處理,進而設計其專屬的相似度測量方法。 實驗結果說明 HLC, CLC 和 HCC 不僅能由各式各樣的標籤資料集來挖掘出規則,而且得到具說服性的正確率和精確率。 zh_TW
dc.description.abstractPresently, decision tree classifiers are designed to classify the data with categorical or Boolean labels. In many practical situations, however, there are more complex classification scenarios, where the labels to be predicted are not just nominal variable with flat structure. For example, the predicted labels can be (1) hierarchically related, (2) continuous variable, or (3) hierarchical continuous variable. Unfortunately, existing research paid little attention to the issue of classification for constructing a DT from data with various types of labels. To remedy this research gap, this research has developed three innovative label-driven DT algorithms named (1)HLC (Hierarchical Label Classifier), (2)CLC (Continuous Label Classifier), and (3)HCC (Hierarchical Continuous-label Classifier) HLC, CLC and HCC are different from the traditional decision tree classifiers in some major functions including growing a decision tree, selecting attribute, assigning labels to represent a leaf and making a prediction for a new data. The development strategy of the proposed algorithms is mainly based on measuring similarity among labels by considering data distribution over the predefined concept hierarchy and by a proposed dynamic discretization for the continuous label at each node during the tree-induction process. The experimental results show that this research can not merely mine classification rules from variety types of labels, but also gets convincing accuracy and precision of rules. en_US
DC.subject資料離散化zh_TW
DC.subject決策樹zh_TW
DC.subject資料探勘zh_TW
DC.subject概念階層zh_TW
DC.subjectDecision Treeen_US
DC.subjectData Discretizationen_US
DC.subjectConcept Hierarchyen_US
DC.subjectData Miningen_US
DC.title不同標籤屬性變化下的決策樹建構系統zh_TW
dc.language.isozh-TWzh-TW
DC.titleConstructing Decision Trees from Data with Various Label-Driven Inductionsen_US
DC.type博碩士論文zh_TW
DC.typethesisen_US
DC.publisherNational Central Universityen_US

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明