中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/93269
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 80990/80990 (100%)
Visitors : 41269548      Online Users : 165
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version


    Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/93269


    Title: 集成樣態對特徵選擇的效能影響—以微陣列資料為例
    Authors: 鄭淨文;Cheng, Ching-Wen
    Contributors: 資訊管理學系
    Keywords: 特徵選擇;穩定性;微陣列資料集;高維度資料集;集成特徵選擇;feature selection;stability;microarray datasets;high-dimensional datasets;Ensemble Feature Selection
    Date: 2023-07-24
    Issue Date: 2024-09-19 16:51:16 (UTC+8)
    Publisher: 國立中央大學
    Abstract: 本研究旨在解決特徵選擇方法在高維度少樣本的應用領域中的穩定性問題。儘管特徵選擇方法在提升模型的預測性能方面發揮了重要作用,但在實驗中,資料的微小變動可能導致選擇的特徵有顯著差異,從而影響模型的可信度。為了提升特徵選擇的穩定性,本研究探討集成學習對於特徵選擇的影響,並進一步分析同質集成與異質集成架構的最佳參數與組合。
    集成特徵選擇主要可以分為同質集成、異質集成與混合集成,同質集成透過對訓練集進行多次抽樣來製造資料的多樣性,並使用同一特徵選擇方法進行多次評估。異質集成則是採用多種不同特徵選擇來製造方法的多樣性。混合集成則是同時採用資料多樣性與方法多樣性的特點。
    本研究根據混合集成的概念提出兩種混合式的集成架構:階層式集成和抽樣異質集成。研究結果顯示,同質集成能有助於提升特徵選擇的穩定性,但可能會微幅降低預測性能;異質集成對於提升特徵選擇的效能有限;混合集成中以階層式集成表現優於抽樣異質集成,能在保持預測性能的同時,進一步提升特徵選擇的穩定性。本研究期望這些研究成果能為高維度少樣本的研究領域,提供更穩定的特徵選擇方法。;This study addresses the stability issues of feature selection methods in high-dimensional and low-sample-size application domains. Despite the critical role of feature selection methods in enhancing prediction performance, minor variations in the data during experiments can lead to significant differences in the selected features, thereby impacting the credibility of the models. To improve the stability of feature selection, this study investigates the influence of ensemble learning on feature selection. Further, it analyzes the optimal parameters and combi-nations of the homogeneous and the heterogeneous ensemble frameworks.
    Ensemble feature selection can be divided into the homogeneous, the heterogeneous, and the hybrid ensembles. The homogeneous ensemble creates diversity in the data by performing multiple samplings on the training set and utilizing the same feature selection method for mul-tiple evaluations. In contrast, the heterogeneous ensemble introduces methodological diversity by employing various distinct feature selection methods. The hybrid ensembles, meanwhile, leverage both data diversity and method diversity.
    Based on the concept of the hybrid ensemble, this study proposes two hybrid ensemble frameworks: the hierarchical ensemble and the sampling heterogeneous ensemble. The results show that while the homogeneous ensemble can enhance the stability of feature selection, they may slightly decrease prediction performance. The heterogeneous ensemble has limited effects on improving the overall evaluation of feature selection. Among the hybrid ensembles, the hi-erarchical ensemble outperforms the sampling heterogeneous ensemble, as it maintains predic-tion performance and further enhances the stability of feature selection. This study hopes these findings can provide more stable feature selection methods for the research domain of high-dimensional and low-sample-size datasets.
    Appears in Collections:[Graduate Institute of Information Management] Electronic Thesis & Dissertation

    Files in This Item:

    File Description SizeFormat
    index.html0KbHTML16View/Open


    All items in NCUIR are protected by copyright, with all rights reserved.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明