DC 欄位 |
值 |
語言 |
DC.contributor | 資訊管理學系 | zh_TW |
DC.creator | 鄭淨文 | zh_TW |
DC.creator | Ching-Wen Cheng | en_US |
dc.date.accessioned | 2023-7-24T07:39:07Z | |
dc.date.available | 2023-7-24T07:39:07Z | |
dc.date.issued | 2023 | |
dc.identifier.uri | http://ir.lib.ncu.edu.tw:444/thesis/view_etd.asp?URN=110423033 | |
dc.contributor.department | 資訊管理學系 | zh_TW |
DC.description | 國立中央大學 | zh_TW |
DC.description | National Central University | en_US |
dc.description.abstract | 本研究旨在解決特徵選擇方法在高維度少樣本的應用領域中的穩定性問題。儘管特徵選擇方法在提升模型的預測性能方面發揮了重要作用,但在實驗中,資料的微小變動可能導致選擇的特徵有顯著差異,從而影響模型的可信度。為了提升特徵選擇的穩定性,本研究探討集成學習對於特徵選擇的影響,並進一步分析同質集成與異質集成架構的最佳參數與組合。
集成特徵選擇主要可以分為同質集成、異質集成與混合集成,同質集成透過對訓練集進行多次抽樣來製造資料的多樣性,並使用同一特徵選擇方法進行多次評估。異質集成則是採用多種不同特徵選擇來製造方法的多樣性。混合集成則是同時採用資料多樣性與方法多樣性的特點。
本研究根據混合集成的概念提出兩種混合式的集成架構:階層式集成和抽樣異質集成。研究結果顯示,同質集成能有助於提升特徵選擇的穩定性,但可能會微幅降低預測性能;異質集成對於提升特徵選擇的效能有限;混合集成中以階層式集成表現優於抽樣異質集成,能在保持預測性能的同時,進一步提升特徵選擇的穩定性。本研究期望這些研究成果能為高維度少樣本的研究領域,提供更穩定的特徵選擇方法。 | zh_TW |
dc.description.abstract | This study addresses the stability issues of feature selection methods in high-dimensional and low-sample-size application domains. Despite the critical role of feature selection methods in enhancing prediction performance, minor variations in the data during experiments can lead to significant differences in the selected features, thereby impacting the credibility of the models. To improve the stability of feature selection, this study investigates the influence of ensemble learning on feature selection. Further, it analyzes the optimal parameters and combi-nations of the homogeneous and the heterogeneous ensemble frameworks.
Ensemble feature selection can be divided into the homogeneous, the heterogeneous, and the hybrid ensembles. The homogeneous ensemble creates diversity in the data by performing multiple samplings on the training set and utilizing the same feature selection method for mul-tiple evaluations. In contrast, the heterogeneous ensemble introduces methodological diversity by employing various distinct feature selection methods. The hybrid ensembles, meanwhile, leverage both data diversity and method diversity.
Based on the concept of the hybrid ensemble, this study proposes two hybrid ensemble frameworks: the hierarchical ensemble and the sampling heterogeneous ensemble. The results show that while the homogeneous ensemble can enhance the stability of feature selection, they may slightly decrease prediction performance. The heterogeneous ensemble has limited effects on improving the overall evaluation of feature selection. Among the hybrid ensembles, the hi-erarchical ensemble outperforms the sampling heterogeneous ensemble, as it maintains predic-tion performance and further enhances the stability of feature selection. This study hopes these findings can provide more stable feature selection methods for the research domain of high-dimensional and low-sample-size datasets. | en_US |
DC.subject | 特徵選擇 | zh_TW |
DC.subject | 穩定性 | zh_TW |
DC.subject | 微陣列資料集 | zh_TW |
DC.subject | 高維度資料集 | zh_TW |
DC.subject | 集成特徵選擇 | zh_TW |
DC.subject | feature selection | en_US |
DC.subject | stability | en_US |
DC.subject | microarray datasets | en_US |
DC.subject | high-dimensional datasets | en_US |
DC.subject | Ensemble Feature Selection | en_US |
DC.title | 集成樣態對特徵選擇的效能影響—以微陣列資料為例 | zh_TW |
dc.language.iso | zh-TW | zh-TW |
DC.type | 博碩士論文 | zh_TW |
DC.type | thesis | en_US |
DC.publisher | National Central University | en_US |