基於相關性與自動編碼器的同質集成與二階段特徵選擇

NCU Institutional Repository > 管理學院 > 資訊管理研究所 > 博碩士論文 > Item 987654321/92622

jsp.display-item.identifier=請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/92622

题名:	基於相關性與自動編碼器的同質集成與二階段特徵選擇
作者:	謝鎮安;Hsieh, Chen-An
贡献者:	資訊管理學系
关键词:	特徵選擇;高維資料集;自動編碼器;集成學習;穩定性;Feature selection;High-dimensional dataset;Autoencoder;Ensemble learning;Stability
日期:	2023-07-24
上传时间:	2023-10-04 16:06:51 (UTC+8)
出版者:	國立中央大學
摘要:	本研究旨在將自動編碼器特徵選擇應用於監督式任務，研究該方法與相關性特徵選擇在預測性能和穩定性方面的表現，並進一步分析同質集成架構與本研究提出的二階段結合架構對特徵選擇效能的影響，以建立更好的特徵選擇方法。本研究建構了基於Gedeon方法的自動編碼器特徵選擇，並與Impurity、Anova、ReliefF和Mutual Information四種相關性特徵選擇進行比較。實驗結果顯示，自動編碼器特徵選擇在沒有使用架構改進的情況下表現不佳。在同質集成實驗中，相關性特徵選擇能透過犧牲少量的預測性能換取更好的穩定性，使其在整體表現上更好；自動編碼器特徵選擇透過同質集成架構能獲得穩定性與預測性能上的提升，並在預測性能上贏過相關性特徵選擇。在二階段實驗中，以自動編碼器特徵選擇作為第一階段的方法是最佳的結合順序。透過結合兩種不同評估方式的特徵選擇方法，在預測性能上優於未集成與同質集成的所有特徵選擇方法。根據實驗結果，本研究建議在進行特徵選擇時，應根據不同應用情境選擇同質集成或二階段結合架構，來提升特徵選擇的整體效能。同質集成著重於提升穩定性，而二階段結合則能有效提升預測性能，並透過對前後兩個階段的特徵選擇使用同質集成來保持良好的穩定性。 ;This study aims to apply autoencoder feature selection to supervised tasks, investigate its prediction performance and stability compared to relevance feature selection, and further ana-lyze the impact of homogeneous ensemble and the proposed two-phase combination on feature selection effectiveness to establish a better feature selection method. We constructed an autoencoder feature selection method based on the Gedeon method and compared it with four relevance feature selection methods: Impurity, Anova, ReliefF, and Mutu-al Information. The experimental results showed that the autoencoder feature selection per-formed poorly without architectural improvements. In the homogeneous ensemble experiment, relevance feature selection achieved better overall evaluation by sacrificing a small amount of prediction performance in exchange for im-proved stability. The autoencoder feature selection improved stability and prediction perfor-mance, outperforming relevance feature selection in prediction performance. In the two-phase combination, using autoencoder feature selection as the first-phase is the optimal combination order. Combining two different evaluation feature selections in this order, it outperforms all non-ensemble and homogeneous ensemble feature selection methods in prediction performance. Based on the experimental results, this study suggests that feature selection should be cho-sen based on different application scenarios, either using a homogeneous ensemble or the two-phase combination, to enhance the effectiveness of feature selection. The homogeneous ensemble focuses on improving stability. In contrast, the two-phase combination effectively im-proves prediction performance and maintains good stability by applying a homogeneous en-semble to the feature selection in both phases.
显示于类别:	[資訊管理研究所] 博碩士論文

文件中的档案:

档案	描述	大小	格式	浏览次数
index.html		0Kb	HTML	74	检视/开启

在NCUIR中所有的数据项都受到原著作权保护.

社群 sharing

数据加载中.....