集成樣態對特徵選擇的效能影響—以微陣列資料為例

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：100

、訪客IP：3.133.114.38

姓名

鄭淨文(Ching-Wen Cheng) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

集成樣態對特徵選擇的效能影響—以微陣列資料為例

相關論文

★ 具代理人之行動匿名拍賣與付款機制	★ 網路攝影機遠端連線安全性分析
★ HSDPA環境下的複合式細胞切換機制	★ 樹狀結構為基礎之行動隨意網路IP位址分配機制
★ 平面環境中目標區域之偵測 - 使用行動感測網路技術	★ 藍芽Scatternet上的P2P檔案分享機制
★ 交通壅塞避免之動態繞路機制	★ 運用UWB提升MANET上檔案分享之效能
★ 合作學習平台對團體迷思現象及學習成效之影響–以英文字彙學習為例	★ 以RFID為基礎的室內定位機制─使用虛擬標籤的經驗法則
★ 適用於實體購物情境的行動商品比價系統-使用影像辨識技術	★ 信用卡網路刷卡安全性
★ DEAP:適用於行動RFID系統之高效能動態認證協定	★ 在破產預測與信用評估領域對前處理方式與分類器組合的比較分析
★ 單一類別分類方法於不平衡資料集－搭配遺漏值填補和樣本選取方法	★ 正規化與變數篩選在破產領域的適用性研究

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2028-6-30以後開放)

摘要(中)

本研究旨在解決特徵選擇方法在高維度少樣本的應用領域中的穩定性問題。儘管特徵選擇方法在提升模型的預測性能方面發揮了重要作用，但在實驗中，資料的微小變動可能導致選擇的特徵有顯著差異，從而影響模型的可信度。為了提升特徵選擇的穩定性，本研究探討集成學習對於特徵選擇的影響，並進一步分析同質集成與異質集成架構的最佳參數與組合。
集成特徵選擇主要可以分為同質集成、異質集成與混合集成，同質集成透過對訓練集進行多次抽樣來製造資料的多樣性，並使用同一特徵選擇方法進行多次評估。異質集成則是採用多種不同特徵選擇來製造方法的多樣性。混合集成則是同時採用資料多樣性與方法多樣性的特點。
本研究根據混合集成的概念提出兩種混合式的集成架構：階層式集成和抽樣異質集成。研究結果顯示，同質集成能有助於提升特徵選擇的穩定性，但可能會微幅降低預測性能；異質集成對於提升特徵選擇的效能有限；混合集成中以階層式集成表現優於抽樣異質集成，能在保持預測性能的同時，進一步提升特徵選擇的穩定性。本研究期望這些研究成果能為高維度少樣本的研究領域，提供更穩定的特徵選擇方法。

摘要(英)

This study addresses the stability issues of feature selection methods in high-dimensional and low-sample-size application domains. Despite the critical role of feature selection methods in enhancing prediction performance, minor variations in the data during experiments can lead to significant differences in the selected features, thereby impacting the credibility of the models. To improve the stability of feature selection, this study investigates the influence of ensemble learning on feature selection. Further, it analyzes the optimal parameters and combi-nations of the homogeneous and the heterogeneous ensemble frameworks.
Ensemble feature selection can be divided into the homogeneous, the heterogeneous, and the hybrid ensembles. The homogeneous ensemble creates diversity in the data by performing multiple samplings on the training set and utilizing the same feature selection method for mul-tiple evaluations. In contrast, the heterogeneous ensemble introduces methodological diversity by employing various distinct feature selection methods. The hybrid ensembles, meanwhile, leverage both data diversity and method diversity.
Based on the concept of the hybrid ensemble, this study proposes two hybrid ensemble frameworks: the hierarchical ensemble and the sampling heterogeneous ensemble. The results show that while the homogeneous ensemble can enhance the stability of feature selection, they may slightly decrease prediction performance. The heterogeneous ensemble has limited effects on improving the overall evaluation of feature selection. Among the hybrid ensembles, the hi-erarchical ensemble outperforms the sampling heterogeneous ensemble, as it maintains predic-tion performance and further enhances the stability of feature selection. This study hopes these findings can provide more stable feature selection methods for the research domain of high-dimensional and low-sample-size datasets.

關鍵字(中)

★ 特徵選擇
★ 穩定性
★ 微陣列資料集
★ 高維度資料集
★ 集成特徵選擇

關鍵字(英)

★ feature selection
★ stability
★ microarray datasets
★ high-dimensional datasets
★ Ensemble Feature Selection

論文目次

摘要 i
Abstract ii
致謝 iii
目錄 iv
圖目錄 vii
表目錄 ix
一、緒論 1
1-1　　研究背景 1
1-2　　研究動機 2
1-3　　研究目的 4
二、文獻探討 6
2-1　　特徵選擇的策略 6
2-2　　過濾器式特徵選擇方法 7
2-2-1　基於訊息理論的特徵選擇方法 8
2-2-2　基於隨機森林重要性的特徵選擇方法 9
2-2-3　基於統計衡量的特徵選擇方法 10
2-2-4　基於樣本的特徵選擇方法 12
2-3　　集成特徵選擇 14
2-4　　特徵選擇的穩定性 15
2-5　　分類器 16
三、研究方法 19
3-1　　研究資料集 20
3-2　　資料前處理 21
3-3　　實驗參數設定、分類器與特徵選擇方法 22
3-4　　評估指標 23
3-4-1　穩定性評估指標 23
3-4-2　預測性能評估指標 24
3-4-3　穩定性-預測性能權衡指標 25
3-5　　特徵選擇實驗流程 27
3-6　　篩選預測性能較佳的特徵選擇方法 28
3-7　　探討同質集成對於特徵選擇方法的影響 29
3-8　　探討異質集成對於特徵選擇方法的影響 30
3-9　　探討混合集成對於特徵選擇方法的影響 31
3-10　　探討排名穩定性 33
四、實驗結果與分析 34
4-1　篩選預測性能較佳的特徵選擇方法 34
4-2　同質集成架構的適用性分析 35
4-2-1　集成次數與抽樣比例對特徵選擇效能的影響 35
4-2-2　同質集成架構的效能分析 47
4-2-3　預測性能與子集穩定性的權衡 50
4-3　　異質集成架構的適用性分析 50
4-3-1　特徵選擇組合對特徵選擇效能的影響 50
4-4　　混合集成架構的適用性分析 52
4-4-1　混合集成架構的效能分析 52
4-4-2　預測性能與子集穩定性的權衡 54
4-5　　綜合評比不同集成架構的適用性 55
4-5-1　不同架構中最佳預測性能方法的效能分析 55
4-5-2　不同架構中最佳子集穩定性方法的效能分析 57
4-5-3　不同架構的綜合效能比較 58
4-6　　排名穩定性分析 60
4-6-1　同質集成排名穩定性分析 62
4-6-2　異質集成排名穩定性分析 63
4-6-3　混合集成排名穩定性分析 63
4-6-4　綜合排名穩定性分析 64
五、結論 65
5-1　結論與貢獻 65
5-2　研究限制 67
5-3　未來研究與建議 67
參考文獻 68

參考文獻

[1] R. Aziz, C. Verma, and N. Srivastava, "A novel approach for dimension reduction of microarray," Computational Biology and Chemistry, vol. 71, pp. 161-169, 2017
[2] S. Chen, J. Montgomery, and A. Bolufé-Röhler, "Measuring the curse of dimensionality and its effects on particle swarm optimization and differential evolution," Applied Intelligence, vol. 42, no. 3, pp. 514-526, 2015
[3] C. Lazar et al., "A survey on filter techniques for feature selection in gene expression microarray analysis," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 9, no. 4, pp. 1106-1119, 2012
[4] Y. Saeys, I. Inza, and P. Larranaga, "A review of feature selection techniques in bioinformatics," Bioinformatics, vol. 23, no. 19, pp. 2507-2517, 2007
[5] H. Lu, J. Chen, K. Yan, Q. Jin, Y. Xue, and Z. Gao, "A hybrid feature selection algorithm for gene expression data classification," Neurocomputing, vol. 256, pp. 56-62, 2017
[6] M. J. Heller, "DNA microarray technology: devices, systems, and applications," Annual Review of Biomedical Engineering, vol. 4, no. 1, pp. 129-153, 2002
[7] S. Li and D. Li, DNA microarray technology and data analysis in cancer research. World Scientific, 2008
[8] L. Yu, Y. Han, and M. E. Berens, "Stable gene selection from microarray data via sample weighting," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 9, no. 1, pp. 262-272, 2011
[9] R. K. Singh and M. Sivabalakrishnan, "Feature selection of gene expression data for cancer classification: a review," Procedia Computer Science, vol. 50, pp. 52-57, 2015
[10] A. Brazma and J. Vilo, "Gene expression data analysis," FEBS letters, vol. 480, no. 1, pp. 17-24, 2000
[11] G. Sherlock, "Analysis of large-scale gene expression data," Current Opinion in Immunology, vol. 12, no. 2, pp. 201-205, 2000
[12] M. S. Pepe et al., "Phases of biomarker development for early detection of cancer," Journal of the National Cancer Institute, vol. 93, no. 14, pp. 1054-1061, 2001
[13] T. Li, C. Zhang, and M. Ogihara, "A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression," Bioinformatics, vol. 20, no. 15, pp. 2429-2437, 2004
[14] J. Ye, T. Li, T. Xiong, and R. Janardan, "Using uncorrelated discriminant analysis for tissue classification with gene expression data," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 1, no. 4, pp. 181-190, 2004
[15] S. Zhu, D. Wang, K. Yu, T. Li, and Y. Gong, "Feature selection for gene expression using model-based entropy," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 7, no. 1, pp. 25-36, 2008
[16] C. A. Davis et al., "Reliable gene signatures for microarray classification: assessment of stability and performance," Bioinformatics, vol. 22, no. 19, pp. 2356-2363, 2006
[17] L. Ein-Dor, I. Kela, G. Getz, D. Givol, and E. Domany, "Outcome signature genes in breast cancer: is there a unique set?," Bioinformatics, vol. 21, no. 2, pp. 171-178, 2005
[18] A. Kalousis, J. Prados, and M. Hilario, "Stability of feature selection algorithms: a study on high-dimensional spaces," Knowledge and Information Systems, vol. 12, pp. 95-116, 2007
[19] F. Yang and K. Mao, "Robust feature selection for microarray data based on multicriterion fusion," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 8, no. 4, pp. 1080-1092, 2010
[20] P. Drotár, M. Gazda, and L. Vokorokos, "Ensemble feature selection using election methods and ranker clustering," Information Sciences, vol. 480, pp. 365-380, 2019
[21] U. M. Khaire and R. Dhanalakshmi, "Stability of feature selection algorithm: A review," Journal of King Saud University-Computer and Information Sciences, 2019
[22] B. Pes, "Ensemble feature selection for high-dimensional data: a stability analysis across multiple domains," Neural Computing and Applications, vol. 32, no. 10, pp. 5951-5973, 2020
[23] Y. Saeys, T. Abeel, and Y. Van de Peer, "Robust feature selection using ensemble feature selection techniques," Proceedings of Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2008, Berlin, Heidelberg, 2008
[24] V. Bolón-Canedo and A. Alonso-Betanzos, "Ensembles for feature selection: A review and future trends," Information Fusion, vol. 52, pp. 1-12, 2019
[25] R. Salman, A. Alzaatreh, and H. Sulieman, "The stability of different aggregation techniques in ensemble feature selection," Journal of Big Data, vol. 9, no. 1, pp. 1-23, 2022
[26] Z.-H. Zhou and Z.-H. Zhou, Ensemble learning. Springer, 2021
[27] R. Salman, A. Alzaatreh, H. Sulieman, and S. Faisal, "A bootstrap framework for aggregating within and between feature selection methods," Entropy, vol. 23, no. 2, p. 200, 2021
[28] A. Ben Brahim and M. Limam, "Ensemble feature selection for high dimensional data: a new method and a comparative study," Advances in Data Analysis and Classification, vol. 12, pp. 937-952, 2018
[29] B. Seijo-Pardo, I. Porto-Díaz, V. Bolón-Canedo, and A. Alonso-Betanzos, "Ensemble feature selection: homogeneous and heterogeneous approaches," Knowledge-Based Systems, vol. 118, pp. 124-139, 2017
[30] C.-F. Tsai and Y.-T. Sung, "Ensemble feature selection in high dimension, low sample size datasets: Parallel and serial combination approaches," Knowledge-Based Systems, vol. 203, p. 106097, 2020
[31] J. Kim, J. Kang, and M. Sohn, "Ensemble learning-based filter-centric hybrid feature selection framework for high-dimensional imbalanced data," Knowledge-Based Systems, vol. 220, p. 106901, 2021
[32] A. Jain and V. Jain, "Sentiment classification using hybrid feature selection and ensemble classifier," Journal of Intelligent & Fuzzy Systems, vol. 42, no. 2, pp. 659-668, 2022
[33] J. Liu and M. Shi, "A hybrid feature selection and ensemble approach to identify depressed users in online social media," Frontiers in Psychology, vol. 12, p. 802821, 2022
[34] N. Singh and P. Singh, "A hybrid ensemble-filter wrapper feature selection approach for medical data classification," Chemometrics and Intelligent Laboratory Systems, vol. 217, p. 104396, 2021
[35] S. Sharma and A. Jain, "Hybrid ensemble learning with feature selection for sentiment classification in social media," in Research Anthology on Applying Social Networking Strategies to Classrooms and Libraries: IGI Global, 2023
[36] Y. Li, U. Mansmann, S. Du, and R. Hornung, "Benchmark study of feature selection strategies for multi-omics data," BMC Bioinformatics, vol. 23, no. 1, pp. 1-18, 2022
[37] I. Guyon and A. Elisseeff, "An introduction to variable and feature selection," Journal of machine learning research, vol. 3, no. Mar, pp. 1157-1182, 2003
[38] R. J. Urbanowicz, M. Meeker, W. La Cava, R. S. Olson, and J. H. Moore, "Relief-based feature selection: Introduction and review," Journal of Biomedical Informatics, vol. 85, pp. 189-203, 2018
[39] N. Hoque, D. K. Bhattacharyya, and J. K. Kalita, "MIFS-ND: A mutual information-based feature selection method," Expert Systems with Applications, vol. 41, no. 14, pp. 6371-6385, 2014
[40] M. Dash and H. Liu, "Feature selection for classification," Intelligent Data Analysis, vol. 1, no. 1-4, pp. 131-156, 1997
[41] H. Liu and L. Yu, "Toward integrating feature selection algorithms for classification and clustering," IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 4, pp. 491-502, 2005
[42] N. Pudjihartono, T. Fadason, A. W. Kempa-Liehr, and J. M. O′Sullivan, "A review of feature selection methods for machine learning-based disease risk prediction," Frontiers in Bioinformatics, vol. 2, p. 927312, 2022
[43] B. Remeseiro and V. Bolon-Canedo, "A review of feature selection methods in medical applications," Computers in Biology and Medicine, vol. 112, p. 103375, 2019
[44] R. Kohavi and G. H. John, "Wrappers for feature subset selection," Artificial Intelligence, vol. 97, no. 1-2, pp. 273-324, 1997
[45] G. Chandrashekar and F. Sahin, "A survey on feature selection methods," Computers & Electrical Engineering, vol. 40, no. 1, pp. 16-28, 2014
[46] F. Kotzyba‐Hibert, I. Kapfer, and M. Goeldner, "Recent trends in photoaffinity labeling," Angewandte Chemie International Edition in English, vol. 34, no. 12, pp. 1296-1312, 1995
[47] V. Bolón-Canedo, N. Sánchez-Marono, A. Alonso-Betanzos, J. M. Benítez, and F. Herrera, "A review of microarray datasets and applied feature selection methods," Information Sciences, vol. 282, pp. 111-135, 2014
[48] V. Bolón-Canedo, N. Sánchez-Maroño, and A. Alonso-Betanzos, "An ensemble of filters and classifiers for microarray data classification," Pattern Recognition, vol. 45, no. 1, pp. 531-539, 2012
[49] A. Bommert, T. Welchowski, M. Schmid, and J. Rahnenführer, "Benchmark of filter methods for feature selection in high-dimensional gene expression survival data," Briefings in Bioinformatics, vol. 23, no. 1, p. bbab354, 2022
[50] J. Li et al., "Feature selection: A data perspective," ACM Computing Surveys (CSUR), vol. 50, no. 6, pp. 1-45, 2017
[51] A. Bommert, X. Sun, B. Bischl, J. Rahnenführer, and M. Lang, "Benchmark for filter methods for feature selection in high-dimensional classification data," Computational Statistics & Data Analysis, vol. 143, p. 106839, 2020
[52] T. Khoshgoftaar, D. Dittman, R. Wald, and A. Fazelpour, "First order statistics based feature selection: A diverse and powerful family of feature seleciton techniques," Proceedings of 2012 11th International Conference on Machine Learning and Applications, vol. 2, 2012
[53] J. R. Vergara and P. A. Estévez, "A review of feature selection methods based on mutual information," Neural Computing and Applications, vol. 24, no. 1, pp. 175-186, 2014
[54] M. Beraha, A. M. Metelli, M. Papini, A. Tirinzoni, and M. Restelli, "Feature selection via mutual information: New theoretical insights," Proceedings of 2019 International Joint Conference on Neural Networks (IJCNN), 2019
[55] H. Zhou, X. Wang, and R. Zhu, "Feature selection based on mutual information with correlation coefficient," Applied Intelligence, vol. 52, no. 5, pp. 5457-5474, 2022
[56] D. D. Lewis, "Feature selection and feature extraction for text categorization," Proceedings of the Workshop on Speech and Natural Language, 1992
[57] J. R. Quinlan, "Induction of decision trees," Machine Learning, vol. 1, no. 1, pp. 81-106, 1986
[58] W. H. Press, B. P. Flannery, and S. A. Teukolsky, "WT Vetterling WT Numerical Recipes in C," ed: Cambridge University Press, 1988.
[59] P. E. Hart, D. G. Stork, and R. O. Duda, Pattern classification. Wiley Hoboken, 2000
[60] H. Almuallim and T. G. Dietterich, "Learning With Many Irrelevant Features," Proceedings of AAAI, vol. 91, 1991
[61] K. Kira and L. A. Rendell, "The feature selection problem: Traditional methods and a new algorithm," Proceedings of AAAI, vol. 2, no. 1992a, 1992
[62] K. Kira and L. A. Rendell, "A practical approach to feature selection," in Machine learning proceedings: Elsevier, 1992
[63] I. Kononenko, "Estimating attributes: Analysis and extensions of RELIEF," Proceedings of European Conference on Machine Learning, 1994
[64] Q. Shen, R. Diao, and P. Su, "Feature Selection Ensemble," Proceedings of Turing-100, 2012
[65] L. Morán-Fernández, V. Bólon-Canedo, and A. Alonso-Betanzos, "How important is data quality? Best classifiers vs best features," Neurocomputing, vol. 470, pp. 365-375, 2022
[66] W. Awada, T. M. Khoshgoftaar, D. Dittman, R. Wald, and A. Napolitano, "A review of the stability of feature selection techniques for bioinformatics data," Proceedings of 2012 IEEE 13th International Conference on Information Reuse & Integration (IRI), 2012
[67] R. Wald, T. M. Khoshgoftaar, D. Dittman, W. Awada, and A. Napolitano, "An extensive comparison of feature ranking aggregation techniques in bioinformatics," Proceedings of 2012 IEEE 13th International Conference on Information Reuse & Integration (IRI), 2012
[68] R. Wald, T. M. Khoshgoftaar, and D. J. Dittman, "Mean Aggregation versus Robust Rank Aggregation for Ensemble Gene Selection," 2012 11th International Conference on Machine Learning and Applications, vol. 1, pp. 63-69, 2012
[69] N. Dessì, B. Pes, and M. Angioni, "On stability of ensemble gene selection," Proceedings of Intelligent Data Engineering and Automated Learning–IDEAL 2015: 16th International Conference, Wroclaw, Poland, October 14-16, 2015, Proceedings 16, 2015
[70] K. Dunne, P. Cunningham, and F. Azuaje, "Solutions to instability problems with sequential wrapper-based approaches to feature selection," Trinity College Dublin, Department of Computer Science, 2002.
[71] A. Kalousis, J. Prados, and M. Hilario, "Stability of feature selection algorithms," Proceedings of Fifth IEEE International Conference on Data Mining (ICDM′05), 2005
[72] S. Nogueira, K. Sechidis, and G. Brown, "On the stability of feature selection algorithms," The Journal of Machine Learning Research, vol. 18, no. 1, pp. 6345-6398, 2017
[73] G. Jurman, S. Merler, A. Barla, S. Paoli, A. Galea, and C. Furlanello, "Algebraic stability indicators for ranked lists in molecular profiling," Bioinformatics, vol. 24, no. 2, pp. 258-264, 2008
[74] Y. Piao and K. H. Ryu, "A hybrid feature selection method based on symmetrical uncertainty and support vector machine for high-dimensional data classification," Proceedings of Intelligent Information and Database Systems: 9th Asian Conference, ACIIDS 2017, Kanazawa, Japan, April 3-5, 2017, Proceedings, Part I 9, 2017
[75] R. Clarke et al., "The properties of high-dimensional data spaces: implications for exploring gene and protein expression data," Nature Reviews Cancer, vol. 8, no. 1, pp. 37-49, 2008
[76] A. Osareh and B. Shadgar, "Microarray data analysis for cancer classification," Proceedings of 2010 5th International Symposium on Health Informatics and Bioinformatics, 2010
[77] D. A. Pisner and D. M. Schnyer, "Support vector machine," in Machine learning: Elsevier, 2020
[78] N. Sánchez-Maroño, O. Fontenla-Romero, and B. Pérez-Sánchez, "Classification of microarray data," Microarray Bioinformatics, pp. 185-205, 2019
[79] M. Alirezanejad, R. Enayatifar, H. Motameni, and H. Nematzadeh, "Heuristic filter feature selection methods for medical datasets," Genomics, vol. 112, no. 2, pp. 1173-1181, 2020
[80] T. S. Furey, N. Cristianini, N. Duffy, D. W. Bednarski, M. Schummer, and D. Haussler, "Support vector machine classification and validation of cancer tissue samples using microarray expression data," Bioinformatics, vol. 16, no. 10, pp. 906-914, 2000
[81] H. Alshamlan, G. Badr, and Y. Alohali, "A comparative study of cancer classification methods using microarray gene expression profile," Proceedings of the First International Conference on Advanced Data and Information Engineering (DaEng-2013), 2014
[82] V. Vapnik, "Pattern recognition using generalized portrait method," Automation and Remote Control, vol. 24, pp. 774-780, 1963
[83] E. García-Gonzalo, Z. Fernández-Muñiz, P. J. Garcia Nieto, A. Bernardo Sánchez, and M. Menéndez Fernández, "Hard-rock stability analysis for span design in entry-type excavations with learning classifiers," Materials, vol. 9, no. 7, p. 531, 2016
[84] S. Huang, N. Cai, P. P. Pacheco, S. Narrandes, Y. Wang, and W. Xu, "Applications of support vector machine (SVM) learning in cancer genomics," Cancer Genomics & Proteomics, vol. 15, no. 1, pp. 41-51, 2018
[85] M. A. Aizerman, "Theoretical foundations of the potential function method in pattern recognition learning," Automation and Remote Control, vol. 25, pp. 821-837, 1964
[86] E. a. Alhenawi, R. Al-Sayyed, A. Hudaib, and S. Mirjalili, "Feature selection methods on gene expression microarray data for cancer classification: A systematic review," Computers in Biology and Medicine, vol. 140, p. 105051, 2022
[87] T. R. Golub et al., "Molecular classification of cancer: class discovery and class prediction by gene expression monitoring," Science, vol. 286, no. 5439, pp. 531-537, 1999
[88] I. Guyon, S. Gunn, A. Ben-Hur, and G. Dror, "Result analysis of the nips 2003 feature selection challenge," Advances in Neural Information Processing Systems, vol. 17, 2004
[89] W. A. Freije et al., "Gene expression profiling of gliomas strongly predicts survival," Cancer research, vol. 64, no. 18, pp. 6503-6510, 2004
[90] D. Singh et al., "Gene expression correlates of clinical prostate cancer behavior," Cancer Cell, vol. 1, no. 2, pp. 203-209, 2002
[91] A. Spira et al., "Airway epithelial gene expression in the diagnostic evaluation of smokers with suspect lung cancer," Nature Medicine, vol. 13, no. 3, pp. 361-366, 2007
[92] A. Bhattacharjee et al., "Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses," Proceedings of the National Academy of Sciences, vol. 98, no. 24, pp. 13790-13795, 2001
[93] S. García, J. Luengo, and F. Herrera, Data preprocessing in data mining. Springer, 2015
[94] D. Singh and B. Singh, "Investigating the impact of data normalization on classification performance," Applied Soft Computing, vol. 97, p. 105524, 2020
[95] F. Pedregosa et al., "Scikit-learn: Machine learning in Python," the Journal of Machine Learning Research, vol. 12, pp. 2825-2830, 2011
[96] J. Wang, J. Xu, C. Zhao, Y. Peng, and H. Wang, "An ensemble feature selection method for high-dimensional data based on sort aggregation," Systems Science & Control Engineering, vol. 7, no. 2, pp. 32-39, 2019
[97] G. Hripcsak and A. S. Rothschild, "Agreement, the f-measure, and reliability in information retrieval," Journal of the American Medical Informatics Association, vol. 12, no. 3, pp. 296-298, 2005

指導教授

蘇坤良(Kuen-Liang Su)

審核日期

2023-7-24

推文