結合過濾法、包裝法及嵌入法之集成式特徵選擇於軟 體缺陷預測中之應用

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：25

、訪客IP：3.139.105.38

姓名

吳冠諭(Kuan-Yu Wu) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

結合過濾法、包裝法及嵌入法之集成式特徵選擇於軟體缺陷預測中之應用
(Integrating Filter, Wrapper, and Embedded Methods for Ensemble Feature Selection in Software Defect Prediction)

相關論文

★ 專案管理的溝通關鍵路徑探討─以某企業軟體專案為例	★ 運用並探討會議流如何促進敏捷發展過程中團隊溝通與文件化：以T銀行系統開發為例
★ 專案化資訊服務中人力連續派遣決策模式之研究─以高鐵行控資訊設備維護為例	★ 以組織正義觀點介入案件指派決策之研究
★ 應用協調理論建立系統軟體測試中問題改善之協作流程	★ 應用案例式推理於問題管理系統之研究 -以筆記型電腦產品為例
★ 運用限制理論於多專案開發模式的人力資源配置之探討	★ 應用會議流方法於軟體專案開發之個案研究：以翰昇科技公司為例
★ 多重專案、多期再規劃的軟體開發接案決策模式：以南亞科技資訊部門為例	★ 會議導向敏捷軟體開發及系統設計：以大學畢業專題為例
★ 一種基於物件、屬性導向之變更影響分析方法於差異化產品設計	★ 會議流方法對大學畢業專題的團隊合作品質影響之實驗研究
★ 實施敏捷式發展法於大學部畢業專題之行動研究 – 以中央大學資管系為例	★ 建立一個用來評核自然語言需求品質的線上資訊系統
★ 結合本體論與模糊分析網路程序法於軟體測試之風險與風險關聯辨識	★ 在軟體反向工程中針對UML結構模型圖之線上品質評核系統

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2028-7-1以後開放)

摘要(中)

軟體測試是軟體開發生命週期中一項重要的工作，其在整個週期中佔了大量的時間，如果能針對容易出現缺陷的模組進行有效預測並事先修復，將可節省許多成本並交付更高品質的產品，因此軟體缺陷預測技術被應用於幫助開發人員降低其測試成本，其中，軟體度量是一種獲得原始碼客觀特徵描述的方法，所產生的指標也常被用於軟體偵錯。本研究使用NASA MDP與PROMISE的軟體缺陷預測資料集，這些資料集透過提取原始碼的多項靜態軟體度量指標作為機器學習模型的輸入特徵，然而因資料集屬於高維度資料，容易導致訓練上的複雜性及過擬合(Overfitting)問題。為解決此問題，本研究採用集成式特徵選擇，降低資料集維度再進行訓練，且不同於過往軟體缺陷預測領域的研究，本研究結合三種不同類型的特徵選擇技術，分別為過濾法(Filter)、包裝法(Wrapper)和內嵌法(Embedded)，並搭配三種聚合方法來產生特徵子集，包括交集(Intersection)、聯集(Union)和多重交集(Multi-intersection)，希望藉此克服單一特徵選擇方法的局限性，進而提升軟體缺陷預測的性能表現。研究結果顯示，基於聯集的集成式特徵選擇方法相較於單一特徵選擇擁有更高的預測準確率，同時也維持了良好的特徵縮減率。

摘要(英)

Software testing is an important stage in the software development life cycle, which takes significant time. Therefore, if we can predict and fix modules prone to defects in advance, it can save a considerable amount of costs and deliver higher-quality products. Therefore, software defect prediction techniques are applied to assist developers in reducing testing costs, software metrics are one of the methods to obtain objective descriptions of the source code, and the metrics are often used for software debugging. In this study, the NASA MDP dataset and PROMISE datasets were used. These datasets extract multiple static software metrics from the source code as input features for machine learning models. However, the datasets’ high dimensionality can lead to training complexity and overfitting issues. An ensemble feature selection method was adopted in this research to reduce the dimensionality of the datasets before training. Distinct from previous studies in software defect prediction, our research integrates three types of feature selection techniques: filter, wrapper, and embedded methods. Furthermore, three aggregation methods are employed to generate feature subsets, including union, intersection, and multi-intersection. This combination aims to overcome the limitations of a single feature selection method, and to enhance software defect prediction performance. The result of this study indicated that the ensemble feature selection based on the union method, provides higher accuracy of prediction compared to single feature selection methods, while maintaining a good feature reduction rate.

關鍵字(中)

★ 軟體缺陷預測
★ 機器學習
★ 特徵選擇
★ 集成式特徵選擇
★ 高維度資料

關鍵字(英)

★ Software Defect Prediction
★ Machine Learning
★ Feature Selection
★ Ensemble Feature Selection
★ High-dimensional Data

論文目次

摘要 i
Abstract ii
目錄 iii
圖目錄 v
表目錄 vi
一、緒論 1
1.1研究背景 1
1.2研究動機 2
1.3研究目的 3
1.4研究架構 4
二、文獻探討 6
2.1軟體缺陷預測 6
2.2特徵選擇 8
2.3數據採樣 12
2.4集成式特徵選擇 14
三、研究方法 16
3.1資料集 16
3.2實驗流程 19
3.3集成式特徵選擇 22
3.4模型評估指標 24
四、實驗結果 26
4.1實驗準備 26
4.2 NASA資料集實驗結果 26
4.3 PROMISE資料集實驗結果 35
4.4特徵數量統計 44
五、實驗分析與驗證 48
5.1實驗分析 48
5.2雙樣本中位數差異檢定(Wilcoxon signed-rank test) 50
5.3效度分析 52
六、結論與未來研究方向 55
6.1研究貢獻 55
6.2研究限制 56
6.3未來研究方向 56
參考文獻 58

參考文獻

1. Adorada, A., Wirawan, P. W., & Kurniawan, K. (2020, November). The Comparison of Feature Selection Methods in Software Defect Prediction. In 2020 4th International Conference on Informatics and Computational Sciences (ICICoS) (pp. 1-6). IEEE.
2. Agrawal, A., & Menzies, T. (2018, May). Is" better data" better than" better data miners"? on the benefits of tuning SMOTE for defect prediction. In Proceedings of the 40th International Conference on Software Engineering (pp. 1050-1061).
3. Alam, T. M., Shaukat, K., Hameed, I. A., Luo, S., Sarwar, M. U., Shabbir, S., ... & Khushi, M. (2020). An investigation of credit card default prediction in the imbalanced datasets. IEEE Access, 8, 201173-201198.
4. Alsolai, H., & Roper, M. (2022). The impact of ensemble techniques on software maintenance change prediction: an empirical study. Applied Sciences, 12(10), 5234.
5. Alzubaidi, L., Zhang, J., Humaidi, A. J., Al-Dujaili, A., Duan, Y., Al-Shamma, O., ... & Farhan, L. (2021). Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. Journal of Big Data, 8, 1-74.
6. Anwar, N., & Kar, S. (2019). Review Paper on Various Software Testing Techniques. Global Journal of Computer Science and Technology, 19(C2), 43-49.
7. Badhani, S., & Muttoo, S. K. (2018). Comparative analysis of pre-and post-classification ensemble methods for android malware detection. In Advances in Computing and Data Sciences: Second International Conference, ICACDS 2018, Dehradun, India, April 20-21, 2018, Revised Selected Papers, Part II 2(pp. 442-453). Springer Singapore.
8. Balogun, A. O., Basri, S., Abdulkadir, S. J., & Hashim, A. S. (2019). Performance analysis of feature selection methods in software defect prediction: a search method approach. Applied Sciences, 9(13), 2764.
9. Bennin, K. E., Keung, J. W., & Monden, A. (2019). On the relative value of data resampling approaches for software defect prediction. Empirical Software Engineering, 24, 602-636.
10. Bolón-Canedo, V., & Alonso-Betanzos, A. (2019). Ensembles for feature selection: A review and future trends. Information Fusion, 52, 1-12.
11. Broniatowski, D. A., & Tucker, C. (2017). Assessing causal claims about complex engineered systems with quantitative data: internal, external, and construct validity. Systems Engineering, 20(6), 483-496.
12. Cai, J., Luo, J., Wang, S., & Yang, S. (2018). Feature selection in machine learning: A new perspective. Neurocomputing, 300, 70-79.
13. Cervantes, J., Garcia-Lamont, F., Rodríguez-Mazahua, L., & Lopez, A. (2020). A comprehensive survey on support vector machine classification: Applications, challenges and trends. Neurocomputing, 408, 189-215.
14. Chandrashekar, G., & Sahin, F. (2014). A survey on feature selection methods. Computers & Electrical Engineering, 40(1), 16-28.
15. Chen, C. W., Tsai, Y. H., Chang, F. R., & Lin, W. C. (2020). Ensemble feature selection in medical datasets: Combining filter, wrapper, and embedded feature selection results. Expert Systems, 37(5), e12553.
16. Chidamber, S. R., & Kemerer, C. F. (1994). A metrics suite for object oriented design. IEEE Transactions on Software Engineering, 20(6), 476-493.
17. Chiew, K. L., Tan, C. L., Wong, K., Yong, K. S., & Tiong, W. K. (2019). A new hybrid ensemble feature selection framework for machine learning-based phishing detection system. Information Sciences, 484, 153-166.
18. Choudhary, R., & Shukla, S. (2021). A clustering based ensemble of weighted kernelized extreme learning machine for class imbalance learning. Expert Systems with Applications, 164, 114041.
19. Damtew, Y. G., Chen, H., & Yuan, Z. (2023). Heterogeneous Ensemble Feature Selection for Network Intrusion Detection System. International Journal of Computational Intelligence Systems , 16(1), 9.
20. Deng, J., Lu, L., & Qiu, S. (2020). Software defect prediction via LSTM. IET Software, 14(4), 443-450.
21. Dong, X., Yu, Z., Cao, W., Shi, Y., & Ma, Q. (2020). A survey on ensemble learning. Frontiers of Computer Science, 14, 241-258.
22. Dwivedi, A. K., Mallawaarachchi, I., & Alvarado, L. A. (2017). Analysis of small sample size studies using nonparametric bootstrap test with pooled resampling method. Statistics in medicine, 36(14), 2187-2205.
23. Erturk, E., & Sezer, E. A. (2016). Iterative software fault prediction with a hybrid approach. Applied Soft Computing, 49, 1020-1033.
24. Fotouhi, S., Asadi, S., & Kattan, M. W. (2019). A comprehensive data level analysis for cancer diagnosis on imbalanced data. Journal of Biomedical Informatics, 90, 103089.
25. Galar, M., Fernandez, A., Barrenechea, E., Bustince, H., & Herrera, F. (2011). A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 42(4), 463-484.
26. Gao, K., Khoshgoftaar, T. M., & Wald, R. (2014). The use of under-and oversampling within ensemble feature selection and classification for software quality prediction. International Journal of Reliability, Quality and Safety Engineering, 21(01), 1450004.
27. Gao, K., Khoshgoftaar, T. M., Wang, H., & Seliya, N. (2011). Choosing software metrics for defect prediction: an investigation on feature selection techniques. Software: Practice and Experience, 41(5), 579-606.
28. García, S., Ramírez-Gallego, S., Luengo, J., Benítez, J. M., & Herrera, F. (2016). Big data preprocessing: methods and prospects. Big Data Analytics, 1(1), 1-22.
29. Ghosh, S., Bhowmik, S., Ghosh, K. K., Sarkar, R., & Chakraborty, S. (2016). A filter ensemble feature selection method for handwritten numeral recognition. EMR, 7213.
30. Ghotra, B., McIntosh, S., & Hassan, A. E. (2017, May). A large-scale study of the impact of feature selection techniques on defect classification models. In 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR) (pp. 146-157). IEEE.
31. Goyal, S. (2020, November). Heterogeneous stacked ensemble classifier for software defect prediction. In 2020 sixth International Conference on Parallel, Distributed and Grid Computing (PDGC) (pp. 126-130). IEEE.
32. Halimu, C., Kasem, A., & Newaz, S. S. (2019, January). Empirical comparison of area under ROC curve (AUC) and Mathew correlation coefficient (MCC) for evaluating machine learning algorithms on imbalanced datasets for binary classification. In Proceedings of the 3rd International Conference on Machine Learning and Soft Computing (pp. 1-6).
33. Halstead, M. H. (1977). Elements of Software Science (Operating and programming systems series). Elsevier Science Inc..
34. Hammouri, A., Hammad, M., Alnabhan, M., & Alsarayrah, F. (2018). Software bug prediction using machine learning approach. International Journal of Advanced Computer Science and Applications, 9(2), 78-83.
35. He, P., Li, B., Liu, X., Chen, J., & Ma, Y. (2015). An empirical study on software defect prediction with a simplified metric set. Information and Software Technology, 59, 170-190.
36. Hira, Z. M., & Gillies, D. F. (2015). A review of feature selection and feature extraction methods applied on microarray data. Advances in Bioinformatics, 5, 1-13.
37. Huda, S., Alyahya, S., Ali, M. M., Ahmad, S., Abawajy, J., Al-Dossari, H., & Yearwood, J. (2017). A framework for software defect prediction and metric selection. IEEE Access, 6, 2844-2858.
38. Huda, S., Liu, K., Abdelrazek, M., Ibrahim, A., Alyahya, S., Al-Dossari, H., & Ahmad, S. (2018). An ensemble oversampling model for class imbalance problem in software defect prediction. IEEE access, 6, 24184-24195.
39. Iqbal, A., Aftab, S., Ali, U., Nawaz, Z., Sana, L., Ahmad, M., & Husen, A. (2019). Performance analysis of machine learning techniques on software defect prediction using NASA datasets. International Journal of Advanced Computer Science and Applications, 10(5), 300-308.
40. Jain, D., & Singh, V. (2018). Feature selection and classification systems for chronic disease prediction: A review. Egyptian Informatics Journal, 19(3), 179-189.
41. Jiang, Y., Lin, J., Cukic, B., & Menzies, T. (2009, November). Variance analysis in software fault prediction models. In 2009 20th International Symposium on Software Reliability Engineering (pp. 99-108). IEEE.
42. Kaur, A., & Kaur, I. (2018). An empirical evaluation of classification algorithms for fault prediction in open source projects. Journal of King Saud University-Computer and Information Sciences, 30(1), 2-17.
43. Kaur, A., Guleria, K., & Trivedi, N. K. (2021, March). Feature selection in machine learning: Methods and comparison. In 2021 International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE) (pp. 789-795). IEEE.
44. Kaur, H., Pannu, H. S., & Malhi, A. K. (2019). A systematic review on imbalanced data challenges in machine learning: Applications and solutions. ACM Computing Surveys (CSUR), 52(4), 1-36.
45. Kaur, I., & Kaur, A. (2021). A novel four-way approach designed with ensemble feature selection for code smell detection. IEEE Access, 9, 8695-8707.
46. Khoshgoftaar, T. M., Gao, K., & Bullard, L. A. (2011). A comparative study of filter-based and wrapper-based feature ranking techniques for software quality modeling. International Journal of Reliability, Quality and Safety Engineering, 18(04), 341-364.
47. Khoshgoftaar, T. M., Gao, K., & Napolitano, A. (2012). An empirical study of feature ranking techniques for software quality prediction. International journal of software engineering and knowledge engineering, 22(02), 161-183.
48. Khoshgoftaar, T. M., Gao, K., & Seliya, N. (2010, October). Attribute selection and imbalanced data: Problems in software defect prediction. In 2010 22nd IEEE International Conference on Tools with Artificial Intelligence(Vol. 1, pp. 137-144). IEEE.
49. Kondo, M., Bezemer, C. P., Kamei, Y., Hassan, A. E., & Mizuno, O. (2019). The impact of feature reduction techniques on defect prediction models. Empirical Software Engineering, 24, 1925-1963.
50. Laradji, I. H., Alshayeb, M., & Ghouti, L. (2015). Software defect prediction using ensemble learning on selected features. Information and Software Technology, 58, 388-402.
51. Li, J., He, P., Zhu, J., & Lyu, M. R. (2017, July). Software defect prediction via convolutional neural network. In 2017 IEEE International Conference on Software Quality, Reliability and Security (QRS) (pp. 318-328). IEEE.
52. Li, Z., Huang, M., Liu, G., & Jiang, C. (2021). A hybrid method with dynamic weighted entropy for handling the problem of class imbalance with overlap in credit card fraud detection. Expert Systems with Applications, 175, 114750.
53. Li, Z., Jing, X. Y., & Zhu, X. (2018). Progress on approaches to software defect prediction. Iet Software, 12(3), 161-175.
54. Liu, H., Motoda, H., Setiono, R., & Zhao, Z. (2010, May). Feature selection: An ever evolving frontier in data mining. In Feature Selection in Data Mining (pp. 4-13). PMLR.
55. Liu, N., Li, X., Qi, E., Xu, M., Li, L., & Gao, B. (2020). A novel ensemble learning paradigm for medical diagnosis with imbalanced data. IEEE Access, 8, 171263-171280.
56. Liu, S., Chen, X., Liu, W., Chen, J., Gu, Q., & Chen, D. (2014, July). FECAR: A feature selection framework for software defect prediction. In 2014 IEEE 38th Annual Computer Software and Applications Conference (pp. 426-435). IEEE.
57. Liu, W., Liu, S., Gu, Q., Chen, J., Chen, X., & Chen, D. (2015). Empirical studies of a two-stage data preprocessing approach for software fault prediction. IEEE Transactions on Reliability, 65(1), 38-53.
58. López, V., Fernández, A., García, S., Palade, V., & Herrera, F. (2013). An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics. Information Sciences, 250, 113-141.
59. Mahmood, Z., Bowes, D., Lane, P. C., & Hall, T. (2015, October). What is the impact of imbalance on software defect prediction performance?. In Proceedings of the 11th International Conference on Predictive Models and Data Analytics in Software Engineering (pp. 1-4).
60. Makki, S., Assaghir, Z., Taher, Y., Haque, R., Hacid, M. S., & Zeineddine, H. (2019). An experimental study with imbalanced classification approaches for credit card fraud detection. IEEE Access, 7, 93010-93022.
61. Malhotra, R. (2015). A systematic review of machine learning techniques for software fault prediction. Applied Soft Computing, 27, 504-518.
62. Malhotra, R., & Khan, K. (2020, June). A study on software defect prediction using feature extraction techniques. In 2020 8th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions)(ICRITO) (pp. 1139-1144). IEEE.
63. Malhotra, R., Sharma, S., & Aggarwal, S. (2023, February). Comparative Analysis of Software Defect Prediction Using Dimensionality Reduction. In Proceedings of 3rd International Conference on Recent Trends in Machine Learning, IoT, Smart Cities and Applications: ICMISC 2022 (pp. 171-183). Singapore: Springer Nature Singapore.
64. Manjula, C., & Florence, L. (2019). Deep neural network based hybrid approach for software defect prediction using software metrics. Cluster Computing, 22(Suppl 4), 9847-9863.
65. Matloob, F., Ghazal, T. M., Taleb, N., Aftab, S., Ahmad, M., Khan, M. A., ... & Soomro, T. R. (2021). Software defect prediction using ensemble learning: A systematic literature review. IEEE Access, 9, 98754-98771.
66. McCabe, T. J. (1976). A complexity measure. IEEE Transactions on Software Engineering, SE-2(4), 308-320.
67. Menzies, T., Milton, Z., Turhan, B., Cukic, B., Jiang, Y., & Bener, A. (2010). Defect prediction from static code features: current results, limitations, new approaches. Automated Software Engineering, 17, 375-407.
68. Menzies, T., Turhan, B., Bener, A., Gay, G., Cukic, B., & Jiang, Y. (2008, May). Implications of ceiling effects in defect predictors. In Proceedings of the 4th International Workshop on Predictor Models in Software Engineering (pp. 47-54).
69. Myers, G. J., Sandler, C., & Badgett, T. (2011). The art of software testing . John Wiley & Sons.
70. Pandey, S. K., Mishra, R. B., & Tripathi, A. K. (2021). Machine learning based methods for software fault prediction: A survey. Expert Systems with Applications, 172, 114595.
71. Pandey, S. K., Rathee, D., & Tripathi, A. K. (2020). Software defect prediction using K‐PCA and various kernel‐based extreme learning machine: an empirical study. IET Software, 14(7), 768-782.
72. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ... & Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825-2830.
73. Pelayo, L., & Dick, S. (2007, June). Applying novel resampling strategies to software defect prediction. In NAFIPS 2007-2007 Annual meeting of the North American Fuzzy Information Processing Society (pp. 69-72). IEEE.
74. Pelayo, L., & Dick, S. (2012). Evaluating stratification alternatives to improve software defect prediction. IEEE Transactions on Reliability, 61(2), 516-525.
75. Phung, V. H., & Rhee, E. J. (2019). A high-accuracy model average ensemble of convolutional neural networks for classification of cloud image patches on small datasets. Applied Sciences, 9(21), 4500.
76. Radjenović, D., Heričko, M., Torkar, R., & Živkovič, A. (2013). Software fault prediction metrics: A systematic literature review. Information and Software Technology, 55(8), 1397-1418.
77. Rathore, S. S., & Gupta, A. (2014, February). A comparative study of feature-ranking and feature-subset selection techniques for improved fault prediction. In Proceedings of the 7th India Software Engineering Conference (pp. 1-10).
78. Remeseiro, B., & Bolon-Canedo, V. (2019). A review of feature selection methods in medical applications. Computers in Biology and Medicine, 112, 103375.
79. Riaz, S., Arshad, A., & Jiao, L. (2018). Rough noise-filtered easy ensemble for software fault prediction. Ieee Access, 6, 46886-46899.
80. Rodriguez, D., Herraiz, I., Harrison, R., Dolado, J., & Riquelme, J. C. (2014, May). Preliminary comparison of techniques for dealing with imbalance in software defect prediction. In Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering (pp. 1-10).
81. Rodríguez, D., Ruiz, R., Cuadrado-Gallego, J., & Aguilar-Ruiz, J. (2007, August). Detecting fault modules applying feature selection to classifiers. In 2007 IEEE International Conference on Information Reuse and Integration(pp. 667-672). IEEE.
82. Saeys, Y., Abeel, T., & Van de Peer, Y. (2008). Robust feature selection using ensemble feature selection techniques. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2008, Antwerp, Belgium, September 15-19, 2008, Proceedings, Part II 19 (pp. 313-325). Springer Berlin Heidelberg.
83. Sagi, O., & Rokach, L. (2018). Ensemble learning: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 8(4), e1249.
84. Seiffert, C., Khoshgoftaar, T. M., & Van Hulse, J. (2009). Improving software-quality predictions with data sampling and boosting. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, 39(6), 1283-1294.
85. Seiffert, C., Khoshgoftaar, T. M., Van Hulse, J., & Folleco, A. (2014). An empirical study of the classification performance of learners on imbalanced and noisy software quality data. Information Sciences, 259, 571-595.
86. Seijo-Pardo, B., Porto-Díaz, I., Bolón-Canedo, V., & Alonso-Betanzos, A. (2017). Ensemble feature selection: homogeneous and heterogeneous approaches. Knowledge-Based Systems, 118, 124-139.
87. Shepperd, M., Song, Q., Sun, Z., & Mair, C. (2013). Data quality: Some Comments on the NASA Software Defect Datasets. IEEE Transactions on Software Engineering, 39(9), 1208-1215.
88. Singh, A., Ranjan, R. K., & Tiwari, A. (2021). Credit card fraud detection under extreme imbalanced data: a comparative study of data-level algorithms. Journal of Experimental & Theoretical Artificial Intelligence, 1-28.
89. Song, Q., Guo, Y., & Shepperd, M. (2018). A comprehensive investigation of the role of imbalanced learning for software defect prediction. IEEE Transactions on Software Engineering, 45(12), 1253-1269.
90. Tong, H., Liu, B., & Wang, S. (2018). Software defect prediction using stacked denoising autoencoders and two-stage ensemble learning. Information and Software Technology, 96, 94-111.
91. Tsai, C. F., & Sung, Y. T. (2020). Ensemble feature selection in high dimension, low sample size datasets: Parallel and serial combination approaches. Knowledge-Based Systems, 203, 106097.
92. Wang, H., Khoshgoftaar, T. M., & Napolitano, A. (2010, December). A comparative study of ensemble feature selection techniques for software defect prediction. In 2010 Ninth International Conference on Machine Learning and Applications (pp. 135-140). IEEE.
93. Wang, H., Khoshgoftaar, T. M., & Napolitano, A. (2012). Software measurement data reduction using ensemble techniques. Neurocomputing, 92, 124-132.
94. Wang, H., Zhuang, W., & Zhang, X. (2021). Software defect prediction based on gated hierarchical LSTMs. IEEE Transactions on Reliability, 70(2), 711-727.
95. Wang, L., Wang, Y., & Chang, Q. (2016). Feature selection methods for big data bioinformatics: A survey from the search perspective. Methods, 111, 21-31.
96. Wang, S., & Yao, X. (2013). Using class imbalance learning for software defect prediction. IEEE Transactions on Reliability, 62(2), 434-443.
97. Wang, S., Liu, T., Nam, J., & Tan, L. (2018). Deep semantic feature learning for software defect prediction. IEEE Transactions on Software Engineering, 46(12), 1267-1293.
98. Wang, S., Zhang, Y., Zhan, T., Phillips, P., Zhang, Y. D., Liu, G., ... & Wu, X. (2016). Pathological brain detection by artificial intelligence in magnetic resonance imaging scanning (invited review). Progress in Electromagnetics Research, 156, 105-133.
99. Xu, Y., Yu, Z., Cao, W., & Chen, C. P. (2021). A novel classifier ensemble method based on subspace enhancement for high-dimensional data classification. IEEE Transactions on Knowledge and Data Engineering, 35(1), 16-30.
100. Xu, Z., Liu, J., Luo, X., Yang, Z., Zhang, Y., Yuan, P., ... & Zhang, T. (2019). Software defect prediction based on kernel PCA and weighted extreme learning machine. Information and Software Technology, 106, 182-200.
101. Xu, Z., Liu, J., Yang, Z., An, G., & Jia, X. (2016, October). The impact of feature selection on defect prediction performance: An empirical comparison. In 2016 IEEE 27th International Symposium on Software Reliability Engineering (ISSRE) (pp. 309-320). IEEE.
102. Xu, Z., Shen, D., Nie, T., & Kou, Y. (2020). A hybrid sampling algorithm combining M-SMOTE and ENN based on random forest for medical imbalanced data. Journal of Biomedical Informatics, 107, 103465.
103. Xu, Z., Shen, D., Nie, T., Kou, Y., Yin, N., & Han, X. (2021). A cluster-based oversampling algorithm combining SMOTE and k-means for imbalanced medical data. Information Sciences, 572, 574-589.
104. Yan, M., Fang, Y., Lo, D., Xia, X., & Zhang, X. (2017, November). File-level defect prediction: Unsupervised vs. supervised models. In 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM) (pp. 344-353). IEEE.
105. Yin, H., & Gai, K. (2015, August). An empirical study on preprocessing high-dimensional class-imbalanced data for classification. In 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conference on Embedded Software and Systems (pp. 1314-1319). IEEE.
106. Yuan, Z., Chen, X., Cui, Z., & Mu, Y. (2020). ALTRA: Cross-project software defect prediction via active learning and tradaboost. IEEE Access, 8, 30037-30049.
107. Zebari, R., Abdulazeez, A., Zeebaree, D., Zebari, D., & Saeed, J. (2020). A comprehensive review of dimensionality reduction techniques for feature selection and feature extraction. J. Appl. Sci. Technol. Trends, 1(2), 56-70.

指導教授

陳仲儼(Chung-Yang Chen)

審核日期

2023-7-14

推文