參考文獻 |
1. Alasadi, S.A. and W.S. Bhaya, Review of data preprocessing techniques in data mining. Journal of Engineering and Applied Sciences, 2017. 12(16): p. 4102-4107.
2. Galar, M., et al., A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 2011. 42(4): p. 463-484.
3. Cao, H., et al., Integrated oversampling for imbalanced time series classification. IEEE Transactions on Knowledge and Data Engineering, 2013. 25(12): p. 2809-2822.
4. Singh, B., N. Kushwaha, and O.P. Vyas, A feature subset selection technique for high dimensional data using symmetric uncertainty. Journal of Data Analysis and Information Processing, 2014. 2(04): p. 95.
5. Rachburee, N. and W. Punlumjeak. A comparison of feature selection approach between greedy, IG-ratio, Chi-square, and mRMR in educational mining. in 2015 7th international conference on information technology and electrical engineering (ICITEE). 2015. IEEE.
6. Omuya, E.O., G.O. Okeyo, and M.W. Kimwele, Feature selection for classification using principal component analysis and information gain. Expert Systems with Applications, 2021. 174: p. 114765.
7. Kovács, G., An empirical comparison and evaluation of minority oversampling techniques on a large number of imbalanced datasets. Applied Soft Computing, 2019. 83: p. 105662.
8. Douzas, G., F. Bacao, and F. Last, Improving imbalanced learning through a heuristic oversampling method based on k-means and SMOTE. Information Sciences, 2018. 465: p. 1-20.
9. Fernández, A., et al., SMOTE for learning from imbalanced data: progress and challenges, marking the 15-year anniversary. Journal of artificial intelligence research, 2018. 61: p. 863-905.
10. Choudhary, R. and S. Shukla, A clustering based ensemble of weighted kernelized extreme learning machine for class imbalance learning. Expert Systems with Applications, 2021. 164: p. 114041.
11. Sagi, O. and L. Rokach, Ensemble learning: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2018. 8(4): p. e1249.
12. Ali, U., K.S. Arif, and U. Qamar. A hybrid scheme for feature selection of high dimensional educational data. in 2019 International Conference on Communication Technologies (ComTech). 2019. IEEE.
13. Agrawal, P., et al., Metaheuristic algorithms on feature selection: A survey of one decade of research (2009-2019). IEEE Access, 2021. 9: p. 26766-26791.
14. Gazzah, S. and N.E.B. Amara. New oversampling approaches based on polynomial fitting for imbalanced data sets. in 2008 the eighth iapr international workshop on document analysis systems. 2008. IEEE.
15. Barua, S., M.M. Islam, and K. Murase. ProWSyn: Proximity weighted synthetic oversampling technique for imbalanced data set learning. in Pacific-Asia Conference on Knowledge Discovery and Data Mining. 2013. Springer.
16. Sáez, J.A., et al., SMOTE–IPF: Addressing the noisy and borderline examples problem in imbalanced classification by a re-sampling method with filtering. Information Sciences, 2015. 291: p. 184-203.
17. Boonamnuay, S., N. Kerdprasop, and K. Kerdprasop, Classification and regression tree with resampling for classifying imbalanced data. International Journal of Machine Learning and Computing, 2018. 8(4): p. 336-340.
18. Claesen, M., et al., Fast prediction with SVM models containing RBF kernels. arXiv preprint arXiv:1403.0736, 2014.
19. Zhao, Z., et al., Imbalance learning for the prediction of N6-Methylation sites in mRNAs. BMC genomics, 2018. 19(1): p. 1-10.
20. Chandrashekar, G. and F. Sahin, A survey on feature selection methods. Computers & Electrical Engineering, 2014. 40(1): p. 16-28.
21. Liu, H., M. Zhou, and Q. Liu, An embedded feature selection method for imbalanced data classification. IEEE/CAA Journal of Automatica Sinica, 2019. 6(3): p. 703-715.
22. Thabtah, F., et al., Data imbalance in classification: Experimental evaluation. Information Sciences, 2020. 513: p. 429-441.
23. Feng, W., W. Huang, and J. Ren, Class imbalance ensemble learning based on the margin theory. Applied Sciences, 2018. 8(5): p. 815.
24. Buda, M., A. Maki, and M.A. Mazurowski, A systematic study of the class imbalance problem in convolutional neural networks. Neural Networks, 2018. 106: p. 249-259.
25. Chawla, N.V., Data mining for imbalanced datasets: An overview. Data mining and knowledge discovery handbook, 2009: p. 875-886.
26. Leevy, J.L., et al., A survey on addressing high-class imbalance in big data. Journal of Big Data, 2018. 5(1): p. 1-30.
27. Khushi, M., et al., A comparative performance analysis of data resampling methods on imbalance medical data. IEEE Access, 2021. 9: p. 109960-109975.
28. Chawla, N.V., et al., SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research, 2002. 16: p. 321-357.
29. Han, H., W.-Y. Wang, and B.-H. Mao. Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. in International conference on intelligent computing. 2005. Springer.
30. Bunkhumpornpat, C., K. Sinapiromsaran, and C. Lursinsap. Safe-level-smote: Safe-level-synthetic minority over-sampling technique for handling the class imbalanced problem. in Pacific-Asia conference on knowledge discovery and data mining. 2009. Springer.
31. Yin, L., et al., Feature selection for high-dimensional imbalanced data. Neurocomputing, 2013. 105: p. 3-11.
32. Grobelnik, M. Feature selection for unbalanced class distribution and naive bayes. in ICML ‘99: Proceedings of the Sixteenth International Conference on Machine Learning. 1999. Citeseer.
33. Zheng, Z., X. Wu, and R. Srihari, Feature selection for text categorization on imbalanced data. ACM Sigkdd Explorations Newsletter, 2004. 6(1): p. 80-89.
34. He, H., et al. ADASYN: Adaptive synthetic sampling approach for imbalanced learning. in 2008 IEEE international joint conference on neural networks (IEEE world congress on computational intelligence). 2008. IEEE.
35. Barua, S., M. Islam, and K. Murase. A novel synthetic minority oversampling technique for imbalanced data set learning. in International Conference on Neural Information Processing. 2011. Springer.
36. Khoshgoftaar, T.M. and P. Rebours, Improving software quality prediction by noise filtering techniques. Journal of Computer Science and Technology, 2007. 22(3): p. 387-396.
37. Krawczyk, B., Learning from imbalanced data: open challenges and future directions. Progress in Artificial Intelligence, 2016. 5(4): p. 221-232.
38. Johnson, J.M. and T.M. Khoshgoftaar, Survey on deep learning with class imbalance. Journal of Big Data, 2019. 6(1): p. 1-54.
39. Moepya, S.O., S.S. Akhoury, and F.V. Nelwamondo. Applying cost-sensitive classification for financial fraud detection under high class-imbalance. in 2014 IEEE international conference on data mining workshop. 2014. IEEE.
40. Xu, Q., et al., Imbalanced fault diagnosis of rotating machinery via multi-domain feature extraction and cost-sensitive learning. Journal of Intelligent Manufacturing, 2020. 31(6): p. 1467-1481.
41. Fernández, A., et al., Cost-sensitive learning, in Learning from Imbalanced Data Sets. 2018, Springer. p. 63-78.
42. Elkan, C. The foundations of cost-sensitive learning. in International joint conference on artificial intelligence. 2001. Lawrence Erlbaum Associates Ltd.
43. Ribeiro, M.H.D.M. and L. dos Santos Coelho, Ensemble approach based on bagging, boosting and stacking for short-term prediction in agribusiness time series. Applied Soft Computing, 2020. 86: p. 105837.
44. Mosavi, A., et al., Ensemble boosting and bagging based machine learning models for groundwater potential prediction. Water Resources Management, 2021. 35(1): p. 23-37.
45. Dong, X., et al., A survey on ensemble learning. Frontiers of Computer Science, 2020. 14(2): p. 241-258.
46. Schapire, R.E., The strength of weak learnability. Machine learning, 1990. 5(2): p. 197-227.
47. Freund, Y. and R.E. Schapire, A decision-theoretic generalization of on-line learning and an application to boosting. Journal of computer and system sciences, 1997. 55(1): p. 119-139.
48. Palit, I. and C.K. Reddy, Scalable and parallel boosting with mapreduce. IEEE Transactions on Knowledge and Data Engineering, 2011. 24(10): p. 1904-1916.
49. Breiman, L., Bagging predictors. Machine learning, 1996. 24(2): p. 123-140.
50. Oza, N.C. and S.J. Russell. Online bagging and boosting. in International Workshop on Artificial Intelligence and Statistics. 2001. PMLR.
51. Bauer, E. and R. Kohavi, An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine learning, 1999. 36(1): p. 105-139.
52. Breiman, L., Random forests. Machine learning, 2001. 45(1): p. 5-32.
53. Ghimire, D. and J. Lee, Extreme learning machine ensemble using bagging for facial expression recognition. Journal of Information Processing Systems, 2014. 10(3): p. 443-458.
54. Nikulin, V., G.J. McLachlan, and S.K. Ng. Ensemble approach for the classification of imbalanced data. in Australasian Joint Conference on Artificial Intelligence. 2009. Springer.
55. Du, H., et al., Online ensemble learning algorithm for imbalanced data stream. Applied Soft Computing, 2021. 107: p. 107378.
56. Lim, P., C.K. Goh, and K.C. Tan, Evolutionary cluster-based synthetic oversampling ensemble (eco-ensemble) for imbalance learning. IEEE transactions on cybernetics, 2016. 47(9): p. 2850-2861.
57. Huda, S., et al., An Ensemble Oversampling Model for Class Imbalance Problem in Software Defect Prediction. Ieee Access, 2018. 6: p. 24184-24195.
58. Kira, K. and L.A. Rendell, A practical approach to feature selection, in Machine learning proceedings 1992. 1992, Elsevier. p. 249-256.
59. Xu, Z., et al., Discriminative semi-supervised feature selection via manifold regularization. IEEE Transactions on Neural networks, 2010. 21(7): p. 1033-1047.
60. Raileanu, L.E. and K. Stoffel, Theoretical comparison between the gini index and information gain criteria. Annals of Mathematics and Artificial Intelligence, 2004. 41(1): p. 77-93.
61. Guyon, I. and A. Elisseeff, An introduction to variable and feature selection. Journal of machine learning research, 2003. 3(Mar): p. 1157-1182.
62. Battiti, R., Using mutual information for selecting features in supervised neural net learning. IEEE Transactions on neural networks, 1994. 5(4): p. 537-550.
63. Lazar, C., et al., A survey on filter techniques for feature selection in gene expression microarray analysis. IEEE/ACM transactions on computational biology and bioinformatics, 2012. 9(4): p. 1106-1119.
64. Kohavi, R. and G.H. John, Wrappers for feature subset selection. Artificial intelligence, 1997. 97(1-2): p. 273-324.
65. Peng, H., F. Long, and C. Ding, Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on pattern analysis and machine intelligence, 2005. 27(8): p. 1226-1238.
66. Ferreira, A.J. and M.A. Figueiredo, An unsupervised approach to feature discretization and selection. Pattern Recognition, 2012. 45(9): p. 3048-3060.
67. Rostami, M., et al., Review of swarm intelligence-based feature selection methods. Engineering Applications of Artificial Intelligence, 2021. 100: p. 104210.
68. Pudil, P., J. Novovičová, and J. Kittler, Floating search methods in feature selection. Pattern recognition letters, 1994. 15(11): p. 1119-1125.
69. Reeves, S.J. and Z. Zhe, Sequential algorithms for observation selection. IEEE Transactions on Signal Processing, 1999. 47(1): p. 123-132.
70. Goldberg, D.E., Genetic algorithms. 2006: Pearson Education India.
71. Kennedy, J. and R. Eberhart. Particle swarm optimization. in Proceedings of ICNN′95-international conference on neural networks. 1995. IEEE.
72. Blum, A.L. and P. Langley, Selection of relevant features and examples in machine learning. Artificial intelligence, 1997. 97(1-2): p. 245-271.
73. Jiménez-Cordero, A., J.M. Morales, and S. Pineda, A novel embedded min-max approach for feature selection in nonlinear support vector machine classification. European Journal of Operational Research, 2021. 293(1): p. 24-35.
74. Guyon, I., et al., Gene selection for cancer classification using support vector machines. Machine learning, 2002. 46(1): p. 389-422.
75. Tibshirani, R., Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 1996. 58(1): p. 267-288.
76. Al-Tashi, Q., et al., Approaches to multi-objective feature selection: A systematic literature review. IEEE Access, 2020. 8: p. 125076-125096.
77. Chawla, N.V., N. Japkowicz, and A. Kotcz, Special issue on learning from imbalanced data sets. ACM SIGKDD explorations newsletter, 2004. 6(1): p. 1-6.
78. Chen, H., et al., Feature selection for imbalanced data based on neighborhood rough sets. Information Sciences, 2019. 483: p. 1-20.
79. Chen, X.-w. and M. Wasikowski. Fast: a roc-based feature selection metric for small samples and imbalanced data classification problems. in Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. 2008.
80. Alibeigi, M., S. Hashemi, and A. Hamzeh, DBFS: An effective Density Based Feature Selection scheme for small sample size and high dimensional imbalanced data sets. Data & Knowledge Engineering, 2012. 81: p. 67-103.
81. Kamalov, F., F. Thabtah, and H.H. Leung, Feature Selection in Imbalanced Data. Annals of Data Science, 2022: p. 1-15.
82. Deepa, T. and M. Punithavalli. An E-SMOTE technique for feature selection in high-dimensional imbalanced dataset. in 2011 3rd International Conference on Electronics Computer Technology. 2011. IEEE.
83. Liu, Y., et al., A classification method based on feature selection for imbalanced data. IEEE Access, 2019. 7: p. 81794-81807.
84. Van de Geer, J.P., Some aspects of Minkowski distance. 1995: Leiden University, Department of Data Theory.
85. Maldonado, S., J. López, and C. Vairetti, An alternative SMOTE oversampling strategy for high-dimensional datasets. Applied Soft Computing, 2019. 76: p. 380-389.
86. Uyun, S. and E. Sulistyowati, Feature selection for multiple water quality status: integrated bootstrapping and SMOTE approach in imbalance classes. International Journal of Electrical and Computer Engineering, 2020. 10(4): p. 4331.
87. Yin, H. and K. Gai. An empirical study on preprocessing high-dimensional class-imbalanced data for classification. in 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conference on Embedded Software and Systems. 2015. IEEE.
88. Marutho, D., S.H. Handaka, and E. Wijaya. The determination of cluster number at k-mean using elbow method and purity evaluation on headline news. in 2018 International Seminar on Application for Technology of Information and Communication. 2018. IEEE.
89. Liu, F. and Y. Deng, Determine the number of unknown targets in Open World based on Elbow method. IEEE Transactions on Fuzzy Systems, 2020. 29(5): p. 986-995.
90. Halimu, C., A. Kasem, and S.S. Newaz. Empirical comparison of area under ROC curve (AUC) and Mathew correlation coefficient (MCC) for evaluating machine learning algorithms on imbalanced datasets for binary classification. in Proceedings of the 3rd international conference on machine learning and soft computing. 2019.
91. Wardhani, N.W.S., et al. Cross-validation metrics for evaluating classification performance on imbalanced data. in 2019 international conference on computer, control, informatics and its applications (ic3ina). 2019. IEEE. |