參考文獻 |
[1] Nnamoko, N., & Korkontzelos, I. (2020). Efficient treatment of outliers and class imbalance for diabetes prediction. Artificial intelligence in medicine, 104, 101815.
[2] Haddad, B. M., Yang, S., Karam, L. J., Ye, J., Patel, N. S., & Braun, M. W. (2016). Multifeature, sparse-based approach for defects detection and classification in semiconductor units. IEEE Transactions on Automation Science and Engineering, 15(1), 145-159.
[3] Pereira, R. M., Bertolini, D., Teixeira, L. O., Silla Jr, C. N., & Costa, Y. M. (2020). COVID-19 identification in chest X-ray images on flat and hierarchical classification scenarios. Computer methods and programs in biomedicine, 194, 105532.
[4] Cui, Z., Xue, F., Cai, X., Cao, Y., Wang, G. G., & Chen, J. (2018). Detection of malicious code variants based on deep learning. IEEE Transactions on Industrial Informatics, 14(7), 3187-3196.
[5] Luque, A., Carrasco, A., Martín, A., & de Las Heras, A. (2019). The impact of class imbalance in classification performance metrics based on the binary confusion matrix. Pattern Recognition, 91, 216-231.
[6] García, S., Luengo, J., & Herrera, F. (2015). Data preprocessing in data mining. pp. 245–283.
[7] Garcia, S., Luengo, J., Sáez, J. A., Lopez, V., & Herrera, F. (2012). A survey of discretization techniques: Taxonomy and empirical analysis in supervised learning. IEEE transactions on Knowledge and Data Engineering, 25(4), 734-750.
[8] Liu, H., & Setiono, R. (1997). Feature selection via discretization. IEEE Transactions on knowledge and Data Engineering, 9(4), 642-645.
[9] He, H., & Garcia, E. A. (2009). Learning from imbalanced data. IEEE Transactions on knowledge and data engineering, 21(9), 1263-1284.
[10] Johnson, J. M., & Khoshgoftaar, T. M. (2019). Survey on deep learning with class imbalance. Journal of Big Data, 6(1), 1-54.
[11] Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research, 16, 321-357.
[12] Fernández, A., García, S., del Jesus, M. J., & Herrera, F. (2008). A study of the behaviour of linguistic fuzzy rule based classification systems in the framework of imbalanced data-sets. Fuzzy Sets and Systems, 159(18), 2378-2398.
[13] Krawczyk, B. (2016). Learning from imbalanced data: open challenges and future directions. Progress in Artificial Intelligence, 5(4), 221-232.
[14] Japkowicz, N., & Stephen, S. (2002). The class imbalance problem: A systematic study. Intelligent data analysis, 6(5), 429-449.
[15] Das, S., Datta, S., & Chaudhuri, B. B. (2018). Handling data irregularities in classification: Foundations, trends, and future challenges. Pattern Recognition, 81, 674-693.
[16] Santos, M. S., Abreu, P. H., Japkowicz, N., Fernández, A., & Santos, J. (2023). A unifying view of class overlap and imbalance: Key concepts, multi-view panorama, and open avenues for research. Information Fusion, 89, 228-253.
[17] Vuttipittayamongkol, P., & Elyan, E. (2020). Neighbourhood-based undersampling approach for handling imbalanced and overlapped data. Information Sciences, 509, 47-70.
[18] Liu, Y., Liu, Y., Bruce, X. B., Zhong, S., & Hu, Z. (2023). Noise-robust oversampling for imbalanced data classification. Pattern Recognition, 133, 109008.
[19] Huang, C., Li, Y., Loy, C. C., & Tang, X. (2019). Deep imbalanced learning for face recognition and attribute prediction. IEEE transactions on pattern analysis and machine intelligence, 42(11), 2781-2794.
[20] Li, T., Xia, Q., Zhao, M., Gui, Z., & Leng, S. (2020). Prospectivity mapping for tungsten polymetallic mineral resources, Nanling metallogenic belt, south China: Use of random forest algorithm from a perspective of data imbalance. Natural Resources Research, 29(1), 203-227.
[21] Bustillo, A., Pimenov, D. Y., Mia, M., & Kapłonek, W. (2021). Machine-learning for automatic prediction of flatness deviation considering the wear of the face mill teeth. Journal of Intelligent Manufacturing, 32(3), 895-912.
[22] Ma, H., Huang, W., Jing, Y., Yang, C., Han, L., Dong, Y., ... & Ruan, C. (2019). Integrating growth and environmental parameters to discriminate powdery mildew and aphid of winter wheat using bi-temporal Landsat-8 imagery. Remote Sensing, 11(7), 846.
[23] Liu, X. Y., Wu, J., & Zhou, Z. H. (2008). Exploratory undersampling for class-imbalance learning. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 39(2), 539-550.
[24] García, V., Sánchez, J. S., & Mollineda, R. A. (2012). On the effectiveness of preprocessing methods when dealing with different levels of class imbalance. Knowledge-Based Systems, 25(1), 13-21.
[25] Burez, J., & Van den Poel, D. (2009). Handling class imbalance in customer churn prediction. Expert Systems with Applications, 36(3), 4626-4636.
[26] Drummond, C., & Holte, R. C. (2003, August). C4. 5, class imbalance, and cost sensitivity: why under-sampling beats over-sampling. In Workshop on learning from imbalanced datasets II (Vol. 11, pp. 1-8).
[27] Tomek, I. (1976). Two modifications of CNN.
[28] Pereira, R. M., Costa, Y. M., & Silla Jr, C. N. (2020). MLTL: A multi-label approach for the Tomek Link undersampling algorithm. Neurocomputing, 383, 95-105.
[29] Choirunnisa, S., & Lianto, J. (2018, November). Hybrid method of undersampling and oversampling for handling imbalanced data. In 2018 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI) (pp. 276-280). IEEE.
[30] Batista, G. E., Prati, R. C., & Monard, M. C. (2004). A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD explorations newsletter, 6(1), 20-29.
[31] Maslove, D. M., Podchiyska, T., & Lowe, H. J. (2013). Discretization of continuous features in clinical datasets. Journal of the American Medical Informatics Association, 20(3), 544-553.
[32] Tsai, C. F., & Chen, Y. C. (2019). The optimal combination of feature selection and data discretization: An empirical study. Information Sciences, 505, 282-293.
[33] Gómez, I., Ribelles, N., Franco, L., Alba, E., & Jerez, J. M. (2016). Supervised discretization can discover risk groups in cancer survival analysis. Computer Methods and Programs in Biomedicine, 136, 11-19.
[34] Gonzalez-Abril, L., Cuberos, F. J., Velasco, F., & Ortega, J. A. (2009). Ameva: An autonomous discretization algorithm. Expert Systems with Applications, 36(3), 5327-5332.
[35] Wen, L. Y., Min, F., & Wang, S. Y. (2017). A two-stage discretization algorithm based on information entropy. Applied Intelligence, 47, 1169-1185.
[36] Richeldi, M., & Rossotto, M. (1995). Class-driven statistical discretization of continuous attributes. In Machine Learning: ECML-95: 8th European Conference on Machine Learning Heraclion, Crete, Greece, April 25–27, 1995 Proceedings 8 (pp. 335-338). Springer Berlin Heidelberg.
[37] Fayyad, U., & Irani, K. (1993). Multi-interval discretization of continuous-valued attributes for classification learning.
[38] Lavangnananda, K., & Chattanachot, S. (2017, February). Study of discretization methods in classification. In 2017 9th International Conference on Knowledge and Smart Technology (KST) (pp. 50-55). IEEE.
[39] Abraham, R., Simha, J. B., & Iyengar, S. S. (2006, December). A comparative analysis of discretization methods for Medical Datamining with Naïve Bayesian classifier. In 9th International Conference on Information Technology (ICIT′06) (pp. 235-236). IEEE.
[40] Clarke, E. J., & Barton, B. A. (2000). Entropy and MDL discretization of continuous variables for Bayesian belief networks. International Journal of Intelligent Systems, 15(1), 61-92.
[41] Makhalova, T., Kuznetsov, S. O., & Napoli, A. (2022). Mint: MDL-based approach for mining INTeresting numerical pattern sets. Data Mining and Knowledge Discovery, 1-38.
[42] Sun, X., Lin, X., Li, Z., & Wu, H. (2022). A comprehensive comparison of supervised and unsupervised methods for cell type identification in single-cell RNA-seq. Briefings in bioinformatics, 23(2), bbab567.
[43] Abonizio, H. Q., Paraiso, E. C., & Barbon, S. (2021). Toward text data augmentation for sentiment analysis. IEEE Transactions on Artificial Intelligence, 3(5), 657-668.
[44] Chen, C. H., Patel, V. M., & Chellappa, R. (2017). Learning from ambiguously labeled face images. IEEE transactions on pattern analysis and machine intelligence, 40(7), 1653-1667.
[45] Mahapatra, D., Poellinger, A., & Reyes, M. (2022). Interpretability-guided inductive bias for deep learning based medical image. Medical image analysis, 81, 102551.
[46] Mori, T., & Uchihira, N. (2019). Balancing the trade-off between accuracy and interpretability in software defect prediction. Empirical Software Engineering, 24, 779-825.
[47] Kerber, R. (1992). Chimerge: Discretization of numeric attributes. In Proceedings of the tenth national conference on Artificial intelligence (pp. 123-128).
[50] Xu, D., Zhang, Z., & Shi, J. (2022). A New Multi-Sensor Stream Data Augmentation Method for Imbalanced Learning in Complex Manufacturing Process. Sensors, 22(11), 4042.
[51] Vuttipittayamongkol, P., & Elyan, E. (2020). Neighbourhood-based undersampling approach for handling imbalanced and overlapped data. Information Sciences, 509, 47-70.
[52] Boser, B. E., Guyon, I. M., & Vapnik, V. N. (1992, July). A training algorithm for optimal margin classifiers. In Proceedings of the fifth annual workshop on Computational learning theory (pp. 144-152).
[53] Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine learning, 20, 273-297.
[54] Chang, C. C., & Lin, C. J. (2011). LIBSVM: a library for support vector machines. ACM transactions on intelligent systems and technology (TIST), 2(3), 1-27.
[55] Salo, F., Nassif, A. B., & Essex, A. (2019). Dimensionality reduction with IG-PCA and ensemble classifier for network intrusion detection. Computer Networks, 148, 164-175.
[56] Burges, C. J. (1998). A tutorial on support vector machines for pattern recognition. Data mining and knowledge discovery, 2(2), 121-167.
[57] Rezvani, S., & Wang, X. (2021). Class imbalance learning using fuzzy ART and intuitionistic fuzzy twin support vector machines. Information Sciences, 578, 659-682.
[58] Shafizadeh-Moghadam, H., Tayyebi, A., Ahmadlou, M., Delavar, M. R., & Hasanlou, M. (2017). Integration of genetic algorithm and multiple kernel support vector regression for modeling urban growth. Computers, Environment and Urban Systems, 65, 28-40.
[59] Chen, P., Yuan, L., He, Y., & Luo, S. (2016). An improved SVM classifier based on double chains quantum genetic algorithm and its application in analogue circuit diagnosis. Neurocomputing, 211, 202-211.
[60] Fu, X., & Wang, L. (2003). Data dimensionality reduction with application to simplifying RBF network structure and improving classification performance. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 33(3), 399-409.
[61] Prajapati, G. L., & Patle, A. (2010). On performing classification using SVM with radial basis and polynomial kernel functions. In 2010 3rd International Conference on Emerging Trends in Engineering and Technology (pp. 512-515). IEEE.
[62] Quinlan, J. R. (1986). Induction of decision trees. Machine learning, 1, 81-106.
[63] Quinlan, J. R. (1993). C.45: Programs for machine learning. San Francisco: Morgan Kaufmann.
[64] Singh, S., & Gupta, P. (2014). Comparative study ID3, cart and C4. 5 decision tree algorithm: a survey. International Journal of Advanced Information Science and Technology (IJAIST), 27(27), 97-103.
[65] Breiman, L. (2001). Random forests. Machine learning, 45, 5-32.
[66] Bader-El-Den, M., Teitei, E., & Perry, T. (2018). Biased random forest for dealing with the class imbalance problem. IEEE transactions on neural networks and learning systems, 30(7), 2163-2172.
[67] Li, Y. S., Chi, H., Shao, X. Y., Qi, M. L., & Xu, B. G. (2020). A novel random forest approach for imbalance problem in crime linkage. Knowledge-Based Systems, 195, 105738.
[68] Tan, X., Su, S., Huang, Z., Guo, X., Zuo, Z., Sun, X., & Li, L. (2019). Wireless sensor networks intrusion detection based on SMOTE and the random forest algorithm. Sensors, 19(1), 203.
[69] Casa, A., Scrucca, L., & Menardi, G. (2021). Better than the best? Answers via model ensemble in density-based clustering. Advances in Data Analysis and Classification, 15, 599-623.
[70] Rokach, L. (2016). Decision forest: Twenty years of research. Information Fusion, 27, 111-125.
[71] Cano, A., Nguyen, D. T., Ventura, S., & Cios, K. J. (2016). ur-CAIM: improved CAIM discretization for unbalanced and balanced data. Soft Computing, 20, 173-188.
[72] Tahan, M. H., & Asadi, S. (2018). EMDID: Evolutionary multi-objective discretization for imbalanced datasets. Information Sciences, 432, 442-461.
[73] Pal, S. S., & Kar, S. (2019). Time series forecasting for stock market prediction through data discretization by fuzzistics and rule generation by rough set theory. Mathematics and Computers in Simulation, 162, 18-30.
[74] Serengil, S. I., & Teknoloji, Y. K. (2021). ChefBoost: A Lightweight Boosted Decision Tree Framework.
[75] Pineda-Bautista, B. B., Carrasco-Ochoa, J. A., & Martı́nez-Trinidad, J. F. (2011). General framework for class-specific feature selection. Expert Systems with Applications, 38(8), 10018-10024. |