參考文獻 |
[1] V. Mayer-Schönberger and K. Cukier, Big Data : a Revolution That Will Transform How We Live, Work, and Think. London: John Murray, 2013.
[2] J. Zakir, T. Seymour, and K.Berg, “Big Data Analytics,” Int. Assoc. Comput. Inf. Syst., vol. 16, no. 2, pp. 81–90, 2015.
[3] U. Fayyad, G. Piatetsky-Shapiro, and P. Smyth, “From Data Mining to Knowledge Discovery in Databases,” AI Magazine, vol. 17, no. 3, pp. 37–54, 1996.
[4] M. Gera and S. Goel, “Data Mining - Techniques, Methods and Algorithms: A Review on Tools and their Validity,” Int. J. Comput. Appl., vol. 113, no. 18, pp. 22–29, 2015.
[5] G. S. Linoff and M. J. A. Berry, Data mining techniques for marketing, sales, and customer relationship management, 2nd ed. New York: John Wiley and Sons Inc, 2004.
[6] C. Kleissner, “Data mining for the enterprise,” Proc. Thirty-First Hawaii Int. Conf. Syst. Sci., vol. 7, pp. 295–304, 1998.
[7] J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and Techniques, 3rd ed. New York: Elsevier Inc, 2012.
[8] G. Kesavaraj and S. Sukumaran, “A study on classification techniques in data mining,” in 2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT), 2013, pp. 1–7.
[9] Y. Sun, M. S. Kamel, A. K. C.Wong, and Y. Wang, “Cost-sensitive boosting for classification of imbalanced data,” Pattern Recognit., vol. 40, no. 12, pp. 3358–3378, 2007.
[10] N. V. Chawla, N. Japkowicz, and A. Kolcz, “Editorial : Special Issue on Learning from Imbalanced Data Sets,” ACM SIGKDD Explor. Newsl., vol. 6, no. 1, pp. 1–6, 2004.
[11] M. Galar, A. Fernandez, E. Barrenechea, H. Bustince, and F. Herrera, “A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches,” IEEE Trans. Syst. Man, Cybern. Part C, vol. 42, no. 4, pp. 463–484, 2012.
[12] A. Orriols-Puig and E. Bernadó-Mansilla, “Evolutionary rule-based systems for imbalanced data sets,” Soft Comput., vol. 13, no. 3, pp. 213–225, 2009.
[13] Z. -B. Zhu and Z. -H. Song, “Fault diagnosis based on imbalance modified kernel Fisher discriminant analysis,” Chem. Eng. Res. Des., vol. 88, no. 8, pp. 936–951, 2010.
[14] W. Khreich, E. Granger, A. Miri, and R. Sabourin, “Iterative Boolean combination of classifiers in the ROC space: An application to anomaly detection with HMMs,” Pattern Recognit., vol. 43, no. 8, pp. 2732–2752, 2010.
[15] M. A. Mazurowski, P. A. Habas, J. M. Zurada, J. Y. Lo, J. A. Baker, and G. D. Tourassi, “Training neural network classifiers for medical decision making: The effects of imbalanced datasets on classification performance,” Neural Networks, vol. 21, no. 2–3, pp. 427–436, 2008.
[16] Y. -H. Liu and Y. -T. Chen, “Total Margin Based Adaptive Fuzzy Support Vector Machines for Multiview Face Recognition,” in 2005 IEEE International Conference on Systems, Man and Cybernetics, 2005, vol. 2, pp. 1704–1711.
[17] M. Kubat, R. C. Holte, and S. Matwin, “Machine learning for the detection of oil spills in satellite radar images,” Mach. Learn., vol. 30, no. 2–3, pp. 195–215, 1998.
[18] L. Yin, Y. Ge, K. Xiao, X. Wang, and X. Quan, “Feature selection for high-dimensional imbalanced data,” Neurocomputing, vol. 105, pp. 3–11, 2013.
[19] X. -Y. Liu andZ. -H. Zhou, Ensemble Methods for Class Imbalance Learning, 1st ed. Hoboken, NJ, USA: John Wiley & Sons, Inc., 2013.
[20] Y. Lin, Y. Lee, and G. Wahba, “Support Vector Machines for Classification in Nonstandard Situations,” Mach. Learn., vol. 46, pp. 191–202, 2002.
[21] J. Stefanowski and S. Wilk, “Selective pre-processing of imbalanced data for improving classification performance,” Data Warehous. Knowl. Discov., vol. 5182 LNCS, pp. 283–292, 2008.
[22] N. V. Chawla, D. A. Cieslak, L. O. Hall, and A. Joshi, “Automatically countering imbalance and its empirical relationship to cost,” Data Min. Knowl. Discov., vol. 17, no. 2, pp. 225–252, 2008.
[23] M. Kubat and S. Matwin, “Addressing the Curse of Imbalanced Training Sets: One Sided Selection,” in Proceedings of the Fourteenth International Conference on Machine Learning, 1997, vol. 97, pp. 179–186.
[24] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: Synthetic minority over-sampling technique,” J. Artif. Intell. Res., vol. 16, pp. 321–357, 2002.
[25] D. Dai and S. -W. Hua, “Random Under-Sampling Ensemble Methods for Highly Imbalanced Rare Disease Classification,” in 12th International Conference on Data Mining (DMIN 2016), 2016, pp. 54–59.
[26] W. -C. Lin, C. -F. Tsai, Y. -H. Hu, and J.-S. Jhang, “Clustering-based undersampling in class-imbalanced data,” Inf. Sci. (Ny)., vol. 409–410, pp. 17–26, 2017.
[27] Z. -B. Sun, Q. -B. Song, and X. -Y. Zhu, “Using coding-based ensemble learning to improve software defect prediction,” IEEE Trans. Syst. Man Cybern. Part C Appl. Rev., vol. 42, no. 6, pp. 1806–1817, 2012.
[28] T. -F. Wu, C. -J. Lin, and R. C. Weng, “Probability Estimates for Multi-class Classification by Pairwise Coupling,” J. Mach. Learn. Res., vol. 5, pp. 975–1005, 2004.
[29] B. Das, N. C. Krishnan, and D. J. Cook, “Handling Class Overlap and Imbalance to Detect Prompt Situations in Smart Homes,” in IEEE 13th International Conference on Data Mining Workshops, 2013.
[30] G. E. A. P. A. Batista, R. C. Prati, and M. C. Monard, “A Study of the Behavior of Several Methods for Balancing Machine Learning Training Data,” ACM SIGKDD Explor. Newsl., vol. 6, no. 1, pp. 20–29, 2004.
[31] A. Ali, S. M. Shamsuddin, and A. L. Ralescu, “Classification with class imbalance problem,” Int. J. Adv. Soft Comput. its Appl., vol. 7, no. 3, pp. 176–204, 2015.
[32] N. Japkowicz and S. Stephen, “The class imbalance problem: A systematic study,” Intell. Data Anal. J., vol. 6, no. 5, pp. 429–449, 2002.
[33] R. C. Holte, L. E. Acker, and B. W. Porter, “Concept learning and the problem of small disjuncts,” in Proceedings of the Eleventh International Joint Conference on Artificial Intelligence, 1989, pp. 813–818.
[34] G. M. Weiss and F. J. Provost, “Learning When Training Data are Costly: The Effect of ClassDistribution on Tree Induction.,” J. Artif. Intell. Res., vol. 19, pp. 315–354, 2003.
[35] S. Kotsiantis, D. Kanellopoulos, and P. Pintelas, “Handling imbalanced datasets : A review,” GESTS Int. Trans. Comput. Sci. Eng., vol. 30, no. 1, pp. 25–36, 2006.
[36] C. Drummond and R. C. Holte, “C4.5, class imbalance, and cost sensitivity: why under-sampling beats over-sampling,” in Proceedings of the International Conference on Machine Learning (ICML 2003) Workshop on Learning from Imbalanced Data Sets II, 2003, pp. 1–8.
[37] S. B. Kotsiantis and P. E. Pintelas, “Mixture of expert agents for handling imbalanced data sets,” Ann. Math. Comput. Teleinformatics, vol. 1, no. 1, pp. 46–55, 2003.
[38] I. Tomek, “Two Modifications of CNN,” IEEE Trans. Syst. Man. Cybern., vol. SMC-6, no. 11, pp. 769–772, 1976.
[39] G. Weiss, “Mining with rarity: A unifying framework.,” SIGKDD Explor., vol. 6, no. 1, pp. 7–19, 2004.
[40] W. W. Cohen, “Fast effective rule induction,” in Proceedings of the Twelfth International Conference on Machine Learning, 1995, pp. 115–123.
[41] R. Longadge, S. S. Dongre, and L. Malik, “Class imbalance problem in data mining: review,” Int. J. Comput. Sci. Netw., vol. 2, no. 1, pp. 83–87, 2013.
[42] B. J. Frey and D. Dueck, “Clustering by passing messages between data points,” Science (80-. )., vol. 315, no. 5814, pp. 972–976, 2007.
[43] J. Macqueen, “Some methods for classification and analysis of multivariate observations,” in Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, 1967, vol. 1, no. 233, pp. 281–297.
[44] J. A. Hartigan and M. A. Wong, “Algorithm AS 136: A K-Means Clustering Algorithm,” J. R. Stat. Soc. Ser. C (Applied Stat., vol. 28, no. 1, pp. 100–108, 1979.
[45] E. W. Forgy, “Cluster Analysis of Multivariate Data: Efficiency versus Interpretability of Classification,” Biometrics, vol. 21, no. 3, pp. 768–769, 1965.
[46] J. Holland, Adaptation in natural and artificial systems: An introductory analysis with applications to bilogoy, controol and artificial intelligence. Cambridge, MA, USA: MIT Press, 1975.
[47] I. H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques, Second Edition, 2nd ed. San Francisco: Morgan Kaufmann, 2005.
[48] M. Kantardzic, Data Mining: Concepts, Models, Methods, and Algorithms, 2nd ed. Hoboken: John Wiley & Sons, Inc., 2011.
[49] D. E. Goldberg, Genetic Algorithm in Search, Optimization, and Machine Learning. Boston: Addison Wesley, 1989.
[50] Y. Freund and R. E. Schapire, “A desicion-theoretic generalization of on-line learning and an application to boosting,” J. Comput. Syst. Sci., vol. 55, pp. 119–139, 1996.
[51] R. E. Schapire, “The Strength of Weak Learnability,” Mach. Learn., vol. 5, no. 2, pp. 197–227, 1990.
[52] Y .Freund and R. E. Schapire, “Experiments with a New Boosting Algorithm,” in In Proceedings of the International Conference on Machine Learning, 1996, pp. 148–156.
[53] L. Breiman, “Bagging Predictors,” Mach. Learn., vol. 24, no. 2, pp. 123–140, 1996.
[54] G. Douzas and F. Bacao, “Self-Organizing Map Oversampling (SOMO) for imbalanced data set learning,” Expert Syst. Appl., vol. 82, pp. 40–52, 2017.
[55] W. A. Rivera, “Noise Reduction A Priori Synthetic Over-Sampling for class imbalanced data sets,” Inf. Sci. (Ny)., vol. 408, pp. 146–161, 2017.
[56] G. G. Sundarkumar and V. Ravi, “A novel hybrid undersampling method for mining unbalanced datasets in banking and insurance,” Eng. Appl. Artif. Intell., vol. 37, pp. 368–377, 2015.
[57] L. Nanni, C. Fantozzi, and N. Lazzarini, “Coupling different methods for overcoming the class imbalance problem,” Neurocomputing, vol. 158, pp. 48–61, 2015.
[58] 陳景祥, R軟體:應用統計方法. 台北市: 東華, 2010. |