參考文獻 |
[1]. Berry, M. J. A., and Linoff, G. (1997). Data Mining Techniques: for Marking, Sales, and Customer Support. New York, John Wiley and Sons Inc.
[2]. Kleissner, C. (1998). Data Mining for the Enterprise, Proceedings of the 31st Annual Hawaii International Conference on System Sciences, 7, 295-304.
[3]. Nitesh V. Chawla, Nathalie Japkowicz, Aleksander Kotcz. (2004). Special Issue on Learning from Imbalanced Data Sets. SIGKDD Explor, 6(1), 1-6.
[4]. Galar, M., Fernández, A., Barrenechea, E., Bustince, H. and Herrera, F. (2012). A review on ensembles for class imbalance problem: bagging, boosting and hybrid-based approaches. IEEE Transactions on Systems, Man, and Cybernetics – part C: Applications and Reviews, 42(4), 463–484.
[5]. Mazurowski, M. A., Habas, P. A., Zurada, J. M., Lo,J. Y., Baker, J. A. and Tourassi G. D. (2008). Training neural network classifiers for medical decision making: The effects of imbalanced datasets on classification performance. Neural Netw., 21(2-3), 427-436.
[6]. Zhu, Z. B., and Song, Z. H. (2010). Fault diagnosis based on imbalance modified kernel fisher discriminant analysis. Chem. Eng. Res. Des., 88(8), 936-951.
[7]. Liu, Y. H., and Chen, Y. T. (2005). Total margin-based adaptive fuzzy support vector machines for multiview face recognition. Proc. IEEE Int. Conf. Syst., Man Cybern., 2, 1704-1711.
[8]. I. Guyon and A. Elisseeff (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3, 1157-1182.
[9]. Show-Jane Yen, Yue-Shi Lee, Cheng-Han Lin and Jia-Ching Ying (2006). Investigating the Effect of Sampling Methods for Imbalanced Data Distributions. IEEE International Conference on System, Man, and Cybernetics, 4163–4168.
[10]. J. Stefanowski and S. Wilk (2008). Selective pre-processing of imbalanced data for improving classification performance. Data Warehousing and Knowledge Discovery (Lecture Notes in Computer Science Series 5182), 283–292.
[11]. Y. Lin, Y. Lee, and G.Wahba. (2002). Support vector machines for classification
in nonstandard situations. Machine Learning, 46, 191–202.
[12]. N. Chawla, D. Cieslak, L. Hall, and A. Joshi. (2008). Automatically countering imbalance and its empirical relationship to cost. Data Min. Knowl. Discov., 17, 225–252.
[13]. V. García, R. A. Mollineda, J. S. Sánchez (2008). On the k-NN performance in a challenging scenario of imbalance and overlapping. Pattern Anal Applic, 11, 269–280.
[14]. Show-Jane Yen and Yue-Shi Lee. (2009). Cluster-based under-sampling approaches for imbalanced data distributions. Expert Systems with Applications, 36, 5718–5727.
[15]. N. V. Chawla, K.W. Bowyer, L. O. Hall and W. P. Kegelmeyer. (2002). SMOTE:
synthetic minority over-sampling technique, J. Artif. Intell. Res., 16, 321–357.
[16]. X.D. Wu et al. (2008). Top 10 Algorithms in Data Mining. Knowledge and Information Systems, vol. 14(1), 1-37.
[17]. J. Arturo Olvera-López, J. Ariel Carrasco-Ochoa, J. Francisco Martínez-Trinidad and Josef Kittler. (2010). A review of instance selection methods. Artif Intell Rev, 34, 133-143.
[18]. Brendan J. Frey and Delbert Dueck. (2007). Clustering by Passing Messages Between Data Points. Science, 315(5814), 972-976.
[19]. Sen Jia, Yuntao Qian and Zhen Ji. (2008). Band Selection for Hyperspectral Imagery Using Affinity Propagation. Proc. DICTA’08.Digital Image Computing:
Techniques and Applications, 137-141.
[20]. Shang F, Jiao L, Shi J, Wang F and Gong M. (2012). Fast affinity propagation clustering: a multilevel approach. Pattern Recognition (45):474–486.
[21]. Bradley, A. P. (1997). The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition, 30(7), 1145-1159.
[22]. Batista, G. E., Prati, R. C., and Monard, M. C. (2004). A study of the behavior of several methods for balancing machine learning training data. ACM Sigkdd Explorations Newsletter, 6(1), 20-29.
[23]. Japkowicz, N., and Stephen, S. (2002). The Class Imbalance Problem: A Systematic Study. Intelligent Data Analysis, 6(5), 429-449.
[24]. Kotsiantis, S., Kanellopoulos, D. and Pintelas, P. (2006). Handling imbalanced datasets: A review, GESTS International Transactions on Computer Science and Engineering. 30(1), 25-36.
[25]. Drummond, C. and Holte, R. C. (2003). C4.5, class imbalance, and cost sensitivity: why under-sampling beats over-sampling. In Workshop on learning from imbalanced datasets II (Vol. 11).
[26]. Weiss, G. (2004). Mining with rarity: A unifying framework. SIGKDD Explorations, 6(1), 7-19.
[27]. Cohen, W. W., (1995). Fast effective rule induction. In Proceedings of the Twelfth International Conference on Machine Learning, 115-123.
[28]. Raskutti, B. and Kowalczyk, A., (2004). Extreme rebalancing for svms: a case study. SIGKDD Explorations, 6(1), 60-69.
[29]. Longadge, R., Dongre, S. S., and Malik, L. (2013). Class Imbalance Problem in Data Mining: Review. International Journal of Computer Science and Network, 2(1), 1-6.
[30]. Liu, X. Y., and Zhou, Z. H. (2013). Ensemble Methods for Class Imbalance Learning. Imbalanced Learning: Foundations, Algorithms, and Applications, First Edition, 61-82.
[31]. López, V., Fernández, A., García, S., Palade, V., and Herrera, F. (2013). An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics. Information Sciences, 250, 113-141.
[32]. Garcı´a, S., Derrac, J., Cano, J. R., and Herrera, F. (2012). Prototype Selection for Nearest Neighbor Classification: Taxonomy and Empirical Study. IEEE Transactions on pattern analysis and machine intelligence, 34(3), 417-435.
[33]. Kuncheva, L. I., and S´anchez, J. S. (2008). Nearest Neighbour Classifiers for Streaming Data with Delayed Labelling. Eighth IEEE International Conference on Data Mining, 33, 869-874.
[34]. Cano, J.R., Herrera, F., and Lozano, M. (2003). Using Evolutionary Algorithms as Instance Selection for Data Reduction in KDD: an experimental study. Evolutionary Computation, 6(3), 323-332.
[35]. Brightion, H., Mellish C. (2002). Advances in Instance Selection for Instance-Based Learning Algorithms. Data Mining and Knowledge Discovery, 153-172.
[36]. Wilson, D. R., andMartinez, T. R. (2000). Reduction Techniques for Instance-Based Learning Algorithms. Machine Learning, 38, 257-286.
[37]. Nikolaidis, K., Goulermas, J. Y., & Wu, Q. H. (2011). A class boundary preserving algorithm for data condensation. Pattern Recognition, 44(3), 704-715.
[38]. Holland, J. H. (1975). Adaption in Natural and Artificial Systems. MIT Press, Cambridge, MA.
[39]. Goldberg, D. E. (1989). Genetic Algorithm in Search, Optimization, and Machine Learning. Addison Wesley.
[40]. Herrera, F., Lozano, M., and Verdegay, J. L. (1998). Tackling Real-Coded Genetic Algorithms: Operators and Tools for Behavioural Analysis. Artificial Intelligence Review, 12, 265-319.
[41]. Baker, J. E. (1987). Reducing bias and inefficiency in the selection algorithm. Proc. Second Int. Conf. on Genetic Algorithms, 14-21.
[42]. Reeves, C. R. (1999). Foundations of Genetic Algorithms. Morgan Kaufmann.
[43]. Sikora, R., and Piramuthu, S. (2007). Framework for efficient feature selection in genetic algorithm based data mining. European Journal of Operational Research, 180 (2), 723-737.
[44]. MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, 1(14), 281-297.
[45]. Hartigan, J. A. and Wong, M. A. (1979). Algorithm AS 136: A k-means clustering algorithm. Journal of the Royal Statistical Society. Series C (Applied Statistics), 28(1), 100-108.
[46]. Forgy, E. W., (1965). Cluster analysis of multivariate data: efficiency versus interpretability of classifications. Biometrics, 21, 768.
[47]. Jiawei Han and Micheline Kamber. (2000). Data Mining: Concepts and Techniques, Third Edition (The Morgan Kaufmann Series in Data Management Systems).
[48]. Witten, I. H. and Frank, E. (2005). Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann.
[49]. Yoav Freund and Robert E. Schapire. (1996). A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. Journal of computer and system sciences, 55, 119-139.
[50]. Schapire, R. E. (1990). The strength of weak learnability. Machine learning, 5(2), 197-227.
[51]. Freund, Y., and Schapire, R. E. (1996). Experiments with a new boosting algorithm. ICML, 96, 148-156.
[52]. Eric Bauer and Ron Kohavi. (1999). An Empirical Comparison of Voting Classification Algorithms:Bagging, Boosting, and Variants. Machine Learning, 36, 105-139.
[53]. T.G. Dietterich. (2000). Ensemble methods in machine learning 1st Int. Workshop on Multiple Classifier Systems, 1857, 1-15.
[54]. Breiman, L. (1996). Bagging predictors. Machine learning, 24(2), 123-140.
[55]. 陳景祥 (2010)。R軟體:應用統計方法。臺北市:東華。
[56]. 張智星 (2004)。MATLAB程式設計:入門篇。鈦思科技股份有限公司。 |