參考文獻 |
[1] C.Kleissner, “Data mining for the enterprise,” Proc. Hawaii Int. Conf. Syst. Sci., vol. 7, no. c, pp. 295–304, 1998, doi: 10.1109/hicss.1998.649224.
[2] D.Hand, H.Mannila, and P.Smyth, Principles of Data Mining Cambridge, vol. 2001. 2001.
[3] M.Burri, “Understanding the Implications of Big Data and Big Data Analytics for Competition Law,” in New Developments in Competition Law and Economics, Springer, 2019, pp. 241–263.
[4] A. M.Hormozi and S.Giles, “Data mining: A competitive weapon for banking and retail industries,” Inf. Syst. Manag., vol. 21, no. 2, pp. 62–71, 2004.
[5] P. K.Chan, W.Fan, A. L.Prodromidis, and S. J.Stolfo, “Distributed Data Mining in Credit Card Fraud Detection,” IEEE Intell. Syst. Their Appl., vol. 14, no. 6, pp. 67–74, 1999, doi: 10.1109/5254.809570.
[6] J.Burez andD.denPoel, “Handling class imbalance in customer churn prediction,” Expert Syst. Appl., vol. 36, no. 3, pp. 4626–4636, 2009.
[7] U.Fayyad, G.Piatetsky-Shapiro, andP.Smyth, “From data mining to knowledge discovery in databases,” AI Mag., vol. 17, no. 3, p. 37, 1996.
[8] M.Miller, “Visual Analytics of Spatio-Temporal Event Predictions: Investigating Causes for Urban Heat Islands,” 2018.
[9] S.Garcia, J.Luengo, and F.Herrera, Data preprocessing in data mining, vol. 72. Springer, 2015.
[10] C.Phua, D.Alahakoon, and V.Lee, “Minority report in fraud detection,” ACM SIGKDD Explor. Newsl., vol. 6, no. 1, pp. 50–59, 2004, doi: 10.1145/1007730.1007738.
[11] M. A.Mazurowski, P. A.Habas, J. M.Zurada, J. Y.Lo, J. A.Baker, and G. D.Tourassi, “Training neural network classifiers for medical decision making: The effects of imbalanced datasets on classification performance,” Neural Networks, vol. 21, no. 2–3, pp. 427–436, 2008, doi: 10.1016/j.neunet.2007.12.031.
[12] D. D.Lewis and J.Catlett, “Heterogeneous uncertainty sampling for supervised learning,” in Machine learning proceedings 1994, Elsevier, 1994, pp. 148–156.
[13] Y.Li, G.Sun, and Y.Zhu, “Data imbalance problem in text classification,” Proc. - 3rd Int. Symp. Inf. Process. ISIP 2010, pp. 301–305, 2010, doi: 10.1109/ISIP.2010.47.
[14] R. C.Prati, G. E. A. P. A.Batista, and M. C.Monard, “Class imbalances versus class overlapping: An analysis of a learning system behavior,” Lect. Notes Artif. Intell. (Subseries Lect. Notes Comput. Sci., vol. 2972, pp. 312–321, 2004, doi: 10.1007/978-3-540-24694-7_32.
[15] M.Alibeigi, S.Hashemi, and A.Hamzeh, “DBFS: An effective Density Based Feature Selection scheme for small sample size and high dimensional imbalanced data sets,” Data Knowl. Eng., vol. 81–82, pp. 67–103, 2012, doi: 10.1016/j.datak.2012.08.001.
[16] T.Jo and N.Japkowicz, “Class imbalances versus small disjuncts,” ACM SIGKDD Explor. Newsl., vol. 6, no. 1, pp. 40–49, Jun.2004, doi: 10.1145/1007730.1007737.
[17] N.VChawla, N.Japkowicz, and A.Ko, “Editorial: Special Issue on Learning from Imbalanced Data Sets,” ACM SIGKDD Explor. Newsl., vol. 6, no. 1.
[18] M.Galar, A.Fernandez, E.Barrenechea, H.Bustince, and F.Herrera, “A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches,” IEEE Trans. Syst. Man, Cybern. Part C (Applications Rev., vol. 42, no. 4, pp. 463–484, 2011.
[19] N.Japkowicz and S.Stephen, “The class imbalance problem: A systematic study,” Intell. Data Anal., vol. 6, no. 5, pp. 429–449, Jan.2002, doi: 10.3233/ida-2002-6504.
[20] J.Stefanowski and S.Wilk, “Selective pre-processing of imbalanced data for improving classification performance,” in International Conference on Data Warehousing and Knowledge Discovery, 2008, pp. 283–292.
[21] N.V.Chawla, D. A.Cieslak, L. O.Hall, and A.Joshi, “Automatically countering imbalance and its empirical relationship to cost,” Data Min. Knowl. Discov., vol. 17, no. 2, pp. 225–252, 2008, doi: 10.1007/s10618-008-0087-0.
[22] A.Estabrooks, T.Jo, and N.Japkowicz, “A multiple resampling method for learning from imbalanced data sets,” Comput. Intell., vol. 20, no. 1, pp. 18–36, Feb.2004, doi: 10.1111/j.0824-7935.2004.t01-1-00228.x.
[23] A.Orriols-Puig and E.Bernadó-Mansilla, “Evolutionary rule-based systems for imbalanced data sets,” Soft Comput., vol. 13, no. 3, pp. 213–225, 2009, doi: 10.1007/s00500-008-0319-7.
[24] X.-Y.Liu, J.Wu, and Z.-H.Zhou, “Exploratory undersampling for class-imbalance learning,” IEEE Trans. Syst. Man, Cybern. Part B, vol. 39, no. 2, pp. 539–550, 2008.
[25] N.V.Chawla, K. W.Bowyer, L. O.Hall, and W. P.Kegelmeyer, “SMOTE: Synthetic minority over-sampling technique,” J. Artif. Intell. Res., vol. 16, pp. 321–357, Jun.2002, doi: 10.1613/jair.953.
[26] G. E. A. P. A.Batista, R. C.Prati, and M. C.Monard, “A study of the behavior of several methods for balancing machine learning training data,” ACM SIGKDD Explor. Newsl., vol. 6, no. 1, pp. 20–29, Jun.2004, doi: 10.1145/1007730.1007735.
[27] E.Ramentol, Y.Caballero, R.Bello, and F.Herrera, “SMOTE-RSB*: A hybrid preprocessing approach based on oversampling and undersampling for high imbalanced data-sets using SMOTE and rough sets theory,” Knowl. Inf. Syst., vol. 33, no. 2, pp. 245–265, 2012, doi: 10.1007/s10115-011-0465-6.
[28] J. A.Olvera-López, J. A.Carrasco-Ochoa, J. F.Mart’inez-Trinidad, and J.Kittler, “A review of instance selection methods,” Artif. Intell. Rev., vol. 34, no. 2, pp. 133–143, 2010.
[29] W.Liu, Z.Wang, X.Liu, N.Zeng, Y.Liu, and F. E.Alsaadi, “A survey of deep neural network architectures and their applications,” Neurocomputing, vol. 234, pp. 11–26, Apr.2017, doi: 10.1016/j.neucom.2016.12.038.
[30] C.Drummond and R. C.Holte, “C4. 5, class imbalance, and cost sensitivity: why under-sampling beats over-sampling,” in Workshop on learning from imbalanced datasets II, 1998, pp. 295--304.
[31] S.Kotsiantis, D.Kanellopoulos, and P.Pintelas, “Handling imbalanced datasets : A review,” Science (80-. )., vol. 30, no. 1, pp. 25–36, 2006.
[32] M.Kubat and S.Matwin, “Addressing the curse of imbalanced data sets: One-sided sampling,” Proc. Fourteenth Int. Conf. Mach. Learn., pp. 179–186, 1997.
[33] D. L.Wilson, “Asymptotic properties of nearest neighbor rules using edited data,” IEEE Trans. Syst. Man. Cybern., no. 3, pp. 408–421, 1972.
[34] J.Han, J.Pei, and M.Kamber, Data mining: concepts and techniques. Elsevier, 2011.
[35] L.Breiman, “Random forests,” Mach. Learn., vol. 45, no. 1, pp. 5–32, Oct.2001, doi: 10.1023/A:1010933404324.
[36] Y.Freund, R. S.-Icml, and U.1996, “Experiments with a new boosting algorithm,” Citeseer, vol. 96, pp. 148--156, 1996.
[37] N.V.Chawla, A.Lazarevic, L. O.Hall, and K. W.Bowyer, “SMOTEBoost: Improving prediction of the minority class in boosting,” Lect. Notes Artif. Intell. (Subseries Lect. Notes Comput. Sci., vol. 2838, pp. 107–119, 2003, doi: 10.1007/978-3-540-39804-2_12.
[38] D. A.Cieslak, N.VChawla, and A.Striegel, “Combating imbalance in network intrusion datasets.,” in GrC, 2006, pp. 732–737.
[39] R. A.Johnson, N.VChawla, and J. J.Hellmann, “Species distribution modeling and prediction: A class imbalance problem,” in 2012 Conference on Intelligent Data Understanding, 2012, pp. 9–16.
[40] A.Fallahi and S.Jafari, “An Expert System for Detection of Breast Cancer Using Data Preprocessing and Bayesian Network,” Int. J. Adv. Sci. Technol., vol. 34, no. October, pp. 65–70, 2011.
[41] C. F.Tsai, W. C.Lin, Y. H.Hu, and G. T.Yao, “Under-sampling class imbalanced datasets by combining clustering analysis and instance selection,” Inf. Sci. (Ny)., vol. 477, pp. 47–54, 2019, doi: 10.1016/j.ins.2018.10.029.
[42] M.Blachnik and M.Kordos, “Comparison of instance selection and construction methods with various classifiers,” Appl. Sci., vol. 10, no. 11, pp. 1–19, 2020, doi: 10.3390/app10113933.
[43] R.Longadge and S.Dongre, “Class imbalance problem in data mining review,” arXiv Prepr. arXiv1305.1707, 2013.
[44] P. H.-I. transactions on informationtheory and U.1968, “The condensed nearest neighbor rule (Corresp.),” Citeseer, pp. 515–516, 1967.
[45] D. W.Aha, D.Kibler, and M. K.Albert, “Instance-based learning algorithms,” Mach. Learn., vol. 6, no. 1, pp. 37–66, Jan.1991, doi: 10.1007/bf00153759.
[46] N.Jankowski and M.Grochowski, “Comparison of instances seletion algorithms i. algorithms survey,” in International conference on artificial intelligence and soft computing, 2004, pp. 598–603.
[47] I.Goodfellow et al., “Generative adversarial networks,” Commun. ACM, vol. 63, no. 11, pp. 139–144, Oct.2020, doi: 10.1145/3422622.
[48] M.Arjovsky, S.Chintala, and L.Bottou, “Wasserstein generative adversarial networks,” in International conference on machine learning, 2017, pp. 214–223.
[49] I.Gulrajani, F.Ahmed, M.Arjovsky, V.Dumoulin, and A.Courville, “Improved training of wasserstein gans,” arXiv Prepr. arXiv1704.00028, 2017.
[50] M.Mirza and S.Osindero, “Conditional generative adversarial nets,” arXiv Prepr. arXiv1411.1784, 2014.
[51] M. Y.Liu, T.Breuel, and J.Kautz, “Unsupervised image-to-image translation networks,” in Advances in Neural Information Processing Systems, Mar. 2017, vol. 2017-Decem, pp. 701–709.
[52] M.Frid-Adar, E.Klang, M.Amitai, J.Goldberger, and H.Greenspan, “Synthetic data augmentation using GAN for improved liver lesion classification,” Proc. - Int. Symp. Biomed. Imaging, vol. 2018-April, no. Isbi, pp. 289–293, 2018, doi: 10.1109/ISBI.2018.8363576.
[53] M.Frid-Adar, I.Diamant, E.Klang, M.Amitai, J.Goldberger, and H.Greenspan, “GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification,” Neurocomputing, vol. 321, pp. 321–331, 2018.
[54] L. A.Gatys, A. S.Ecker, and M.Bethge, “A neural algorithm of artistic style,” arXiv Prepr. arXiv1508.06576, 2015.
[55] P.Isola, J.-Y.Zhu, T.Zhou, and A. A.Efros, “Image-to-image translation with conditional adversarial networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1125–1134.
[56] A.Radford, L.Metz, and S.Chintala, “Unsupervised representation learning with deep convolutional generative adversarial networks,” arXiv Prepr. arXiv1511.06434, 2015.
[57] C.Ledig et al., “Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network,” Proc. IEEE Conf. Comput. Vis. pattern Recognit., pp. 4681–4690, 2017.
[58] L.Yu, W.Zhang, J.Wang, and Y.Yu, “Seqgan: Sequence generative adversarial nets with policy gradient,” in Proceedings of the AAAI conference on artificial intelligence, 2017, vol. 31, no. 1.
[59] S.Reed, Z.Akata, X.Yan, L.Logeswaran, B.Schiele, and H.Lee, “Generative adversarial text to image synthesis,” 33rd Int. Conf. Mach. Learn. ICML 2016, vol. 3, pp. 1681–1690, 2016.
[60] L.Lusa and others, “Evaluation of smote for high-dimensional class-imbalanced microarray data,” in 2012 11th international conference on machine learning and applications, 2012, vol. 2, pp. 89–94.
[61] L.Lusa and others, “Class prediction for high-dimensional class-imbalanced data,” BMC Bioinformatics, vol. 11, no. 1, pp. 1–17, 2010.
[62] H.Chen, S.Jajodia, J.Liu, N.Park, V.Sokolov, and V. S.Subrahmanian, “Faketables: Using GANs to generate functional dependency preserving tables with bounded real data,” IJCAI Int. Jt. Conf. Artif. Intell., vol. 2019-Augus, no. August, pp. 2074–2080, 2019, doi: 10.24963/ijcai.2019/287.
[63] L.Xu, M.Skoularidou, A.Cuesta-Infante, and K.Veeramachaneni, “Modeling tabular data using conditional GAN,” Adv. Neural Inf. Process. Syst., vol. 32, no. NeurIPS, 2019.
[64] L.Xu and K.Veeramachaneni, “Synthesizing Tabular Data using Generative Adversarial Networks,” arXiv Prepr. arXiv1811.11264, 2018.
[65] N.Park, M.Mohammadi, K.Gorde, S.Jajodia, H.Park, and Y.Kim, “Data synthesis based on generative adversarial networks,” Proc. VLDB Endow., vol. 11, no. 10, pp. 1071–1083, 2018, doi: 10.14778/3231751.3231757.
[66] M. K.Baowaly, C.-C.Lin, C.-L.Liu, and K.-T.Chen, “Synthesizing electronic health records using improved generative adversarial networks,” J. Am. Med. Informatics Assoc., vol. 26, no. 3, pp. 228–241, 2019.
[67] E.Choi, S.Biswal, B.Malin, J.Duke, W. F.Stewart, and J.Sun, “Generating multi-label discrete patient records using generative adversarial networks,” in Machine learning for healthcare conference, 2017, pp. 286–305.
[68] P. H.Lu, P. C.Wang, and C. M.Yu, “Empirical evaluation on synthetic data generation with generative adversarial network,” ACM Int. Conf. Proceeding Ser., pp. 1–6, 2019, doi: 10.1145/3326467.3326474.
[69] D. R.Wilson and T. R.Martinez, “Reduction techniques for instance-based learning algorithms,” Mach. Learn., vol. 38, no. 3, pp. 257–286, 2000.
[70] J. J.Grefenstette, “Optimization of Control Parameters for Genetic Algorithms,” IEEE Trans. Syst. Man Cybern., vol. 16, no. 1, pp. 122–128, 1986, doi: 10.1109/TSMC.1986.289288.
[71] U.Tantipongpipat, C.Waites, D.Boob, A. A.Siva, and R.Cummings, “Differentially Private Synthetic Mixed-Type Data Generation For Unsupervised Learning,” arXiv Prepr. arXiv1912.03250, pp. 1–39, 2019.
[72] M.Scott and J.Plested, “GAN-SMOTE: A Generative Adversarial Network approach to Synthetic Minority Oversampling.,” Aust. J. Intell. Inf. Process. Syst., vol. 15, no. 2, pp. 29–35, 2019.
[73] S.Ioffe and C.Szegedy, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift,” in International conference on machine learning, Jun. 2015, pp. 448–456.
[74] X.Wu et al., “Top 10 algorithms in data mining,” Knowl. Inf. Syst., vol. 14, no. 1, pp. 1–37, 2008. |