深度學習技術於類別不平衡問題之應用

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：58

、訪客IP：52.15.92.58

姓名

黄玟榛(Wen-Zhen Huang) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

深度學習技術於類別不平衡問題之應用
(Deep Learning for the Class Imbalance Problem)

相關論文

★ 利用資料探勘技術建立商用複合機銷售預測模型	★ 應用資料探勘技術於資源配置預測之研究-以某電腦代工支援單位為例
★ 資料探勘技術應用於航空業航班延誤分析-以C公司為例	★ 全球供應鏈下新產品的安全控管-以C公司為例
★ 資料探勘應用於半導體雷射產業-以A公司為例	★ 應用資料探勘技術於空運出口貨物存倉時間預測-以A公司為例
★ 使用資料探勘分類技術優化YouBike運補作業	★ 特徵屬性篩選對於不同資料類型之影響
★ 資料探勘應用於B2B網路型態之企業官網研究-以T公司為例	★ 衍生性金融商品之客戶投資分析與建議-整合分群與關聯法則技術
★ 應用卷積式神經網路建立肝臟超音波影像輔助判別模型	★ 基於卷積神經網路之身分識別系統
★ 能源管理系統電能補值方法誤差率比較分析	★ 企業員工情感分析與管理系統之研發
★ 資料淨化於類別不平衡問題: 機器學習觀點	★ 資料探勘技術應用於旅客自助報到之分析—以C航空公司為例

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2026-1-1以後開放)

摘要(中)

在資料探勘領域中，如何針對有類別不平衡問題（Class imbalance problem）的資
料集進行有效的分類一直是一個非常重要的議題，類別不平衡問題指的是當資料集某
一類別樣本數量遠大於另一類別的樣本數量時，會導致在建立模型時，資料的偏態分
布造成模型會傾向於將小類資料（Minority class）誤判為大類資料（Majority class），
使得小類資料經常被忽略。由於類別不平衡問題經常存在於許多實際應用上，如故障
診斷（Fault diagnosis）、醫學診斷（Medical diagnosis）、盜刷偵測（Fraud detection）
等等，因此近十年來，有許多學者致力於研究處理類別不平衡問題的方法。在過往文
獻中，類別不平衡的處理方法大致分為三種層面，包含演算法層面、資料層面以及成
本敏感法等，而以往資料層面相關文獻當中，大多為使用資料前處理方式搭配機器學
習技術所建構的分類器來處理類別不平衡問題。而隨著近年來深度學習技術的普及，
為資料探勘研究帶來了新的可能性，然而，目前卻鮮少有人嘗試使用深度學習技術所
建構之分類器應用在類別不平衡資料集中，因此本論文欲使用深度學習技術所建構之
分類器，搭配資料前處理的 SMOTE 方法（Synthetic minority over-sampling technique）
來處理類別不平衡問題，以探討深度學習技術所建構之分類器效果是否能夠優於傳統
機器學習技術所建構之分類器。
本研究使用 44 個來自 KEEL 網站上的二元類別不平衡資料集，以及 8 個 NASA 資
料集。首先進行資料的前處理，並搭配兩種深度學習模型（D-MLP、DBN）進行訓練
以及測試，計算出 AUC 結果後與過往文獻之方法進行正確率比較。
從本實驗結果而言，整體來說使用資料層級方法搭配深度學習分類器 D-MLP 和
DBN 效果會比機器學習技術所建構之分類器效能較佳，若將資料集區分為高低類別不
平衡資料集時，在高類別不平衡比率的情況下，DBN 會擁有更佳的表現，若不考慮類
別不平衡比率，則是 D-MLP 擁有整體較佳的表現。

摘要(英)

Effective classification for class imbalance datasets is always an important issue of data
mining. The class imbalance problem means when the number of samples in one class
outnumbers the other classes in a dataset. The learning model will tend to misclassify the
minority class into the majority class because of the skewed class distribution. Due to the class
imbalance problem occurs in many real-world applications, for example, fault diagnosis,
medical diagnosis, fraud detection and so on, there are many researchers committed to the
methods to handle the class imbalance datasets in past decades. In the literatures, the class
imbalance problem can be solved from three different ways, including algorithm level methods,
data level methods and cost-sensitive methods. Particularly, data level methods are widely
considered, such as under- and over-sampling techniques. In recent years, deep learning
techniques have demonstrated their outperformances over many machine learning techniques.
However, very few studies examine their applicability on class imbalance datasets. Therefore,
the research objective is to perform SMOTE as the over-sampling method to re-balance the
class imbalance datasets and then construct the deep learning models for performance
comparison. In this research, 44 class imbalanced datasets collected from the KEEL dataset
repository and 8 datasets from NASA are used for the experiment. In addition, the deep neural
networks including deep multilayer perceptron (D-MLP) and deep belief network (DBN) are
compared with some representative baseline learning models. The experimental results show
that SMOTE combining with deep learning classifiers perform better than traditional machine
learning classifiers. In particular, the DBN classifier performs better than others for the datasets
with high imbalance ratios, whereas the D-MLP classifier has an overall better performance
than the other classifiers.

關鍵字(中)

★ 類別不平衡
★ 資料探勘
★ 機器學習
★ 深度學習
★ 增加少數法

關鍵字(英)

★ Class Imbalance
★ Data Mining
★ Machine Learning
★ Deep learning
★ Over-sampling

論文目次

目錄
摘要.......................................................................................................................................... vii
Abstract.................................................................................................................................... viii
圖目錄....................................................................................................................................... xi
表目錄...................................................................................................................................... xii
一、緒論...................................................................................................................................1
1-1 研究背景................................................................................................................1
1-2 研究動機................................................................................................................2
1-3 研究目的................................................................................................................3
1-4 研究架構................................................................................................................4
二、文獻探討...........................................................................................................................6
2-1 類別不平衡問題.......................................................................................................6
2-2 解決類別不平衡問題之探討...................................................................................7
2-2-1 資料層級（Data level） ..............................................................................7
2-3 機器學習演算法.....................................................................................................10
2-3-1 監督式學習演算法 ....................................................................................11
2-3-2 集成學習法 ................................................................................................14
2-4 深度學習.................................................................................................................17
2-4-1 機器學習到深度學習 ................................................................................17
2-4-2 深度多層感知器（Deep multilayer perceptron）.....................................17
2-4-3 深度信念網路（Deep belief networks） ..................................................19
2-5 相關文獻比較.........................................................................................................20
三、研究方法.........................................................................................................................26
3-1 實驗流程....................................................................................................................26
3-2 實驗資料集................................................................................................................27
3-3 實驗環境與參數設定................................................................................................30
3-4 評估方法....................................................................................................................31
四、實驗結果.........................................................................................................................33
4-1 實驗準備....................................................................................................................33
4-1-1 實驗電腦環境..................................................................................................33
4-1-2 軟體程式..........................................................................................................33
4-2 實驗一結果................................................................................................................34
4-3 實驗二結果................................................................................................................43
4-4 實驗總結....................................................................................................................46x
五、結論.................................................................................................................................48
5-1 結論與貢獻................................................................................................................48
5-2 未來研究方向與建議................................................................................................49
參考文獻..................................................................................................................................51

參考文獻

[1] Wu, X., Zhu, X., Wu, G.-Q., & Ding, W. (2014). Data mining with big data. IEEE
Transactions on Knowledge and Data Engineering, 26(1), 97–107.
https://doi.org/10.1109/TKDE.2013.109
[2] Zhang, C., Sun, J. H., & Tan, K. C. (2015). Deep Belief Networks Ensemble with
Multi-objective Optimization for Failure Diagnosis. 2015 IEEE International Conference on
Systems, Man, and Cybernetics, 32–37. https://doi.org/10.1109/SMC.2015.19
[3] Fawcett, T., & Provost, F. (1997). Adaptive Fraud Detection. Data Mining and
Knowledge Discovery, 1(3), 291–316. https://doi.org/10.1023/A:1009700419189
[4] Valdovinos, R. M., & Sanchez, J. S. (2005). Class-dependant resampling for medical
applications. Fourth International Conference on Machine Learning and Applications
(ICMLA’05), 6 pp.-. https://doi.org/10.1109/ICMLA.2005.15
[5] Ezawa, K. J., Singh, M., & Norton, S. W. (n.d.). Learning Goal Oriented Bayesian
Networks for Telecommunications Risk Management. 9.
[6] Sun, J., Rahman, M., Wong, Y. S., & Hong, G. S. (2004). Multiclassification of tool
wear with support vector machine by manufacturing loss consideration. International Journal
of Machine Tools and Manufacture, 44(11), 1179–1187.
https://doi.org/10.1016/j.ijmachtools.2004.04.003
[7] Galar, M., Fernandez, A., Barrenechea, E., Bustince, H., & Herrera, F. (2012). A
Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and HybridBased Approaches. IEEE Transactions on Systems, Man, and Cybernetics, Part C
(Applications and Reviews), 42(4), 463–484. https://doi.org/10.1109/TSMCC.2011.216128552
[8] Lin, Y., Lee, Y., & Wahba, G. (2002). Support Vector Machines for Classification in
Nonstandard Situations. Machine Learning, 46(1), 191–202.
https://doi.org/10.1023/A:1012406528296
[9] Liu, B., Ma, Y., & Wong, C. K. (2000). Improving an Association Rule Based
Classifier. In D. A. Zighed, J. Komorowski, & J. Żytkow (Eds.), Principles of Data Mining
and Knowledge Discovery (pp. 504–509). Springer. https://doi.org/10.1007/3-540-45372-
5_58
[10] Barandela, R., Sánchez, J. S., Garcı́a, V., & Rangel, E. (2003). Strategies for learning
in class imbalance problems. Pattern Recognition, 36, 849–851.
https://doi.org/10.1016/S0031-3203(02)00257-1
[11] Stefanowski, J., & Wilk, S. (2008). Selective Pre-processing of Imbalanced Data for
Improving Classification Performance. In I.-Y. Song, J. Eder, & T. M. Nguyen (Eds.), Data
Warehousing and Knowledge Discovery (pp. 283–292). Springer. https://doi.org/10.1007/978-
3-540-85836-2_27
[12] Fernández, A., García, S., del Jesus, M. J., & Herrera, F. (2008). A study of the
behaviour of linguistic fuzzy rule based classification systems in the framework of
imbalanced data-sets. Fuzzy Sets and Systems, 159(18), 2378–2398.
https://doi.org/10.1016/j.fss.2007.12.023
[13] Napierała, K., Stefanowski, J., & Wilk, S. (2010). Learning from Imbalanced Data in
Presence of Noisy and Borderline Examples. In M. Szczuka, M. Kryszkiewicz, S. Ramanna,
R. Jensen, & Q. Hu (Eds.), Rough Sets and Current Trends in Computing (pp. 158–167).
Springer. https://doi.org/10.1007/978-3-642-13529-3_18
[14] Chawla, N. V., Cieslak, D. A., Hall, L. O., & Joshi, A. (2008). Automatically
countering imbalance and its empirical relationship to cost. Data Mining and Knowledge
Discovery, 17(2), 225–252. https://doi.org/10.1007/s10618-008-0087-053
[15] Ling, C. X., Sheng, V. S., & Yang, Q. (2006). Test strategies for cost-sensitive
decision trees. IEEE Transactions on Knowledge and Data Engineering, 18(8), 1055–1067.
https://doi.org/10.1109/TKDE.2006.131
[16] Zhang, S., Liu, L., Zhu, X., & Zhang, C. (2008). A Strategy for Attributes Selection in
Cost-Sensitive Decision Trees Induction. 2008 IEEE 8th International Conference on
Computer and Information Technology Workshops, 8–13.
https://doi.org/10.1109/CIT.2008.Workshops.51
[17] Liu, Y.-Q., Wang, C., & Zhang, L. (2009). Decision Tree Based Predictive Models for
Breast Cancer Survivability on Imbalanced Data. 2009 3rd International Conference on
Bioinformatics and Biomedical Engineering, 1–4.
https://doi.org/10.1109/ICBBE.2009.5162571
[18] Tang, Y., Zhang, Y.-Q., Chawla, N. V., & Krasser, S. (2009). SVMs Modeling for
Highly Imbalanced Classification. IEEE Transactions on Systems, Man, and Cybernetics,
Part B (Cybernetics), 39(1), 281–288. https://doi.org/10.1109/TSMCB.2008.2002909
[19] Najafabadi, M. M., Villanustre, F., Khoshgoftaar, T. M., Seliya, N., Wald, R., &
Muharemagic, E. (2015). Deep learning applications and challenges in big data analytics.
Journal of Big Data, 2(1), 1. https://doi.org/10.1186/s40537-014-0007-7
[20] Wang, S., & Yao, X. (2012). Multiclass Imbalance Problems: Analysis and Potential
Solutions. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 42(4),
1119–1130. https://doi.org/10.1109/TSMCB.2012.2187280
[21] Prati, R. C., Batista, G. E. A. P. A., & Monard, M. C. (2004). Class Imbalances versus
Class Overlapping: An Analysis of a Learning System Behavior. In R. Monroy, G. ArroyoFigueroa, L. E. Sucar, & H. Sossa (Eds.), MICAI 2004: Advances in Artificial Intelligence
(pp. 312–321). Springer. https://doi.org/10.1007/978-3-540-24694-7_32
[22] The class imbalance problem: A systematic study—IOS Press. (n.d.). Retrieved June
18, 2021, from https://content.iospress.com/articles/intelligent-data-analysis/ida0010354
[23] Jo, T., & Japkowicz, N. (2004). Class imbalances versus small disjuncts. ACM
SIGKDD Explorations Newsletter, 6(1), 40–49. https://doi.org/10.1145/1007730.1007737
[24] Kubat, M., & Matwin, S. (1997). Addressing the Curse of Imbalanced Training Sets:
One-Sided Selection. In Proceedings of the Fourteenth International Conference on Machine
Learning, 179–186.
[25] Mani, I., & Zhang, I. (2003, August). kNN approach to unbalanced data distributions:
a case study involving information extraction. In Proceedings of workshop on learning from
imbalanced datasets (Vol. 126). United States: ICML.
[26] Kotsiantis, S., & Pintelas, P. (2004). Mixture of expert agents for handling imbalanced
data sets. Annals of Mathematics, Computing & Teleinformatics, 1, 46–55.
[27] Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE:
Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research, 16,
321–357. https://doi.org/10.1613/jair.953
[28] Safavian, S. R., & Landgrebe, D. (1991). A survey of decision tree classifier
methodology. IEEE Transactions on Systems, Man, and Cybernetics, 21(3), 660–674.
https://doi.org/10.1109/21.97458
[29] C4.5: Programs for Machine Learning—J. Ross Quinlan—Google 圖書. (n.d.).
Retrieved June 18, 2021, from https://books.google.com.tw/books?hl=zhTW&lr=&id=b3ujBQAAQBAJ&oi=fnd&pg=PP1&dq=C45+Programs+for+Machine+Learni
ng&ots=sR3qRKFsE4&sig=xHZLhK0xBmyfxtlYEvwcsBJAltQ&redir_esc=y#v=onepage&q
=C45%20Programs%20for%20Machine%20Learning&f=false
[30] Hssina, B., Merbouha, A., Ezzikouri, H., & Erritali, M. (2014). A comparative study
of decision tree ID3 and C4.5. International Journal of Advanced Computer Science and
Applications, 4(2). https://doi.org/10.14569/SpecialIssue.2014.040203
[31] Mingers, J. (1989). An Empirical Comparison of Pruning Methods for Decision Tree
Induction. Machine Learning, 4(2), 227–243. https://doi.org/10.1023/A:102260410093355
[32] Rokach, L. (2010). Ensemble-based classifiers. Artificial Intelligence Review, 33(1),
1–39. https://doi.org/10.1007/s10462-009-9124-7
[33] Dietterich, T. G. (2000). Ensemble Methods in Machine Learning. Multiple Classifier
Systems, 1–15. https://doi.org/10.1007/3-540-45014-9_1
[34] Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140.
https://doi.org/10.1007/BF00058655
[35] Schapire, R. E. (n.d.). The strength of weak learnability. 31.
[36] Freund, Y., & Schapire, R. E. (1996). Experiments with a New Boosting Algorithm.
[37] LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–
444. https://doi.org/10.1038/nature14539
[38] Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2017). ImageNet classification with
deep convolutional neural networks. Communications of the ACM, 60(6), 84–90.
https://doi.org/10.1145/3065386
[39] Farabet, C., Couprie, C., Najman, L., & LeCun, Y. (2013). Learning Hierarchical
Features for Scene Labeling. IEEE Transactions on Pattern Analysis and Machine
Intelligence, 35(8), 1915–1929. https://doi.org/10.1109/TPAMI.2012.231
[40] Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, A., Jaitly, N., Senior, A.,
Vanhoucke, V., Nguyen, P., Sainath, T. N., & Kingsbury, B. (2012). Deep Neural Networks
for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups.
IEEE Signal Processing Magazine, 29(6), 82–97. https://doi.org/10.1109/MSP.2012.2205597
[41] Mikolov, T., Deoras, A., Povey, D., Burget, L., & Černocký, J. (2011). Strategies for
training large scale neural network language models. 2011 IEEE Workshop on Automatic
Speech Recognition Understanding, 196–201. https://doi.org/10.1109/ASRU.2011.6163930
[42] Young, T., Hazarika, D., Poria, S., & Cambria, E. (2018). Recent Trends in Deep
Learning Based Natural Language Processing [Review Article]. IEEE Computational
Intelligence Magazine, 13(3), 55–75. https://doi.org/10.1109/MCI.2018.284073856
[43] Leung, M. K. K., Xiong, H. Y., Lee, L. J., & Frey, B. J. (2014). Deep learning of the
tissue-regulated splicing code. Bioinformatics, 30(12), i121–i129.
https://doi.org/10.1093/bioinformatics/btu277
[44] The human splicing code reveals new insights into the genetic determinants of disease
| Science. (n.d.). Retrieved June 18, 2021, from
https://science.sciencemag.org/content/347/6218/1254806.abstract?casa_token=VL33a8_afwAAAAA:zY8EiV53672xkXEhe2olopddzcSc00qhsiOc4mtUes4lCcgck7-
CklSqvohI1DB85kgiBxv10jvZ17E
[45] Gardner, M. W., & Dorling, S. R. (1998). Artificial neural networks (the multilayer
perceptron)—A review of applications in the atmospheric sciences. Atmospheric
Environment, 32(14), 2627–2636. https://doi.org/10.1016/S1352-2310(97)00447-0
[46] Hinton, G. E., Osindero, S., & Teh, Y.-W. (2006). A Fast Learning Algorithm for
Deep Belief Nets. Neural Computation, 18(7), 1527–1554.
https://doi.org/10.1162/neco.2006.18.7.1527
[47] Douzas, G., & Bacao, F. (2017). Self-Organizing Map Oversampling (SOMO) for
imbalanced data set learning. Expert Systems with Applications, 82, 40–52.
https://doi.org/10.1016/j.eswa.2017.03.073
[48] Rivera, W. A. (2017). Noise Reduction A Priori Synthetic Over-Sampling for class
imbalanced data sets. Information Sciences, 408, 146–161.
https://doi.org/10.1016/j.ins.2017.04.046
[49] Nanni, L., Fantozzi, C., & Lazzarini, N. (2015). Coupling different methods for
overcoming the class imbalance problem. Neurocomputing, 158, 48–61.
https://doi.org/10.1016/j.neucom.2015.01.068
[50] Lin, W.-C., Tsai, C.-F., Hu, Y.-H., & Jhang, J.-S. (2017). Clustering-based
undersampling in class-imbalanced data. Information Sciences, 409–410, 17–26.
https://doi.org/10.1016/j.ins.2017.05.00857
[51] Buda, M., Maki, A., & Mazurowski, M. A. (2018). A systematic study of the class
imbalance problem in convolutional neural networks. Neural Networks, 106, 249–259.
https://doi.org/10.1016/j.neunet.2018.07.011
[52] Hensman, P., & Masko, D. (n.d.). The Impact of Imbalanced Training Data for
Convolutional Neural Networks. 28.
[53] Zhang, C., Tan, K. C., & Ren, R. (2016). Training cost-sensitive Deep Belief
Networks on imbalance data problems. 2016 International Joint Conference on Neural
Networks (IJCNN), 4362–4367. https://doi.org/10.1109/IJCNN.2016.7727769
[54] Wang, S., Liu, W., Wu, J., Cao, L., Meng, Q., & Kennedy, P. J. (2016). Training deep
neural networks on imbalanced data sets. 2016 International Joint Conference on Neural
Networks (IJCNN), 4368–4374. https://doi.org/10.1109/IJCNN.2016.7727770
[55] An alternative SMOTE oversampling strategy for high-dimensional datasets—
ScienceDirect. (n.d.). Retrieved July 13, 2021, from
https://www.sciencedirect.com/science/article/abs/pii/S1568494618307130
[56] Ren, R., Yang, Y., & Sun, L. (2020). Oversampling technique based on fuzzy
representativeness difference for classifying imbalanced data. Applied Intelligence, 50(8),
2465–2487. https://doi.org/10.1007/s10489-020-01644-0
[57] Guan, H., Zhang, Y., Xian, M., Cheng, H. D., & Tang, X. (2021). SMOTE-WENN:
Solving class imbalance and small sample problems by oversampling and distance scaling.
Applied Intelligence, 51(3), 1394–1409. https://doi.org/10.1007/s10489-020-01852-8
[58] Zhang, R., Zhang, Z., & Wang, D. (2021). RFCL: A new under-sampling method of
reducing the degree of imbalance and overlap. Pattern Analysis and Applications, 24(2), 641–
654. https://doi.org/10.1007/s10044-020-00929-x
[59] Dara, S., & Tumma, P. (2018). Feature Extraction By Using Deep Learning: A
Survey. 2018 Second International Conference on Electronics, Communication and
Aerospace Technology (ICECA), 1795–1801. https://doi.org/10.1109/ICECA.2018.847491258
[60] Xiao, Y., Xing, C., Zhang, T., & Zhao, Z. (2019). An Intrusion Detection Model
Based on Feature Reduction and Convolutional Neural Networks. IEEE Access, 7, 42210–
42219. https://doi.org/10.1109/ACCESS.2019.2904620

指導教授

蔡志豐(Chih-Fong Tsai)

審核日期

2021-7-15

推文