博碩士論文 107423065 完整後設資料紀錄

DC 欄位 語言
DC.contributor資訊管理學系zh_TW
DC.creator黄玟榛zh_TW
DC.creatorWen-Zhen Huangen_US
dc.date.accessioned2021-7-15T07:39:07Z
dc.date.available2021-7-15T07:39:07Z
dc.date.issued2021
dc.identifier.urihttp://ir.lib.ncu.edu.tw:88/thesis/view_etd.asp?URN=107423065
dc.contributor.department資訊管理學系zh_TW
DC.description國立中央大學zh_TW
DC.descriptionNational Central Universityen_US
dc.description.abstract在資料探勘領域中,如何針對有類別不平衡問題(Class imbalance problem)的資 料集進行有效的分類一直是一個非常重要的議題,類別不平衡問題指的是當資料集某 一類別樣本數量遠大於另一類別的樣本數量時,會導致在建立模型時,資料的偏態分 布造成模型會傾向於將小類資料(Minority class)誤判為大類資料(Majority class), 使得小類資料經常被忽略。由於類別不平衡問題經常存在於許多實際應用上,如故障 診斷(Fault diagnosis)、醫學診斷(Medical diagnosis)、盜刷偵測(Fraud detection) 等等,因此近十年來,有許多學者致力於研究處理類別不平衡問題的方法。在過往文 獻中,類別不平衡的處理方法大致分為三種層面,包含演算法層面、資料層面以及成 本敏感法等,而以往資料層面相關文獻當中,大多為使用資料前處理方式搭配機器學 習技術所建構的分類器來處理類別不平衡問題。而隨著近年來深度學習技術的普及, 為資料探勘研究帶來了新的可能性,然而,目前卻鮮少有人嘗試使用深度學習技術所 建構之分類器應用在類別不平衡資料集中,因此本論文欲使用深度學習技術所建構之 分類器,搭配資料前處理的 SMOTE 方法(Synthetic minority over-sampling technique) 來處理類別不平衡問題,以探討深度學習技術所建構之分類器效果是否能夠優於傳統 機器學習技術所建構之分類器。 本研究使用 44 個來自 KEEL 網站上的二元類別不平衡資料集,以及 8 個 NASA 資 料集。首先進行資料的前處理,並搭配兩種深度學習模型(D-MLP、DBN)進行訓練 以及測試,計算出 AUC 結果後與過往文獻之方法進行正確率比較。 從本實驗結果而言,整體來說使用資料層級方法搭配深度學習分類器 D-MLP 和 DBN 效果會比機器學習技術所建構之分類器效能較佳,若將資料集區分為高低類別不 平衡資料集時,在高類別不平衡比率的情況下,DBN 會擁有更佳的表現,若不考慮類 別不平衡比率,則是 D-MLP 擁有整體較佳的表現。zh_TW
dc.description.abstractEffective classification for class imbalance datasets is always an important issue of data mining. The class imbalance problem means when the number of samples in one class outnumbers the other classes in a dataset. The learning model will tend to misclassify the minority class into the majority class because of the skewed class distribution. Due to the class imbalance problem occurs in many real-world applications, for example, fault diagnosis, medical diagnosis, fraud detection and so on, there are many researchers committed to the methods to handle the class imbalance datasets in past decades. In the literatures, the class imbalance problem can be solved from three different ways, including algorithm level methods, data level methods and cost-sensitive methods. Particularly, data level methods are widely considered, such as under- and over-sampling techniques. In recent years, deep learning techniques have demonstrated their outperformances over many machine learning techniques. However, very few studies examine their applicability on class imbalance datasets. Therefore, the research objective is to perform SMOTE as the over-sampling method to re-balance the class imbalance datasets and then construct the deep learning models for performance comparison. In this research, 44 class imbalanced datasets collected from the KEEL dataset repository and 8 datasets from NASA are used for the experiment. In addition, the deep neural networks including deep multilayer perceptron (D-MLP) and deep belief network (DBN) are compared with some representative baseline learning models. The experimental results show that SMOTE combining with deep learning classifiers perform better than traditional machine learning classifiers. In particular, the DBN classifier performs better than others for the datasets with high imbalance ratios, whereas the D-MLP classifier has an overall better performance than the other classifiers.en_US
DC.subject類別不平衡zh_TW
DC.subject資料探勘zh_TW
DC.subject機器學習zh_TW
DC.subject深度學習zh_TW
DC.subject增加少數法zh_TW
DC.subjectClass Imbalanceen_US
DC.subjectData Miningen_US
DC.subjectMachine Learningen_US
DC.subjectDeep learningen_US
DC.subjectOver-samplingen_US
DC.title深度學習技術於類別不平衡問題之應用zh_TW
dc.language.isozh-TWzh-TW
DC.titleDeep Learning for the Class Imbalance Problemen_US
DC.type博碩士論文zh_TW
DC.typethesisen_US
DC.publisherNational Central Universityen_US

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明