博碩士論文 104423035 完整後設資料紀錄

DC 欄位 語言
DC.contributor資訊管理學系zh_TW
DC.creator邱子安zh_TW
DC.creatorTzu-An Chiuen_US
dc.date.accessioned2018-1-18T07:39:07Z
dc.date.available2018-1-18T07:39:07Z
dc.date.issued2018
dc.identifier.urihttp://ir.lib.ncu.edu.tw:88/thesis/view_etd.asp?URN=104423035
dc.contributor.department資訊管理學系zh_TW
DC.description國立中央大學zh_TW
DC.descriptionNational Central Universityen_US
dc.description.abstract隨著儲存媒體的技術進步,企業在儲存資料時不再像過去需要考慮容量問題,會將所有資料儲存下來以待後續分析,但是這使得資料過於繁雜,因此,在進行資料探勘時,資料前處理就變成一個重要的課題。特徵選取(feature selection)與樣本選取(instance selection)是前處理的兩大重要技術,過去的研究中往往專注討論一項,同時討論二者的研究並不常見,過去同時討論兩者的研究也只有使用基因演算法(genetic algorithm)作為特徵與樣本選取的方式,沒有其他方式的組合與比較,所以我們並不清楚用其他的特徵或樣本選取方式的組合是否會比基因演算法的組合更佳,同時,也不清楚其他方法在同時使用特徵與樣本選取時,先後順序是否會對效能有所影響。因此,本研究的目的是透過組合數種較具代表性的特徵與樣本選取方式,來探討選取方式之間的優劣以及先後順序的影響,以及在信用評估與破產預測兩個領域的資料集是否有差異。兩個領域中各使用了變數數量與類別的比例都不相同的資料集,目的在找出資料集的特性不同時,對於選取方式的選擇是否也會造成影響。實驗中使用了多個具代表性的分類器進行比較,目的是在找出選取方式的先後順序與最佳組合之外,找到最佳的分類器或分類器組合(classifier ensembles),作為後續實驗的參考依據。zh_TW
dc.description.abstractWith advances in media storage technology, many companies do not consider the capacity when they store their data like they used to in the past. They now save all the data for further analysis, but this makes the data too complicated for practical usage. Thus, data pre-processing becomes an important issue in data mining. Feature selection and instance selection are two important tasks in data pre-processing, but the literatures often focused on a single task. Few literatures discuss both tasks at the same time, but they only use genetic algorithm as feature and instance selection function. We could not know if there are performance differences between other combination of pre-processing methods and genetic algorithm. Therefore, the aim of this research is to perform feature selection and instance selection with several representatives of feature and instance selection methods using different priorities to examine the classification performances over two differnet domain, namely bankruptcy prediction and credit scoring. We use datasets with different amount of features and different ratio of classes, to find out if the characteristic of the dataset will affect the performance of feature or instance selection. We also use several representatives of classifiers to find out which classifier or classifier ensembles is the best for further usage. en_US
DC.subject資料探勘zh_TW
DC.subject特徵選取zh_TW
DC.subject樣本選取zh_TW
DC.subject分類器組合zh_TW
DC.subject基因演算法zh_TW
DC.title在破產預測與信用評估領域對前處理方式與分類器組合的比較分析zh_TW
dc.language.isozh-TWzh-TW
DC.titleComparative analysis of pre-processing methods and classifier ensembles for bankruptcy prediction and credit scoringen_US
DC.type博碩士論文zh_TW
DC.typethesisen_US
DC.publisherNational Central Universityen_US

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明