結合特徵選取與重採樣技術應用於信用風險預測

DC 欄位	值	語言
DC.contributor	資訊管理學系在職專班	zh_TW
DC.creator	陳奕嫻	zh_TW
DC.creator	Yi-Hsien Chen	en_US
dc.date.accessioned	2024-6-11T07:39:07Z
dc.date.available	2024-6-11T07:39:07Z
dc.date.issued	2024
dc.identifier.uri	http://ir.lib.ncu.edu.tw:444/thesis/view_etd.asp?URN=111453008
dc.contributor.department	資訊管理學系在職專班	zh_TW
DC.description	國立中央大學	zh_TW
DC.description	National Central University	en_US
dc.description.abstract	信用風險管理是銀行的核心議題，精確評估高風險貸款並建立可靠的信用評分模型極為重要。傳統機器學習演算法在處理平衡數據時表現良好，但在面對不平衡的類別分布時，這些模型往往偏向多數類別（即良好信用），而忽略了少數重要的類別（即不良信用）。這種偏差可能導致不良信用被錯誤地分類為良好信用，當這些借款人違約時，金融機構可能面臨巨大的財務損失。為了解決不平衡問題，在本研究中結合了特徵選取和重採樣技術，從公開平台收集了五個信用風險數據集，採用了三種特徵選取與八種重採樣技術，並對六種不同的分類器模型進行了廣泛的實驗。通過系統性的比較分析，本研究評估了單獨與組合前處理技術的性能，並探討了不同前處理技術的應用順序對模型預測結果的影響。此研究為信用風險管理提供了一種有效的前處理組合策略，即先進行重採樣平衡資料集後，再進行特徵選取選出具代表性的特徵，相較於單一技術的應用，能夠有效提升模型的預測效能，特別是在小規模且高度不平衡的數據集中效果更為優秀，該策略有助於改進信用評分模型，從而更精確地識別和處理高風險貸款。	zh_TW
dc.description.abstract	Credit risk management is a core issue for banks, and accurately assessing high-risk loans and establishing reliable credit scoring models is extremely important. Traditional machine learning algorithms perform well with balanced data, but when facing imbalanced class distributions, these models tend to favor the majority class (i.e., good credit) while neglecting the minority important class (i.e., poor credit). This bias could lead to misclassification of poor credit as good credit, potentially causing significant financial losses for financial institutions when these borrowers default. To solve the imbalance issue, this study combined feature selection and resampling techniques, collecting five credit risk datasets from public platforms. It employed three feature selection methods and eight resampling techniques, and conducted extensive experiments on six different classifier models. Through systematic comparative analysis, this study evaluated the performance of individual and combined preprocessing techniques and explored the impact of the order of these techniques on the model prediction results. This research offers an effective preprocessing combination strategy for credit risk, which involves first resampling to balance the dataset and then selecting representative features through feature selection. Compared to the application of a single technique, this strategy can effectively enhance the predictive performance of models, especially in small and highly imbalanced datasets. This strategy contributes to the improvement of credit models, thereby enabling more accurate identification and management of high-risk loans.	en_US
DC.subject	信用風險	zh_TW
DC.subject	特徵選取	zh_TW
DC.subject	重採樣	zh_TW
DC.subject	不平衡資料	zh_TW
DC.subject	機器學習	zh_TW
DC.subject	資料探勘	zh_TW
DC.subject	Credit Risk	en_US
DC.subject	Feature Selection	en_US
DC.subject	Resampling	en_US
DC.subject	Imbalanced Data	en_US
DC.subject	Machine Learning	en_US
DC.subject	Data Mining	en_US
DC.title	結合特徵選取與重採樣技術應用於信用風險預測	zh_TW
dc.language.iso	zh-TW	zh-TW
DC.title	Combining Feature Selection and Resampling Techniques for Credit Risk Prediction	en_US
DC.type	博碩士論文	zh_TW
DC.type	thesis	en_US
DC.publisher	National Central University	en_US

博碩士論文 111453008 完整後設資料紀錄