自建風控模型在降低成本和提高收益方面的應用研究

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：28

、訪客IP：18.227.111.48

姓名

蕭琮寶(Chung-Pao Hsiao) 查詢紙本館藏

畢業系所

資訊工程學系在職專班

論文名稱

自建風控模型在降低成本和提高收益方面的應用研究
(Application Study of Self-built Risk Control Models in Cost Reduction and Revenue Enhancement)

相關論文

★ 基於最大期望算法之分析陶瓷基板機器暗裂破片率	★ 基於時間序列預測的機器良率預測
★ 基於OpenPose特徵的行人分心偵測	★ 建構深度學習CNN模型以正確分類傳統AOI模型之偵測結果
★ 一種結合循序向後選擇法與回歸樹分析的瑕疵肇因關鍵因子擷取方法與系統－以紡織製程為例	★ 融合生成對抗網路及領域知識的分層式影像擴增
★ 針織布異常偵測方法研究	★ 基於工廠生產資料的異常機器維修預測
★ 萃取駕駛人在不同環境之駕駛行為方法	★ 基於刮痕瑕疵資料擴增的分割拼接影像生成
★ 應用卷積神經網路於航攝影像做基於坵塊的水稻判釋之研究	★ 採迴歸樹進行規則探勘以有效同時降低多種紡織瑕疵
★ 應用增量式學習於多種農作物判釋之研究	★ 應用自動化測試於異質環境機器學習管道之 MLOps 系統
★ 農業影像二元分類：坵塊分離的檢測	★ 應用遷移學習於胚布瑕疵檢測

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

本研究旨在探討自建風控模型在降低成本和提高收益方面的應用。當前許多
公司依賴外部風控商進行風險評估，這導致了高成本和模型不透明等問題。本研究
提出了一種基於堆疊技術的自建風控模型，旨在利用內部數據建立準確且高效的
風控評分卡模型，以取代外部供應商並提高整體收益。
本論文的目標是提出一個風險控制模型，使用 Stacking 技術結合多種基底模
型（如邏輯迴歸、決策樹、XGBoost、LightGBM）達成目標並引入 LIME（Local
Interpretable Model-agnostic Explanations）方法來提高模型解釋性。首先，收集公司
內部的貸款資料，並從中提取出用戶提交的相關信息，再利用模型輸出用戶違約機
率映射評分卡分數來調整貸款額度。
實驗結果顯示，自建風控模型在降低違約率和提升收益率方面表現優異，並且
相比外部風控模型有效降低了風控成本，提升了模型透明度和評估結果的精確性。
基於內部數據進行的風控模型在應對多變的市場需求和保障數據安全方面具有顯
著優勢。

摘要(英)

This study aims to explore the application of self-built risk control models to reduce costs
and increase revenue. Currently, many companies rely on external providers for risk
assessment, leading to high costs and opaque models. This study proposes a self-built risk
control model based on stacking technology, aiming to use internal data to establish an
accurate and efficient risk scoring model to replace external providers and improve
overall revenue.
The goal of this thesis is to propose a risk control model that uses stacking technology
combined with multiple base models (such as logistic regression, decision trees, XGBoost,
and LightGBM) to achieve this goal. First, the company′s internal loan data is collected,
and user-submitted loan information is extracted. Then, the model output probability is
mapped to a scoring card, and the method is gradually adjusted and optimized.
Experimental results show that the self-built risk control model performs excellently in
reducing default rates and improving return rates. Compared to external risk control
models, it effectively reduces risk control costs, improves model transparency, and
enhances the accuracy of evaluation results. Risk control models based on internal data
have significant advantages in responding to changing market demands and ensuring data
security.

關鍵字(中)

★ 風控評分卡
★ 機器學習
★ 模型解釋性
★ 成本控制
★ 收益率

關鍵字(英)

★ Risk Scoring System
★ Machine Learning
★ Model Interpretability
★ Cost Control
★ Profitability

論文目次

中文摘要........................................................................................................................... i
ABSTRACT .................................................................................................................... iii
目錄................................................................................................................................. iv
圖目錄............................................................................................................................ vii
表目錄........................................................................................................................... viii
第一章緒論............................................................................................................1
1.1 研究動機與目的 ............................................................................................2
1.2 研究目標 ........................................................................................................3
1.3 論文架構 ........................................................................................................4
第二章文獻探討....................................................................................................5
2.1 風險控制模型 ................................................................................................5
2.2 風險評估技術的現狀 ....................................................................................7
2.3 機器學習模型 ................................................................................................8
2.3.1 邏輯迴歸模型 (Logistic Regression) ..................................................9
2.3.2 隨機森林.............................................................................................10
2.3.3 XGBOOST..........................................................................................12
2.3.4 LIGHTBGM........................................................................................13
2.3.5 LIME...................................................................................................14
第三章解決方案..................................................................................................16
3.1 引言 ..............................................................................................................16
3.2 系統架構設計 ..............................................................................................17
v
3.3 數據收集與預處理 ......................................................................................18
3.3.1 數據來源.............................................................................................18
3.3.2 數據預處理.........................................................................................20
3.4 模型選擇與訓練 ..........................................................................................21
3.4.1 邏輯回歸 (Logistic Regression)模型訓練 ........................................21
3.5 風控評分卡設計 ..........................................................................................24
3.5.1 FICO 評分轉換...................................................................................24
3.5.2 評分卡生成.........................................................................................24
第四章實驗設計與結果 .....................................................................................25
4.1 衡量指標 ......................................................................................................25
4.2 實驗一：模型性能評估 ..............................................................................26
4.2.1 實驗流程.............................................................................................26
4.2.2 實驗結果 (Results) ............................................................................26
4.3 實驗二：模型可解釋性評估 ......................................................................28
4.3.1 實驗流程.............................................................................................28
4.3.2 實驗結果.............................................................................................28
4.4 實驗三：內部與外部風控違約率比較 ......................................................30
4.4.1 實驗流程.............................................................................................30
4.4.2 實驗結果.............................................................................................30
4.5 實驗四：內部與外部風控報酬率比較 ......................................................32
4.5.1 實驗流程 (Experimental Procedure) .................................................32
4.5.2 實驗結果.............................................................................................32
vi
第五章結論與未來展望 .....................................................................................35
5.1 結論 ..............................................................................................................35
5.2 未來展望 ......................................................................................................36
參考文獻.........................................................................................................................37

參考文獻

[1] X. Zhu, et al., "Explainable prediction of loan default based on machine learning
models," Data Science and Management, vol. 6, no. 3, pp. 123-133, 2023.
[2] C.-Y. J. Peng, K. L. Lee, and G. M. Ingersoll, "An introduction to logistic regression
analysis and reporting," The Journal of Educational Research, vol. 96, no. 1, pp. 3-
14, 2002.
[3] J. R. Quinlan, "Induction of decision trees," Machine Learning, vol. 1, no. 1, pp. 81-
106, 1986.
[4] T. Chen and C. Guestrin, "XGBoost: A Scalable Tree Boosting System," in
Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining, 2016, pp. 785-794.
[5] G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.-Y. Liu,
"LightGBM: A highly efficient gradient boosting decision tree," in Advances in
Neural Information Processing Systems 30 (NIPS 2017), 2017, pp. 3146-3154.
[6] M. T. Ribeiro, S. Singh, and C. Guestrin, "Why Should I Trust You?": Explaining
the Predictions of Any Classifier," in Proceedings of the 22nd ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining, 2016, pp.
1135-1144.
[7] T. Fawcett, "An introduction to ROC analysis," Pattern Recognition Letters, vol. 27,
no. 8, pp. 861-874, June 2006.
[8] A. Alagic, N. Zivic, E. Kadusic, D. Hamzic, N. Hadzajlic, M. Dizdarevic, and E.
Selmanovic, "Machine Learning for an Enhanced Credit Risk Analysis: A
Comparative Study of Loan Approval Prediction Models Integrating Mental Health
Data," Machine Learning and Knowledge Extraction, vol. 6, no. 1, pp. 53-77, 2024.
[9] L. Breiman, "Random Forests," Machine Learning, vol. 45, no. 1, pp. 5-32, Oct.
2001.
[10] "Paytm Credit Score," Paytm, 2024. [Online]. Available: https://creditscore.lending.paytm.com/. [Accessed: July 22, 2024].
[11] J. Kittler, "Statistical Pattern Recognition: The State of the Art," IEEE Transactions
on Pattern Analysis and Machine Intelligence, vol. 22, no. 1, pp. 38-62, Jan. 2000.
[12] A. L. Samuel, "Some Studies in Machine Learning Using the Game of Checkers,"
IBM Journal of Research and Development, vol. 3, no. 3, pp. 210-229, July 1959.
38
[13] V. Verdhan, "Introduction to Supervised Learning," in Supervised Learning with
Python, Berkeley, CA: Apress, 2020, pp. 1-28.
[14] H. Li, "Introduction to Unsupervised Learning," in Machine Learning Methods,
Singapore: Springer, 2024, pp. 345-367.
[15] D. Berthelot, N. Carlini, I. Goodfellow, N. Papernot, A. Oliver, and C. A. Raffel,
"MixMatch: A Holistic Approach to Semi-Supervised Learning," in Advances in
Neural Information Processing Systems 32 (NeurIPS 2019), pp. 5049-5059.
[16] T. Szandała, "Review and Comparison of Commonly Used Activation Functions for
Deep Neural Networks," arXiv preprint arXiv:2010.09458, 2020.
[17] L. Breiman, "Bagging predictors," Machine Learning, vol. 24, no. 2, pp. 123-140,
Aug. 1996.
[18] L. Breiman, "Random Forests," Machine Learning, vol. 45, no. 1, pp. 5-32, Oct.
2001.
[19] J. H. Friedman, "Greedy Function Approximation: A Gradient Boosting Machine,"
Annals of Statistics, vol. 29, no. 5, pp. 1189-1232, Oct. 2001.
[20] L. Li, H. Xiong, H. Wang, Y. Rao, L. Liu, Z. Chen, and J. Huan, "DELTA: DEep
Learning Transfer using Feature Map with Attention for Convolutional Networks,"
arXiv preprint arXiv:1901.09229, 2019.
[21] D. Ge, J. Gu, S. Chang, and J. Cai, "Credit Card Fraud Detection using LightGBM
Model," in Proceedings of the 2020 International Conference on E-commerce and
Internet Technology (ECIT), 2020, pp. 215-220.
[22] V. Taghian, S. H. Hassan, and M. K. Akbari, "H3O-LGBM: Hybrid Harris Hawk
Optimization-Based Light Gradient Boosting Machine Model for Real-Time
Trading," Artificial Intelligence Review, vol. 54, no. 4, pp. 2563-2582, 2022.
[23] P. Pokhrel, E. Ioup, M. Hoque, M. Abdelguerfi, and J. Simeonov, "A LightGBM
based Forecasting of Dominant Wave Periods in Oceanic Waters," arXiv preprint
arXiv:2105.08721, 2021.
[24] J. Bergstra and Y. Bengio, "Random Search for Hyper-Parameter Optimization,"
Journal of Machine Learning Research, vol. 13, pp. 281-305, 2012.
[25] C. Cortes, M. Mohri, and A. Rostamizadeh, "L2 Regularization for Learning
Kernels," in Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial
Intelligence (UAI 2009), 2009, pp. 109-116.
[26] S.-A. N. Alexandropoulos, C. K. Aridas, S. B. Kotsiantis, and M. N. Vrahatis,
39
"Stacking strong ensembles of classifiers," in Artificial Intelligence Applications and
Innovations, J. MacIntyre, I. Maglogiannis, L. Iliadis, and E. Pimenidis, Eds. Cham:
Springer International Publishing, 2019, pp. 545-556.

指導教授

梁德容(Deron Liang)

審核日期

2024-7-30

推文