論文名稱 運用機器學習技術建構核保風險預測模型:以A公司為例
(Using machine learning technology to build an underwriting risk prediction model. Take company A as an example)
摘要(中) 台灣「保險業資產占金融機構資產比率」自2004年的17.36%一路上升到2020年達36.32%是近年來的最高點,可見台灣保險產業的重要性,保險除了是一種金融工具促進社會經濟活動與發展外,且尚有對社會制度發揮穩定的重要功能。然,依據「財團法人金融法制暨犯罪防制中心」2020年舉辦的防制保險犯罪研討會指出2019年的保險犯罪黑數估計損失金額達105億元,同時也指出國外經驗約佔理賠金額10%,這個潛在的保險詐欺、逆選擇風險是各保險公司風險控管的重要課題。
摘要(英) Taiwan′s "Ratio of Assets of Insurance Industry to Total Assets of Financial Institutions" has risen from 17.36% in 2004 to 36.32% in 2020, which is the highest point in recent years. This shows the importance of Taiwan′s insurance industry. Insurance is not only a financial tool to facilitate socio-economic activities, but also plays a vital role in stabilizing social system. According to the seminar on the prevention of insurance crime held by the "Institute of Financial Law and Crime Prevention" in 2020, it is pointed out that the estimated loss of insurance crime in 2019 will reach NT$10.5 billion, and it is also pointed out that foreign experience accounts for about 10% of the claim amount. , this potential insurance fraud and adverse selection risk is an important subject of risk control for insurance companies.
This study uses the underwriting cases of Company A in 2018 as the data source, and takes the non-acute medical claims cases within two or three years after the policy takes effect as the target variable. Analyze and discuss six aspects of customer basic information, insurance information, financial information, physical condition information, agent information, and agent claim settlement rate, and use machine learning technology to try eight different classifiers to build underwriting risk prediction models and experiments. From the experimental results of the seven classifiers, it can be seen from the independent variables that the number of medical insurance, the type of primary insurance, the code of the primary insurance and the number of riders under the policy are the best predictors. The performance of the prediction model is based on the gradient boosting machine, which has relatively stable and high prediction ability, and its AUC reaches more than 0.71. For CA scores, logistic regression is the best, reaching above 0.612.
It is hoped that the results of this study can provide a reference for the case company to construct an underwriting risk assessment mechanism for risk classification based on data in the future, and provide a more efficient underwriting assessment and underwriting risk classification through the underwriting risk prediction model. So as to improve the underwriting risk classification and efficiency, speed up automated operations, and improve customer experience and satisfaction, thereby enhancing customer’s stickiness and loyalty with the company.
關鍵字(中) ★ 永續
★ 核保風險預測模型
★ 核保預測
關鍵字(英) ★ Sustainability
★ Underwriting risk prediction model
★ Predictive underwriting
論文目次 摘 要 I
圖目錄 III
表目錄 III
第1章 緒論 1
1.1 研究背景 1
1.2 研究動機 3
1.3 研究目的 5
第2章 文獻探討 7
2.1 台灣保險巿場概況 7
2.1.1 台灣保險巿場滲透率及投保率 7
2.1.2 影響保險公司營運重要指標 8
2.1.3 影響損失率的因素 11
2.2 保險契約告知義務與詐欺相關介紹 11
2.3 逆選擇、保險理賠詐欺預測模型相關研究 14
2.4 核保風險預測模型相關研究 23
2.5 總結 32
第3章 研究方法 33
3.1 資料來源及蒐集 34
3.2 資料前處理 35
3.3 研究變項說明 35
3.4 機器學習技術 37
3.4.1 隨機森林 38
3.4.2 梯度提升機與極限梯度提升 38
3.4.3 支援向量機 38
3.4.4 邏輯斯迴歸 39
3.4.5 樸素貝葉斯 39
3.4.6 自適應增強 40
3.4.7 類神經網路 40
3.5 分析工具 41
3.6 實驗設計與評估 41
3.6.1 實驗設計 41
3.6.2 評估指標 45
第4章 實驗結果與分析 47
4.1 描述性統計分析 47
4.2 實驗結果 53
4.3 變項重要性排序 55
4.4 綜合討論 56
第5章 研究結論與建議 59
5.1 研究結論 59
5.2 研究限制 59
5.3 未來研究方向與建議 60
參考文獻 62
英文文獻 62
中文文獻 65
附錄 66
附錄一:第一實驗組主約商品代碼之統計分析 66
附錄二:第二實驗組主約商品代碼之統計分析 67
附錄三:第一實驗組第1、2、3組之混亂矩陣預測結果 68
附錄四:第二實驗組第1、2、3組之混亂矩陣預測結果 69
附錄五:第一實驗組第1、2、3組之核保風險預測模型結果 70
附錄六:第二實驗組第1、2、3組之核保風險預測模型結果 71
附錄七:第一實驗組職業代碼之統計分析 72
附錄八:第二實驗組職業代碼之統計分析 88
指導教授 胡雅涵(Ya-Han Hu) 審核日期 2022-9-26
