利用資料探勘技術建立破產預測模型

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：95

、訪客IP：3.147.53.221

姓名

鄭茂松(Mao-Sung Cheng) 查詢紙本館藏

畢業系所

資訊管理學系在職專班

論文名稱

利用資料探勘技術建立破產預測模型
(Build machine learning module of bankrupt prediction)

相關論文

★ 利用資料探勘技術建立商用複合機銷售預測模型	★ 應用資料探勘技術於資源配置預測之研究-以某電腦代工支援單位為例
★ 資料探勘技術應用於航空業航班延誤分析-以C公司為例	★ 全球供應鏈下新產品的安全控管-以C公司為例
★ 資料探勘應用於半導體雷射產業-以A公司為例	★ 應用資料探勘技術於空運出口貨物存倉時間預測-以A公司為例
★ 使用資料探勘分類技術優化YouBike運補作業	★ 特徵屬性篩選對於不同資料類型之影響
★ 資料探勘應用於B2B網路型態之企業官網研究-以T公司為例	★ 衍生性金融商品之客戶投資分析與建議-整合分群與關聯法則技術
★ 應用卷積式神經網路建立肝臟超音波影像輔助判別模型	★ 基於卷積神經網路之身分識別系統
★ 能源管理系統電能補值方法誤差率比較分析	★ 企業員工情感分析與管理系統之研發
★ 資料淨化於類別不平衡問題: 機器學習觀點	★ 資料探勘技術應用於旅客自助報到之分析—以C航空公司為例

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

2007-2008環球金融危機，導因於2007年8月9日爆發的次級房貸危機，投資人開始對抵押證券的價值失去信心，引發流動性風險。這場金融危機開始失控，並導致多間大型金融機構倒閉或被政府接管並引發經濟衰退。金融機構與財團法人反覆槓桿操作下，財報中很難判讀資產與負債，傳統的檢視方式難以預警破產危機。尤其大如雷曼兄弟控股公司，一旦無預警破產會引發整體金融系統風險，每家金融機構都需要思考新的工具來檢視投資標的。如果預測出會破產就不投資或減碼，在這金融洪流中避開暗礁。
本研究的主要目的是利用機器學習技術建構破產預測的最佳混合模型，在台灣6819家公司為標的資料庫，未破產公司中隨機選出220筆與220家破產公司組合成平衡型資料庫。其中有95種財務指標或分為八類組合：償債能力Solvency / 資本結構Capital Structure ratios / 其他Others / 盈利能力 Profitability / 周轉率 Turnover ratios / 現金流量率 Cash flow ratios / 成長能力 Growth / 償債能力+其他。排列組合各種訓練模型，預期找出最佳的財務指標與分類器組合。此外進一步探討若使用Feature Selection刪減維度來探討模型效能與建模時間成本的影響。
訓練結果發現，八種資料集使用CART與MLP的AUC很接近，SVM不適用因為AUC多接近於0.5不具參考性。Bagging與Adaboost多重分類器的AUC都比單一分類器可略微提昇。Feature Selection刪減維度後又可更進一步提升AUC以及減少建模時間。

摘要(英)

Due to the global financial crisis in 2007 and 2008, cause by August 9, 2017 Subprime mortgage crisis, investors began to lose confidence in the value of mortgage-backed securities, causing a liquidity risk. The financial crisis started out of control and leads to a number of large financial institutions fail or the governments have to take over and lead to a recession. Financial institutions operating lever repeatedly Foundation, the financial statements of assets and liabilities are difficult to interpret, the traditional way of viewing difficult warning bankruptcy. Especially big as Lehman Brothers Holdings Inc., once no warning bankruptcies happen will tiger overall financial system risk, every financial institution needs to think about new tools to review investment targets, which are not in potential bankrupt risk.
The main research objective of this study is using machine learning techniques to construct an optimal bankruptcy prediction model. The research target dataset is based on 6819 companies of Taiwan, which contain 220 non-bankruptcy and bankruptcy companies, respectively. In addition, there are 95 different financial indicators, which are divided into eight categories or combinations including Solvency Solvency / Capital Structure Capital Structure ratios / Other Others / Profitability Profitability / Turnover Turnover ratios / cash flow ratio Cash flow ratios / ability to grow Growth / solvency + other. By constructing different single classifiers and classifier ensembles, the study is expected to find out the best combination of financial indicators and classifier. Moreover, the performance impact when using feature Selection for dimensionality reduction is further examined.
According the experimental results, we found based on the eight kinds of financial type datasets using the CART and MLP classifier has similar AUC. For the SVM classifier, it is not applicable because it AUC is near 0.5 only. On the other hand, classifier ensembles by the Bagging and Adaboost techniques slightly perform better than single classifiers. moreover, feature selection can enhance AUC and reduce the modeling time.

關鍵字(中)

★ 單一分類器
★ 多重分類器
★ Feature Selection/CART/Bagging/Adaboost

關鍵字(英)

★ Single classifier
★ multiple classification
★ Feature Selection
★ CART

論文目次

摘要 i
Build machine learning module of bankrupt prediction ii
Abstract ii
目錄 iii
圖目錄 v
表目錄 vi
第一章緒論 1
1.1研究背景 1
1.2研究動機 2
1.3研究目的 3
1.4研究流程 4
1.5論文架構 5
第二章文獻探討 6
2.1財務危機 6
2.1.1財務危機之定義 6
2.1.2財務危機預測對企業之重要性 7
2.2 資料探勘 9
2.2.1資料探勘之定義 9
2.2.2 資料探勘步驟 10
2.3 特徵選取 (Feature Selection) 12
2.3.1基因演算法(GA) 13
2.3.2資訊增益法 (Information Gain) 13
2.4分類演算法 14
2.4.1分類決策樹推論演算法 14
2.4.2類神經網路 14
2.4.3支援向量機(SVM) 15
2.4.4 自適應增強(Adaboost) 16
2.4.5 裝袋算法(Bagging) 16
第三章研究方法 17
3.1 研究設計及架構 17
3.2 資料來源 19
3.3 財務指標說明 21
3.3.1 Solvency(償債能力) : 21
3.3.2 Capital Structure ratios： 24
3.3.3 Others(其他)： 25
3.3.4 Profitability(盈利能力): 26
3.3.5 Turnover ratios(周轉率): 28
3.3.6 Cash flow ratios(現金流率): 30
3.3.7 Growth(成長能力): 31
3.4 K折交叉驗證(K-Fold Cross-Validation) 32
3.5 監督式學習技術 33
第四章研究結果與分析 34
4.1資料前處理 34
4.2 模型評估方式 34
4.2.1 混亂矩陣(Confusion Matrix) 34
4.2.2 接受者操作特徵曲線(Receiver Operating Characterisitc,ROC) 35
4.3實驗結果與分析 37
4.3.1 維度未刪減之結果分析 37
4.3.2 以Feature Selection (Information Gain)簡化維度 40
4.3.3 先以Feature Selection(GA)簡化維度 43
4.3.4 Feature Selection 屬性結果 47
4.3.5 討論 49
第五章研究結論與建議 50
5.1 研究結論 50
5.2 研究貢獻 50
5.3 未來研究方向與建議 50
參考文獻 51

參考文獻

【中文文獻】
1. 曹曾樹 (2008) ，「中小企業財務危機預警實證研究之文獻回顧」，中小企業發展季刊，卷期: 9 2008.09。
2. 范睿昀 (2015) ，「應用資料探勘技術於資源配置預測之研究-以某電腦代工支援單位為例」，國立中央大學資訊管理學系碩士論文。
3. 邱偉明 (2007) ，「建構財務危機預測之最佳資料探勘模式」，國立中正大學會計與資訊科技研究所碩士論文。
4. 薛兆亨 (2015) ，「財務報表分析二版」，雙葉書廊有限公司。
5. 袁梅宇 (2015) ，「WEKA機器學習與大數據聖經」，佳魁資訊股份有限公司。
6. 朱啟源(2011), 「資料前處理之研究：以基因演算法為例」，國立中央大學資訊管理學系碩士論文。
7. 歐嘉文(2012)，「基因演算法運用於特徵挑選解決財務危機預測問題」，國立中央大學資訊工程學系軟體工程學系資訊工程碩士班碩士學位論文。
8. 朱啟源(2011), 「資料前處理之研究：以基因演算法為例」，國立中央大學資訊管理學系碩士論文。
9. 高棋楠(2012)「資料探勘技術建構財務危機公司預警模式之研究」，國立中正大學會計與資訊科技研究所碩士論文。
10. 洪彥群(2014)，「利用資料探勘技術建立商用複合機銷售預測模型」，國立中央大學資訊管理學系碩士論文。
11. 洪振富(2010)，「距離式特徵於資料自動分類之研究」，國立中央大學資訊管理學系碩士論文。
12. 張麗娟;許佳豪;張耀元(2012)，「建構臺灣電子業財務預警－以資料探勘技術分析」，臺灣銀行季刊第六十三卷第一期。
13. 楊正三，葉明龍，莊麗月，陳禹融，楊正宏，「利用資訊增益與瀰集演算法於基因微陣列之特徵選取與分類問題」，資訊科技國際期刊，第二卷，第十期，2008，第50-62頁。
14. 林明傑，董子毅，亞洲家庭暴力與性侵害期刊第四卷第二期，2008，64 -74 頁

【英文文獻】
15. Altman, E.I., 1969, “Corporate bankruptcy, potential stockholder returns and share valuation,” Journal of Finance, 887-900.
16. C.F. Tsai, 2009, "Feature selection in bankruptcy prediction," Knowledge-Based Systems, Vol. 22, No. 2, pp. 120-127.
17. Atiya, A.F., 2001, “Bankruptcy prediction for credit risk using neural networks: a survey and new results,“ IEEE Transactions on Neural Networks, Vol. 12, No. 4, 929-935.
18. Bahnson, P. and Bartley, J., 1992, “The Sensitivity of Failure Prediction Models to Alternative Definitions of Failure,” Advances in Accounting, 55-64.
19. Beaver, W., 1996, “Financial Ratios as Predictors of Failure,”, Journal of Accounting Research, Vol. 4, 71-102.
20. Beaver, W.H., 1968, “Market prices, financial ratios and the prediction of failure,” Journal of Accounting Research, 179-192.
21. Berry, M.J. and Linoff, G., 1997, “Data Mining Techniques: For Marketing Sale and Customer Support,” New York: John Wiley & Sons, Inc.
22. Berson, A., Smith, S. and Thearling, K., 2000, “Building Data Mining Application for CRM,” NY: McGraw-Hill Inc.
23. Bishop, C.M., 1995, “Neural Networks for Pattern Recognition,” Oxford University Press, Oxford.
24. Breiman, L., Friedman, J. H., Olshen, R. A. and Stone, C. J., 1984. “Classification and Regression Trees”, The Wadsworth Statistics/Probability Series, Belmont, CA, USA.
25. Chiu, C.-C., Tien, C.-C. and Chou, Y.-C., 2005. “Construction of Clustering and Classification Models by Integrating Fuzzy Art, CART and Neural Network Approaches,” Journal of the Chinese Institute of Industrial Engineers, Vol. 22, No. 2, 171-188.
26. Deakin, E., 1972, “A Discriminant Analysis of Predictors of Business Failure,” Journal of Accounting Research, 167-179.
27. Deconinck, E., Hancock, T., Commans, D. Massart, D.L. and Heyden, Y.V., 2005. ”Classification of drugs in absorption classes using the classification and regression trees (CART) methodology,” Journal of Pharmaceutical and Biomedical Analysis, Vol 39, 91–103.
28. Duda, R.O., Hart, P.E. and Stork, D.G.., 2001. “Pattern Classification,” 2nd Edition. John Wiley, New York.
29. Fadlalla A., 2005, “An experimental investigation of the impact of aggregation on the performance of data mining with logistic regression ,” Information and Management, Vol 42, No. 5, 695-707.
30. Fayyad, U. and P. Stolorz., 1997. “Data mining and KDD: promise and challenges,” Future Generation Computer Systems, 99-115.
31. Fayyad, U.M., Piatesky, S.G. and Smyth, P., 1996, “From Data Mining to Knowledge Discovery in Databases,” American Association for Artificial Intelligence, 37-54.
32. Fish, K. E., Barnes, J.H. and Aiken, M. W., 1995. “Artificial neural networks: a new methodology for industrial market segmentation,” Industrial Marketing Management, Vol 24, 431-438.
33. Foster, G., 1978, “Financial Statement Analysis,” Englewood Cliffs, New Jersey：Prentice-Hall Inc.
34. Freeman, J. A. and Skapura, D.M., 1992. Neural Networks Algorithms, Applications, and Programming Techniques, Addison-Wesley, Reading, Michigan.
35. Lin, W.-Y., Hu, Y.-H., and Tsai, C.-F.* (2012) Machine Learning in Financial Crisis Prediction: A Survey. IEEE Transactions on Systems, Man and Cybernetics – Part C: Applications and Reviews, vol. 42, no. 4, pp. 421-436. (SCI) (NSC 96-2416-H-194-010-MY3) ();

指導教授

蔡志豐(CHIH-FONG TSAI)

審核日期

2016-6-4

推文