dc.description.abstract | In past studies, many had discussed about missing value imputation but most of them did experiments on UCI datasets or only on training datasets. Seldom does a study discuss about missing value on bankruptcy prediction and credit scoring. On the other hand, using deep neural network for imputation or classification prediction is rarely mentioned in past research, and is still an unknown and need to be discussed.
The applicability of missing value imputation on bankruptcy prediction and credit scoring is analyzed by using five credit datasets (Australia, Japan, Germany, Kaggle and pakdd) and four bankruptcy datasets (Bankruptcy, Japan Bankruptcy, TEJ Taiwan Bankruptcy and US Bankruptcy) with four imputation methods, including Deep Neural Network, K-nearest neighbor, Random Forest and Multivariate Imputation by Chained Equations, and at last using four different classifiers: Support Vector Machine, Random Forest, Deep Neural Network and Deep Belief Network Stacked Deep Neural Network respectively to discuss the effect of imputation methods on outcomes. Furthermore, this experiment also explore the possibility on whether data normalization will improve prediction accuracy.
This experiment finds out that on average, imputation improves the classification accuracy, and data normalization along with imputation, can elaborate the effect of artificial neural network. Besides, in comparison with machine learning, neural network performs much better after data normalization. In every pair of experiments after normalization, DBN-DNN outstands other classifiers. When missing rate is low, the combination with Random Forest outputs the best AUC, with MICE on the other hand, gets the lowest type II error. When missing rate is high, the combination with Random Forest takes the first place for the best AUC and the lowest type II error. | en_US |