dc.description.abstract | Effective classification for class imbalance datasets is always an important issue of data
mining. The class imbalance problem means when the number of samples in one class
outnumbers the other classes in a dataset. The learning model will tend to misclassify the
minority class into the majority class because of the skewed class distribution. Due to the class
imbalance problem occurs in many real-world applications, for example, fault diagnosis,
medical diagnosis, fraud detection and so on, there are many researchers committed to the
methods to handle the class imbalance datasets in past decades. In the literatures, the class
imbalance problem can be solved from three different ways, including algorithm level methods,
data level methods and cost-sensitive methods. Particularly, data level methods are widely
considered, such as under- and over-sampling techniques. In recent years, deep learning
techniques have demonstrated their outperformances over many machine learning techniques.
However, very few studies examine their applicability on class imbalance datasets. Therefore,
the research objective is to perform SMOTE as the over-sampling method to re-balance the
class imbalance datasets and then construct the deep learning models for performance
comparison. In this research, 44 class imbalanced datasets collected from the KEEL dataset
repository and 8 datasets from NASA are used for the experiment. In addition, the deep neural
networks including deep multilayer perceptron (D-MLP) and deep belief network (DBN) are
compared with some representative baseline learning models. The experimental results show
that SMOTE combining with deep learning classifiers perform better than traditional machine
learning classifiers. In particular, the DBN classifier performs better than others for the datasets
with high imbalance ratios, whereas the D-MLP classifier has an overall better performance
than the other classifiers. | en_US |