dc.description.abstract | In the real world, data often presents many issues, such as noise, irrelevant information, and excessive data volume. Therefore, preprocessing is necessary before using this data. Dimensionality reduction is a common data preprocessing method that aims to retain important features and reduce data dimensionality. Ensemble dimensionality reduction refers to the use of multiple different dimensionality reduction algorithms, and combining their selected subsets of features in different ways. Through ensemble techniques, the robustness and classification accuracy of dimensionality reduction can be improved. In recent years, deep learning techniques have received significant attention. However, most of the related research has focused on handling unstructured data, with limited studies on the use of deep learning techniques for high-dimensional structured data, and a lack of comprehensive discussions on machine learning and deep learning techniques. Therefore, this study aims to investigate dimensionality reduction and classification techniques based on machine learning and deep learning, specifically for high-dimensional structured datasets. It also aims to understand whether deep learning can outperform traditional machine learning methods, and compare the performance of single dimensionality reduction and ensemble dimensionality reduction methods to identify optimal combinations of dimensionality reduction techniques.
This study focuses on twenty high-dimensional structured datasets, ranging from 44 to 22,283 dimensions. Machine learning and deep learning-based dimensionality reduction and classification techniques are applied, incorporating ensemble learning and feature fusion concepts for dimensionality reduction. The experiments use five-fold cross-validation and record average accuracy, average area under curve, and average CPU time. Finally, the results are analyzed to evaluate the advantages and recommendations of dimensionality reduction across different dimensions.
According to the experimental results of this study, both dimensionality reduction and classifier techniques using deep learning methods outperform machine learning methods. Ensemble dimensionality reduction outperforms single dimensionality reduction, with parallel dimensionality reduction being the best approach. Finally, the best single dimensionality reduction method is found to be SAE+MLP, and the best performing method among sequential ensemble dimensionality reduction approaches is IG+SAE+MLP, while AE+SAE(SFC)+MLP is the preferred approach for parallel ensemble dimensionality reduction. | en_US |