目前大部分的特徵選取大多為單一(競爭式)特徵選取,本研究想加入資訊融合(Information Fusion)的概念,將實驗設計為UCI 公開資料集與其他公開資料集中,取得28 個完整資料集,進行單一(競爭式)特徵選取與混合式資料選取的比較,進一步探討不同維度、類型的資料對於不同方式的特徵選取的影響,以提出資訊融合(Information Fusion)概念的混合式特徵選取是否能幫助處理各種類型的資料集,並可大幅度的提升預測模型的正確率。;In our current life, we not only face the huge data (Big Data) problem, but also need to take into account the immediacy of information. Under limited resources and time, it is important to know how to perform data mining to find interesting style. We first consider data pre-processing for feature selection, and apply the selected data to construct the classifier, which could improve the classificaiton accuracy of the model, and help users make decisions.
In this thesis, we discuss the feature selection as the preprocessing step, and remove irrelevant and redundant features ( attributes of the data) from a given dataset. In other words, the feature selection algorithm is used to idenitfy useful or represenative attributes from the entire data set. We reassemble these attributes into a new data set and then use the support vector machine classifier to improve the correctness and efficiency of the model.
Since most related studies only focus on single (competitive) feature selection, this thesis applies the concept of information fusion for multiple feature selection results. The experiments are based on 28 UCI public datasets. The purpose of this thesis is to combine multiple feature selection methods. Under different dimensions and data types of information, we are able to understand whether combininng different feature selection results can perform better than single results in terms of classificaiton performance.