摘要: | 在資料探勘和機器學習中,分類是一個很重要的議題,分類被廣泛地應用在金融、醫學、生物、圖樣辨識等領域。分類模型能有效地建模並且正確地預測未知樣本所屬的類別是很重要的。本研究提出一種結合模糊類神經理論和適應性推進法(Adaptive boosting, AdaBoost)建構一種以模糊類神經系統(Neuro-fuzzy system, NFS)為架構之合奏分類器(Ensemble classifier),並將其應用於分類問題上。本研究提出的合奏分類器(Ensemble classifier)是由NFS元件分類器(Component classifier)所構成。在NFS合奏分類器之建模(Modeling)上,分成結構學習階段和參數學習階段; 在結構學習階段中,使用模糊C平均分裂演算法(FCM-based splitting algorithm, FBSA)來自動決定NFS元件分類器的最佳結構,在參數學習階段中,使用粒子群最佳化(Particle swarm optimization, PSO)來調整NFS元件分類器的前鑑部參數,遞迴最小平方法(Recursive least-squares estimator, RLSE)則被用來調整其後鑑部參數。為了提升系統建模的效率,本研究使用主成分分析(Principal component analysis, PCA)來萃取出重要的屬性特徵,不但可以節省分類器之計算時間還能提升分類正確率。本研究使用加州大學爾灣分校(University of California - Irvine, UCI)機器學習資料庫中的六個資料集來檢驗本研究提出之方法,並與其他著名的研究方法比較分類正確率。實驗結果顯示本研究提出方法有較佳之分類正確率,實證了本論文提出的研究方法有良好的表現。 In data mining and machine learning, classification is an important research issue. Classification has been widely applied in medicine, biology, finance, pattern recognition, and more. It is very important that a classification model can be modeled effectively to predict unseen samples for their classes accurately. In this study, we present a neuro-fuzzy system based ensemble classifier that uses both the theory of neuro-fuzzy system (NFS) and the adaptive boosting algorithm to the problem of classification. The proposed ensemble classifier is composed a set of the NFS component classifiers. The modeling of proposed NFS ensemble classifier comprises the phases of structure learning and parameter learning. In the structure-learning phase, the method of FCM-based splitting algorithm (FBSA) is used to determine the number of If-Then rules for NFS component classifier. In the parameter-learning phase, the PSO-RLSE hybrid learning method is used that comprises the method of particle swarm optimization (PSO) and the algorithm of recursive least squares estimation (RLSE), where PSO is used to adjust the premise parameters of an NFS component classifier and RLSE is used to update the consequent parameters. Moreover, for the purpose of classification performance and computational time reduction, the method of principal component analysis is used to extract important features for the modeling by the proposed approach. In this study, six datasets from the University of California - Irvine (UCI) machine learning repository were used to test the proposed approach, whose results are compared with those by other noted approaches. The proposed approach can get good performance in classification. Through the experimental results, the proposed approach shows excellent performance and outperforms the compared approaches. |