Hierarchical Classification and Regression with Feature Selection

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：14

、訪客IP：18.222.125.171

姓名

葉奇瑋(Chi-Wei Yeh) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

(Hierarchical Classification and Regression with Feature Selection)

相關論文

★ 多重標籤文本分類之實證研究 : word embedding 與傳統技術之比較	★ 基於圖神經網路之網路協定關聯分析
★ 學習模態間及模態內之共用表示式	★ 病徵應用於病患自撰日誌之情緒分析
★ 基於注意力機制的開放式對話系統	★ 針對特定領域任務—基於常識的BERT模型之應用
★ 基於社群媒體使用者之硬體設備差異分析文本情緒強烈程度	★ 機器學習與特徵工程用於虛擬貨幣異常交易監控之成效討論
★ 捷運轉轍器應用長短期記憶網路與機器學習實現最佳維保時間提醒	★ 基於半監督式學習的網路流量分類
★ ERP日誌分析-以A公司為例	★ 企業資訊安全防護：網路封包蒐集分析與網路行為之探索性研究
★ 資料探勘技術在顧客關係管理之應用─以C銀行數位存款為例	★ 人臉圖片生成與增益之可用性與效率探討分析
★ 人工合成文本之資料增益於不平衡文字分類問題	★ 探討使用多面向方法在文字不平衡資料集之分類問題影響

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

在雜亂無章的資料當中找到合適的整理方法以及萃取出珍貴的資訊已經是大數據時代的願景，隨著資料數量與複雜度的增加，資料科學家不再把目光放在模型訓練的優劣而是希望以不一樣的計算方式或運行架構來找尋資料當中的蛛絲馬跡，最後更期望能從這些發現當中找到有效提升數值型預測的方法。
在進行資料集的數值型預測時回歸預測方法使用線性回歸(linear regression)、類神經網路(neural network)及支持向量回歸(support vector regression)為比較常見的預測模型建置方式，在訓練回歸模型時為了追求更好數值型預測結果除了調整模型內的設定參數外也會在資料前處理上採用特徵選取(feature selection)來篩選出較不相關或是多餘的特徵以及分群(clustering)將資料有條理地歸類為不同群。
本研究以階層式架構作為實驗雛型並且進行延伸，針對資料筆數與特徵數量多的資料集在資料前處理上進行層次式分群與特徵選取。為了使實驗結果具有明顯的比較成果，本研究採用多個不同領域、數量及特徵的資料集以階層式的架構結合不同的分群、特徵選取及線性回歸演算法來進行模型訓練和數值型預測誤差值運算，從分析與比較多種演算法組合的實驗結果之中證明本研究的實驗架構比起使用回歸預測方法或者是分群加上回歸預測方法更能使資料集在均方根或平均絕對誤差上有0至1的誤差值下降效果，除此之外本研究也在實驗結果中找出針對不同資料集平均表現較佳的階層式分群(K-means, C-means)、特徵選取(Mutual Information, Information Gain)與回歸預測方法(Multi-Perceptron)。

摘要(英)

The vision of the big data era is to find the suitable sorting method and extract valuable information from numerous and messy data. With the increase of the quantity and complexity of data, data scientists no longer focus on the strengths and weaknesses of model training but concentrate on using different calculation methods or operating architectures to find clues in the data, and finally they look forward to find ways to improve numerical prediction accuracy from these findings.
In numerical prediction of data sets, the common prediction model construction methods in regression training uses linear regression, neural network and support vector regression. In order to pursue better numerical prediction results in the model, adjusting the parameters in the models is necessary. Apart from this, we use feature selection to select less relevant or redundant features and clustering to organize data into different groups.
In this study, the hierarchical structure is used as an experimental prototype and extended further to deal with datasets which have a large number of features and amounts.
In order to have obvious comparison results from the experimental results, our study combine different clustering, feature selection and regression algorithms to train models and process numerical prediction errors from different fields, quantities and features datasets. Our study find out that there is improvement of root mean square and mean absolute error by using hierarchical classification and regression with feature selection than using only regression or hierarchical classification and regression from analyzing and comparing multiple algorithms experimental results. In addition, our study also find out that the hierarchical structure using clustering (K-means and C-means), feature selection (Mutual Information and Information Gain) and regression (Multi-layer Perceptron) have better average performance in different datasets.

關鍵字(中)

★ 階層式架構
★ 線性回歸
★ 特徵選取
★ 分類
★ 分群

關鍵字(英)

★ K-means
★ C-means
★ Gaussian Maximum
★ Chi Square
★ Mutual Information
★ Information Gain
★ Support Vector Machine
★ Multilayer Perceptron
★ Support Vector Regression
★ Linear Regression

論文目次

中文摘要 I
英文摘要 II
目錄 III
圖目錄 V
表目錄 VI
附錄圖目錄 VII
1.緒論 1
1.1 研究背景 1
1.2 研究動機 1
1.3研究目的 2
1.4 研究架構 3
2.文獻探討 4
2.1 分群 4
2.1.1 K-means 4
2.1.2 Fuzzy C-means 5
2.1.3 Expectation Maximum 6
2.2 特徵選取 7
2.2.1 卡方檢驗Chi Square 8
2.2.2 相互資訊Mutual information 8
2.2.3 信息增益Information gain 8
2.3回歸預測方法 9
2.3.1 Linear Regression 9
2.3.2 支持向量回歸Support Vector Regression 9
2.3.3 多層感知機Multi-layer Perceptron 10
2.4 階層式架構之數值型預測 10
3.研究方法 11
3.1實驗架構 11
3.2實驗流程 11
3.2.1訓練集之模型訓練流程 11
3.2.2測試集之模型訓練流程 12
3.3資料集 14
3.4 資料前處理 15
3.4.1 資料集切分(Data Split) 15
3.4.2 資料集特徵縮放(Feature Scaling) 15
3.5 資料集分群 16
3.6 資料集特徵選取 17
3.7 分類器模型訓練 17
3.8 回歸模型訓練 18
3.9誤差值評估 18
3.9.1均方根誤差(Root Mean Square Error) 18
3.9.2平均絕對誤差(Mean Absolute Error) 19
4.實驗結果 20
4.1使用不同回歸預測方法之結果比較 20
4.2各資料集使用MLP平均表現較佳的分群數目之RMSE與MAE結果比較 26
4.3所有資料集使用MLP平均表現較佳的分群數目之RMSE與MAE加總結果比較 36
4.4階層式架構於不同分群之所有資料集結果比較 38
4.5各資料集分群結果 38
4.6實驗結果分析 45
5.結論 46
5.1研究貢獻 46
5.2展望 46
參考文獻 48
附錄 50

參考文獻

1.Yang, Y., & Pedersen, J. O. (1997, July). A comparative study on feature selection in text categorization. In Icml (Vol. 97, No. 412-420, p. 35).
2.Battiti, R. (1994). Using mutual information for selecting features in supervised neural net learning. IEEE Transactions on Neural Networks, 5(4), 537–550.
3.Bong Chih How, & Narayanan, K. (2004). An Empirical Study of Feature Selection for Text Categorization based on Term Weightage. In IEEE/WIC/ACM International Conference on Web Intelligence (WI’04) (pp. 599–602). Beijing, China: IEEE.
4.Estevez, P. A., Tesmer, M., Perez, C. A., & Zurada, J. M. (2009). Normalized Mutual Information Feature Selection. IEEE Transactions on Neural Networks, 20(2), 189–201.
5.Fleuret, F. (2004). Fast binary feature selection with conditional mutual information. Journal of Machine learning research, 5(Nov), 1531-1555.
6.Forman, G. (2003). An extensive empirical study of feature selection metrics for text classification. Journal of machine learning research, 3(Mar), 1289-1305.
7.Ke, S.-W., Lin, W.-C., Tsai, C.-F., & Hu, Y.-H. (2017). Soft estimation by hierarchical classification and regression. Neurocomputing, 234, 27–37.
8.Moon, T. K. (1996). The expectation-maximization algorithm. IEEE Signal Processing Magazine, 13(6), 47–60.
9.Nasser, S., Alkhaldi, R., & Vert, G. (2006). A Modified Fuzzy K-means Clustering using Expectation Maximization. In 2006 IEEE International Conference on Fuzzy Systems (pp. 231–235). Vancouver, BC, Canada: IEEE.
10.Uysal, A. K., & Gunal, S. (2012). A novel probabilistic feature selection method for text classification. Knowledge-Based Systems, 36, 226–235.
11.Vergara, J. R., & Estévez, P. A. (2014). A Review of Feature Selection Methods Based on Mutual Information. Neural Computing and Applications, 24(1), 175–186.
12.Wang, G., & Lochovsky, F. H. (2004). Feature selection with conditional mutual information maximin in text categorization. In Proceedings of the Thirteenth ACM conference on Information and knowledge management - CIKM ’04 (p. 342). Washington, D.C., USA: ACM Press.
13.Wu, G., & Xu, J. (2015). Optimized Approach of Feature Selection Based on Information Gain. In 2015 International Conference on Computer Science and Mechanical Automation (CSMA) (pp. 157–161). Hangzhou, China: IEEE.
14.Xu, R., & WunschII, D. (2005). Survey of Clustering Algorithms. IEEE Transactions on Neural Networks, 16(3), 645–678.
15.Xue, B., Zhang, M., Browne, W. N., & Yao, X. (2016). A Survey on Evolutionary Computation Approaches to Feature Selection. IEEE Transactions on Evolutionary Computation, 20(4), 606–626.
16.Zheng, Z., Wu, X., & Srihari, R. (2004). Feature selection for text categorization on imbalanced data. ACM SIGKDD Explorations Newsletter, 6(1), 80.
17.You, H., & Ryu, T. (2005). Development of a hierarchical estimation method for anthropometric variables. International Journal of Industrial Ergonomics, 35(4), 331–343.
18.Hellier, P., Barillot, C., Memin, E., & Perez, P. (2001). Hierarchical estimation of a dense deformation field for 3-D robust registration. IEEE Transactions on Medical Imaging, 20(5), 388–402.
19.Hamidieh, K. (2018). A data-driven statistical model for predicting the critical temperature of a superconductor. Computational Materials Science, 154, 346-354.
20.Singh, K., Sandhu, R. K., & Kumar, D. (2015). Comment volume prediction using neural networks and decision trees. Proceedings of the 2015 17th UKSIM, 15.
21.Candanedo, L. M., Feldheim, V., & Deramaix, D. (2017). Data driven prediction models of energy use of appliances in a low-energy house. Energy and buildings, 140, 81-97.
22.Graf, F., Kriegel, H. P., Pölsterl, S., Schubert, M., & Cavallaro, A. (2011). Position prediction in ct volume scans. In Proceedings of the 28th International Conference on Machine Learning (ICML) Workshop on Learning for Global Challenges, Bellevue, Washington, WA.
23.Buza, K. (2014). Feedback prediction for blogs. In Data analysis, machine learning and knowledge discovery (pp. 145-152). Springer, Cham.
24.Tsai, C. F. (2009). Feature selection in bankruptcy prediction. Knowledge-Based Systems, 22(2), 120-127.
25.Jin, X., Xu, A., Bie, R., & Guo, P. (2006, April). Machine learning techniques and chi-square feature selection for cancer classification using SAGE gene expression profiles. In International Workshop on Data Mining for Biomedical Applications (pp. 106-115). Springer, Berlin, Heidelberg.
26.Doquire, G., & Verleysen, M. (2013). Mutual information-based feature selection for multilabel classification. Neurocomputing, 122, 148-155.
27.Lee, M. C. (2009). Using support vector machine with a hybrid feature selection method to the stock trend prediction. Expert Systems with Applications, 36(8), 10896-10904.
28.Yun, S., Na, J., Kang, W. S., & Choi, J. (2008, December). Hierarchical estimation for adaptive visual tracking. In 2008 19th International Conference on Pattern Recognition (pp. 1-4). IEEE.
29.Strijbosch, L. W. G., & Moors, J. J. A. (2010). Calculating the accuracy of hierarchical estimation. IMA Journal of Management Mathematics, 21(3), 303-315.

指導教授

柯士文(Shih-Wen Ke)

審核日期

2019-7-23

推文