基於單一與混合特徵選取方法之比較

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：51

、訪客IP：13.58.149.34

姓名

張櫻馨(Ying-Hsin, Chang) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

基於單一與混合特徵選取方法之比較

相關論文

★ 利用資料探勘技術建立商用複合機銷售預測模型	★ 應用資料探勘技術於資源配置預測之研究-以某電腦代工支援單位為例
★ 資料探勘技術應用於航空業航班延誤分析-以C公司為例	★ 全球供應鏈下新產品的安全控管-以C公司為例
★ 資料探勘應用於半導體雷射產業-以A公司為例	★ 應用資料探勘技術於空運出口貨物存倉時間預測-以A公司為例
★ 使用資料探勘分類技術優化YouBike運補作業	★ 特徵屬性篩選對於不同資料類型之影響
★ 資料探勘應用於B2B網路型態之企業官網研究-以T公司為例	★ 衍生性金融商品之客戶投資分析與建議-整合分群與關聯法則技術
★ 應用卷積式神經網路建立肝臟超音波影像輔助判別模型	★ 基於卷積神經網路之身分識別系統
★ 能源管理系統電能補值方法誤差率比較分析	★ 企業員工情感分析與管理系統之研發
★ 資料淨化於類別不平衡問題: 機器學習觀點	★ 資料探勘技術應用於旅客自助報到之分析—以C航空公司為例

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

在我們現今生活中，我們面臨巨量資料（Big Data）的問題，還需要考慮到資料的即時性，如何在有限的資源與時間之下，進行資料探勘，找出有趣的樣式，我們首要考慮的是資料前處理（Data Pre-processing），將特徵選取處理後的資料應用在分類器，提高模型預測正確率，進而幫助使用者做決策。

本研究為探討特徵選取（Feature Selection）作為資料前處理的步驟，將不相關、冗餘的特徵（資料的屬性）刪除，換句話說，就是將原始資料集利用特徵選取的演算法，萃取出有用的特徵，或是足以代表整個資料集的資料值，並將這些特徵值重新組成一個新的資料集，再丟入SVM 支援向量機分類器中，希望可以透過特徵選取的方式，改善模型的正確率與執行的效能。

目前大部分的特徵選取大多為單一（競爭式）特徵選取，本研究想加入資訊融合（Information Fusion）的概念，將實驗設計為UCI 公開資料集與其他公開資料集中，取得28 個完整資料集，進行單一（競爭式）特徵選取與混合式資料選取的比較，進一步探討不同維度、類型的資料對於不同方式的特徵選取的影響，以提出資訊融合（Information Fusion）概念的混合式特徵選取是否能幫助處理各種類型的資料集，並可大幅度的提升預測模型的正確率。

摘要(英)

In our current life, we not only face the huge data （Big Data） problem, but also need to take into account the immediacy of information. Under limited resources and
time, it is important to know how to perform data mining to find interesting style. We first consider data pre-processing for feature selection, and apply the selected data to construct the classifier, which could improve the classificaiton accuracy of the model, and help users make decisions.

In this thesis, we discuss the feature selection as the preprocessing step, and remove irrelevant and redundant features （ attributes of the data） from a given dataset. In other words, the feature selection algorithm is used to idenitfy useful or represenative attributes
from the entire data set. We reassemble these attributes into a new data set and then use the support vector machine classifier to improve the correctness and efficiency of the model.

Since most related studies only focus on single （competitive） feature selection, this thesis applies the concept of information fusion for multiple feature selection results. The experiments are based on 28 UCI public datasets. The purpose of this thesis is to
combine multiple feature selection methods. Under different dimensions and data types of information, we are able to understand whether combininng different feature selection results can perform better than single results in terms of classificaiton performance.

關鍵字(中)

★ 資料探勘
★ 機器學習
★ 資訊融合
★ 特徵選取
★ 支援向量機

關鍵字(英)

★ KDD
★ Machine Learning
★ Information Fusion
★ Feature Selection
★ Support Vector Machines

論文目次

摘要 ..... i
Abstract ..... ii
誌謝 ..... iii
目錄 ...... iv
圖目錄 ..... vi
表目錄 ...... vii
第一章緒論 ................................................ 1
1.1 研究背景 ............................................... 1
1.2 研究動機 ............................................... 3
1.3 研究目的 ............................................... 4
1.4 研究架構 ............................................... 5
第二章文獻探討 ............................................ 6
2.1 特徵選取 ............................................... 6
2.1.1 基因演算法（Genetic Algorithm, GA） ................. 10
2.1.2 主成分分析（Principal Components Analysis, PCA） .... 13
2.1.3 決策樹C4.5（Decision Tree C4.5, DT） ................ 14
2.1.4 資訊融合之特徵選取 .................................. 16
2.2 監督式學習於分類器之應用 .............................. 17
2.2.1 監督式學習 .......................................... 17
2.2.2 支援向量機（Support Vector Machines, SVM） .......... 18
第三章實驗方法 ........................................... 20
3.1 實驗架構 .............................................. 20
3.2 實驗參數設定 .......................................... 23
3.2.1 GA 基因演算法（Wrappers） ........................... 23
3.2.2 PCA 主成分分析（Filters） ........................... 24
3.2.3 C4.5 決策樹（Embedded） ............................. 24
3.2.4 SVM 支援向量機 ...................................... 24
3.3 實驗一 ................................................ 24
3.3.1 Baseline ............................................ 25
3.3.2 單一式特徵選取 ...................................... 26
3.3.3 混合式特徵選取 ...................................... 27
3.4 實驗二 ................................................ 28
第四章實驗結果 ........................................... 28
4.1 實驗設定 .............................................. 28
4.1.1 資料集 .............................................. 28
4.1.2 實驗電腦環境 ........................................ 30
4.1.3 模型驗證準則 ........................................ 30
4.2 PCA 主成分分析的資訊保留率之選擇 ...................... 30
4.3 實驗一結果 ............................................ 32
4.3.1 類別型資料單一與混合特徵選取屬性集合大小之比較 ...... 33
4.3.2 數值型資料單一與混合特徵選取屬性集合大小之比較 ...... 35
4.3.3 混合型資料單一與混合特徵選取屬性集合大小之比較 ...... 37
4.3.4 類別型資料（Categorical Data）的SVM 分類器結果 ...... 39
4.3.5 數值型資料（Numeric Data）的SVM 分類器結果 .......... 43
4.3.6 混合型資料（Mixed Data）的SVM 分類器結果 ............ 46
4.3.7 各資料集正確率最佳的方法 ............................ 49
4.3.8 初始資料特徵選取正確率之比較 ........................ 52
4.4 實驗二結果 ............................................ 53
4.4.1 單一與混合特徵選取屬性集合大小之比較 ................ 53
4.4.2 高維度資料的SVM 分類器結果 .......................... 55
4.4.3 各資料集正確率最佳的方法 ............................ 58
4.5 實驗結論 .............................................. 58
第五章結論 ............................................... 60
5.1 結論與貢獻 ............................................ 60
5.2 研究限制與後續研究建議與方向 .......................... 62
參考文獻 .................................................. 64
附錄一特徵選取的結果 ..................................... 69
1.1 研究一單一特徵選取的結果 ............................. 69
1.2 研究二單一特徵選取的結果 ............................. 74

參考文獻

[1] W. J. Frawley, G. P. Shapiro and C. J. Matheus, “Knowledge Discovery in Databases: An Overview” AI Magazine, Vol. 13, pp. 57-70. Nov. 1992.
[2] J. Ginsberg, M. H. Mohebbi, R. S. Patel, L. Brammer, M. S. Smolinski and L. Brilliant, “Detecting influenza epidemics using search engine query data” Nature , pp. 1012-1014. Feb. 2009.
[3] S. B. Kotsiantis, D. Kanellopoulos, and P. E. Pintelas, “Data Preprocessing for Supervised Leaning” International Journal of Computer Science, Vol. 1, No. 12, pp. 4091-4096. 2007.
[4] D. M. Strong, Y. W. Lee, and R. Y. Wang, “Data Quality In Context” Communications of The ACM, Vol. 40, No. 5, pp. 103-110. May. 1997.
[5] J. Han, J. Pei and M. Kamber, “Classification: Basic Concepts,” in Data mining: concepts and techniques,3th ed.ELSEVIER,2011,ch.8, pp. 327-385.
[6] I. Guyon, A. Elisseeff, “An Introduction to Variable and Feature Selection” Journal of machine learning research, pp. 1157-1182. Mar. 2003.
[7] R. Kohavi, G. H. John, “Wrappers for feature subset selection” Artificial Intelligence, pp.273-324. May. 1996.
[8] Y. Zhai, YS. Ong and I. W. Tsang, “The Emerging "Big Dimensionality"” IEEE Computational Intelligence Magazine, pp. 14-26. Aug. 2014.
[9] M. Haghighat , M. Abdel-Mottaleb, and W. Alhalabi, “Discriminant Correlation Analysis: Real-Time Feature Level Fusion for Multimodal Biometric Recognition” IEEE Transactions On Information Forensics And Security, Vol. 11, No. 9, Sep. 2016.
[10] P. N. Sabes, M. I. Jordan, “Reinforcement Learning by Probability Matching” Advances in Neural Information Processing Systems, 1995.
[11] P. Zhu, W. Zhu, Q. Hu,C. Zhang,W. Zuo, “Subspace clustering guided unsupervised feature selection.” Pattern Recognition, Vol. 66, pp. 364-374. Jun. 2017.
[12] JH Holland, “Interim and Prospectus” in Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. Bradford Books, 1992, ch 10, pp. 171-181.
[13] C. Cortes, V. Vapnik, “Support-Vector Networks” Machine Learning, pp.273-297. 1995.
[14] J.G. Carbonell, R.S. Michalski, T.M. Mitchell, “An overview of machine learning” in Machine Learning: An Artificial Approach, Tioga Publishing Co., 1983, ch 1, pp. 3-20.
[15] M. Mohri, “Multi-Class Classification” in Foundations of Machine Learning, MIT press, 2012, ch8, pp.183-207.
[16] PC. Chang, CH. Liu, “A TSK type fuzzy rule based system for stock price prediction” Expert Systems with Applications, pp. 35-144. Aug. 2008.
[17] G. James, D. Witten, T. Hastie, R. Tibshirani, “Classification” in An introduction to statistical learning: with applications in R,1th ed. Springer, Jun. 2013. ch 4, pp.129-170.
[18] A. M. MartõÂnez and A. C. Kak, “PCA versus LDA” IEEE Transactions On Pattern Analysis And Machine Intelligence, Vol. 23, No. 2, Feb. 2001.
[19] R. Kohavi, “A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection” International Joint Conference on Articial Intelligence (IJCAI),Vol. 14, No.2, pp. 1137-1145. 1995.
[20] H. Ince and T. B. Trafalis, “Kernel principal component analysis and support vector machines for stock price prediction” IIE Transactions, pp. 629–637. Mar. 2007.
[21] A. Kalousis, J. Prados, M. Hilario, “Stability of Feature Selection Algorithms: a study on high dimensional spaces” Knowledge and information systems, pp. 95-116. Mar. 2007.
[22] M. Dash, H. Liu, “Feature Selection for Classification” Intelligent Data Analysis, Vol. 1,pp. 131-156. 1997.
[23] P. M. Narendra,. K. Fukunaga,, “A branch and bound algorithm for feature selection” IEEE Transactions on Computers, pp. 917-922. Sep. 1977.
[24] H. Liu, H. Motoda, “Perspectives of Feature Selection” in Feature selection for knowledge discovery and data mining, Springer Science & Business Media, Vol. 454., 2012. ch 2. pp. 17-38.
[25] H. Liu and L. Yu, “Toward Integrating Feature Selection Algorithms for Classification and Clustering” IEEE Transactions on knowledge and data engineering, pp. 491-502. 2005.
[26] V. Kumar and S. Minz, “Feature Selection: A literature Review” Smart Computing Review, Vol. 4, No. 3, pp. 211-229. Jun. 2014.
[27] KJ. Kim, I. Han, “Genetic algorithms approach to feature discretization in artificial neural networks for the prediction of stock price index”, Expert Systems with Applications, pp. 125–132. 2000.
[28] Q. Guo, W. Wu, DL. Massart, C. Boucon, S. D. Jong, “Feature selection in principal component analysis of analytical data”, Chemometrics and Intelligent Laboratory Systems, Vol. 61, pp. 123-132. Feb. 2002.
[29] B. E. Boser, I. M. Guyon, V. N. Vapnik, “A Training Algorithm for Optimal Margin Classifiers” Proceedings of the fifth annual workshop on Computational learning theory. ACM, pp. 144-152. Jul. 1992.
[30] R. Bekkerman, R. El-Yaniv, N. Tishby, Y. Winter, “Distributional Word Clusters vs. Words for Text Categorization” Journal of Machine Learning Research, pp. 1183-1208. 2003.
[31] J. R. Quinlan, “Constructing Decision Tree” in C4. 5: programs for machine learning, Elsevier, ch 2, pp. 17-25. 2014.
[32] V. Kumar, M. Steinbach, PN. Tan, “Introduction To Data Mining” in Introduction To Data Mining, ch 4, pp.145-205. Mar. 2006.
[33] S. Wold, K. Esbensen, P. Geladi, “Principal component analysis” Chemometrics and intelligent laboratory systems, Vol .2, pp. 37-52. Aug. 1987.
[34] D. Enke, S. Thawornwong, “The use of data mining and neural networks for forecasting stock market returns”, Expert Systems with Applications, Vol. 29,pp. 927–940. 2005.
[35] ST. Li, SC. Kuo, “Knowledge discovery in financial investment for forecasting and trading strategy through wavelet-based SOM networks” Expert Systems with Applications, Vol. 34, pp. 935-951. Feb. 2008.
[36] A. Abraham, B. Nath, P Mahanti, “Hybrid intelligent systems for stock market analysis” Computational science-ICCS 2001, pp. 337-345. 2001.
[37] W. Siedlecki, J. Sklansky, “A note on genetic algorithms for large-scale feature selection” Pattern recognition letters, pp. 335-347. 1989.
[38] CF. Tsai, YC. Hsiao, “Combining multiple feature selection methods for stock prediction: union, intersection, and multi-intersection approaches” Decision Support Systems, Vol.50, pp. 258-269. Aug. 2010.
[39] S. Moon, H. Qi, “Hybrid dimensionality reduction method based on support vector machine and independent component analysis” IEEE transactions on neural networks and learning systems, Vol. 23, pp. 749-761. 2012.
[40] K. Tumer, J. Gosh, “Linear order statistics combiners for pattern classification, Combining Artificial Neural Networks” Combining Artificial Neural Networks, Ed. Amanda Sharkey, pp 127-162. 1999.
[41] F. Herrera, M. Lozano, JL. Verdegay, “Tackling real-coded genetic algorithms: Operators and tools for behavioural analysis” Artificial Intelligence Review, Vol. 12 , pp. 265–319. 1998.
[42] A. Venkatachalam, “M-infosift: A Graph-based Approach For Multiclass
document Classification” Master Of Science In Computer Science And Engineering, Aug. 2007.
[43] JJ. Grefenstette, “Optimization of control parameters of genetic algorithms” IEEE Transactions on systems, man, and cybernetics, Vol.16, pp. 122-128. 1986.
[44] L. Yu,, S. Wang, K. K. Lai, “Mining Stock Market Tendency Using GA-Based Support Vector Machines” Internet and Network Economics, pp. 336-345. 2005.

指導教授

蔡志豐

審核日期

2017-7-4

推文