支援向量機使用不同核心與變數交叉驗證之效能比較

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：45

、訪客IP：52.15.109.209

姓名

莊譽辰(Yu-Chen Chuang) 查詢紙本館藏

畢業系所

資訊管理學系在職專班

論文名稱

支援向量機使用不同核心與變數交叉驗證之效能比較

相關論文

★ 利用資料探勘技術建立商用複合機銷售預測模型	★ 應用資料探勘技術於資源配置預測之研究-以某電腦代工支援單位為例
★ 資料探勘技術應用於航空業航班延誤分析-以C公司為例	★ 全球供應鏈下新產品的安全控管-以C公司為例
★ 資料探勘應用於半導體雷射產業-以A公司為例	★ 應用資料探勘技術於空運出口貨物存倉時間預測-以A公司為例
★ 使用資料探勘分類技術優化YouBike運補作業	★ 特徵屬性篩選對於不同資料類型之影響
★ 資料探勘應用於B2B網路型態之企業官網研究-以T公司為例	★ 衍生性金融商品之客戶投資分析與建議-整合分群與關聯法則技術
★ 應用卷積式神經網路建立肝臟超音波影像輔助判別模型	★ 基於卷積神經網路之身分識別系統
★ 能源管理系統電能補值方法誤差率比較分析	★ 企業員工情感分析與管理系統之研發
★ 資料淨化於類別不平衡問題: 機器學習觀點	★ 資料探勘技術應用於旅客自助報到之分析—以C航空公司為例

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

隨著科技的發達和數據分析的蓬勃發展，各種產業都有越來越多的大型企業嘗試運用資料探勘的方式來將銷售中所取得的大量資料轉換成有用的資訊，以求節省公司的成本或是增加利潤，而在這樣的背景之下，有許多資料探勘的工具和相關語言隨之而生，本研究主要是從眾多的資料分析工具中，選取LibSVM和LS-SVM兩種分類工具，針對兩種特徵屬性多且資料量大的資料集HIGGS和covertype做探勘後的數據比較，過程中會針對隨機抽樣不同的training data和testing data，搭配不同的kernel function和不同的SVM工具來做交叉分析，透過得出的實驗數據，評估在何種搭配下，分析的時間/正確率比例較小，可以取得較高的效益。並且使後續想運用SVM工具來做資料分析的研究者，以此為依據而針對欲分析的資料集類型取得較好的搭配效果。
本實驗結果主要列出資料分析的時間和正確率，以及時間增加/正確率增加的比例，找出時間和正確率較高，且時間增加/正確率增加的比例較低的組合，研究後發現，在LibSVM使用Linear kernel時，分析HIGGS和covertype資料集可以取得較少的時間和較高正確率，但同時在training data數提高時的效率會較低，而LS-SVM在分析兩種資料集時正確率較高但分析時間較長，且training data數提高的時候效益較LibSVM來的低。

摘要(英)

With the development of science and technology and the vigorous development of data analysis, there are more and more large-scale enterprises in various industries trying to use data mining methods to convert the large amounts of data obtained in sales into useful information in order to save the company’s costs. or increase profits, and in this context, there are many tools for data exploration and associated languages was invented.

This research selects LibSVM and LS-SVM classification tools from numerous data analysis tools, and compares the data from two kinds of data sets, HIGGS and covertype, compare Random sampling of different training data and testing data, with different kernel functions and using different SVM tools for cross-analysis, then through the experimental data obtained, to assess under what collocation, the analyst can get higher benefits. and for subsequent researchers who want to use SVM tools for data analysis, they can obtain better collocation effects for the type of data set to be analyzed based on this research.

The results of this experiment mainly list the time and accuracy of data analysis, as well as the ratio of the increase in time/accuracy rate. wish to find the combination which time and accuracy had better performance, and the ratio of the increase in time/accuracy rate is low. the result can be found that when using the Linear kernel for LibSVM, analyzing the HIGGS and covertype data sets can achieve less time and higher accuracy, but at the same time the efficiency will be lower when the number of training data increases.furthmore, LS-SVM gets better correct rate but the analysis time is longer, and when the training data increases, it’s efficiency is lower than the same condition of LibSVM

關鍵字(中)

★ LS-SVM

關鍵字(英)

★ LS-SVM
★ LibSVM
★ Kernel function
★ RBF
★ Linear

論文目次

摘要 I
Abstract II
誌謝 III
附圖目錄 VI
附表目錄 VII
第一章緒論 1
1.1 研究背景 1
1.2 研究動機 1
1.3 研究目的 2
1.4 論文架構 3
第二章文獻探討 4
2.1. SVM簡介 4
2.1.1 LibSVM(A Library for Support Vector Machines)簡介[7] 7
2.1.2 LS-SVM(Least Squares Support Vector Machine)簡介[8] 7
2.2 LibSVM & LS-SVM比較 7
2.2.1 SMO演算法[2] 7
2.2.2 最小二乘法(最小平方法)[16] 8
2.3 SVM kernel function簡介[17,18] 9
2.3.1 Linear kernel 10
2.3.2 RBF kernel 10
2.3.3 Linear kernel and RBF kernel 比較 10
2.4 相關文獻探討 11
第三章. 研究方法 13
3.1 研究概述 13
3.2. 實驗資料集(datasets) 14
3.3. 實驗配置 14
3.3.1資料前處理 15
3.3.2 LibSVM資料分類 15
3.3.3 LS-SVM資料分類 16
3.4. 評估方法 17
第四章實驗結果 19
4.1 Training model數抽樣分析 19
4.1.1 LibSVM使用RBF kernel針對HIGGS資料集抽樣分析 19
4.1.2 LibSVM使用RBF kernel針對covtype資料集抽樣分析 21
4.1.3 LibSVM使用RBF kernel資料集抽樣之時間和正確率比例之比較 23
4.2 不同kernel function抽樣分析 24
4.2.1 LibSVM使用Linear kernel針對HIGGS資料集抽樣分析 24
4.2.2 LibSVM使用Linear kernel針對covtype資料集抽樣分析 27
4.2.3 LibSVM使用Linear kernel資料集抽樣之時間和正確率比例之比較 29
4.2.4 LibSVM使用Linear kernel和RBF kernel，資料集抽樣之時間和正確率比例之比較 30
4.3不同SVM工具抽樣分析 31
4.3.1. LS-SVM使用RBF kernel針對HIGGS資料集抽樣分析 31
4.3.2. LS-SVM使用RBF kernel針對covtype資料集抽樣分析 34
4.3.3. LS-SVM使用Linear kernel針對HIGGS資料集抽樣分析 36
4.3.4. LS-SVM使用Linear kernel針對covtype資料集抽樣分析 38
4.3.5. LS-SVM和LibSVM抽樣分析的時間和正確率比較 40
4.4實驗結論 43
第五章研究討論 44
5.1 研究限制 44
5.2 研究貢獻 44
5.3 未來研究方向 45
參考文獻 46
附錄 52
附錄一 52
附錄二 52
附錄三 53
附錄四 54
附錄五 55
附錄六 56
附錄七 57
附錄八 57
附錄九 58
附錄十 59
附錄十一 63
附錄十二 68
附錄十三 72
附錄十四 77
附錄十五 82
附錄十六 86
附錄十七 91

參考文獻

參考文獻
[1] Cristianini N., and Taylor J.S. “An introduction to support vector machine,” Cambridge University Press, Cambridge, UK, 2000.
[2] Chih-Chung Chang and Chih- Jen Lin. “LIBSVM: a Library for support vector machines” Software available at http://www.csie.ntu.edu.tw/~cjlin/LibSVM, 2001
[3] Suykens, J.A.K.; Vandewalle, J. “Least squares support vector machine classifiers”, Neural Processing Letters, 9 (3), 293-300, 1999
[4] Alpaydin, Ethem .”Introduction to Machine Learning” MIT Press. p. 9. ISBN 978-0-262-01243-0, 2010
[5] Armstrong ,J. Scott. “Illusions in Regression Analysis”, International Journal of Forecasting (forthcoming). 28 (3): ,689. doi:10.1016/j.ijforecast.2012.02.001., 2012
[6] L.Zhao, and C. E. Thorpe, “Stereo- and neural network-based pedestrian detection”, IEEE Trans. on Intelligent Transportation Systems, vol. 1, no. 3, pp.148-154, 2000.
[7] 何嘉翰, 基於LibSVM之異常偵測模擬分析, 私立東海大學資訊工程與科學研究所碩士論文, 2008
[8] 江博昱, 以基因演算法和粒子群演算法為基礎之最小二乘支持向量機預測通訊產業之市場趨勢, 國立清華大學通訊工程研究所碩士論文, 2012
[9] Hofmann, Thomas; Scholkopf, Bernhard; Smola, Alexander J. "Kernel Methods in Machine Learning", 2008
[10] Chang, Yin-Wen; Hsieh, Cho-Jui; Chang, Kai-Wei; Ringgaard, Michael; Lin, Chih-Jen.”Training and testing low-degree polynomial data mappings via linear SVM".J. Machine Learning Research. 11: 1471–1490, 2010
[11] Chang, Chih-Chung; Lin, Chih-Jen "LIBSVM: A Library for support vector machines". ACM Transactions on Intelligent Systems and Technology. 2 (3), 2011
[12] Charnes, A.; Frome, E. L.; Yu, P. L. “The Equivalence of Generalized Least Squares and Maximum Likelihood Estimates in the Exponential Family”. Journal of the American Statistical Association. 71 (353): 169–171. doi:10.1080/01621459.1976.10481508, 1976
[13] Russell, Stuart; Norvig, Peter [1995]. Artificial Intelligence: A Modern Approach (2nd ed.). Prentice Hall. ISBN 978-0137903955, 2003
[14] Graves, Alex; Schmidhuber, Jürgen. Bengio, Yoshua; Schuurmans, Dale; Lafferty, John; Williams, Chris editor-K. I.; Culotta, Aron, eds. "Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks". Neural Information Processing Systems (NIPS) Foundation: 545–552, 2009
[15] 林宗勳, SVM簡介及人臉辨識,,台大資訊工程學系CMLAB教學文件
[16] Wang Guorong; Wei Yimin; Qiao SanZheng. Equation Solving Generalized Inverses. Generalized Inverses: Theory and Computations. Beijing: Science Press.第6頁. ISBN 7-03-012437-5, 2004
[17] On-Line Prediction Wiki Contributors. “Kernel Methods.” On-Line Prediction Wiki. http://onlineprediction.net/?n=Main.KernelMethods, 2010
[18] Hofmann, T., B. Schölkopf, and A. J. Smola. “Kernel methods in machine learning.” Ann. Statist. Volume 36, Number 3, 1171-1220, 2008
[19] 賴明志, 利用支援向量機探討植生崩塌分佈研究-以南投縣竹山鎮林境為例, 朝陽科技大學營建工程系碩士論文, 2011
[20] S. Tavara, H. Sundell and A.Dahlbom, “Empirical Study of Time Efficiency and Accuracy of Support Vector Machines Using an Improved Version of PSVM”, 1Department of Information Technology, University of Boras, Bor ˚ as, Sweden ˚ 2School of Informatics, University of Skovde, Sk ¨ ovde, Sweden, 2015
[21] Abdiansah Abdiansah, Retantyo Wardoyo, “Time Complexity Analysis of Support Vector Machines (SVM) in LibSVM”, Intelligent System Laboratory Comp. Science Department Sriwijaya University and Intelligent System Laboratory Comp. Science & Electronic Dept. Gadjah Mada University, 2015
[22] Chih-Wei Hsu and Chih-Jen Lin, “A Comparison of Methods for Multiclass Support Vector Machines”, IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 2, 2012
[23] Byvatov E,Fechner U,Sadowski J,Schneider G, ”Comparison of support vector machine and artificial neural network systems for drug/nondrug classification”, J Chem Inf Comput Sci., 2003
[24] Jieping Ye, Tao Xiong, “SVM versus Least Squares SVM”, Department of Computer Science and Engineering Arizona State University Tempe, AZ 85287 and Department of Electrical and Computer Engineering University of Minnesota Minneapolis, MN 55455, 2007
[25] Kroese, D. P.; Brereton, T.; Taimre, T.; Botev, Z. I. “Why the Monte Carlo method is so important today.” WIREs Comput Stat, 6: 386–392. doi:10.1002/wics.1314, 2014
[26] Jinkyu Lee and Ivan Tashev, “High-level Feature Representation using Recurrent Neural Network for Speech Emotion Recognition”, Department of Electrical and Electronic Engineering, Yonsei University, Seoul, Korea 2Microsoft Research, One Microsoft Way, Redmond, WA 98052, USA
[27] Campbell, David K.”Nonlinear physics: Fresh breather” Nature. 25 November, 432 (7016): 455–456. ISSN 0028-0836. doi:10.1038/432455a, 2004
[28] “MathWorld:Dimension” Mathworld.wolfram.com. Archived from the original on 2014-03-25. Retrieved 2014-03-03, 2014
[29] Yau, S-T and Nadis, S ”The Shape of Inner Space”, Basic Books, Chapter 4, 2010
[30] Liu, H., Motoda H, ”Feature Selection for Knowledge Discovery and Data Mining” Kluwer Academic Publishers. Norwell, MA, USA. 1998
[31] K. De Brabanter, P. Karsmakers, F. Ojeda, C. Alzate, J. De Brabanter, K. Pelckmans, B. De Moor, J. Vandewalle, J.A.K. Suykens, ”LS-SVMlab Toolbox User’s Guide version 1.8”, Department of Electrical Engineering, ESAT-SCD-SISTA Kasteelpark Arenberg 10, B-3001 Leuven-Heverlee,2011

指導教授

蔡志豐

審核日期

2018-3-28

推文