dc.description.abstract | With the development of science and technology and the vigorous development of data analysis, there are more and more large-scale enterprises in various industries trying to use data mining methods to convert the large amounts of data obtained in sales into useful information in order to save the company’s costs. or increase profits, and in this context, there are many tools for data exploration and associated languages was invented.
This research selects LibSVM and LS-SVM classification tools from numerous data analysis tools, and compares the data from two kinds of data sets, HIGGS and covertype, compare Random sampling of different training data and testing data, with different kernel functions and using different SVM tools for cross-analysis, then through the experimental data obtained, to assess under what collocation, the analyst can get higher benefits. and for subsequent researchers who want to use SVM tools for data analysis, they can obtain better collocation effects for the type of data set to be analyzed based on this research.
The results of this experiment mainly list the time and accuracy of data analysis, as well as the ratio of the increase in time/accuracy rate. wish to find the combination which time and accuracy had better performance, and the ratio of the increase in time/accuracy rate is low. the result can be found that when using the Linear kernel for LibSVM, analyzing the HIGGS and covertype data sets can achieve less time and higher accuracy, but at the same time the efficiency will be lower when the number of training data increases.furthmore, LS-SVM gets better correct rate but the analysis time is longer, and when the training data increases, it’s efficiency is lower than the same condition of LibSVM
| en_US |