參考文獻 |
[1] 李淑惠, (2014), 運用文字探勘技術於口碑分析之研究, 碩士, 東吳大學資訊管理學系。
[2] H.Gomaa, W. and A. Fahmy, A. (2013). A Survey of Text Similarity Approaches. International Journal of Computer Applications, 68(13), pp.13-18.
[3] LIU, Q. and LI, S. (2002). Word Similarity Computing Based on How-net. The Association for Computational Linguistics and Chinese Language Processing, [online] 7(2), pp.59-76. Available at: https://aclweb.org/anthology/O/O02/O02-2003.pdf [Accessed 29 Jun. 2017].
[4] Cheng, S. and Liang, T. (2005). 中⽂句⼦相似度之計算與應用 (ChineseSentence Similarity Computing and Appling) [In Chinese]. ROCLING, pp.1-2.
[5] Gan, Z. (2017). A Document Similarity Measure and Its Applications. NSYSU.
[6] Kruse, H. and Mukherjee, A. (n.d.). Preprocessing text to improve compression ratios. Proceedings DCC ′98 Data Compression Conference (Cat. No.98TB100225).
[7] Yao, Z. and Ze-wen, C. (2011). Research on the Construction and Filter Method of Stop-word List in Text Preprocessing. 2011 Fourth International Conference on Intelligent Computation Technology and Automation.
[8] Saad, M. K. (2010). The impact of text preprocessing and term weighting on arabic text classification. Gaza: Computer Engineering, the Islamic University.
[9] Wilbur, W. J., & Sirotkin, K. (1992). The automatic identification of stop words. Journal of information science, 18(1), 45-55.
[10] El-Khair, I. A. (2006). Effects of stop words elimination for Arabic information retrieval: a comparative study. International Journal of Computing & Information Sciences, 4(3), 119-133.
[11] Paice, C. D. (1994, August). An evaluation method for stemming algorithms. In Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 42-50). Springer-Verlag New York, Inc..
[12] Hull, D. A. (1996). Stemming algorithms: A case study for detailed evaluation. JASIS, 47(1), 70-84.
[13] Lovins, J. B. (1968). Development of a stemming algorithm. Mech. Translat. & 38 Comp. Linguistics, 11(1-2), 22-31.
[14] Porter, M. F. (1980). An algorithm for suffix stripping. Program, 14(3), 130-137.
[15] Jivani, A. G. (2011). A comparative study of stemming algorithms. Int. J. Comp. Tech. Appl, 2(6), 1930-1938.
[16] Li, H., Cao, Y., Petzold, L. R., & Gillespie, D. T. (2008). Algorithms and software for stochastic simulation of biochemical reacting systems. Biotechnology progress, 24(1), 56-61.
[17] Dijkman, R. M., Dumas, M., & García-Bañuelos, L. (2009, September). Graph Matching Algorithms for Business Process Model Similarity Search. In BPM(Vol. 5701, pp. 48-63).
[18] Yang, Y., & Pedersen, J. O. (1997, July). A comparative study on feature selection in text categorization. In Icml (Vol. 97, pp. 412-420).
[19] Ikonomakis, M., Kotsiantis, S., & Tampakas, V.(2005). Text classification using machine learning techniques. WSEAS transactions on computers, 4(8), 966-974.
[20] Figueroa, Alejandro (2015). Exploring effective features for recognizing the user intent behind web queries. Computers in Industry, 68, 162–169.
[21] Zhang, Y., Wang, S., Phillips, P. and Ji, G. (2014). Binary PSO with mutation operator for feature selection using decision tree applied to spam detection. Knowledge-Based Systems, 64, pp.22-31.
[22] López, F. G., Torres, M. G., Batista, B. M., Pérez, J. A. M., & Moreno-Vega, J. M. (2006). Solving feature subset selection problem by a parallel scatter search. European Journal of Operational Research, 169(2), 477-489.
[23] Garcıa-Torres, M., Garcıa-López, F., Melián-Batista, B., Moreno-Pérez, J. A., & Moreno-Vega, J. M. (2004). Solving feature subset selection problem by a hybrid
metaheuristic. Hybrid Metaheuristics, 59-68.
[24] Ramos, J. (2003, December). Using tf-idf to determine word relevance in document queries. In Proceedings of the first instructional conference on machine learning (Vol. 242, pp. 133-142).
[25] Aizawa, A. (2003). An information-theoretic perspective of tf–idf measures. Information Processing & Management, 39(1), 45-65.
[26] Li, K. C. (1991). Sliced inverse regression for dimension reduction. Journal of the American Statistical Association, 86(414), 316-327.
[27] Fodor, I. K. (2002). A survey of dimension reduction techniques (No. UCRL-ID-148494). Lawrence Livermore National Lab., CA (US).
[28] Wold, S., Esbensen, K., & Geladi, P. (1987). Principal component analysis. Chemometrics and intelligent laboratory systems, 2(1-3), 37-52.
[29] Abdi, H., & Williams, L. J. (2010). Principal component analysis. Wiley interdisciplinary reviews: computational statistics, 2(4), 433-459.
[30] Schölkopf, B., Smola, A., & Müller, K. R. (1997). Kernel principal component analysis. Artificial Neural Networks—ICANN′97, 583-588.
[31] Landauer, T. K., Foltz, P. W., & Laham, D. (1998). An introduction to latent semantic analysis. Discourse processes, 25(2-3), 259-284.
[32] Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., & Harshman, R. (1990). Indexing by latent semantic analysis. Journal of the American society for information science, 41(6), 391.
[33] Blog.csdn.net. (2015). [online] Available at: http://blog.csdn.net/zhzhji440
[Accessed 6 Jul. 2017].
[34] T. W. Schoenharl and G. Madey. Evaluation of measurement techniques for the validation of agent-based simulations against streaming data. International Conference on Computational Science, 2008.
[35] J. Han and M. Kamber. Data Mining: Concepts and Techniques. Second Edition, Morgan Kaufmann, Elsevier, 2006.
[36] Li, Y., McLean, D., Bandar, Z. A., O′shea, J. D., & Crockett, K. (2006). Sentence similarity based on semantic nets and corpus statistics. IEEE transactions on
knowledge and data engineering, 18(8), 1138-1150.
[37] Achananuparp, P., Hu, X., & Shen, X. (2008). The evaluation of sentence similarity measures. Data warehousing and knowledge discovery, 305-316.
[38] Islam, A., & Inkpen, D. (2008). Semantic text similarity using corpus-based word similarity and string similarity. ACM Transactions on Knowledge Discovery from
Data (TKDD), 2(2), 10.
[39] Gomaa, W. H., & Fahmy, A. A. (2013). A survey of text similarity approaches. International Journal of Computer Applications, 68(13).
[40] Cilibrasi, R. L., & Vitanyi, P. M. (2007). The google similarity distance. IEEE Transactions on knowledge and data engineering, 19(3). |