對使用者評論利用相對權重建立向量模型進行分類之研究

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：17

、訪客IP：18.219.194.82

姓名

解少帆(Shao-Fan Xie) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

對使用者評論利用相對權重建立向量模型進行分類之研究
(Using relative weights to build vector model to classify customer reviews)

相關論文

★ 信用卡盜刷防治簡訊規則製作之決策支援系統	★ 不同檢索策略之效果比較
★ 知識分享過程之影響因子探討	★ 兼具分享功能之檢索代理人系統建構與評估
★ 犯罪青少年電腦態度與學習自我效能之研究	★ 使用AHP分析法在軟體度量議題之研究
★ 優化入侵規則庫	★ 商務資訊擷取效率與品質促進之研究
★ 以分析層級程序法衡量銀行業導入企業應用整合系統(EAI)之關鍵因素	★ 應用基因演算法於叢集電腦機房強迫對流裝置佈局最佳近似解之研究
★ The Development of a CASE Tool with Knowledge Management Functions	★ 以PAT tree 為基礎發展之快速搜尋索引樹
★ 以複合名詞為基礎之文件概念建立方式	★ 利用使用者興趣檔探討形容詞所處位置對評論分類的重要性
★ 透過半結構資訊及使用者回饋資訊以協助使用者過濾網頁文件搜尋結果	★ 利用feature-opinion pair建立向量空間模型以進行使用者評論分類之研究

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 ( 永不開放)

摘要(中)

隨著網際網路以及web2.0的發達，許多針對產品或服務的使用者評論文章，充斥在評論網站以及個人部落格中。而這些評論文章，已漸漸成為消費者進行消費前的一項重要的參考指標，故準確以及有效率的將這些評論文章進行分類，將有助於消費者作出更快速以及正確的決策。
過去意見探勘對於使用者評論進行分類的方式，主要是利用語意的傾向對文章進行分析，但不同產品類型的評論文章所使用的意見詞，意義可能不會相同，且語意的程度也會有所差異。為了達到提高分類準確度之目的，本研究提出了一套方法，它利用一部分已經分類好的文章做為回饋的資訊，計算出意見詞的相對權重，再將此相對權重建立向量模型，透過機器學習的方式來計算準確度。
實驗結果顯示利用相對權重建立的向量模型，的確是比利用語意傾向做為權重，所建立的向量模型的準確率來的高。所以對於使用者評論的分類，若能利用一部分已知類別的文章，獲取資訊後，應用在未分類的評論文章上，可得到更佳的分類準確度。

摘要(英)

As the Internet and web2.0 are becoming more and more popular, the number of customer review grows rapidly in the web sites and the blogs. These reviews have gradually become an important reference for the consumers. Therefore, accurate and efficient classification for these reviews will help the consumers to make decisions quickly and correctly.
Past studies usually applied the opinion mining or the semantic tendency to classify the reviews. But the significance of the opinion in the diffident types of the reviews may not be the same, and the degree of the semantic tendency may be different. In order to improve the accuracy of classification, this study propose a method that applies the classified documents to get the feedback information, and then calculates the relative weight of the opinion to establish the vector model. The experimental results show that the classification accuracy of relative weight could perform better than the accuracy of semantic weights.

關鍵字(中)

★ 向量模型
★ 機器學習
★ 意見探勘
★ 回饋資訊

關鍵字(英)

★ feedback
★ vector model
★ machine learning
★ opinion mining

論文目次

第一章導論 ........................................................................................................... 1
第一節研究動機 ........................................................................................... 1
第二節研究目的 ........................................................................................... 3
第三節研究限制 ........................................................................................... 4
第四節研究流程 ........................................................................................... 4
第五節論文架構 ........................................................................................... 5
第二章文獻探討 ................................................................................................... 6
第一節意見探勘 (opinion mining) ............................................................... 6
第二節語意分類 (sentiment classification)................................................... 7
第三節特徵意見配對 (feature-opinion pair) .............................................. 11
第四節相關回饋 (relevance feedback) ....................................................... 12
第三章系統設計 ................................................................................................. 15
第一節研究構想 ......................................................................................... 16
第二節系統架構 ......................................................................................... 16
一、前處理器 ......................................................................................... 17
二、權重建置器 ..................................................................................... 19
三、向量建置器 ..................................................................................... 22
四、分類器 ............................................................................................. 24
第四章實驗結果與討論 ..................................................................................... 28
第一節實驗設計 ......................................................................................... 28
第二節使用資源 ......................................................................................... 29
第三節實驗結果 ......................................................................................... 30
一、以opinion words做為建立向量模型 .............................................. 30
二、以feature-opinion pair做為建立向量模型 ..................................... 35
第四節實驗結果討論.................................................................................. 40
第五章結論 ......................................................................................................... 42
第一節研究結論與貢獻 .............................................................................. 42
第二節未來研究方向.................................................................................. 42
一、演算法的精緻化 .............................................................................. 43
二、詞彙的選取 ..................................................................................... 43
三、資料集的選取.................................................................................. 43
四、加入文法規則.................................................................................. 44
參考文獻 ................................................................................................................ 45

參考文獻

[1]Annett, M., & Kondrak, G. (2008). A comparison of sentiment analysis techniques: polarizing movie blogs. Lecture Notes in Computer Science, 5032, 25-35.
[2]Apté, C., Damerau, F., & Weiss, S. M. (1994). Automated learning of decision rules for text categorization. ACM Transactions on Information Systems (TOIS), 12, 233-251.
[3]Buckley, C., & Salton, G. (1995). Optimization of relevance feedback weights. Proc. Of SIGIR’95, 351-357.
[4]Chaovalit, P., & Zhou, L. (2005). Movie review mining: A comparison between supervised and unsupervised classification approaches. Proceedings of the 38th Annual Hawaii International Conference on System Sciences, Big Island, Hawaii, January 2005, 112c.
[5]Church, K., Gale, W., Hanks, P., & Hindle, D. (1989). Parsing, word associations and typical predicate-argument relations. Proceedings of the workshop on Speech and Natural Language, Cape Cod, Massachusetts, October 1989, 75-81.
[6]Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine learning, 20, 273-297.
[7]Ding, X., Liu, B., & Yu, P. S. (2008). A holistic lexicon-based approach to opinion mining. Proceedings of the international conference on Web search and web data mining, Palo Alto, California, U.S.A., February 2008, 231-240.
[8]Dumais, S., Platt, J., Heckerman, D., & Sahami, M. (1998). Inductive learning algorithms and representations for text categorization. Proceedings of the 7th international conference on Information and knowledge management, 148-155.
[9]Esuli, A., & Sebastiani, F. (2006). Determining term subjectivity and term orientation for opinion mining. Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics, Trento, Italy, April 2006, 193-200.
[10]Harb, A., Plantiè, M., Dray, G., Roche, M., Trousset, F., & Poncelet, P. (2008). Web Opinion Mining: How to extract opinions from blogs?. CSTST, 211-217.
[11]Hu, M., & Liu, B. (2004). Mining and summarizing customer reviews. Proceedings of the 10th ACM SIGKDD international conference on Knowledge discovery and data mining, Seattle W.A., U.S.A., August 2004, 168-177.
[12]Joachims, T. (1998). Text categorization with support vector machines: Learning with many relevant features. Machine Learning: ECML-98, 137-142.
[13]Joachims, T. (2002). Learning to classify text using support vector machines: Methods, theory, and algorithms. Computational Linguistics, 29, 656-664.
[14]Kim, S. M., & Hovy, E. (2006). Automatic identification of pro and con reasons in online reviews. Proceedings of the COLING/ACL on Main conference poster sessions, Sydney, Australia, July 2006, 483-490.
[15]Koster, C., & Beney, J. (2007). On the importance of parameter tuning in text categorization. Perspectives of Systems Informatics, 270-283.
[16]Lewis, D. D., & Ringuette, M. (1994). A comparison of two learning algorithms for text categorization. Proceedings of the Third Annual Symposium on Document Analysis and Information Retrieval, Las Vegas, Nevada, April 1994, 81-93.
[17]Liu, B., Hu, M., & Cheng, J. (2005). Opinion observer: Analyzing and comparing opinions on the web. WWW’ 2005, 351.
[18]Nick, Z. Z., & Themis, P. (2001). Web search using a genetic algorithm. IEEE Internet Computing, 5, 18-26.
[19]Pang, B., & Lee, L. (2004). A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, Barcelona, Spain, July 2004, 271.
[20]Parasuraman, A., Zeithaml, V. A., & Berry, L. L. (1985). A conceptual model of service quality and its implications for future research. The Journal of Marketing, 49, 41-50.
[21]Popescu, A. M., & Etzioni, O. (2005). Extracting product features and opinions from reviews. Proceedings of HLT '05 Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver B.C., Canada, October 2005, 339-346.
[22]Rocchio, J. J. (1966). Document retrieval systems: Optimization and evaluation. Unpublished doctoral dissertation ed.Cambridge, Harvard University, MA, USA.
[23]Salton, G., Wong, A. & Yang, C. S. (1975). A vector space model for automatic indexing. Communications of the ACM 18 (11), 613-620.
[24]Schütze, H., Hull, D. A., & Pedersen, J. O. (1995). A comparison of classifiers and document representations for the routing problem. Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, Seattle, Washington, U.S.A., July 1995, 229-237.
[25]Sung, A. H., & Mukkamala, S. (2003). Identifying important features for intrusion detection using support vector machines and neural networks. Proceedings of the 2003 International Symposium on Applications and the Internet Technology, Orlando, Florida, January 2003, 209-216.
[26]Tong, S., & Chang, E. (2001). Support vector machine active learning for image retrieval. Proceedings of the 2003 International Symposium on Applications and the Internet Technology, Ottawa, Canada, September 2000, 107-118.
[27]Turney, P. D. (2002). Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. Proceedings of the 9th ACM international conference on Multimedia, Philadelphia, Pennsylvania, July 2002, 417-424.
[28]Vapnik, V. N. (2000). The nature of statistical learning theory. Springer Verlag.
[29]Whitelaw, C., Garg, N., & Argamon, S. (2005). Using appraisal groups for sentiment analysis. Proceedings of the 14th ACM international Conference on information and Knowledge Management Bremen, Germany, October 31 - November 2005, 625-631.
[30]Wiener, E., Pedersen, J. O., & Weigend, A. S. (1995). A neural network approach to topic spotting. Proceedings of SDAIR-95, 4th Annual Symposium on Document Analysis and Information Retrieval, Las Vegas, Nevada, U.S.A., April 1995, 317-332.
[31]Yang, Y. (1994). Expert network: Effective and efficient learning from human decisions in text categorization and retrieval. Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval, Dublin, Ireland, July 1994, 13-22.
[32]Ye, Q., Lin, B., & Li, Y. J. (2005). Sentiment classification for Chinese reviews: A comparison between SVM and semantic approaches. Proceedings of the 4th international conference on machine learning and cybernetics, Guangzhou, China, August 2005, 2341-2346.
[33]Ye, Q., Shi, W., & Li, Y. (2006). Sentiment classification for movie reviews in Chinese by improved semantic oriented approach. Proceedings of the 39th Annual Hawaii International Conference on System Sciences, Kauai, Hawaii, January 2006, 53b.
[34]Zhang, W., Jia, L., Yu, C., & Meng, W. (2008). Improve the effectiveness of the opinion retrieval and opinion polarity classification. CIKM 2008, 1415-1416.
[35]Zhang, W., Yu, C., & Meng, W. (2007). Opinion Retrieval from Blogs. CIKM 2007, 831-840.
[36]Zhang, Z., Li, Y., Ye, Q., & Law, R. (2008). Sentiment classification for Chinese product reviews using an unsupervised Internet-based method. Proceedings of the 15th Annual Conference on International Conference on Management Science and Engineering, Jiaozuo, China, November 2008, 3-9.

指導教授

周世傑(Shin-Chieh Chou)

審核日期

2010-7-26

推文