博碩士論文 964203050 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:9 、訪客IP:3.143.0.157
姓名 周嘉宏(Chia-Hung Chou)  查詢紙本館藏   畢業系所 資訊管理學系
論文名稱 利用feature-opinion pair建立向量空間模型以進行使用者評論分類之研究
(Using Feature-Opinion Pair to Create Vector Space Model for User Review Classification)
相關論文
★ 信用卡盜刷防治簡訊規則製作之決策支援系統★ 不同檢索策略之效果比較
★ 知識分享過程之影響因子探討★ 兼具分享功能之檢索代理人系統建構與評估
★ 犯罪青少年電腦態度與學習自我效能之研究★ 企業多角化經營策略之研究 ─以某機構拓展資訊應用服務市場為例
★ 使用AHP分析法在軟體度量議題之研究★ 優化入侵規則庫
★ 商務資訊擷取效率與品質促進之研究★ 某中小企業執行政府專案計畫之社會網絡關係分析
★ 學習型組織與價值創造的探討★ 以社會行動研究法探討組織內的社會網絡關係-以C研究中心為例
★ 以分析層級程序法衡量銀行業導入企業應用整合系統(EAI)之關鍵因素★ 應用基因演算法於叢集電腦機房強迫對流裝置佈局最佳近似解之研究
★ 以個案研究探討半導體封測業CIM之導入★ 消費者特性與網路使用經驗對網路購物 之影響-以上海、台北學生為例
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   至系統瀏覽論文 ( 永不開放)
摘要(中) 隨著網路的蓬勃發展,電子商務興起,以及web2.0技術的廣泛應用,愈來愈多人在網路上表達個人對於產品與服務之使用意見。許多的討論區、專業評論網站 (例如epinon.com,Amazon) 以及個人網誌,亦提供使用者抒發己見的空間。由此可知,線上評論是為買賣雙方獲取參考資訊的重要來源。然而網路的評論通常會混合著正面與負面意見,若以人工處理方式去從中取得具參考價值之訊息,勢必要耗費甚多精力與時間。因此,如何彙整與分析大量的網路文字資料,尤其是針對具有豐富語意資訊的使用者評論,自動化意見探勘,實為重要之研究議題。
  回顧過去意見探勘之研究得知,特徵表示法是用來反映網路評論文章之特性,透過特徵選取的方法以提供分類訓練模型進行學習,本研究發現評論分類的領域中,最常採用之特徵表示法,大多是單一字詞的頻率。此類型之特徵表示法對於分類器而言,容易產生維度太大或增加雜訊,進而影響分類效果,有鑑於此,本研究針對特徵表示法的部分進行改良,利用feature-opinion pair來代表向量空間模型之特徵,使特徵表示法能包含更多的語意訊息。
  本研究所提出之改良特徵表示法,係以監督式學習演算法為基礎,針對文章之特性進行分類。透過所截取之產品與服務的特徵(feature)與使用者意見 (opinion)來形成feature-opinion pair,以建立向量空間模型。並採用支援向量機(support vector machine)來做為本研究之分類器,來測試我們所收集之資料集。實驗結果顯示,本研究提出之方法能夠有效的降低建立向量空間模型之維度,並提升分類之準確率。
摘要(英) The emergence of Internet has constructed a space (e.g. epinions.com, amazon.com) for users to freely express opinions and exchange experiences regarding products, services, and any public issues. Nowadays a great amount of referral information can be obtained from a variety of information source, including products profile, recommendations, expert opinion and so forth. However, identification of the semantic orientation from referral information requires a lot of human efforts. Therefore, the study of opinion mining has been extended to this field.
In prior studies of opinion mining, feature representation has been the key method. Bag-of-word is one of the most popular feature representation that describes reviewing contents as single-word sets. However, applying bag-of-word model to online reviews usually are lack of semantic information and will significantly increase vector dimension to reduce the performance of machine learning classifier.
This study proposed a modified feature presentation method for building vector space model. Feature-opinion pair will be extracted from product features and user comments at sentence level. We use support vector machine as our classification method to test our dataset. These experiments indicate that the proposed method can not only increase the accuracy of classification but also reduce time cost with fewer dimensions. Finally, we expect that our system could be used to solve the high dimension problem in review classification.
關鍵字(中) ★ 意見探勘
★ 評論分類
★ 機器學習
關鍵字(英) ★ opinion mining
★ review classification
★ machine learning
論文目次 目錄
第一章 緒論 1
1.1 研究背景與動機 1
1.2 研究目的與範圍 2
1.3 研究限制 3
1.4 研究流程 3
1.5 論文架構 4
第二章 文獻探討 5
2.1 意見探勘 5
2.2 意見探勘技術 9
2.3 電影評論分類 12
第三章 系統設計 15
3.1 研究構想 15
3.2 系統架構 27
第四章 實驗結果與討論 31
4.1 實驗設計 31
4.2 實驗結果 33
4.3 實驗討論 39
第五章 結論 41
5.1 研究結論與貢獻 41
5.2 未來研究方向 42
參考文獻 44
參考文獻 參考文獻
[1] Annett, M. and Kondrak, G. (2008), “A Comparison of Sentiment Analysis Techniques: Polarizing Movie Blogs, ” Lecture Notes in Computer Science, Vol. 5032, pp. 25-35.
[2] Apté, C., Damerau, F. and Weiss, S. M. (1994), “Automated learning of decision rules for text categorization, ” ACM Transactions on Information Systems, Vol. 12 No. 3, pp. 233-251.
[3] Baeza-Yates, R. and Ribeiro-Neto, B. (1999), “Modern information retrieval, ” Addison-Wesley, New York, 1999.
[4] Chaovalit, P. and Zhou, L. (2005), “Movie review mining: A comparison between supervised and unsupervised classification approaches, ” Proceedings of the 38th Annual Hawaii International Conference on System Sciences, Big Island, Hawaii, January 2005, pp. 112c.
[5] Chen, H., Schuffels, C. and Orwig, R. (1996), “Internet categorization and search: A self-organizing approach, ” Journal of visual communication and image representation, Vol. 7 No. 1, pp. 88-102.
[6] Church, K., Gale, W., Hanks, P. and Hindle, D. (1989), “Parsing, word associations and typical predicate-argument relations, ” Proceedings of the workshop on Speech and Natural Language, Cape Cod, Massachusetts, October 1989, pp. 75-81.
[7] Cortes, C. and Vapnik, V. (1995), “Support-vector networks, ” Machine Learning, Vol. 20 No. 3, pp. 273-297.
[8] Dave, K., Lawrence, S. and Pennock, D. M. (2003), “Mining the peanut gallery: Opinion extraction and semantic classification of product reviews, ” Proceedings of the 12th international conference on World Wide Web, Budapest, Hungary, May 2003, pp. 519-528.
[9] Ding, X., Liu, B. and Yu, P. S. (2008), “A holistic lexicon-based approach to opinion mining, ” Proceedings of the international conference on Web search and web data mining, Palo Alto, California, U.S.A., February 2008, pp. 231-240.
[10] Dumais, S., Platt, J., Heckerman, D. and Sahami, M. (1998), “Inductive learning algorithms and representations for text categorization, ” Proceedings of the 7th international conference on Information and knowledge management, Bethesda, Maryland, November, 1998, pp. 148-155.
[11] Esuli, A. and Sebastiani, F. (2006), “Determining term subjectivity and term orientation for opinion mining, ” Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics, Trento, Italy, April 2006, pp. 193-200.
[12] Gamon, M., Aue, A., Corston-Oliver, S. and Ringger, E. ( 2005), “Pulse: Mining customer opinions from free text, ” Lecture Notes in Computer Science, Vol. 3646, pp. 121-132.
[13] Hatzivassiloglou, V. and Wiebe, J. M. (2000), “Effects of adjective orientation and gradability on sentence subjectivity, ” Proceedings of the 18th conference on Computational linguistics, Saarbrucken, Germany, July 2006, pp. 299-305.
[14] Hu, M. and Liu, B. (2004), “Mining and summarizing customer reviews, ” Proceedings of the 10th ACM SIGKDD international conference on Knowledge discovery and data mining, Seattle W.A., U.S.A., August 2004 pp. 168-177.
[15] Joachims, T. (2002), “Learning to classify text using support vector machines: Methods, theory, and algorithms, ” Computational Linguistics, Vol. 29 No. 4, pp. 656-664.
[16] Joachims, T., Nedellec, C. and Rouveirol, C. (1998), “ Text categorization with support vector machines: learning with many relevant, ” Proceedings of the 10th European Conference on Machine Learning, Chemnitz, Germany, April 1998, pp. 137-142.
[17] Kim, S. M. and Hovy, E. (2006), “ Automatic identification of pro and con reasons in online reviews, ” Proceedings of the COLING/ACL on Main conference poster sessions, Sydney, Australia, July 2006, pp. 483-490.
[18] Kim, S. M. and Hovy, E. (2004), “Determining the sentiment of opinions, ” Proceedings of the 20th international conference on Computational Linguistics, Geneva, Switzerland, August 2004, pp. 1367-1373.
[19] Lewis, D. and Ringuette, M. (1994), “A comparison of two learning algorithms for text classification, ” Proceedings of the Third Annual Symposium on Document Analysis and Information Retrieval, Las Vegas, Nevada, April 1994, pp. 81-93.
[20] Liu, B., Hu, M. and Cheng, J. (2005), “Opinion observer: Analyzing and comparing opinions on the web, ” Proceedings of the 14th international conference on World Wide Web, Chiba, Japan, May 2005, pp. 342-351.
[21] Morinaga, S., Yamanishi, K., Tateishi, K. and Fukushima, T. (2002), “Mining product reputations on the web, ” Proceedings of the 8th ACM SIGKDD international conference on Knowledge discovery and data mining, Alberta, Canada, July 2002 pp. 341-349.
[22] Pang, B. and Lee, L. (2004),“A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts, ” Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, Barcelona, Spain, July 2004, pp. 271–278.
[23] Pang, B., Lee, L. and Vaithyanathan, S. (2002),“ Thumbs up?: sentiment classification using machine learning techniques, ” Proceedings of the ACL-02 conference on Empirical methods in natural language processing, Philadelphia P.A., U.S.A., July 2003, pp. 279-86.
[24] Popescu, A. M. and Etzioni, O. (2005),“ Extracting product features and opinions from reviews, ” Proceedings of HLT '05 Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver B.C., Canada, October 2005, pp. 339–346.
[25] Salton, G. and McGill, M. J. (1986),“Introduction to modern information retrieval, ” McGraw-Hill, New York, 1983.
[26] Schütze, H., Hull, D. A. and Pedersen, J. O. (1995),“ A comparison of classifiers and document representations for the routing problem, ” Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, Seattle, Washington, U.S.A., July 1995, pp. 229-237.
[27] Sung, A. and Mukkamala, S. (2003),“ Identifying important features for intrusion detection using support vector machines and neural networks, ” Proceedings of the 2003 International Symposium on Applications and the Internet Technology, Orlando, Florida, January 2003, pp. 209-216.
[28] Terveen, L., Hill, W., Amento, B., McDonald, D. and Creter, J. ( 1997),“ PHOAKS: A system for sharing recommendations, ” Communications of the ACM, Vol. 40 No. 3, pp. 59-62.
[29] Tong, S. and Chang, E. (2001),“ Support vector machine active learning for image retrieval, ” Proceedings of the 9th ACM international conference on Multimedia, Ottawa, Canada, September 2000, pp.107-118.
[30] Turney, P. D. (2002),“Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews, ” Proceedings of the 40th annual meeting of the Association for Computational Linguistics, Philadelphia, Pennsylvania, July 2002, pp. 417–424.
[31] Turney, P. D. and Littman, M. L. ( 2003),“Measuring praise and criticism: Inference of semantic orientation from association, ” ACM Transactions on Information Systems, Vol. 21 No. 40, pp.315-346.
[32] Vapnik, V. Structure of statistical learning theory. (1996),“Computational Learning and Probabilistic Reasoning, ” New York: John Wiely.
[33] Wiebe, J., Bruce, R., Bell, M., Martin, M. and Wilson, T. (2001),“A corpus study of evaluative and speculative language, ” Proceedings of the Second SIGdial Workshop on Discourse and Dialogue, Aalborg, Denmark, September 2001, pp.1-10.
[34] Wiebe, J. M. (2000),“Learning subjective adjectives from corpora, ” Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence, Austin, Texas, U.S.A., July 2000, pp. 735-741.
[35] Wiener, E., Pedersen, J. O. and Weigend, A. S. (1995),“A neural network approach to topic spotting, ” Proceedings of SDAIR-95, 4th Annual Symposium on Document Analysis and Information Retrieval, Las Vegas, Nevada, U.S.A., April 1995, pp. 317-332.
[36] Yang, Y. (1994),“Expert network: Effective and efficient learning from human decisions in text categorization and retrieval, ” Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval, Dublin, Ireland, July 1994, pp. 13-22.
[37] Yang, Y. and Pedersen, J. O. (1997),“A comparative study on feature selection in text categorization, ” Proceedings of the Fourteenth International Conference on Machine Learning, Nashville, Tennessee, U.S.A., July 1997, pp. 412-420.
[38] Ye, Q., Lin, B. and Li, Y. J. (2005),“Sentiment classification for Chinese reviews: a comparison between SVM and semantic approaches, ” Proceedings of the 4th international conference on machine learning and cybernetics, Guangzhou, China, August 2005, pp. 2341-2346.
[39] Ye, Q., Shi, W. and Li, Y. (2006),“Sentiment classification for movie reviews in Chinese by improved semantic oriented approach, ” Proceedings of the 39th Annual Hawaii International Conference on System Sciences, Kauai, Hawaii, January 2006, pp. 53b.
[40] Zhang, W., Yu, C. and Meng, W. (2007),“Opinion retrieval from blogs, ” Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, Lisbon, Portugal, November 2007, pp. 831-840.
[41] Zhang, Z., Li, Y., Ye, Q. and Law, R. (2008),“Sentiment classification for Chinese product reviews using an unsupervised Internet-based method, ” Proceedings of the 15th Annual Conference on International Conference on Management Science and Engineering, Jiaozuo, China, November 2008, pp. 3-9.
[42] Zhuang, L., Jing, F. and Zhu, X. Y. (2006),“Movie review mining and summarization, ” Proceedings of the 15th ACM international conference on Information and knowledge management, Arlington, Virginia, U.S.A., November 2006, pp. 43-50.
指導教授 宋鎧、周世傑
(Kai Sung、Shih-Chieh Chou)
審核日期 2009-7-3
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明