博碩士論文 110453036 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:54 、訪客IP:3.138.137.175
姓名 莊婉麗(Wan-Li Chuang)  查詢紙本館藏   畢業系所 資訊管理學系在職專班
論文名稱 以文字探勘技術探討評論星級評分與實際評分之間的一致性研究
(A Study on the Consistency between Review Star Ratings and Actual Ratings Using Text Mining Techniques)
相關論文
★ 不動產仲介業銷售住宅類別之成交預測模型—以不動產仲介S公司為例★ 應用文字探勘技術建構預測客訴問題類別機器學習模型
★ 以機器學習技術建構顧客回購率預測模型:以某手工皂原料電子商務網站為例★ 以機器學習建構股價預測模型:以台灣股市為例
★ 以機器學習方法建構財務危機之預測模型:以台灣上市櫃公司為例★ 運用資料探勘技術於股票填息之預測模型:以台灣股市上市公司為例
★ 運用資料探勘技術優化 次世代防火牆規則之研究★ 應用資料探勘技術於電子病歷文本中識別相關新資訊
★ 應用深度學習於藥品後市場監督:Twitter文本分類任務★ 運用電子病歷與資料探勘技術建構腦中風病人心房顫動預測模型
★ 考量特徵選取與隨機森林之遺漏值填補技術★ 電子病歷縮寫消歧與一對多分類任務
★ 運用Meta-path與注意力機制改善個人化穿搭推薦★ 運用機器學習技術建構核保風險預測模型:以A公司為例
★ 風扇壽命預測使用大數據分析-以 X 公司為例★ 使用文字探勘與深度學習技術建置中風後肺炎之預測模型
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   至系統瀏覽論文 (2029-7-1以後開放)
摘要(中) 隨著社群媒體的盛行,eWOM的生成和傳播迅速擴散,線上評論已成為影響消費者購買與否的重要因素之一。在電影領域,消費者在觀影後常常會在IMDb等平台上留下文字評論和星級評分,這些數據蘊藏著豐富的用戶偏好訊息。然而,大多數現有研究都將重點放在了評論文本的情感分析和推薦系統的構建上,過去較少文獻針對評論星級評分與實際評分一致性進行預測。針對這一研究缺口,本文以 IMDb 網站電影評論數據為例,使用爬蟲技術蒐集電影評論與資訊做為本研究資料集,預處理後共有21,150筆資料,並取出前3大類電影類別,再以是否使用文字探勘技術拆分為文本類型及非文本類型共9種特徵。在模型構建方面研究採用了多種迴歸方法進行評分預測包括隨機森林、梯度提升機、自適應增強和線性迴歸。實驗使用十折交叉驗證來訓練預測模型,最後使用迴歸評估指標評估模型預測準確率。本研究主要聚焦於使用不同的文字向量技術來提高電影評論星級預測的準確率,並識別影響評分的關鍵特徵。透過四項實驗,發現Doc2Vec在評論星級評分預測中表現最佳,突顯為重要的文本特徵。研究也顯示,發現Action電影類別的評分預測結果優於其他3類。進一步分析文本與非文本特徵的組合發現,複合特徵可顯著提升預測精度。實驗四比較全特徵模型與特徵選擇模型,結果發現在本資料集上使用全部特徵可以取得更好的預測效果。這些結果不僅證明了文字探勘技術的實用性,也為電影評論分析提供了新的技術途徑。
摘要(英) With the prevalence of social media, the generation and dissemination of eWOM have rapidly expanded, and online reviews have become one of the important factors influencing consumer purchasing decisions. In the movie domain, consumers often leave text reviews and star ratings on platforms like IMDb after watching a movie, and these data contain rich user preference information. However, most existing studies have focused on sentiment analysis of review texts and the construction of recommendation systems, with few studies predicting the consistency between review star ratings and actual ratings.To address this research gap, this study uses movie review data from the IMDb website as an example, collecting movie reviews and information using web crawling techniques as the research dataset. After preprocessing, there are 21,150 data points, and the top 3 movie genres are selected. The data is then divided into text-type and non-text-type features based on whether text mining techniques are used, resulting in 9 types of features. In terms of model construction, the study employs various regression methods for rating prediction, including Random Forest, Gradient Boosting Machine, AdaBoost, and Linear Regression. The experiments use 10-fold cross-validation to train the prediction models, and regression evaluation metrics are used to assess the model prediction accuracy.The main focus of this research is on using different text vectorization techniques to improve the accuracy of movie review star rating predictions and identifying key features that influence the ratings. Through four experiments, it is found that Doc2Vec performs the best in review rating prediction, highlighting its importance as a text feature. The study also shows that the rating prediction results for the Action movie genre are better than those for the other three genres. Further analysis of the combination of text and non-text features reveals that composite features can significantly improve prediction accuracy. Experiment 4 compares the full-feature model with the feature selection model, and the results show that using all features can achieve better prediction effects on this dataset. These results not only demonstrate the practicality of text mining techniques but also provide new technical approaches for movie review analysis.
關鍵字(中) ★ 情感分析
★ 文字向量
★ 特徵類別
★ 評論星級評分
★ 一致性
關鍵字(英) ★ sentiment analysis
★ text vectorization
★ feature categories
★ review ratings
★ consistency
論文目次 誌謝 I
摘要 II
Abstract III
目錄 IV
圖目錄 VI
表目錄 VII
第一章 緒論 1
1.1 研究背景 1
1.2 研究動機 4
1.3 研究目的 5
第二章 文獻探討 6
2.1 評論星級評分預測 6
2.2 電影評論星級評分預測相關文獻研究 13
第三章 研究方法 18
3.1 資料集來源 19
3.2 資料預處理 21
3.3 研究變數 22
3.4 實驗設計 28
3.5 資料驗證與評估指標 32
第四章 實證結果分析 34
4.1 實驗結果 34
4.2 綜合討論 37
第五章 結論與建議 39
5.1 研究結論與貢獻 39
5.2 研究限制 41
5.3 未來研究方向與建議 41
英文參考文獻 43
中文參考文獻 51
附錄 52
參考文獻 英文參考文獻
Aakash, A., Tandon, A., & Gupta Aggarwal, A. (2021). How features embedded in eWOM predict hotel guest satisfaction: An application of artificial neural networks. Journal of Hospitality Marketing & Management, 30(4), 486-507.
Adetunji, O., Hadiza, M., & Otuneme, N. (2020). Design of a Movie Review Rating Prediction (MR2P) Algorithm. International Journal of Scientific Research in Computer Science, Engineering and Information Technology, 6(4), 423-432.
Ahmed, B. H., & Ghabayen, A. S. (2022). Review rating prediction framework using deep learning. Journal of Ambient Intelligence and Humanized Computing, 13(7), 3423-3432.
Antonio, N., de Almeida, A. M., Nunes, L., Batista, F., & Ribeiro, R. (2018). Hotel online reviews: creating a multi-source aggregated index. International Journal of Contemporary Hospitality Management, 30(12), 3574-3591.
Babić Rosario, A., De Valck, K., & Sotgiu, F. (2020). Conceptualizing the electronic word-of-mouth process: What we know and need to know about eWOM creation, exposure, and evaluation. Journal of the Academy of Marketing Science, 48, 422-448.
Bei, L.-T., Chen, E. Y. I., & Widdows, R. (2004). Consumers’ Online Information Search Behavior and the Phenomenon of Search vs. Experience Products. Journal of Family and Economic Issues, 25(4), 449–467. https://doi.org/10.1007/s10834-004-5490-0
Bristi, W. R., Zaman, Z., & Sultana, N. (2019, July). Predicting imdb rating of movies by machine learning techniques. In 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT) (pp. 1-5). IEEE.
Chaovalit, P., & Zhou, L. (2005, January). Movie review mining: A comparison between supervised and unsupervised classification approaches. In Proceedings of the 38th annual Hawaii international conference on system sciences (pp. 112c-112c). IEEE.
Chatterjee, S. (2020). Drivers of helpfulness of online hotel reviews: A sentiment and emotion mining approach. International Journal of Hospitality Management, 85, 102356. https://doi.org/10.1016/j.ijhm.2019.102356
Chen, F., Liu, S. Q., & Mattila, A. S. (2020). Bragging and humblebragging in online reviews. Annals of Tourism Research, 80, 102849.
Chetioui, Y., Butt, I., & Lebdaoui, H. (2021). Facebook advertising, eWOM and consumer purchase intention-Evidence from a collectivistic emerging market. Journal of Global Marketing, 34(3), 220–237. https://doi.org/10.1080/08911762.2021.1891359
Chua, A. Y. K., & Banerjee, S. (2015). Understanding review helpfulness as a function of reviewer reputation, review rating, and review depth. Journal of the Association for Information Science and Technology, 66(2), 354–362. https://doi.org/10.1002/asi.23180
Cizmeci, B., & Ögüdücü, Ş. G. (2018, September). Predicting IMDb ratings of pre-release movies with factorization machines using social media. In 2018 3rd International Conference on Computer Science and Engineering (UBMK) (pp. 173-178). IEEE.
Dong, R., Schaal, M., O’Mahony, M. P., & Smyth, B. (2013). Topic extraction from online reviews for classification and recommendation. Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, 1310–1316.
DuBay, W. H. (2004). The principles of readability. Online Submission.
Feng, S., Song, K., Wang, D., Gao, W., & Zhang, Y. (2021). InterSentiment: combining deep neural models on interaction and sentiment for review rating prediction. International Journal of Machine Learning and Cybernetics, 12, 477-488.
Forman, C., Ghose, A., & Wiesenfeld, B. (2008). Examining the Relationship Between Reviews and Sales: The Role of Reviewer Identity Disclosure in Electronic Markets. Information Systems Research, 19(3), 291–313. https://doi.org/10.1287/isre.1080.0193
Gogineni, S., & Pimpalshende, A. (2020, June). Predicting IMDb Movie Rating Using Deep Learning. In 2020 5th International Conference on Communication and Electronics Systems (ICCES) (pp. 1139-1144). IEEE.
Hassan, J., & Shoaib, U. (2020). Multi-class review rating classification using deep recurrent neural network. Neural Processing Letters, 51, 1031-1048.
Hong, T. (2022). Sentiment Analysis and Star Rating Prediction Based on Big Data Analysis of Online Reviews of Foreign Tourists Visiting Korea. Knowledge Management Research, 23(1), 187-201.
Hossain, M. I., Rahman, M., Ahmed, T., & Islam, A. T. (2021, February). Forecast the rating of online products from customer text review based on machine learning algorithms. In 2021 International Conference on Information and Communication Technology for Sustainable Development (ICICT4SD) (pp. 6-10). IEEE.
Hu, N., Koh, N. S., & Reddy, S. K. (2014). Ratings lead you to the product, reviews help you clinch it? The mediating role of online review sentiments on product sales. Decision support systems, 57, 42-53.
Hu, Y.-H., & Chen, K. (2016). Predicting hotel review helpfulness: The impact of review visibility, and interaction between hotel stars and review ratings. International Journal of Information Management, 36(6, Part A), 929–944. https://doi.org/10.1016/j.ijinfomgt.2016.06.003
Jain, A., & Jain, V. (2021). Effect of activation functions on deep learning algorithms performance for IMDb movie review analysis. In Proceedings of International Conference on Artificial Intelligence and Applications: ICAIA 2020 (pp. 489-497). Springer Singapore.
Lee, M., Kwon, W., & Back, K.-J. (2021). Artificial intelligence for hospitality big data analytics: Developing a prediction model of restaurant review helpfulness for customer decision-making. International Journal of Contemporary Hospitality Management, 33(6), 2117–2136. https://doi.org/10.1108/IJCHM-06-2020-0587
Liu, Y., Huang, X., An, A., & Yu, X. (2008). Modeling and Predicting the Helpfulness of Online Reviews. 2008 Eighth IEEE International Conference on Data Mining, 443–452. https://doi.org/10.1109/ICDM.2008.94
Liu, Z., Hong, L., & Liu, L. (2014). An investigation of online review helpfulness based on movie reviews. African Journal of Business Management, 8(12), 441–450. https://doi.org/10.5897/AJBM11.2628
Luo, L., Duan, S., Shang, S., & Pan, Y. (2021). What makes a helpful online review? Empirical evidence on the effects of review and reviewer characteristics. Online Information Review, 45(3), 614–632. https://doi.org/10.1108/OIR-05-2020-0186
Luo, Y., & Xu, X. (2019). Predicting the Helpfulness of Online Restaurant Reviews Using Different Machine Learning Algorithms: A Case Study of Yelp. Sustainability, 11, 5254. https://doi.org/10.3390/su11195254
Malik, M. S. I. (2020). Predicting users’ review helpfulness: The role of significant review and reviewer characteristics. Soft Computing, 24(18), 13913–13928. https://doi.org/10.1007/s00500-020-04767-1
Man, R., & Lin, K. (2021, April). Sentiment analysis algorithm based on bert and convolutional neural network. In 2021 IEEE Asia-Pacific Conference on Image Processing, Electronics and Computers (IPEC) (pp. 769-772). IEEE.
Mhowwala, Z., Sulthana, A. R., & Shetty, S. D. (2020). Movie Rating Prediction using Ensemble Learning Algorithms. International Journal of Advanced Computer Science and Applications, 11(8).
Muda, M., & Hamzah, M. I. (2021). Should I suggest this YouTube clip? The impact of UGC source credibility on eWOM and purchase intention. Journal of Research in Interactive Marketing, 15(3), 441–459. https://doi.org/10.1108/JRIM-04-2020-0072
Naeem, M. Z., Rustam, F., Mehmood, A., Ashraf, I., & Choi, G. S. (2022). Classification of movie reviews using term frequency-inverse document frequency and optimized machine learning algorithms. PeerJ Computer Science, 8, e914.
Nam, K., Baker, J., Ahmad, N., & Goo, J. (2020). Determinants of writing positive and negative electronic word-of-mouth: Empirical evidence for two types of expectation confirmation. Decision Support Systems, 129, 113168. https://doi.org/10.1016/j.dss.2019.113168
Ning, X., Yac, L., Wang, X., Benatallah, B., Dong, M., & Zhang, S. (2020). Rating prediction via generative convolutional neural networks based regression. Pattern Recognition Letters, 132, 12-20.
O’Mahony, M. P., & Smyth, B. (2010). A classification-based review recommender. Knowledge-Based Systems, 23(4), 323–329. https://doi.org/10.1016/j.knosys.2009.11.004
Qiu, L., Pang, J., & Lim, K. H. (2012). Effects of conflicting aggregated rating on eWOM review credibility and diagnosticity: The moderating role of review valence. Decision Support Systems, 54(1), 631-643.
Rafay, A., Suleman, M., & Alim, A. (2020, March). Robust review rating prediction model based on machine and deep learning: Yelp dataset. In 2020 International Conference on Emerging Trends in Smart Technologies (ICETST) (pp. 8138-8143). IEEE.
Reimers, N., & Gurevych, I. (2019). Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084.
Shah, D., Mashere, S., Kumar, A., Chalse, S., & Pawar, R. (2023). Predicting Movie Success through Ratings Analysis: A Machine Learning Approach.
Shan, G., Zhang, D., Zhou, L., Suo, L., Lim, J., & Shi, C. (2018, August). Inconsistency investigation between online review content and ratings. In Twenty-fourth Americas Conference on Information Systems.
Sharma, P., & Kaur, M. (2013). Classification in pattern recognition: A review. International Journal of Advanced Research in Computer Science and Software Engineering, 3(4).
Sivakumar, P., Rajeswaren, V. P., Abishankar, K., Ekanayake, E. M. U. W. J. B., & Mehendran, Y. (2020). Movie success and rating prediction using data mining algorithms.
Sulthana, A. N., & Vasantha, S. (2019). Influence of electronic word of mouth eWOM on purchase intention. International Journal of Scientific and Technology Research, 8(10), 1-5.
Taparia, A., & Bagla, T. (2020). Sentiment analysis: predicting product reviews’ ratings using online customer reviews. Available at SSRN 3655308.
Tsao, H. Y., Chen, M. Y., Campbell, C., & Sands, S. (2020). Estimating numerical scale ratings from text-based service reviews. Journal of Service Management, 31(2), 187-202.
Tseng, T. H., Chang, S. H., Wang, Y. M., Wang, Y. S., & Lin, S. J. (2020). An empirical investigation of the longitudinal effect of online consumer reviews on hotel accommodation performance. Sustainability, 13(1), 193.
Valdivia, A., Hrabova, E., Chaturvedi, I., Luzón, M. V., Troiano, L., Cambria, E., & Herrera, F. (2019). Inconsistencies on TripAdvisor reviews: A unified index between users and Sentiment Analysis Methods. Neurocomputing, 353, 3-16.
Verma, S., & Yadav, N. (2021). Past, present, and future of electronic word of mouth (EWOM). Journal of Interactive Marketing, 53, 111-128.

Verma, S., Saini, M., & Sharan, A. (2017, August). Deep sequential model for review rating prediction. In 2017 Tenth international conference on contemporary computing (IC3) (pp. 1-6). IEEE.
Wang, Y., Yan, Z., & Xing, L. (2021, November). A Movie Score Prediction Model Based on XGBoost Algorithm. In 2021 International Conference on Culture-oriented Science & Technology (ICCST) (pp. 486-491). IEEE.
Xiang, Z., Schwartz, Z., Gerdes Jr, J. H., & Uysal, M. (2015). What can big data and text analytics tell us about hotel guest experience and satisfaction?. International journal of hospitality management, 44, 120-130.
Xie, K. L., Zhang, Z., & Zhang, Z. (2014). The business value of online consumer reviews and management response to hotel performance. International Journal of Hospitality Management, 43, 1-12.
Yu, S., Qiu, J., Bao, X., Guo, M., Chen, X., & Sun, J. (2022, October). Movie Rating Prediction Recommendation Algorithm based on XGBoost-DNN. In 2022 12th International Conference on Information Science and Technology (ICIST) (pp. 288-293). IEEE.
Zakaluk, B. L., & Samuels, S. J. (1988). Toward a new approach to predicting text comprehensibility. Readability: Its past, present, and future, 121-144.
Zhang, Z., Ye, Q., Zhang, Z., & Li, Y. (2011). Sentiment classification of Internet restaurant reviews written in Cantonese. Expert Systems with Applications, 38(6), 7674-7682.
Zhao, Y., Xu, X., & Wang, M. (2019). Predicting overall customer satisfaction: Big data evidence from hotel online textual reviews. International Journal of Hospitality Management, 76, 111-121.
Zhu, L., Yin, G., & He, W. (2014). IS THIS OPINION LEADER’S REVIEW USEFUL? PERIPHERAL CUES FOR ONLINE REVIEW HELPFULNESS. 15(4), 14.

中文參考文獻
傅鈞暉(2020)。機器學習應用於電影評價預測與分類之研究。﹝碩士論文。國立臺北科技大學﹞臺灣博碩士論文知識加值系統。 https://hdl.handle.net/11296/v8yu92。
指導教授 胡雅涵(Ya-Han Hu) 審核日期 2024-6-24
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明