利用文字探勘技術探討暴雷因子對電影評論有用性之影響;Exploring the Impact of Spoiler Factors on the Helpfulness of Movie Reviews Using Text Mining Technology

NCU Institutional Repository > 管理學院 > 資訊管理學系碩士在職專班 > 博碩士論文 > Item 987654321/98234

請使用永久網址來引用或連結此文件: https://ir.lib.ncu.edu.tw/handle/987654321/98234

題名:	利用文字探勘技術探討暴雷因子對電影評論有用性之影響;Exploring the Impact of Spoiler Factors on the Helpfulness of Movie Reviews Using Text Mining Technology
作者:	許智雯;Hsu, Chih-Wen
貢獻者:	資訊管理學系在職專班
關鍵詞:	IMDb;評論暴雷;評論有用性;文字探勘;機器學習
日期:	2025-07-03
上傳時間:	2025-10-17 12:31:31 (UTC+8)
出版者:	國立中央大學
摘要:	隨著數位媒體與網路平台的蓬勃發展，電影評論已成為觀眾選片時的重要參考依據。IMDb與Rotten Tomatoes等大型評論網站提供豐富的用戶評價與評分機制，使得觀眾在大量評論中常陷入資訊過載，難以迅速辨別具參考價值的內容。為協助觀眾更有效判斷評論品質，評論「有用性」的預測與排序成為本研究之重點。而評論中是否包含暴雷(劇透)資訊，可能影響其被視為有用的程度，特別是在不同電影類型下，觀眾對劇情揭露的接受度有所差異。本研究以 IMDb 電影評論為資料來源，選取行動片(Action)、冒險片(Adventure)與戲劇片(Drama) 3種類型，探討暴雷因子對評論有用性預測準確性的影響。資料處理方面，採用Term Frequency-Inverse Document Frequency (TF-IDF)、FastText與Sentence-BERT (SBERT)，3種詞向量模型進行特徵轉換並結合5種迴歸模型，分別為Linear Regression、Random Forest、k-Nearest Neighbors、Extreme Gradient Boosting (XGBoost)、Adaptive Boosting (AdaBoost)，進行訓練與效能比較。結果顯示，TF-IDF搭配增強式模型(如AdaBoost或XGBoost)於多數情境中展現最佳表現，特別是在Drama暴雷類型中，TF-IDF加上XGBoost模型表現最好，顯示暴雷資訊在特定情境下可提升評論參考價值。本研究成果有助於評論平台未來在排序與推薦系統中，針對暴雷資訊設計更細緻的呈現策略，協助使用者依偏好快速篩選出最具價值的評論，提升使用者體驗與檢索效率。;With the widespread growth of digital media and online platforms, movie reviews have become a key reference point for audiences when choosing films. Popular websites such as IMDb and Rotten Tomatoes provide large volumes of user-generated content, often resulting in information overload and making it difficult for users to quickly identify helpful reviews. To address this issue, this study focuses on the prediction and ranking of review helpfulness. An important factor examined is the presence of spoilers, which can influence how useful a review is perceived to be. Audience sensitivity to spoilers often varies across different movie genres. This study uses IMDb reviews from three selected genres, namely Action, Adventure, and Drama, to investigate how spoiler content affects the accuracy of helpfulness prediction. For feature extraction, the study applies three text representation methods: Term Frequency-Inverse Document Frequency (TF-IDF), FastText, and Sentence-BERT (SBERT). The resulting feature vectors are then used as input for five regression models, namely Linear Regression, Random Forest, k-Nearest Neighbors (kNN), Extreme Gradient Boosting (XGBoost), and Adaptive Boosting (AdaBoost), to evaluate and compare prediction performance. The results show that TF-IDF, when used in combination with boosting-based models such as XGBoost and AdaBoost, consistently achieves superior performance across most scenarios. In particular, the TF-IDF and XGBoost combination performs best in the Drama genre with spoilers, suggesting that spoiler content can, in certain contexts, enhance the perceived helpfulness of reviews. These findings provide practical implications for online review platforms. Specifically, they support the development of content-aware sorting and recommendation strategies that take spoiler presence into account, helping users more efficiently identify reviews that match their preferences and improving overall user experience and information retrieval effectiveness.
顯示於類別:	[資訊管理學系碩士在職專班 ] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	65	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....