When Words and Scores Diverge: A Machine Learning Study on Wine Review Inconsistencies

NCUIR > School of Management at National Central University > Executive Master of Information Management > Electronic Thesis & Dissertation > Item 987654321/98279

Please use this identifier to cite or link to this item: https://ir.lib.ncu.edu.tw/handle/987654321/98279

Title:	When Words and Scores Diverge: A Machine Learning Study on Wine Review Inconsistencies
Authors:	石佩玉;Shih, Pei-Yu
Contributors:	資訊管理學系在職專班
Keywords:	葡萄酒評論;評論不一致性;機器學習;情感分析;電子口碑;wine review;review inconsistency;machine learning;sentiment scoring;e-WOM
Date:	2025-07-23
Issue Date:	2025-10-17 12:34:34 (UTC+8)
Publisher:	國立中央大學
Abstract:	在當代的數位消費環境中，線上評論已成為影響購買決策的重要依據，特別是對於如葡萄酒般的體驗型商品。然而，文本評論與其所附的評分間常出現不一致的現象，導致消費者誤判產品品質，並進一步削弱平台的公信力與信任度。針對此一問題，本研究以 Vivino 平台上之葡萄酒評論作為研究對象，建構出一套用於偵測評論內容與評分是否一致的二元分類模型。本研究首先透過爬蟲收集逾 72 萬筆用戶評論資料，並以評論內容與星等差異為依據，自行標註一致性標籤。接著，分別使用 TF-IDF 與 SBERT 進行文本向量化處理，並搭配隨機森林、類神經網路、XGBoost演算法與 LightGBM 模型等多種分類模型進行訓練與測試。在資料處理上，本研究設計多種篩選條件 (如最小按讚數、文件頻率閾值)，並透過 SMOTE 處理類別不平衡問題，以提升模型效能。實驗結果顯示，LightGBM 在多數條件下表現最為穩定，且 TF-IDF 與 SBERT 各有優勢：SBERT 在語意處理上具有優勢，TF-IDF 則在高文件頻率下具備較佳分類效果。本研究不僅提出一套可應用於評論平台的自動偵測機制，亦呼應數位時代下使用者生成內容的可信度與透明性議題，對於平台設計者與消費者皆具實務與學術參考價值。;In today’s digital consumption environment, online reviews play a critical role in shap-ing consumer decisions, particularly for experiential products like wine. However, inconsist-encies between textual reviews and their accompanying scores are frequently observed, po-tentially misleading users and undermining trust in review platforms. This study investigates this issue by constructing a binary classification model to detect alignment between review text and numerical ratings, using data collected from the Vivino platform. Over 720,000 user reviews were collected via web scraping and manually labeled for consistency based on discrepancies between text content and star ratings. Textual features were extracted using both Term Frequency-Inverse Document Frequency (TF-IDF) and Sen-tence-BERT (SBERT), and four classification algorithms were applied: Random Forest (RF), Neural Network (NN), Extreme Gradient Boosting (XGBoost), and Light Gradient Boosting Machine (LightGBM). To enhance model performance, the study incorporated various filter-ing criteria, for instance, minimum like counts and document frequency (DF) thresholds, also used SMOTE to address class imbalance. Experimental results indicate that LightGBM consistently outperformed other models across multiple data conditions. While SBERT offered superior semantic representation, TF-IDF exhibited stronger performance in high-document-frequency settings. This research not only proposes an effective automated framework for detecting review-score inconsistencies but also contributes to broader discussions on the credibility and transparency of user-generated content in digital platforms.
Appears in Collections:	[Executive Master of Information Management] Electronic Thesis & Dissertation

Files in This Item:

File	Description	Size	Format
index.html		0Kb	HTML	172	View/Open

社群 sharing

Loading...