| 摘要: | 隨著數位時代的進步,消費者對於產品和服務的評價查詢的需求也隨之提升。電子口碑行銷在影響消費決策方面已占據主要地位,甚至超越了傳統行銷渠道的影響力。現今,消費者常會參考Google Maps的評論系統來蒐集商家的資訊和使用者經驗,藉以決定是否要前往消費。然而,不準確或虛假的評價對於消費者和商家雙方皆造成不良影響。因此,建構一套有效的一致性檢測機制變得至關重要。本研究旨在解決此一問題,首先,本研究從熱門旅遊和餐飲平台上蒐集評論資料,並運用多種情感分析法將評論基於評分星等分為兩類:一致與不一致。隨後,採用詞袋分析技術(Chinese Knowledge and Information Processing, CKIP)透過TF-IDF計算各詞彙的重要性,並利用三款詞向量表示技術:SBERT、FastText和Doc2Vec將文字資料轉化為向量空間。接著,使用多種機器學習演算法,包括隨機森林(Random Forest, RF)、邏輯式迴歸(Logistic Regression, LR)、K近鄰演算法(K Nearest Neighbor, kNN)、梯度提升技術(Gradient Boosting, GB)和自適應增強(Adaptive Boosting, AdaBoosting),對資料集進行了訓練和測試,並以此評估模型的表現。此外,為了瞭解模型的特異性和敏感度,本研究測量了評估指標的效能和分類判斷閾值的調整形式展示了結果。透過本研究,致力於解決電子口碑的不一致性問題,並提供了一個基於多重詞向量表示的技術框架,可以有效地偵測產品和服務評論中的不一致現象。此研究成果有助於提升電子口碑在商業決策中的可靠性,從而幫助消費者做出更明智的選擇,同時減輕商家在管理上所面臨的不必要負擔。;With the advancement of the digital age, consumers′ demand for product and service evaluation queries has also increased. Electronic word-of-mouth marketing has occupied a major position in influencing consumer decisions, even surpassing the influence of traditional marketing channels. Nowadays, consumers often refer to the review system of Google Maps to collect information and user experience of merchants to decide whether to go there for consumption. However, inaccurate or false reviews have a negative impact on both consumers and merchants. Therefore, it is crucial to construct an effective consistency detection mechanism. This study aims to solve this problem. First, this study collects review data from popular travel and catering platforms, and uses multiple sentiment analysis methods to divide reviews into two categories based on the star rating: consistent and inconsistent. Subsequently, the bag-of-words analysis technique (Chinese Knowledge and Information Processing, CKIP) is used to calculate the importance of each word through TF-IDF, and three word vector representation technologies: SBERT, FastText and Doc2Vec are used to convert text data into vector space. Then, the dataset was trained and tested using a variety of machine learning algorithms, including Random Forest (RF), Logistic Regression (LR), K Nearest Neighbor (kNN), Gradient Boosting (GB), and Adaptive Boosting (AdaBoost), to evaluate the performance of the models. In addition, to understand the specificity and sensitivity of the models, the performance of the evaluation indicators and the adjustment of the classification judgment threshold were measured results were presented. Through this study, we aim to address the inconsistency problem of e-WOM and provide a technical framework based on multiple word vector representations to effectively detect inconsistencies in product and service reviews. This research result helps to improve the reliability of e-WOM in business decisions, thereby helping consumers make more informed choices and reducing the unnecessary burden faced by merchants in management. |