| 摘要: | 隨著生成式AI與網路操縱手法的快速發展,假評論日益擬真,對消費者決策造成衝擊。過往依賴單一模型進行偵測,難以應對假評論蘊含的複雜特徵與資料不平衡問題,導致整體效能受限。為解決此問題,本研究提出一套改良的Stacking整合架構,強化對假評論的辨識能力,並對抗資料不平衡所致的分類偏差。 在實驗設計方面,本研究選用Yelp評論資料集,並分別根據特徵類型選擇最適合的Base Learner(SVM、Random Forest、LightGBM)進行第一層訓練,再透過Meta Learner(XGBoost、LightGBM、CNN和 LR)整合第一層Base Learner之假評論機率預測結果及原始特徵進行最終分類決策,以提升整體分類效能。 本研究先挑選出各特徵的最適模型,行為特徵採用LightGBM作為 Base Learner能取得最佳分類效果,而文字特徵則採用Random Forest表現較佳,而在語義特徵部分,則以SVM表現最為優異。這些最佳特徵-模型組合所輸出的假評論機率,將與原始特徵一併輸入第二層Meta Learner進行整合預測,實驗證實納入原始特徵後可提升分類效能。另外,於Meta Learner中,LightGBM在準確率上表現最為突出,驗證其作為整合模型的優勢。 最後,為驗證本研究方法在該領域的優勢,與過往Stacking與非Stacking方法進行效能比較。結果顯示,本研究在所有評估指標上皆優於過往研究,證實本研究於模型設計上的創新,確實能有效提升假評論的偵測效能。 ;With the rapid advancement of generative AI and online manipulation, fake reviews have become increasingly realistic, influencing consumer decisions. Traditional detection methods relying on single models struggle with the complexity and imbalance of such data. To address this, this study proposes an enhanced Stacking framework that improves fake review detection and mitigates classification bias. This study utilizes Yelp datasets and selects the most suitable Base Learners (SVM, Random Forest, LightGBM) based on feature types for the first-layer training. The predicted fake review probabilities and original features are then integrated by Meta Learners (XGBoost, LightGBM, CNN, and LR) to enhance overall classification performance. This study first selected the optimal Base Learner for each feature type: LightGBM for behavioral features, Random Forest for textual features, and SVM for semantic features. The resulting fake review probabilities, combined with original features, were input into the second-layer Meta Learner for integrated prediction. Results show that incorporating original features improves classification performance, and among the Meta Learners, LightGBM achieved the highest accuracy, confirming its superiority as an integration model. Finally, to validate the advantages of the proposed method, performance comparisons were made with past Stacking and non-Stacking approaches. Results show that this study outperformed all previous methods across all evaluation metrics, confirming that the proposed model design effectively enhances fake review detection performance. |