運用特徵導向建模與原始特徵注入Stacking於假評論偵測研究;Stacking-Based Fake Review Detection Method Enhanced by Feature-Oriented Modeling and Raw Feature Injection

NCU Institutional Repository > 管理學院 > 資訊管理研究所 > 博碩士論文 > Item 987654321/98349

請使用永久網址來引用或連結此文件: https://ir.lib.ncu.edu.tw/handle/987654321/98349

題名:	運用特徵導向建模與原始特徵注入Stacking於假評論偵測研究;Stacking-Based Fake Review Detection Method Enhanced by Feature-Oriented Modeling and Raw Feature Injection
作者:	林嫈容;Lin, Ying-Jung
貢獻者:	資訊管理學系
關鍵詞:	機器學習;整合學習;假評論偵測;Machine Learning;Ensemble Learning;Stacking;Boosting;Fake Review Detection
日期:	2025-07-23
上傳時間:	2025-10-17 12:39:44 (UTC+8)
出版者:	國立中央大學
摘要:	隨著生成式AI與網路操縱手法的快速發展，假評論日益擬真，對消費者決策造成衝擊。過往依賴單一模型進行偵測，難以應對假評論蘊含的複雜特徵與資料不平衡問題，導致整體效能受限。為解決此問題，本研究提出一套改良的Stacking整合架構，強化對假評論的辨識能力，並對抗資料不平衡所致的分類偏差。在實驗設計方面，本研究選用Yelp評論資料集，並分別根據特徵類型選擇最適合的Base Learner（SVM、Random Forest、LightGBM）進行第一層訓練，再透過Meta Learner（XGBoost、LightGBM、CNN和 LR）整合第一層Base Learner之假評論機率預測結果及原始特徵進行最終分類決策，以提升整體分類效能。本研究先挑選出各特徵的最適模型，行為特徵採用LightGBM作為 Base Learner能取得最佳分類效果，而文字特徵則採用Random Forest表現較佳，而在語義特徵部分，則以SVM表現最為優異。這些最佳特徵－模型組合所輸出的假評論機率，將與原始特徵一併輸入第二層Meta Learner進行整合預測，實驗證實納入原始特徵後可提升分類效能。另外，於Meta Learner中，LightGBM在準確率上表現最為突出，驗證其作為整合模型的優勢。最後，為驗證本研究方法在該領域的優勢，與過往Stacking與非Stacking方法進行效能比較。結果顯示，本研究在所有評估指標上皆優於過往研究，證實本研究於模型設計上的創新，確實能有效提升假評論的偵測效能。 ;With the rapid advancement of generative AI and online manipulation, fake reviews have become increasingly realistic, influencing consumer decisions. Traditional detection methods relying on single models struggle with the complexity and imbalance of such data. To address this, this study proposes an enhanced Stacking framework that improves fake review detection and mitigates classification bias. This study utilizes Yelp datasets and selects the most suitable Base Learners (SVM, Random Forest, LightGBM) based on feature types for the first-layer training. The predicted fake review probabilities and original features are then integrated by Meta Learners (XGBoost, LightGBM, CNN, and LR) to enhance overall classification performance. This study first selected the optimal Base Learner for each feature type: LightGBM for behavioral features, Random Forest for textual features, and SVM for semantic features. The resulting fake review probabilities, combined with original features, were input into the second-layer Meta Learner for integrated prediction. Results show that incorporating original features improves classification performance, and among the Meta Learners, LightGBM achieved the highest accuracy, confirming its superiority as an integration model. Finally, to validate the advantages of the proposed method, performance comparisons were made with past Stacking and non-Stacking approaches. Results show that this study outperformed all previous methods across all evaluation metrics, confirming that the proposed model design effectively enhances fake review detection performance.
顯示於類別:	[資訊管理研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	61	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....