跨領域分辨真假評論之研究－以BERT為基礎模型

DC 欄位	值	語言
DC.contributor	企業管理學系	zh_TW
DC.creator	陳莉茿	zh_TW
DC.creator	LI-JU CHEN	en_US
dc.date.accessioned	2023-7-26T07:39:07Z
dc.date.available	2023-7-26T07:39:07Z
dc.date.issued	2023
dc.identifier.uri	http://ir.lib.ncu.edu.tw:444/thesis/view_etd.asp?URN=110421024
dc.contributor.department	企業管理學系	zh_TW
DC.description	國立中央大學	zh_TW
DC.description	National Central University	en_US
dc.description.abstract	線上評論在電子商務中具有重要的影響力，消費者越來越仰賴這些評論來做出購買決策，然而，不道德的企業可能散佈假評論以操縱消費者意見，而Ott et al. (2011) [19] 實驗表明，人類識別假評論的準確率僅有57.3%，且對於跨領域的真假評論分類模型，目前尚缺乏對於在不同領域間共享的文本特徵和規則之研究，由於模型過度依賴相同來源的資料，導致同個模型在其它資料集測試時，準確率急遽下降。因此，本研究提出基於 Bidirectional Encoder Representations from Transformers (BERT) 的模型，利用[MASK]替代評論中出現的該領域特定單詞，克服跨領域之間兩者評論風格差異性過大的問題，在我們的研究中使用來自Ott et al. (2011) [19] 和Li et al. (2014) [33] 在餐廳、旅館、醫生領域之評論，以及本研究額外加入Yelp真實評論做為訓練資料。最後，MASK-BERT於實驗結果中，與Ren & Ji (2017) [25] 為目前研究最佳之結果做比較，在Cross-domain中，F1-score最佳表現為 88.49%；而對於內容差異性較大的醫生領域，在本研究提出遮蔽機制後，Accuracy也提升了15~20%。	zh_TW
dc.description.abstract	Online reviews play a significant role in e-commerce. Consumer has been more relied on them when making decision in purchasing. However, unethical businesses may spread deceptive reviews to manipulate consumer`s opinion. Research by Ott et al. (2011) [19] showed that humans can only identify fraud reviews with only an accuracy of 57.3%. Besides, recent research face a crucial challenge that the cross-domain classification model is too rely on similar datasets from the same domain, which causes in a sharp decline in accuracy when testing on datasets from different domain. Currently, there is a lack of method on text features or rules to share with different domains. Hence, our study proposes a model based on Bidirectional Encoder Representations from Transformers (BERT). We suggest replacing domain-specific words in reviews with [MASK] to overcome the significant stylistic differences between cross-domain reviews. Our research utilizes reviews from Ott et al. (2011) [19] and Li et al. (2014) [33] in the domains of restaurants, hotels, and doctors, supplemented with Yelp reviews as real data for training. Finally, we compare the results of MASK-BERT with the state-of-the-art approach by Ren & Ji (2017) [25]. In the cross-domain, particularly in the doctor domain with larger content differences, our proposed masking mechanism leads to a highest accuracy improvement of 15-20%.	en_US
DC.subject	跨領域	zh_TW
DC.subject	BERT	zh_TW
DC.subject	假評論	zh_TW
DC.subject	虛假偵測	zh_TW
DC.subject	遮蔽資訊	zh_TW
DC.subject	cross-domain	en_US
DC.subject	BERT	en_US
DC.subject	fraud reviews	en_US
DC.subject	deception detection	en_US
DC.subject	masking information	en_US
DC.title	跨領域分辨真假評論之研究－以BERT為基礎模型	zh_TW
dc.language.iso	zh-TW	zh-TW
DC.title	Identify Deceptive Reviews in Cross-domain Content with BERT	en_US
DC.type	博碩士論文	zh_TW
DC.type	thesis	en_US
DC.publisher	National Central University	en_US

博碩士論文 110421024 完整後設資料紀錄