DC 欄位 |
值 |
語言 |
DC.contributor | 企業管理學系 | zh_TW |
DC.creator | 陳莉茿 | zh_TW |
DC.creator | LI-JU CHEN | en_US |
dc.date.accessioned | 2023-7-26T07:39:07Z | |
dc.date.available | 2023-7-26T07:39:07Z | |
dc.date.issued | 2023 | |
dc.identifier.uri | http://ir.lib.ncu.edu.tw:444/thesis/view_etd.asp?URN=110421024 | |
dc.contributor.department | 企業管理學系 | zh_TW |
DC.description | 國立中央大學 | zh_TW |
DC.description | National Central University | en_US |
dc.description.abstract | 線上評論在電子商務中具有重要的影響力,消費者越來越仰賴這些評論來做出購買決策,然而,不道德的企業可能散佈假評論以操縱消費者意見,而Ott et al. (2011) [19] 實驗表明,人類識別假評論的準確率僅有57.3%,且對於跨領域的真假評論分類模型,目前尚缺乏對於在不同領域間共享的文本特徵和規則之研究,由於模型過度依賴相同來源的資料,導致同個模型在其它資料集測試時,準確率急遽下降。
因此,本研究提出基於 Bidirectional Encoder Representations from Transformers (BERT) 的模型,利用[MASK]替代評論中出現的該領域特定單詞,克服跨領域之間兩者評論風格差異性過大的問題,在我們的研究中使用來自Ott et al. (2011) [19] 和Li et al. (2014) [33] 在餐廳、旅館、醫生領域之評論,以及本研究額外加入Yelp真實評論做為訓練資料。最後,MASK-BERT於實驗結果中,與Ren & Ji (2017) [25] 為目前研究最佳之結果做比較,在Cross-domain中,F1-score最佳表現為 88.49%;而對於內容差異性較大的醫生領域,在本研究提出遮蔽機制後,Accuracy也提升了15~20%。 | zh_TW |
dc.description.abstract | Online reviews play a significant role in e-commerce. Consumer has been more relied on them when making decision in purchasing. However, unethical businesses may spread deceptive reviews to manipulate consumer`s opinion. Research by Ott et al. (2011) [19] showed that humans can only identify fraud reviews with only an accuracy of 57.3%. Besides, recent research face a crucial challenge that the cross-domain classification model is too rely on similar datasets from the same domain, which causes in a sharp decline in accuracy when testing on datasets from different domain. Currently, there is a lack of method on text features or rules to share with different domains.
Hence, our study proposes a model based on Bidirectional Encoder Representations from Transformers (BERT). We suggest replacing domain-specific words in reviews with [MASK] to overcome the significant stylistic differences between cross-domain reviews. Our research utilizes reviews from Ott et al. (2011) [19] and Li et al. (2014) [33] in the domains of restaurants, hotels, and doctors, supplemented with Yelp reviews as real data for training. Finally, we compare the results of MASK-BERT with the state-of-the-art approach by Ren & Ji (2017) [25]. In the cross-domain, particularly in the doctor domain with larger content differences, our proposed masking mechanism leads to a highest accuracy improvement of 15-20%. | en_US |
DC.subject | 跨領域 | zh_TW |
DC.subject | BERT | zh_TW |
DC.subject | 假評論 | zh_TW |
DC.subject | 虛假偵測 | zh_TW |
DC.subject | 遮蔽資訊 | zh_TW |
DC.subject | cross-domain | en_US |
DC.subject | BERT | en_US |
DC.subject | fraud reviews | en_US |
DC.subject | deception detection | en_US |
DC.subject | masking information | en_US |
DC.title | 跨領域分辨真假評論之研究-以BERT為基礎模型 | zh_TW |
dc.language.iso | zh-TW | zh-TW |
DC.title | Identify Deceptive Reviews in Cross-domain Content with BERT | en_US |
DC.type | 博碩士論文 | zh_TW |
DC.type | thesis | en_US |
DC.publisher | National Central University | en_US |