基於提示學習的中文事實查核任務之研究

DC 欄位	值	語言
DC.contributor	資訊工程學系	zh_TW
DC.creator	丁于晏	zh_TW
DC.creator	Yu-Yen Ting	en_US
dc.date.accessioned	2023-7-24T07:39:07Z
dc.date.available	2023-7-24T07:39:07Z
dc.date.issued	2023
dc.identifier.uri	http://ir.lib.ncu.edu.tw:88/thesis/view_etd.asp?URN=110522061
dc.contributor.department	資訊工程學系	zh_TW
DC.description	國立中央大學	zh_TW
DC.description	National Central University	en_US
dc.description.abstract	在當今資訊蓬勃發展的時代，網路上充斥各種主張，這些主張的真偽往往難以分辨，而人工方式審核這些主張的真實性並不容易，因此，需要透過自動事實查核解決這個問題。本篇論文研究重點在於中文事實查核任務，過去的研究主要集中在英文或多語言的資料集上，並且著重於傳統的預訓練和微調方法。因此本研究旨在利用新興自然語言範式「預訓練、提示、預測」的提示學習，來提升中文事實查核的效能。事實查核任務包括證據檢索 (Evidence Retrieval) 及宣稱驗證 (Claim Verification) 兩個子任務。在宣稱驗證方面，我們探討多種提示學習策略在宣稱驗證任務上。由於提示學習需要設計一個模板加入到輸入端，我們會分為人工設計的模板和自動生成的模板。對於自動生成方法，我們採用 Automated Prompt Engineer (APE) [1] 來生成的提示模板，研究結果顯示提示學習有助於提升宣稱驗證的 F1 效能 1%-2% (從 78.99% 到 80.70%)。在證據檢索方面，我們使用監督式的 SentenceBERT [2] 和非監督式的 PromptBERT [3] 改善證據檢索效能。非監督式 PromptBERT 可增加 F1 效能 18% (從 12.66% 到 30.61%)，而監督式SentenceBERT 更可大幅提升 F1 效能 88.15%。最後，我們整合宣稱驗證和證據檢索後，在中文事實查核的資料集 CHEF 上，F1 效能可以達到 80.54%，大幅超過基線效能 63.47%，甚至超過使用人工標記的正確證據 (Golden Evidence) 的效能 78.99%。整體而言，提示學習在中文事實查核的效能能夠改善傳統微調的效能。	zh_TW
dc.description.abstract	With the wide spread of information, there are many fake claims on the Web, but it is difficult for humans to check whether the claim is true or not. Therefore, automated fact-checking can solve the problem. Our research focuses on Chinese fact-checking. Previous work has focused on English or multilingual fact-checking datasets and on pre-train and fine-tuning methods. Therefore, we want to enhance the performance of Chinese fact-checking through prompt-based learning. The fact-checking task consists of two subtasks evidence retrieval and claim verification. Since prompt based learning requires designing a template to be added to the input, we divide it into manually designed templates and automatically generated templates. For the automated method, we generate the template by Automatic Prompt Engineer (APE) and use various prompt-based learning training strategies for claim verification. Additionally, we will use supervised SentenceBERT [2] and unsupervised PromptBERT [3] models to improve the evidence retrieval. We show that prompt-based learning can improve the F1 score of claim verification by 1%-2% (from 78.99% to 80.70%), and both evidence retrieval models also show significant performance improvements by 18% (from 12.66% to 30.61%) and achieve the performance at 88.15%. Finally, we combine evidence retrieval with claim verification to construct the complete pipeline for fact-checking. We achieve an impressive F1 score of 80.54% which outperforms the baseline 63.47%, and even outperforms the gold evidence based claim verification, increasing from 78.99% to 80.54%.	en_US
DC.subject	事實查核	zh_TW
DC.subject	提示學習	zh_TW
DC.subject	提示微調	zh_TW
DC.subject	參數高效微調	zh_TW
DC.subject	Fact Checking	en_US
DC.subject	Prompt Based Learning	en_US
DC.subject	Prompt Tuning	en_US
DC.subject	Parameter-Efficient Fine-Tuning	en_US
DC.title	基於提示學習的中文事實查核任務之研究	zh_TW
dc.language.iso	zh-TW	zh-TW
DC.title	The Study of Prompt Based Learning for Chinese Fact Checking	en_US
DC.type	博碩士論文	zh_TW
DC.type	thesis	en_US
DC.publisher	National Central University	en_US

博碩士論文 110522061 完整後設資料紀錄