句子語意等級評估：可解釋性之詞彙釋義消歧;Sentence Semantic Level Evaluation: Leveraging Word Gloss Disambiguation for Interpretability

NCU Institutional Repository > 資訊電機學院 > 資訊工程研究所 > 博碩士論文 > Item 987654321/98152

請使用永久網址來引用或連結此文件: https://ir.lib.ncu.edu.tw/handle/987654321/98152

題名:	句子語意等級評估：可解釋性之詞彙釋義消歧;Sentence Semantic Level Evaluation: Leveraging Word Gloss Disambiguation for Interpretability
作者:	廖振閔;Liao, Jen-Min
貢獻者:	資訊工程學系
關鍵詞:	詞義消歧;基於註釋的字典;資料集創建方法;句子簡化;句子難度量化;語意分析;自然語言處理;繁體中文;文言文;Word Sense Disambiguation;Gloss-Based Dictionary;Traditional Chinese;Classical Chinese;Dataset Creation;Natural Language Processing;Sentence Simplification;Sentence Difficulty Quantification;Semantic Analysis
日期:	2025-04-22
上傳時間:	2025-10-17 12:26:22 (UTC+8)
出版者:	國立中央大學
摘要:	本研究旨在利用詞義消歧（WordSense Disambiguation, WSD）技術，開發一個細緻的句子難度評分系統，以解決簡化任務（Simplification Task）中常用評估方式（如BLEU 和 SARI）需要多個參考答案的問題，並增強評估的可解釋性。WSD在自然語言處理中仍是一項重要挑戰，尤其對於資源有限的語言。本研究針對非英語WSD數據集的稀缺性，通過自動創建基於詞彙釋義的語義庫，展示了利用現有字典資源來緩解數據限制的可行性。為解決上述問題，我們利用詞彙釋義消歧（WordGlossDisambiguation, WGD）技術，這是一種與WSD相關但更專注於詞彙釋義的技術。雖然WGD並非全新概念，但過去研究中未曾明確區分WGD與WSD。我們使用兩本繁體中文字典建立WGD模型，將提示轉換為多選題，並使用TAIDELlama3 8B 模型，在現代漢語和古典漢語中分別達到86.9% 和 80.8% 的準確率。在進一步的GPT-4o API 設定下，分數提升至89.8% 和 83.2%。儘管由於授權限制，我們無法分發最終數據集，但我們提供了所有的處理程序和訪問原始字典的清晰說明，以確保研究的可重現性。這些模型能準確計算句子中每個詞彙的釋義難度，進而評估句子的整體難度。在Google及OpenCC翻譯後的CSS和MCTS 簡化資料集上，我們的方法與標註者的一致率皆超過72%。本研究展示了WGD技術在句子簡化評估中的潛力。儘管目前無法評估整句話語意簡化前後的一致性，但其對詞彙釋義的細緻分析增強了句子簡化的可解釋性。;This study aims to develop a nuanced sentence difficulty scorer using Word Sense Disambiguation (WSD) techniques to address the limitations of traditional evaluation methods (BLEU, SARI) in simplification tasks, which require multiple reference answers, while also enhancing interpretability. This research also addresses the scarcity of non English WSD datasets by demonstrating the feasibility of leveraging existing dictionary resources to mitigate data limitations through the automated creation of a gloss-based sense inventory. Specifically, we employ Word Gloss Disambiguation (WGD) technology, a technique related to WSD but more focused on word glosses, to develop a model using two Traditional Chinese dictionaries. We transformed prompts into multiple-choice questions, achieving accuracy rates of 86.9% for Modern Chinese and 80.8% for Classical Chinese with the TAIDE Llama3 8B model. Further enhancements with GPT-4o API settings increased these scores to 89.8% and 83.2%, respectively. Although licensing constraints prevent us from distributing the final dataset, we provide the necessary processing steps and clear instructions for accessing the original dictionaries to ensure reproducibility. These models can accurately calculate the gloss difficulty of each word in a sentence, thereby assessing the overall sentence difficulty. On the CSS and MCTS simplification datasets translated by Google and OpenCC, our method achieved over 72% agreement with annotators. This study demonstrates the potential of WGD technology in sentence simplification evaluation. Although it currently cannot assess whether the semantic consistency of entire sentences is maintained before and after simplification, its detailed analysis of word glosses enhances the interpretability of sentence simplification.
顯示於類別:	[資訊工程研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	58	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....