本研究展示了WGD技術在句子簡化評估中的潛力。儘管目前無法評估整句話語意簡化前後的一致性,但其對詞彙釋義的細緻分析增強了句子簡化的可解釋性。;This study aims to develop a nuanced sentence difficulty scorer using Word Sense Disambiguation (WSD) techniques to address the limitations of traditional evaluation methods (BLEU, SARI) in simplification tasks, which require multiple reference answers, while also enhancing interpretability. This research also addresses the scarcity of non English WSD datasets by demonstrating the feasibility of leveraging existing dictionary resources to mitigate data limitations through the automated creation of a gloss-based sense inventory.
Specifically, we employ Word Gloss Disambiguation (WGD) technology, a technique related to WSD but more focused on word glosses, to develop a model using two Traditional Chinese dictionaries. We transformed prompts into multiple-choice questions, achieving accuracy rates of 86.9% for Modern Chinese and 80.8% for Classical Chinese with the TAIDE Llama3 8B model. Further enhancements with GPT-4o API settings increased these scores to 89.8% and 83.2%, respectively.
Although licensing constraints prevent us from distributing the final dataset, we provide the necessary processing steps and clear instructions for accessing the original dictionaries to ensure reproducibility. These models can accurately calculate the gloss difficulty of each word in a sentence, thereby assessing the overall sentence difficulty. On the CSS and MCTS simplification datasets translated by Google and OpenCC, our method achieved over 72% agreement with annotators.
This study demonstrates the potential of WGD technology in sentence simplification evaluation. Although it currently cannot assess whether the semantic consistency of entire sentences is maintained before and after simplification, its detailed analysis of word glosses enhances the interpretability of sentence simplification.