結合批次交叉注意力與混合排序損失以強化多面向評分模型;Enhancing Cross-Prompt Automated Essay Scoring with Batch Cross-Attention and Hybrid Ranking Loss

NCU Institutional Repository > 資訊電機學院 > 資訊工程研究所 > 博碩士論文 > Item 987654321/98636

請使用永久網址來引用或連結此文件: https://ir.lib.ncu.edu.tw/handle/987654321/98636

題名:	結合批次交叉注意力與混合排序損失以強化多面向評分模型;Enhancing Cross-Prompt Automated Essay Scoring with Batch Cross-Attention and Hybrid Ranking Loss
作者:	蔡博維;Tsai, Po-Wei
貢獻者:	資訊工程學系
關鍵詞:	自動評分;排序損失;Cross-Attention;Automated Essay Scoring;Ranking Loss;交叉注意力
日期:	2025-08-28
上傳時間:	2025-10-17 13:02:03 (UTC+8)
出版者:	國立中央大學
摘要:	隨著自然語言處理在教育、商業與生成任務中的廣泛應用，多面向評分成為語言理解領域的重要挑戰。然而，現有方法多僅依賴單樣本或成對樣本進行評分，缺乏全局視角，易產生排序傳遞性不一致的問題。為了解決此局限，本研究提出結合 Batch Cross Attention 與 Hybrid Ranking Loss 的跨任務多面向評分模型。Batch Cross-Attention 允許模型在訓練階段同時將同一批次所有文本作為 Query、Key、Value，透過注意力機制捕捉樣本間的細微差異與整體分布，以提升排序的穩定性與比較性，Hybrid Ranking Loss 將局部 Pairwise Rank Loss 與全局 List-wise Loss 結合，既懲罰局部排序錯誤，又維持全局一致性，避免傳遞性矛盾。所提模型能兼容作文、自動問句生成與評論等場景，在內容、組織與可答性等多個面向中實現一致性評估。實驗結果顯示，相較於傳統 point wise、pair-wise 及純 list-wise 方法，本方法在分數一致性（如 QWK）與排序相關性（如 Kendall’s τ）均有顯著提升，證明 Batch Cross-Attention 與 Hybrid Ranking Loss 的有效性與通用性。;Language education plays a vital role in globalization and cross-cultural communication, and Automated Essay Scoring (AES) has gained increasing attention due to its fast and consistent assessment capabilities. Traditional AES methods typically adopt prompt-specific training, achieving high accuracy on familiar prompts but lacking generalization ability to unseen prompts due to the unavailability of annotated data. To address this, recent cross-prompt approaches train and test models across multiple prompts, yet most rely on point-wise or pair wise comparisons that learn only relative rankings between pairs of essays, neglecting the positioning of individual essays within the overall distribution. Building upon the MOOSE framework, this study proposes a two-stage batch-aware ranking and regression framework. In the first stage, we introduce Batch Cross-Attention within the MOOSE architecture, allowing all essays in the same mini-batch to attend to each other during forward propagation, thereby jointly considering global semantic differences. Optimization employs a combination of list wise and pair-wise losses to ensure both global and local ranking consistency. In the second stage, predicted ranking scores are discretized into K bins based on quantiles, and bin position embeddings are concatenated with original essay features. A Bin Regressor is then trained with mean squared error combined with pair-wise loss to fine-tune the continuous scores. Experimental results demonstrate that our method improves ranking transitivity and QWK regression accuracy across multiple prompts, yielding more stable and interpretable scoring by incorporating global ranking information.
顯示於類別:	[資訊工程研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	52	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....