| 摘要: | 隨著醫療健康資訊需求日益攀升,自動化處理並理解患者提問成為智慧醫療發展中 的一項關鍵挑戰。英文醫療問題摘要任務旨在將語句冗長且結構鬆散的原始問題,轉化 為語意明確且重點突出的簡潔問句,進而提升醫療問答系統於資訊檢索與應答階段之效 能。然而,現有生成式摘要模型常面臨資訊遺漏與幻覺 (hallucination) 等問題,對臨床 應用的可靠性與安全性構成潛在風險。 本研究提出一套以 FaMeSum 模型為基礎所延伸之改良型摘要架構,融合多種任務 導向的損失函數設計,包括對比學習損失 (Contrastive Loss) 、醫療知識損失 (Medical Knowledge Loss) 與疑問詞約束損失(Interrogative-Word Constrained Loss) ,並設計一套 自動化的醫療焦點擷取與樣本構建策略,從原始問題中擷取語義關鍵詞,以強化模型對 醫療核心資訊之學習能力。 本研究採用 MeQSum 資料集進行實驗評估,比較多種主流醫療問題摘要模型之 效能。實驗結果顯示,本研究所提出之模型在重點涵蓋度指標 ROUGE 上分別達到 ROUGE-1 為 54.84、ROUGE-2 為 37.81、ROUGE-L 為 52.65,在語義一致性指標 BERTScore 上則達到 89.21。相較於 FaMeSum 模型,本模型在 ROUGE-1 提升 3.69、ROUGE-2 提升 2.98、ROUGE-L 提升 3.73,BERTScore 則提升 1.24,整體表 現顯著優於現有方法。本研究證實,透過結合醫療語義理解與對比學習機制之生成式 摘要模型,能顯著提升醫療問題摘要任務之準確性與忠實性。;With the growing demand for medical information, automating the processing and understanding of patient inquiries has become a pivotal challenge in the advancement of intelligent healthcare systems. The task of medical question summarization aims to transform verbose and loosely structured patient questions into concise, semantically clear queries, thereby enhancing the efficiency of medical question-answering systems in both information retrieval and response generation phases. However, existing generative summarization models often grapple with issues like information omission and hallucination, posing potential risks to clinical reliability and safety. This study introduces an enhanced summarization framework based on the FaMeSum model, integrating multiple task-oriented loss functions, including Contrastive Loss, Medical Knowledge Loss, and Interrogative-Word Constrained Loss. Additionally, we devise an automated strategy for extracting medical focal points and constructing samples by identifying semantic keywords from original questions, aiming to bolster the model′s capability in learning core medical information. We conduct experimental evaluations using the MeQSum dataset, comparing the performance of various state-of-the-art medical question summarization models. The results demonstrate that our proposed model achieves ROUGE-1 of 54.84, ROUGE-2 of 37.81, ROUGE-L of 52.65, and a BERTScore of 89.21. Compared to the original FaMeSum model, ii our approach shows improvements of +3.69 in ROUGE-1, +2.98 in ROUGE-2, +3.73 in ROUGE-L, and +1.24 in BERTScore. These findings confirm that integrating medical semantic understanding with contrastive learning mechanisms in generative summarization models can significantly enhance the accuracy and faithfulness of medical question summarization tasks. |