運用異質圖注意力網路於中文醫療答案擷取式摘要

DC 欄位	值	語言
DC.contributor	電機工程學系	zh_TW
DC.creator	田高源	zh_TW
DC.creator	Kao-Yuan Tien	en_US
dc.date.accessioned	2023-10-13T07:39:07Z
dc.date.available	2023-10-13T07:39:07Z
dc.date.issued	2023
dc.identifier.uri	http://ir.lib.ncu.edu.tw:88/thesis/view_etd.asp?URN=110521083
dc.contributor.department	電機工程學系	zh_TW
DC.description	國立中央大學	zh_TW
DC.description	National Central University	en_US
dc.description.abstract	檢索式醫療問答系統藉由問題與答案的配對排序，回覆使用者的醫療相關問題。然而，返回的相關資訊通常多樣複雜，對於尋找特定資訊的使用者來說，這些答案通常需要花費時間閱讀與理解。本研究專注於中文醫療答案摘要問題，藉由文本摘要技術，將冗長複雜的相關資訊，擷取成簡潔易於理解的答案。我們提出一個基於異質圖注意力網路的擷取式摘要模型 (Heterogeneous Graph Attention Networks for Extractive Summarization, HGATSUM)，用於檢索式中文醫療問答系統。首先，我們將醫療問題和答案對建構成異質圖，圖節點包含問題、答案以及醫療實體，節點間關係做為邊，包含1) 答案句子間基於修辭結構理論的依賴關係; 2) 問題與答案句子間的相似關係; 以及3) 醫療實體和問題或答案句子間的提及關係。然後，經由圖注意力網路來學習異質圖的節點表示。最後，將答案句子的圖節點表示與相關性特徵結合後，進行答案中的句子選擇與組合，形成最終輸出摘要答案。由於缺乏公開的評測資料集，我們建置了一個中文醫療答案擷取式摘要任務的資料集 (Med-AnsSum)，包含469筆醫療問題，以及這些問題藉由檢索系統返回的問答配對共有3,314筆，每筆皆人工標記擷取摘要答案。藉由實驗與效能評估得知，我們提出的模型HGATSUM在資料集Med-AnsSum上的ROUGE (1/2/L) 分數表現 (82.08/78.66/81.60)，皆優於其他相關模型(BERTSUMEXT, MATCHSUM, AREDSUM以及Bert-QSBUM)，人工評估進一步驗證我們提出的HGATSUM模型在中文醫療答案擷取式摘要上有良好的表現。	zh_TW
dc.description.abstract	Information retrieval-based medical question-answering systems usually return relevant answers to a user’s question in a ranked list. However, retrieved results may contain complex and diverse information that hinders users from meeting their specific question intents easily. Therefore, this study focuses on developing extractive summarization techniques for Chinese medical answers. We propose a model called HGATSUM (Heterogeneous Graph Attention Networks for Summarization). First, we construct a heterogeneous graph comprised of nodes in terms of questions, answer sentences, and medical entities and their relationships as edges, including 1) dependency relationships based on Rhetorical Structure Theory (RST) among answer sentences; 2) similarity relationships between questions and answer sentences; and 3) mention relationships between entities and question/answer sentences. Then, Graph Attention Networks are used to learn feature representations of heterogeneous graph nodes. Finally, we combine the graph features of answer sentences with relevancy to the posed question for selecting and assembling partial sentences as an extracted summary. Due to a lack of publicly released benchmark data for medical answer summarization, we constructed a dataset called Med-AnsSum for the extractive summarization task of Chinese medical answers. This dataset contains 3,314 question-answer pairs across 469 distinct medical questions returned by the medical question-answering system, each was manually annotated to obtain an extractive answer summary. Based on experiments and performance evaluations, our proposed HGATSUM model outperforms previous models (i.e., BERTSUMEXT, MATCHSUM, AREDSUM, and Bert-QSBUM) on the Med-AnsSum dataset, achieving the best ROUGE-(1/2/L) scores of 82.08/78.66/81.60. The human evaluation also confirmed that our model is an effective method for Chinese medical answer summarization.	en_US
DC.subject	擷取式摘要	zh_TW
DC.subject	異質圖	zh_TW
DC.subject	圖注意力網路	zh_TW
DC.subject	extractive summarization	en_US
DC.subject	heterogeneous graph	en_US
DC.subject	graph attention networks	en_US
DC.title	運用異質圖注意力網路於中文醫療答案擷取式摘要	zh_TW
dc.language.iso	zh-TW	zh-TW
DC.title	Heterogeneous Graph Attention Networks for Extractive Summarization of Chinese Medical Answers	en_US
DC.type	博碩士論文	zh_TW
DC.type	thesis	en_US
DC.publisher	National Central University	en_US

博碩士論文 110521083 完整後設資料紀錄