上下文嵌入增強異質圖注意力網路模型 於心理諮詢文本多標籤分類

DC 欄位	值	語言
DC.contributor	電機工程學系	zh_TW
DC.creator	曾郁雯	zh_TW
DC.creator	Yu-Wen Tzeng	en_US
dc.date.accessioned	2024-7-26T07:39:07Z
dc.date.available	2024-7-26T07:39:07Z
dc.date.issued	2024
dc.identifier.uri	http://ir.lib.ncu.edu.tw:444/thesis/view_etd.asp?URN=111521099
dc.contributor.department	電機工程學系	zh_TW
DC.description	國立中央大學	zh_TW
DC.description	National Central University	en_US
dc.description.abstract	文本多標籤分類 (Multi-Label Text Classification, MLTC) 任務對於每一則文字內容，預測一個或多個事先給定的分類標籤，由於標籤之間存在隱含關係，難以充分挖掘標籤間的相關性，目前模型效能普遍不佳。本研究探討將圖神經網路 (GNN) 與轉譯器(Transformer) 模型結合，利用異質圖方式建構文本詞彙及標籤之間的關係，並透過網路的圖結構更新能力和轉譯器的自注意力機制，提出了上下文嵌入增強異質圖注意力網路模型 (Contextual Embeddings Enhanced Heterogeneous Graph Attention Networks, CE-HeterGAT)，旨在加強文本特徵表示，提升多標籤分類的效能。我們將文本詞彙及標籤透過五種不同的邊建立異質圖，圖節點包括文本詞、標籤詞及一個虛擬節點，邊類型包括字詞之間的序列關係、字詞之間的依存句法關係、字詞與標籤詞之間的語義關係、標籤詞之間的共現關係以及虛擬節點和所有字詞的邊，然後經由圖注意力網路學習異質圖的節點表示。同時文本透過BERT (Bidirectional Encoder Representations from Transformers)得到文本上下文關係，最後將兩種特徵經由我們設計的注意力解碼器得到整個文本的節點表示並預測最後的標籤分類。我們建置了兩種標籤分類的中文心理諮詢多標籤文本分類資料集，總共蒐集4,473筆線上心理諮詢的留言，人工標記內容的主題和事件，最終建置完成包含11種主題多標籤的Psycho-MLTopic資料集以及52種事件多標籤的Psycho-MLEvent資料集。藉由實驗與效能評估得知，我們提出的模型CE-HeterGAT效能皆優於其他相關模型(TextCNN、Bi-LSTM、BERT、GCN、GAT、TextGCN、SAT、UGformer、Exphormers)，尤其是在Macro-F1 Score指標有顯著的提升，證明異質圖結構及結合上下文訊息的圖神經網路能夠有效提升文本多標籤分類的效能。	zh_TW
dc.description.abstract	Multi-Label Text Classification (MLTC) is a task that focuses on assigning at least one pre-defined label to given texts. Due to the complexity of discovering the implicit relationships among labels, existing methods still need to work on fully exploiting the correlations between labels. This study explores the combination of Graph Neural Networks (GNN) and a Transformer model, using a heterogeneous graph approach to construct relationships between text words and labels. Leveraging the graph processing capabilities of GNNs and the self-attention mechanism of Transformers, we propose the Contextual Embeddings Enhanced Heterogeneous Graph Attention Networks (CE-HeterGAT) model, aimed at enhancing text feature representation and improving multi-label classification performance. We construct a heterogeneous graph comprising of content nodes, label nodes, and a virtual node and five edge types between nodes: 1) sequential relationships between content words; 2) syntactic relationships between content words; 3) semantic relationships between content words and label words; 4) conditional co-occurrence relationships between label words; and 5) edges between the virtual node and all content words. The graph attention networks are then used to learn the node representations of the heterogeneous graph. Simultaneously, the BERT Transformer captures the contextual relationships within the texts. Finally, we use a cross-attention decoder to obtain the fully exploited node representations and predict the final label classifications. We collected 4,473 online psychological counseling texts and manually annotated multiple labels, resulting in a Psycho-MLTopic dataset across 11 topic labels and a Psycho-MLEvent dataset across 52 event labels. Experimental results and performance evaluations show that our proposed CE-HeterGAT model outperforms other related models (TextCNN, Bi-LSTM, BERT, GCN, GAT, TextGCN, SAT, UGformer, Exphormers). The CE-HeterGAT model demonstrates significant improvements, especially in the Macro-F1 Score metric, proving that the heterogeneous graph structure combined with contextual information in graph neural networks effectively enhances text classification performance.	en_US
DC.subject	多標籤分類	zh_TW
DC.subject	異質圖	zh_TW
DC.subject	圖注意力網路	zh_TW
DC.subject	上下文嵌入	zh_TW
DC.subject	心理諮詢	zh_TW
DC.subject	multi-label text classification	en_US
DC.subject	heterogeneous graph	en_US
DC.subject	graph attention networks	en_US
DC.subject	contextual embeddings	en_US
DC.subject	psychological counseling	en_US
DC.title	上下文嵌入增強異質圖注意力網路模型於心理諮詢文本多標籤分類	zh_TW
dc.language.iso	zh-TW	zh-TW
DC.title	Contextual Embeddings Enhanced Heterogeneous Graph Attention Networks for Multi-Label Classification of Psychological Counseling Texts	en_US
DC.type	博碩士論文	zh_TW
DC.type	thesis	en_US
DC.publisher	National Central University	en_US

博碩士論文 111521099 完整後設資料紀錄