強化領域知識語言模型於中文醫療問題意圖分類

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：18

、訪客IP：3.141.19.153

姓名

陳柏翰(Po-Han Chen) 查詢紙本館藏

畢業系所

電機工程學系

論文名稱

強化領域知識語言模型於中文醫療問題意圖分類
(Ingraining Domain knowledge in Language Models for Chinese Medical Question Intent Classification)

相關論文

★ 多重嵌入增強式門控圖序列神經網路之中文健康照護命名實體辨識	★ 基於腦電圖小波分析之中風病人癲癇偵測研究
★ 基於條件式生成對抗網路之資料擴增於思覺失調症自動判別	★ 標籤圖卷積增強式超圖注意力網路之中文健康照護文本多重分類
★ 運用合成器混合注意力改善BERT模型於科學語言編輯	★ 管道式語言轉譯器之中文健康照護開放資訊擷取
★ 運用句嵌入向量重排序器增進中文醫療問答系統效能	★ 利用雙重註釋編碼器於中文健康照護實體連結
★ 聯合詞性與局部語境於中文健康照護實體關係擷取	★ 運用異質圖注意力網路於中文醫療答案擷取式摘要
★ 學習使用者意圖於中文醫療問題生成式摘要	★ 標籤強化超圖注意力網路模型於精神疾病文本多標籤分類

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2026-10-26以後開放)

摘要(中)

多分類文本分類旨在自動將輸入實例歸納至預先定義好的分類中，該方法可用於眾多應用情境，例如：情感分析、聊天機器人、問答系統、電商產品分類和過濾資料等。本研究的主要目標為歸納非結構化的中文醫療問題至正確的分類中，我們可以將分類資訊視為醫療知識特徵，有助於機器理解問題語意內涵，並可做為自動問答系統的基礎。近年來，在基於深度學習的方法中，最被廣泛使用的模型架構為轉譯器 (Transformers)，這些模型有效地捕獲了廣域語意資訊與結構句法，在許多自然語言處理任務得到好的效能表現。因此，我們以兩階段領域知識強化機制為基礎，改善三種主流預訓練模型，並提出EKG-Transformers (Encyclopedia enhanced pre-training with Knowledge Graph fine-tuning Transformers ) 模型，用於中文醫療問題意圖分類，我們將醫學百科 (Encyclopedia)蒐集的層級資料訓練於語言模型上，進一步將醫學領域的階層資訊，例如：疾病的症狀與檢測方式、治療方法的注意事項與副作用、藥物的用法與用量等，導入語言模型中，微調時加入建構的知識圖譜 (Knowledge Graph) 三元組，賦予關係網路給字序列中的命名實體，並將字序列轉化成句圖 (Sentence Graph)，讓模型在遇到需要知識驅動的序列時，能給予更好的語言表徵及分類。本研究使用了醫療問題意圖分類資料集 (Chinese Medical Intent Dataset, CMID)，該資料集歸納出了4個分類：病症、藥物、治療和其他，與涵蓋於其下的36個子分類，總共包含約12,000則的醫療問題，並標註了分詞與命名實體結果。藉由實驗結果與錯誤分析得知，我們提出的EKG-MacBERT模型達到最好的Micro F1-score 74.50%，比相關研究模型 (MacBERT, RoBERTa, BERT, TextCNN, TextRNN, TextGCN與FastText) 表現好，並為中文醫療問題意圖分類提出一個效能解決方案。

摘要(英)

Our main research objective focuses on classifying unstructured Chinese medical questions into one of the pre-defined categories. Recently, the most widely used model architecture is Transformer, which effectively captures semantic and structural syntaxes to achieve promising results in many natural language processing tasks. We improve three mainstream pre-training models based on the two-stage domain knowledge enhancement mechanisms. We propose the EKG-Transformers (Encyclopedia enhanced pre-training with Knowledge Graph fine-tuning Transformers) for user intent classification of Chinese medical questions. During the pre-training phase, we ingrain hierarchical healthcare information, such as the symptoms and diagnoses of a disease, the precautions and side-effects of treatment, and usage and dosage of a drug in the language model. During the fine-tuning phase, a word sequence is endowed with the relation network and further converted into sentence graphs with the injection of triples related to the named entities from the knowledge graph. Experimental data came from the Chinese Medical Intent Dataset (CMID), which included manually annotated users’ intents (in 4 categories and 36 sub-categories), along with word segmentation and named entity results with a total of around 12,000 medical questions. Based on further experiments and the error analysis, EKG-MacBERT achieved the best F1-score of 74.50% that outperforms previous models including the MacBERT, RoBERTa, BERT, TextCNN, TextRNN, TextGCN, and FastText. In summary, our EKG-Transformers model brings forward an effective way to solve the problem of Medical Question Intent Classification.

關鍵字(中)

★ 領域知識擷取
★ 預訓練語言模型
★ 百科全書
★ 知識圖譜
★ 多元分類

關鍵字(英)

★ domain knowledge extraction
★ pre-trained language models
★ encyclopedia
★ knowledge graph
★ multi-class classification

論文目次

摘要 i
Abstract ii
致謝 iii
目錄 iv
圖目錄 vi
表目錄 vii
第一章緒論 1
1-1 研究背景 1
1-2 研究動機與目的 3
1-3 章節概要 5
第二章相關研究 6
2-1 中文醫療領域分類資料集 6
2-2 中文文本分類模型 10
第三章模型架構 32
3-1 百科全書預訓練階段 (Encyclopedia enhanced pre-training phase)： 34
3-2 知識圖譜微調階段 (Knowledge Graph fine-tuning phase)： 36
3-2-1 知識層 (Knowledge Layer)： 39
3-2-2 多重嵌入層 (Embedding Layer)： 40
3-2-3 可視矩陣層 (Visible Matrix Layer)： 41
3-2-4 Visible-Transformer Encoder： 42
3-2-5 分類層 (Classification Layer)： 43
第四章實驗結果 44
4-1 醫學百科語料庫建置 44
4-2 醫學知識圖譜建置 46
4-3 CMID資料集統計資訊 51
4-4 實驗設定 54
4-5 效能評估 61
4-6 模型比較與效能分析 63
4-7 錯誤分析 72
第五章結論與未來工作 75

參考文獻

Maron, M. E., (1961). Automatic indexing: an experimental inquiry. Journal of the ACM (JACM), 8(3), 404-417.
Cover, T., & Hart, P. , (1967). Nearest neighbor pattern classification. IEEE transactions on information theory, 13(1), 21-27.
Joachims, T., (1998, April). Text categorization with support vector machines: Learning with many relevant features. In European conference on machine learning (pp. 137-142). Springer, Berlin, Heidelberg.
Y. Kim, Convolutional neural networks for sentence classification, in EMNLP, Doha, Qatar, October 2014, pp. 1746-1751.
Rumelhart, D. E., Hinton, G. E., & Williams, R. J., (1986). Learning representations by back-propagating errors. nature, 323(6088), 533-536.
Hochreiter, S., & Schmidhuber, J. , (1997). Long short-term memory. Neural computation, 9(8), 1735-1780.
Qiu, X., Sun, T., Xu, Y., Shao, Y., Dai, N., & Huang, X., (2020). Pre-trained models for natural language processing: A survey. Science China Technological Sciences, 1-26.
Chen, N., Su, X., Liu, T., Hao, Q., & Wei, M., (2020). A benchmark dataset and case study for Chinese medical question intent classification. BMC Medical Informatics and Decision Making, 20(3), 1-7.
N. B. Z. L. X. L. L. C. X. D. S. .. &. C. Q. Zhang, (2021). CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark. arXiv preprint arXiv:2106.08087.
Term frequency by inverse document frequency, in Encyclopedia of Database Systems, p. 3035, 2009.
Mikolov, T., Chen, K., Corrado, G., & Dean, J., (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
Pennington, J., Socher, R., & Manning, C. D., (2014, October). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532-1543).
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I., (2017). Attention is all you need. In Advances in neural information processing systems (pp. 5998-6008).
Huang, Z., Xu, W., & Yu, K., (2015). Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991.
Elfaik, H., (2021). Deep Bidirectional LSTM Network Learning-Based Sentiment Analysis for Arabic Text. Journal of Intelligent Systems, 30(1), 395-412.
Kipf, T. N., & Welling, M., (2016). Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907.
Li, Q., Peng, H., Li, J., Xia, C., Yang, R., Sun, L., ... & He, L., (2020). A survey on text classification: From shallow to deep learning. arXiv preprint arXiv:2008.00364.
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. , (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
Sun, Y., Wang, S., Li, Y., Feng, S., Chen, X., Zhang, H., ... & Wu, H. , (2019). Ernie: Enhanced representation through knowledge integration. arXiv preprint arXiv:1904.09223.
Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R. R., & Le, Q. V., (2019). Xlnet: Generalized autoregressive pretraining for language understanding. Advances in neural information processing systems, 32.
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., ... & Stoyanov, V., (2019). Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
Joshi, M., Chen, D., Liu, Y., Weld, D. S., Zettlemoyer, L., & Levy, O., (2020). Spanbert: Improving pre-training by representing and predicting spans. Transactions of the Association for Computational Linguistics, 8, 64-77.
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., & Soricut, R., (2019). Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942.
Clark, K., Luong, M. T., Le, Q. V., & Manning, C. D., (2020). Electra: Pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555.
Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., & Bowman, S. R., (2018). GLUE: A multi-task benchmark and analysis platform for natural language understanding. arXiv preprint arXiv:1804.07461.
Xu, L., Hu, H., Zhang, X., Li, L., Cao, C., Li, Y., ... & Lan, Z., (2020). Clue: A chinese language understanding evaluation benchmark. arXiv preprint arXiv:2004.05986.
Lai, G., Xie, Q., Liu, H., Yang, Y., & Hovy, E., (2017). Race: Large-scale reading comprehension dataset from examinations. arXiv preprint arXiv:1704.04683.
Rajpurkar, P., Jia, R., & Liang, P., Know what you don′t know: Unanswerable questions for SQuAD. arXiv preprint arXiv:1806.03822.
Cui, Y., Che, W., Liu, T., Qin, B., Wang, S., & Hu, G., (2020). Revisiting pre-trained models for chinese natural language processing. arXiv preprint arXiv:2004.13922.
Lee, J., Yoon, W., Kim, S., Kim, D., Kim, S., So, C. H., & Kang, J. , (2020). BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics, 36(4), 1234-1240.
He, Y., Zhu, Z., Zhang, Y., Chen, Q., & Caverlee, J., (2020). Infusing disease knowledge into BERT for health question answering, medical inference and disease name recognition. arXiv preprint arXiv:2010.03746.
Wang, X., Gao, T., Zhu, Z., Zhang, Z., Liu, Z., Li, J., & Tang, J., (2021). KEPLER: A unified model for knowledge embedding and pre-trained language representation. Transactions of the Association for Computational Linguistics, 9, 176-194.
Xiong, W., Du, J., Wang, W. Y., & Stoyanov, V., (2019). Pretrained encyclopedia: Weakly supervised knowledge-pretrained language model. arXiv preprint arXiv:1912.09637.
Liu, W., Zhou, P., Zhao, Z., Wang, Z., Ju, Q., Deng, H., & Wang, P., (2020, April). K-bert: Enabling language representation with knowledge graph. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 34, No. 03, pp. 2901-2908).
Lipscomb, C. E. (2000). , Medical subject headings (MeSH). Bulletin of the Medical Library Association, 88(3), 265.
Donnelly, K. (2006). , SNOMED-CT: The advanced terminology and coding system for eHealth. Studies in health technology and informatics, 121, 279.
Lui, “https://github.com/liuhuanyong/QASystemOnMedicalKG”.
Lee, L. H., & Lu, Y., (2021). Multiple Embeddings Enhanced Multi-Graph Neural Networks for Chinese Healthcare Named Entity Recognition. IEEE Journal of Biomedical and Health Informatics.
Joulin, A., Grave, E., Bojanowski, P., & Mikolov, T., (2016). Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759.
Dietterich, T. G., (1998). Approximate statistical tests for comparing supervised classification learning algorithms. Neural computation, 10(7), 1895-1923.
Bowker, A. H., (1948). A test for symmetry in contingency tables. Journal of the american statistical association, 43(244), 572-574.

指導教授

李龍豪(Lung-Hao Lee)

審核日期

2021-10-27

推文