多重嵌入增強式門控圖序列神經網路之中文健康照護命名實體辨識

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：31

、訪客IP：18.227.49.120

姓名

盧毅(Yi Lu) 查詢紙本館藏

畢業系所

電機工程學系

論文名稱

多重嵌入增強式門控圖序列神經網路之中文健康照護命名實體辨識
(Multiple Embeddings Enhanced Gated Graph Sequence Neural Networks for Chinese Healthcare Named Entity Recognition)

相關論文

★ 基於腦電圖小波分析之中風病人癲癇偵測研究	★ 基於條件式生成對抗網路之資料擴增於思覺失調症自動判別
★ 標籤圖卷積增強式超圖注意力網路之中文健康照護文本多重分類	★ 運用合成器混合注意力改善BERT模型於科學語言編輯
★ 強化領域知識語言模型於中文醫療問題意圖分類	★ 管道式語言轉譯器之中文健康照護開放資訊擷取
★ 運用句嵌入向量重排序器增進中文醫療問答系統效能	★ 利用雙重註釋編碼器於中文健康照護實體連結
★ 聯合詞性與局部語境於中文健康照護實體關係擷取	★ 運用異質圖注意力網路於中文醫療答案擷取式摘要
★ 學習使用者意圖於中文醫療問題生成式摘要	★ 標籤強化超圖注意力網路模型於精神疾病文本多標籤分類

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2025-8-20以後開放)

摘要(中)

命名實體辨識任務的目標是從非結構化的輸入文本中，抽取出關注的命名實體，例如：人名、地名、組織名、日期、時間等專有名詞，擷取的命名實體，可以做為關係擷取、事件偵測與追蹤、知識圖譜建置、問答系統等應用的基礎。機器學習的方法將其視為序列標註問題，透過大規模語料學習標註模型，對句子的各個字元位置進行標註。我們提出一個多重嵌入增強式門控圖序列神經網路 (Multiple Embeddings Enhanced Gated Graph Sequence Neural Network, ME-GGSNN) 模型，用於中文健康照護領域命名實體辨識，我們整合詞嵌入以及部首嵌入的資訊，建構多重嵌入的字嵌入向量，藉由調適門控圖序列神經網路，融入已知字典中的命名實體資訊，然後銜接雙向長短期記憶類神經網路與條件隨機場域，對中文句子中的字元序列標註。
我們透過網路爬蟲蒐集健康照護相關內容的網路文章以及醫療問答紀錄，然後隨機抽取中文句子做人工斷詞與命名實體標記，句子總數為 30,692句 (約150萬字/91.7萬詞)，共有68,460命名實體，包含10個命名實體種類：人體、症狀、醫療器材、檢驗、化學物質、疾病、藥品、營養品、治療與時間。藉由實驗結果與錯誤分析得知，我們提出的模型達到最好的F1-score 75.69%，比相關研究模型 (BiLSTM-CRF, BERT, Lattice, Gazetteers以及ME-CNER)表現好，且為效能與效率兼具的中文健康照護命名實體辨識方法。

摘要(英)

Named Entity Recognition (NER) focuses on locating the mentions of name entities and classifying their types, usually referring to proper nouns such as persons, places, organizations, dates, and times. The NER results can be used as the basis for relationship extraction, event detection and tracking, knowledge graph building, and question answering system. NER studies usually regard this research topic as a sequence labeling problem and learns the labeling model through the large-scale corpus. We propose a ME-GGSNN (Multiple Embeddings enhanced Gated Graph Sequence Neural Networks) model for Chinese healthcare NER. We derive a character representation based on multiple embeddings in different granularities from the radical, character to word levels. An adapted gated graph sequence neural network is involved to incorporate named entity information in the dictionaries. A standard BiLSTM-CRF is then used to identify named entities and classify their types in the healthcare domain.
We firstly crawled articles from websites that provide healthcare information, online health-related news and medical question/answer forums. We then randomly selected partial sentences to retain content diversity. It includes 30,692 sentences with a total of around 1.5 million characters or 91.7 thousand words. After manual annotation, we have 68,460 named entities across 10 entity types: body, symptom, instrument, examination, chemical, disease, drug, supplement, treatment, and time. Based on further experiments and error analysis, our proposed method achieved the best F1-score of 75.69% that outperforms previous models including the BiLSTM-CRF, BERT, Lattice, Gazetteers, and ME-CNER. In summary, our ME-GGSNN model is an effective and efficient solution for the Chinese healthcare NER task.

關鍵字(中)

★ 嵌入向量
★ 圖神經網路
★ 命名實體辨識
★ 資訊擷取
★ 健康資訊學

關鍵字(英)

★ embedding representation
★ graph neural networks
★ named entity recognition
★ information extraction
★ health informatics

論文目次

摘要 i
Abstract ii
致謝 iii
目錄 iv
圖目錄 v
表目錄 vi
第一章緒論 1
1-1 研究背景 1
1-2 研究動機與目的 3
1-3 章節概要 4
第二章相關研究 5
2-1 中文命名實體辨識語料庫 5
2-2 中文命名實體辨識模型 7
第三章模型架構 11
3-1 多重嵌入層 13
3-2 門控圖序列神經網路層 15
3-3 雙向長短期記憶神經網路層 22
3-4 條件隨機場域層 23
第四章實驗結果 25
4-1 語料庫建置 25
4-2 實驗設定 32
4-3 嵌入向量 34
4-4 效能評估 36
4-5 模型比較 37
4-6 效能分析 43
4-7 錯誤分析 47
第五章結論與未來工作 49
參考文獻 50

參考文獻

[1] Lawrence R. Rabiner, A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proceedings of the IEEE, 77 (2), p. 257–286, February 1989.
[2] Toutanova, Kristina; Manning, Christopher D., Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger. Proc. J. SIGDAT Conf. on Empirical Methods in NLP and Very Large Corpora (EMNLP/VLC-2000). pp. 63–70.
[3] Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of 18th International Conference on Machine Learning, ICML 2001, pp. 282–289 (2001).
[4] Krizhevsky, A., Sutskever, I., & Hinton, G., (2012). ImageNet classification with deep convolutional neural networks. In NIPS.
[5] Williams, Ronald J.; Hinton, Geoffrey E.; Rumelhart, David E., (October 1986). "Learning representations by back-propagating errors". Nature. 323 (6088): 533–536.
[6] Hochreiter, S., Schmidhuber, J., Long short-term memory. Neural Comput. 9, 1735–1780 (1997).
[7] Levow, G.A., The third international Chinese language processing bakeoff: word segmentation and named entity recognition. In: Computational Linguistics, pp.
[8] Nanyun Peng and Mark Dredze, 015. Named entity recognition for Chinese social media with jointly trained embeddings. In EMNLP. pages 548–554.
[9] Zhang, Y. and Yang, J., (2018). Chinese NER using lattice LSTM. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics(ACL’18),Long Papers, pages 1554-1564.

[10] Xianpei Han, Overview of the CCKS 2019 Knowledge Graph Evaluation Track: Entity, Relation, Event and QA (2019). arXiv.
[11] Fu, G., Luke, K.K., Chinese named entity recognition using lexicalized HMMs. ACM SIGKDD Explor. Newsl. 7, 19–25 (2005).
[12] Gideon S. Mann and Andrew McCallum., 2010. Generalized Expectation Criteria for SemiSupervised Learning with Weakly Labeled Data. J. Mach. Learn. Res. 11 (March 2010), 955–984.
[13] Duan, H., Zheng, Y., A study on features of the CRFs-based Chinese. Int. J. Adv. Intell. 3, 287–294 (2011).
[14] Han, A.L.-F., Wong, D.F., Chao, L.S., Chinese named entity recognition with conditional random fields in the light of Chinese characteristics. In: Kłopotek, M.A., Koronacki, J., Marciniak, M., Mykowiecka, A., Wierzchoń, S.T. (eds.) IIS 2013. LNCS, vol. 7912, pp. 57–68. Springer, Heidelberg (2013).
[15] Huang, Z., Xu, W., Yu, K., Bidirectional LSTM-CRF models for sequence tagging (2015). arXiv.
[16] Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, Chris Dyer (2016).Neural architectures for named entity recognition.In Proceedings of the NAACL’16, pp. 108-117
[17] Chuanhai Dong, Jiajun Zhang, Chengqing Zong, Masanori Hattori, and Hui Di., 2016. Character based LSTM-CRF with radical-level features for Chinese named entity recognition. In International Conference on Computer Processing of Oriental Languages. Springer, pages 239–250.

[18] Canwen Xu, Feiyang Wang, Jialong Han, and Chenliang Li, Exploiting multiple embeddings for chinese named entity recognition. In CIKM, pages 2269–2272. ACM, 2019.
[19] Ruixue Ding, Pengjun Xie, Xiaoyan Zhang, Wei Lu, Linlin Li, and Luo Si., 2019. A neural multidigraph model for chinese ner with gazetteers. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1462–1467.
[20] Yujia Li, Daniel Tarlow, Marc Brockschmidt, and Richard Zemel, 2016. Gated graph sequence neural networks. In Proc. of ICLR.
[21] Mikolov, T., Chen, K., Corrado, G., & Dean, J., (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
[22] Cho, K. et al., Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proc. Conference on Empirical Methods in Natural Language Processing 1724–1734 (2014).
[23] Cohen, Jacob, (1960). "A coefficient of agreement for nominal scales". Educational and Psychological Measurement. 20 (1): 37–46.
[24] Fleiss, J. L., (1971) "Measuring nominal scale agreement among many raters." Psychological Bulletin, Vol. 76, No. 5 pp. 378–382.
[25] Landis, J. R. and Koch, G. G., "The measurement of observer agreement for categorical data" in Biometrics. Vol. 33, pp. 159–174.
[26] Ma, Wei-Yun and Keh-Jiann Chen, 2003, "Introduction to CKIP Chinese Word Segmentation System for the First International Chinese Word Segmentation Bakeoff", Proceedings of ACL, Second SIGHAN Workshop on Chinese Language Processing, pp168-171.

[27] Jeffrey Pennington, Richard Socher, and Christopher D. Manning, 2014. Glove: Global vectors for word representation. In Empirical Methods in Natural Language Processing (EMNLP), pages 1532–1543.
[28] Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov, 2017. Enriching word vectors with subword information. TACL 5:135–146.
[29] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova., BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short apers), pp. 4171–4186, Minneapolis, Minnesota, June 2019. Association for Computational Linguistics.

指導教授

李龍豪

審核日期

2020-8-20

推文