利用雙重註釋編碼器於中文健康照護實體連結

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：10

、訪客IP：18.222.57.238

姓名

洪滿珍(Man-Chen Hung) 查詢紙本館藏

畢業系所

電機工程學系

論文名稱

利用雙重註釋編碼器於中文健康照護實體連結
(Leveraging Dual Gloss Encoders in Chinese Healthcare Entity Linking)

相關論文

★ 多重嵌入增強式門控圖序列神經網路之中文健康照護命名實體辨識	★ 基於腦電圖小波分析之中風病人癲癇偵測研究
★ 基於條件式生成對抗網路之資料擴增於思覺失調症自動判別	★ 標籤圖卷積增強式超圖注意力網路之中文健康照護文本多重分類
★ 運用合成器混合注意力改善BERT模型於科學語言編輯	★ 強化領域知識語言模型於中文醫療問題意圖分類
★ 管道式語言轉譯器之中文健康照護開放資訊擷取	★ 運用句嵌入向量重排序器增進中文醫療問答系統效能
★ 聯合詞性與局部語境於中文健康照護實體關係擷取	★ 運用異質圖注意力網路於中文醫療答案擷取式摘要
★ 學習使用者意圖於中文醫療問題生成式摘要	★ 標籤強化超圖注意力網路模型於精神疾病文本多標籤分類
★ 上下文嵌入增強異質圖注意力網路模型於心理諮詢文本多標籤分類	★ 基於階層式聚類注意力之編碼解碼器於醫療問題多答案摘要
★ 探索門控圖神經網路於心理諮詢文字情感強度預測

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2027-8-24以後開放)

摘要(中)

詞義消歧是自然語言理解的一項重要且艱難的任務，尤其是對於醫療領域經常有多
種語義含意的詞彙。我們提出一個雙重註釋編碼器 (Dual Gloss Encoders, DGE) 模型，
以 BERT 轉譯器為基礎，將中文句子中健康照護領域的命名實體，連結到多國語言詞彙
語義網路 BabelNet，以實現上下文感知語義理解。消歧目標詞的每個註釋都源自
BabelNet，我們將原句嵌入註釋得到語境化的目標詞嵌入向量，目標詞嵌入向量再和每
個註釋嵌入配對，以計算語義消歧的分數，將分數最高者選為語句中消歧目標詞的註釋
選項。由於在健康照護領域缺乏中文實體連結數據，我們收集了適當的領域單詞並在句
子中手動標記它們的註釋。最後，我們總共有 10,218 個句子，包含 40 個不同的消歧目
標詞和 94 個不同的語義註釋。我們將建構的數據劃分為訓練集 7,109 筆、發展集 979
筆與測試集 2,130 筆。實驗結果表明，我們提出的 DGE 模型的性能優於三個實體連結
模型，即 BERTWSD、GlossBERT 與 BEM，獲得了 F1-Score 97.81%。

摘要(英)

Word sense disambiguation is an important and difficult task for natural language
understanding, especially for those lexical words with many semantic meanings in the
healthcare domain. We propose a BERT transformer based Dual Gloss Encoder (DGE) model
to link Chinese healthcare entities to the multi-lingual lexical network BabelNet for contextaware semantic understanding. The target word along with its context in original sentence is
encoded to obtain embedding vector. Each gloss of the target word is originated from BabelNet
to encode the gloss embedding. Target word embedding and each gloss embedding will be
paired to calculate the scores for sense disambiguation. The gloss with the highest score is
returned as predicted gloss for the target word in a given sentence. Due to a lack of Chinese
entity linking data in the healthcare domain, we collected proper domain-specific words and
manually annotated their glosses in the sentence. Finally, we have a total of 10,218 sentences
containing 40 distinct target words with 94 various semantic glosses. Our constructed data was
divided into three mutually exclusive datasets, including training set (7,109 sentences),
development set (979 sentences), and test set (2,130 sentences). Experimental results indicate
that our proposed DGE model performs better than three entity linking models, i.e., BERTWSD,
GlossBERT and BEM, obtaining the best F1-score of 97.81%.

關鍵字(中)

★ 實體連結
★ 詞義消歧
★ 語言轉譯器
★ 自然語言理解
★ 健康資訊學

關鍵字(英)

★ entity linking
★ word sense disambiguation
★ language transformers
★ natural language understanding
★ health informatics

論文目次

摘要 i
致謝 iii
目錄 iv
表目錄 vi
第一章緒論 1
1-1 研究背景 1
1-2 動機與目的 3
1-3 章節概要 4
第二章相關研究 5
2-1 語義消歧資料集 5
2-2 基於知識的方法 9
2-3 基於深度學習的方法 13
第三章研究方法 22
3-1 系統架構 22
3-2 情境感知註釋編碼器 23
3-3 詞彙註釋編碼器 26
第四章實驗結果 27
4-1 資料集建置 27
4-2 實驗設定 29
4-3 評估指標 30
4-4 模型比較 31
4-5 效能分析 33
4-6 錯誤分析 36
第五章結論與未來工作 37
參考文獻 38
附錄一目標詞統計表 48
附錄二目標詞註釋與例句 50

參考文獻

[1] Roberto Navigli and Simone Paolo Ponzetto. 2012. BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artificial Intelligence, (193):217-250. https://doi.org/10.1016/j.artint.2012.07.001

[2] Lung-Hao Lee and Yi Lu. 2021. Multiple embeddings enhanced multi-graph neural networks for Chinese healthcare named entity recognition. IEEE Journal of Biomedical and Health Informatics, 25(7):2801-2810. https://doi.org/10.1109/JBHI.2020.3048700

[3] Alessandro Raganato, Jose Camacho-Collados and Roberto Navigli. 2017. Word Sense Disambiguation: A unified evaluation framework and empirical comparison. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1:Long and Short Papers). Association for Computational Linguistics, pages 99–110. https://aclanthology.org/E17-1010

[4] Sameer Pradhan, Edward Loper, Dmitriy Dligach, and Martha Palmer. 2007. SemEval-2007 task-17: English lexical sample, SRL and all words. In Proceedings of the 4th International Workshop on Semantic Evaluations. Association for Computational Linguistics, pages 87–92. https://aclanthology.org/S07-1016

[5] Philip Edmonds and Scott Cotton. 2001. Senseval-2: Overview. In Proceedings of The 2nd International Workshop on Evaluating Word Sense Disambiguation Systems. Association for Computational Linguistics, pages 1–6. https://aclanthology.org/S07-1016

[6] Benjamin Snyder and Martha Palmer. 2004. The English all-words task. In Proceedings of the 3rd International Workshop on the Evaluation of Systems for the Semantic Analysis of Text. Association for Computational Linguistics, pages 41–43. https://aclanthology.org/W04-0811

[7] Roberto Navigli, David Jurgens, and Daniele Vannella. 2013. SemEval-2013 Task 12: multilingual word sense disambiguation. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the 7th International Workshop on Semantic Evaluation. Association for Computational Linguistics, pages 222–231. https://aclanthology.org/S13-2040

[8] Andrea Moro and Roberto Navigli. 2015. Semeval2015 task 13: Multilingual all-words sense disambiguation and entity linking. In Proceedings of the 9th International Workshop on Semantic Evaluation. Association for Computational Linguistics, pages 288–297 . https://aclanthology.org/S15-2049.pdf

[9] George A. Miller. 1994. WordNet: a lexical database for english. Communications of the ACM, 38(11):39-41. https://doi.org/10.1145/219717.219748

[10] Kaveh Taghipour and Hwee Tou Ng. 2015. One million sense-tagged instances for word sense disambiguation and induction. In Proceedings of the 19th Conference on Computational Natural Language Learning. Association for Computational Linguistics, pages 338–344. http://dx.doi.org/10.18653/v1/K15-1037

[11] Ken Litkowski. 2004. Senseval-3 task: word sense disambiguation of WordNet glosses. In Proceedings of SENSEVAL-3, the 3rd International Workshop on the Evaluation of Systems for the Semantic Analysis of Text. Association for Computational Linguistics, pages 13–16. https://aclanthology.org/W04-0804

[12] Peng Jin, Yunfang Wu, and Shiwen Yu. 2007. SemEval-2007 Task 05: multilingual Chinese-English lexical sample. In Proceedings of the 4th International Workshop on Semantic Evaluations. Association for Computational Linguistics, pages 19–23. https://aclanthology.org/S07-1004

[13] Eneko Agirre, Oier Lopez de Lacalle, Christiane Fellbaum, Andrea Marchetti, Antonio Toral and Piek Vossen. 2010. SemEval-2010 Task 17: all-words word sense disambiguation on a specific domain. In Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions. Association for Computational Linguistics, pages 75–80. https://aclanthology.org/W09-2420

[14] Hwee Tou Ng and Hian Beng Lee. 1997. DSO corpus of sense-tagged English. Philadelphia: Linguistic Data Consortium. https://doi.org/10.35111/84c8-t325

[15] Rob Koeling, Diana McCarthy, John Carroll. 2005. Domainspecific sense distributions and predominant sense acquisition. In Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, pages 419–426. https://aclanthology.org/H05-1053

[16] Geoffrey Leech. 1992. 100 million words of English: the British National Corpus. Language Research, 28(1):1-13. https://s-space.snu.ac.kr/bitstream/10371/85926/3/1.%202235197.pdf

[17] Tony G. Rose, Mark Stevenson, and Miles Whitehead. 2002. The reuters corpus volume 1 -from yesterday’s news to tomorrow’s language resources. In Proceedings of the 3rd International Conference on Language Resources and Evaluation. European Language Resources Association, pages 827–833. http://www.lrec-conf.org/proceedings/lrec2002/pdf/80.pdf

[18] M Weeber, JG Mork, and AR Aronson. 2001. Developing a test collection for biomedical word sense disambiguation. American Medical Informatics Association Annual Symposium Symposium, pages 746-750. https://pubmed.ncbi.nlm.nih.gov/11825285/

[19] Alexandre Rademaker, Bruno Cuconato, Henrique Muniz and Alexandre Tessarollo. 2019. Completing the princeton annotated gloss corpus project. In Proceedings of the 10th Global Wordnet Conference. Association for Computational Linguistics, pages 378–386. https://aclanthology.org/2019.gwc-1.48.pdf

[20] Ting-Yun Chang, Ta-Chung Chi, Shang-Chi Tsai, and Yun-Nung Chen. 2018. xsense: Learning senseseparated sparse representations and textual definitions for explainable word sense networks. arXiv:1809.03348. https://doi.org/10.48550/arXiv.1809.03348

[21] Michael Lesk. 1986. Automatic sense disambiguation using machine readable dictionaries: How to tell a pine cone from an ice cream cone. In Proceedings of the 5th Annual Conference on Systems Documentation. Association for Computing Machinery, pages 24–26. https://doi.org/10.1145/318723.318728

[22] Eneko Agirre, Oier López de Lacalle, and Aitor Soroa. 2014. Random walks for knowledge-based word sense disambiguation. Computational Linguistics, 40(1):57-84. http://dx.doi.org/10.1162/COLI_a_00164

[23] Siddharth Patwardhan, Satanjeev Banerjee, and Ted Predersen. 2003. Using measures of semantic relatedness for word sense disambiguation. In Proceedings of the 4th International Conference on Computational Linguistics and Intelligent Text Processing. Springer-Verlag, pages 241–257. https://dl.acm.org/doi/10.5555/1791562.1791592

[24] Christiane Fellbaum. 1998. WordNet : an electronic lexical database. MIT Press.
[25] Roberto Navigli, Kenneth C. Litkowski, Orin Hargraves. 2007. SemEval-2007 Task 07: Coarse-grained English all-words task. In Proceedings of the 4th International Workshop on Semantic Evaluations. Association for Computational Linguistics, pages 30–35. https://aclanthology.org/S07-1006

[26] Roberto Navigli and Mirella Lapata. 2010. An experimental study of graph connectivity for unsupervised word sense disambiguation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(4):678–692. https://doi.org/10.1109/TPAMI.2009.36

[27] Rada F Mihalcea. 2002. Word sense disambiguation with pattern learning and automatic feature selection. Natural Language Engineering, 8(4):343-358. https://doi.org/10.1017/S1351324902002991

[28] Ravi Sinha and Rada Mihalcea. 2007. Unsupervised graph-based word sense disambiguation using measures of word semantic similarity. In Proceedings of the IEEE International Conference on Semantic Computing. International Conference on Semantic Computing, pages 363–369. https://doi.org/10.1109/ICSC.2007.87

[29] Sergey Brin and Lawrence Page. 1998. The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems, 30(1-7):107-117. https://snap.stanford.edu/class/cs224w-readings/Brin98Anatomy.pdf

[30] Andrea Moro, Alessandro Raganato, and Roberto Navigli. 2014. Entity linking meets word sense disambiguation: a unified approach. Transactions of the Association for Computational Linguistics, 2(2014):231-244. http://dx.doi.org/10.1162/tacl_a_00179

[31] Devendra Singh Chaplot and Ruslan Salakhutdinov. 2018. Knowledge-based word sense disambiguation using topic models. arXiv:1801.01900. https://doi.org/10.48550/arXiv.1801.01900

[32] Yinglin Wang, Ming Wang, Hamido Fujita. 2020. Word sense disambiguation: a comprehensive knowledge exploitation framework. Knowledge-Based Systems, 190(2020):105030. https://doi.org/10.1016/j.knosys.2019.105030

[33] Pierpaolo Basile, Annalina Caputo, Giovanni Semeraro. 2014. An enhanced lesk word sense disambiguation algorithm through a distributional semantic model. In Proceedings of the 25th International Conference on Computational Linguistics: Technical Papers. Association for Computational Linguistics, pages 1591–1600. https://aclanthology.org/C14-1151

[34] José Camacho-Colladosa, Mohammad Taher Pilehvar, and Roberto Naviglia. 2016. NASARI: Integrating explicit knowledge and corpus statistics for a multilingual representation of concepts and entities. Artificial Intelligence, 240(2016):36-64. https://doi.org/10.1016/j.artint.2016.07.005

[35] Oier Lopez de Lacalle and Eneko Agirre. 2015. A methodology for word sense disambiguation at 90% based on large-scale CrowdSourcing. In Proceedings of the 4th Joint Conference on Lexical and Computational Semantics. Association for Computational Linguistics, pages 61–70. http://dx.doi.org/10.18653/v1/S15-1007

[36] Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder–decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, pages 1724–1734. http://dx.doi.org/10.3115/v1/D14-1179

[37] Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. arXiv:1409.0473. https://arxiv.org/abs/1409.0473

[38] Ryan Kiros, Yukun Zhu, Ruslan R. Salakhutdinov, Richard Zemel, Raquel Urtasun, Antonio Torralba, and Sanja Fidler. 2015. Skip-thought vectors. arXiv:1506.06726. https://arxiv.org/abs/1506.06726

[39] Oriol Vinyals, Ł ukasz Kaiser, Terry Koo, Slav Petrov, Ilya Sutskever, and Geoffrey Hinton. 2015. Grammar as a foreign language. arXiv:1412.7449.https://arxiv.org/abs/1412.7449

[40] Manaal Faruqui, Yulia Tsvetkov, Graham Neubig, and Chris Dyer. 2016. Morphological inflection generation using character sequence to sequence learning. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, pages 634–643. http://dx.doi.org/10.18653/v1/N16-1077

[41] Jiatao Gu, Zhengdong Lu, Hang Li, and Victor O.K. Li. 2016. Incorporating copying mechanism in sequence-to-sequence learning. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, pages 1631–1640. http://dx.doi.org/10.18653/v1/P16-1154

[42] Alessandro Raganato, Claudio Delli Bovi, and Roberto Navigli. 2017. Neural sequence learning models for word sense disambiguation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, pages 1156–1167. http://dx.doi.org/10.18653/v1/D17-1120

[43] Fuli Luo, Tianyu Liu, Qiaolin Xia, Baobao Chang, and Zhifang Sui. 2018. Incorporating glosses into neural word sense disambiguation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics , Volume 1, Long Papers. Association for Computational Linguistics, pages 2473–2482. http://dx.doi.org/10.18653/v1/P18-1230

[44] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics , pages 4171–4186. http://dx.doi.org/10.18653/v1/N19-1423

[45] Michele Bevilacqua and Roberto Navigli. 2020. Breaking through the 80% glass ceiling: raising the state of the art in word sense disambiguation by incorporating knowledge graph information. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics , pages 2854–2864. http://dx.doi.org/10.18653/v1/2020.acl-main.255

[46] Terra Blevins and Luke Zettlemoyer. 2020. Moving down the long tail of word sense disambiguation with gloss informed bi-encoders. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, pages 1006–1017. http://dx.doi.org/10.18653/v1/2020.acl-main.95

[47] Brian Murphy, Partha Talukdar, and Tom Mitchell. 2012. Learning effective and interpretable semantic models using non-negative sparse embedding. In Proceedings of the 2012 Conference of the International Conference on Computational Linguistics. Association for Computational Linguistics , pages 1933–1950. https://aclanthology.org/C12-1118.pdf

[48] Manaal Faruqui, Yulia Tsvetkov, Dani Yogatama, Chris Dyer, and Noah A. Smith. 2015. Sparse overcomplete word vector representations. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, pages 1491-1500. http://dx.doi.org/10.3115/v1/P15-1144

[49] Anant Subramanian, Danish Pruthi, Harsh Jhamtani, Taylor Berg-Kirkpatrick, and Eduard H. Hovy. 2018. SPINE: sparse interpretable neural embeddings. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence, the 30th innovative Applications of Artificial Intelligence, and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence, 32(1). https://doi.org/10.1609/aaai.v32i1.11935

[50] Gabor Berend. 2020. Sparsity makes sense: word sense disambiguation using sparse contextualized word representations. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics , pages 8498–8508. http://dx.doi.org/10.18653/v1/2020.emnlp-main.683

[51] Bianca Scarlini, Tommaso Pasini, and Roberto Navigli. 2020. With more contexts comes better performance: contextualized sense embeddings for all-round word sense disambiguation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, pages 3528–3539. http://dx.doi.org/10.18653/v1/2020.emnlp-main.285

[52] Edoardo Barba, Tommaso Pasini, and Roberto Navigli. 2021. ESC: redesigning WSD with extractive sense comprehension. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, pages 4661–4672. http://dx.doi.org/10.18653/v1/2021.naacl-main.371

[53] Yang Song, Xin Cai Ong, Hwee Tou Ng, and Qian Lin. 2021. Improved word sense disambiguation with enhanced sense representations. In Findings of the Association for Computational Linguistics : Empirical Methods in Natural Language Processing. Association for Computational Linguistics, pages 4311–4320. https://aclanthology.org/2021.findings-emnlp.365.pdf

[54] Edoardo Barba, Luigi Procopio, and Roberto Navigli. 2021. ConSeC: word sense disambiguation as continuous sense comprehension. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, pages 1492–1503. http://dx.doi.org/10.18653/v1/2021.emnlp-main.112

[55] Cui, Yiming and Che, Wanxiang and Liu, Ting and Qin, Bing and Yang, Ziqing. 2021. Pre-Training With Whole Word Masking for Chinese BERT. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 29(2021):3504-3514. https://doi.org/10.1109/TASLP.2021.3124365

[56] Christian Hadiwinoto, Hwee Tou Ng, and Wee Chung Gan. 2019. Improved word sense disambiguation using pre-trained contextualized word representations. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Association for Computational Linguistics, pages 5297–5306. http://dx.doi.org/10.18653/v1/D19-1533

[57] Luyao Huang, Chi Sun, Xipeng Qiu, Xuanjing Huang. 2019. GlossBERT: BERT for word sense disambiguation with gloss knowledge. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Association for Computational Linguistics, pages 3509–3514. http://dx.doi.org/10.18653/v1/D19-1355

[58] Thomas G. Dietterich. 1998. Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms. Neural computation, 10(7):1895-1923. https://doi.org/10.1162/089976698300017197

[59] Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv:1907.11692. https://doi.org/10.48550/arXiv.1907.11692.

[60] Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Shijin Wang, Guoping Hu. 2020. In Findings of the Association for Computational Linguistics : Empirical Methods in Natural Language Processing . Association for Computational Linguistics, pages 657–668

指導教授

李龍豪(Lung-Hao Lee)

審核日期

2022-8-25

推文