針對病歷之疾病命名實體標註以及醫院科別病歷轉移學習之分析;Disease NER of Medical Records and Analysis of Transfer Learning of Medical Records between Different Hospital Departments

NCU Institutional Repository > 資訊電機學院 > 資訊工程研究所 > 博碩士論文 > Item 987654321/84108

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/84108

題名:	針對病歷之疾病命名實體標註以及醫院科別病歷轉移學習之分析;Disease NER of Medical Records and Analysis of Transfer Learning of Medical Records between Different Hospital Departments
作者:	黃珏倪;Huang, Jue-Ni
貢獻者:	資訊工程學系
關鍵詞:	生醫文獻探勘;機器學習;自然語言處理;轉移學習;疾病命名實體標註;Biomedical text mining;Machine learning;Natural language processing;Transfer learning;Disease named entity recognition
日期:	2020-08-12
上傳時間:	2020-09-02 18:05:24 (UTC+8)
出版者:	國立中央大學
摘要:	隨著自然語言處理相關技術的快速發展，其在跨領域的應用上也有相當的發展。生醫文本探勘是生醫領域相關研究中重要的目的之一，隨著相較於以前的紙本記錄更趨向電子化的紀錄方式，在生醫文本探勘中也提供更多的資源去做研究。我們以醫院病歷作為研究方向，針對不同醫院科別間的病歷轉移學習作為主要目的。要達到這項目標，我們會使用到生醫領域的命名實體標註技術(Named Entity Recognition)，藉由其預測出在病歷中的疾病名稱，使醫療人員在統整記錄診斷時能有相當的幫助。過去的研究中，大致上分類為基於規則的生醫文本命名實體標註以及基於字典的命名實體標註兩大方向。但這兩者共同的缺點為會有文字的歧異性，並不能良好的區分語意問題。為了解決這樣的問題，我們使用機器學習的方法，BioBERT(Bidirectional Encoder Representations from Transformers for Biomedical Text Mining)則是在生醫自然語言處理領域中相當重要的技術之一。在我們的實驗中，我們將以醫院的科別為單位去做病歷文本探勘，以分析在不同科的病歷所訓練出的模型轉移學習到其他科別時的效果與不同科之間的文本差異。;With the rapid development of natural language processing (NLP), there has been considerable development in their cross-domain applications. Biomedical text mining is one of the most important purposes in biomedical research, and with the move towards electronic records as opposed to paper records, it provides more resources for biomedical text mining. We use hospital medical records as the research data source, and the primary objective is to apply the transfer learning of medical records between different hospital departments. To achieve this goal, we use Named Entity Recognition (NER), a technique used in the biomedical field that predicts the name of a disease in the patient’s record, to help medical experts in the consolidation of diagnoses. In the past studies, the two main approaches are roughly classified as rule-based biomedical text NER and dictionary-based NER. However, their common disadvantage is textually ambiguous, which is not the best way to distinguish semantic problems. BioBERT (Bidirectional Encoder Representations from Transformers for Biomedical Text Mining) is one of the most important technologies in the field of biomedical natural language processing, and we use machine learning to solve ambiguous word problems. In our experiment, We will apply text mining on medical records through hospital departments in order to analyze the effect of transferring the model trained in medical records from different departments to other departments and the differences in text between different departments.
顯示於類別:	[資訊工程研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	185	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....