利用自然語言處理在胸腔X-Ray的自由文本病歷報告中標記識別心臟肥大的檔案和句子

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：97

、訪客IP：18.191.223.40

姓名

黃政清(Cheng-Ching Huang) 查詢紙本館藏

畢業系所

生物醫學工程研究所

論文名稱

利用自然語言處理在胸腔X-Ray的自由文本病歷報告中標記識別心臟肥大的檔案和句子
(Using Sentence Bidirectional Encoder Representations from Transformers (SBERT) for Identification of Chest X-Ray Cardiomegaly Cases and Phrases from Free-text Reports)

相關論文

★ 使用滾球篩選睡眠紡錘波檢測	★ 利用深度學習產生去骨電腦斷層掃描血管造影改進椎動脈分割
★ 評估深度卷積神經網路用於檢測和分割Chest X-ray圖像中的鎖骨骨折	★ 自然語言處理於病例情感分析分類器及句子相似度計算
★ 以圓柱採樣訓練深度神經網絡改進頭頸部電腦斷層掃描的骨骼偵測和分割	★ 使用深度學習模型自動分割黑血磁共振腦血管管壁
★ 肺炎診斷導向之深度學習電腦斷層掃瞄影像分割	★ AIoT邊緣運算即時領餐人流計數系統
★ 基於角色情感互動與主動式照護的生成式AI模型能力研究	★ 使用SpaCy NER標記胸部放射檢查報告：與 CheXpert Labeler 的比較

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

醫療病例的診斷紀錄具有非常高的臨床應用價值，其中不只有病患身上的情況與資訊，更有醫師針對這些資訊而產生的判斷，如果能善用這些病例文檔來擷取電腦輔助醫療資訊整合管理系統，將可以減少小診所以及大醫院的人力成本、時間成本，可減少醫師的行政業務負擔，使更多的資源用於病患身上。開發全自動化的醫療病例的診斷紀錄的軟體分析系統，這對目前大數據處理技術而言是一項困難的任務。現有的病例報告大多都以非結構化格式的編寫，文字檔案中包含很多技術註釋與速記的術語，內容也常帶有排除、疑似的用語，使得具醫療專業知識背景的專業人士需要將整份報告完整閱讀才能下判斷，對於非醫療領域的人就更加的困難。自然語言處理(Natural language processing)是現今文檔自動分類應用的熱門話題，BERT 是目前最先進的技術，可以使文字擁有更好的詞嵌入參數以方便計算文檔之間的語意距離，而 SBERT 是在 BERT 的架構之上的延伸模組，可以進行句子與句子之間的比較，判斷是否為同類。我們訓練 SBERT 模組在整個病例文檔之中切割出與疾病診斷相關的關鍵句子，給予病歷之中的每個句子一項疾病診斷相關性的分數，方便使用者快速找出關鍵句。在這個研究中，我們使用美國國家醫學圖書館的主題詞(MeSH)數據集(3,955 帶標記的病歷報告) 訓練 SBERT 分類器模型做病例報告的正常分類，我們將 MeSH 按照 7:2:1 分成了訓練集、驗證集、測試集，進行訓練模型的資料有 2768(70%)筆(975 筆正常、1793 筆異常)。驗證集部分有 783(19.79%)筆資料(276 筆正常、507 筆異常)，測試資料有 404 (10.21%)筆文檔(142 筆正常、262 筆異常)。實驗測試得到了 97.2% (393/404)(陽性預測值 95.8%，137/143)(陰性預測值 98.1%，256/261)的正常病例報告分辨正確率。把所有 1393 筆正常報告的 6150 句子測試,正常句子的測試正確率為 96.8%(5951/6150)。我們發現錯判的 199 個句子中，有一大部分錯誤都與 inflate 這個字相關，因此，我們新 ii 增了 100 筆擁有 inflate 相關的短句文檔，並加入重新訓練，得到了正常病例文檔的分辨正確率 96.8 % (391/404)(陽性預測值 95.7%，135/141)(陰性預測值 97.3 %， 256/263) ，但是在句子測試中則進步到 98.0 % (6029/6150). 我們還進行了心臟肥大的文檔分辨訓練，使用同一筆訓練資料的 2768(70%)筆資料 (250 筆心臟肥大、2518 筆非心臟肥大)進行模型的訓練，以驗證資料 783 (89 筆心臟肥大、694 筆非心臟肥大)找最佳的訓練模型參數，以測試資料 404 中的 262 筆異常文檔 (36 筆心臟肥大、226 筆異常非心臟肥大)測試，正確率 100%。與 NegBio 進行比較，正常方面 NegBio 的正確率 69.1 % (279/404)(陽性預測值 53.8%，121/225)(陰性預測值 88.3%，158/179)。在心臟肥大方面，NegBio 的正確率 94.1% (380/404)(陽性預測值 60.3 %，35/58)(陰性預測值 99.7%，345/346)。在正常和心臟肥大的分類上面，SBERT 都勝過了 NegBio 的正確率。

摘要(英)

The diagnostic records of medical cases are of very high research and clinical values. They include not only the information regarding the patients’ conditions, but also the judgments made by doctors based on their domain knowledge and expertise. If these diagnostic files are wellused and implemented in computer-aided monitoring systems, the manpower needed for quality management of small clinics and large hospitals will be reduced. The reduced management burden will allow physicians devote more time and resources to taking care of patients. It is a difficult task to process medical records using traditional rule-based national language processing (NLP) technologies. Most of the existing clinical reports are written in an unstructured format. Further, there are also a lot of technical notes and shorthands in these records. The content often contains excluding and suspecting terms, which usually require the medical professional knowledge background to be able to understand fully. Medical doctors generally need to read the entire report to make judgments and the high level of medical knowledge makes the task extremely difficult for people from non-medical fields. To tackle this challenging task, we propose using newly developed deep convolutional NLP models to help bridging non-medical information scientists to derive information from annotated clinical records. BERT is the most advanced technology at present, which can create better embedding vectors for measuring free-text document similarity, and Sentence BERT (SBERT) is a collection of pretrained models which allow users to adopt the BERT models by transferring learning. In this study, we train SBERT to classify between normal and abnormal phases and sentences. A separated model was also trained to find cardiomegaly sentences which can be assigned with highlight colors and scores for fast grasping at a glance. In this study, we used the Medical Subject Headings (MeSH) dataset (3,955 labeled medical record reports) for training the SBERT classifier Model. The data was split into 2768 iv (70 %) records (975 normal, 1,793 abnormal) for training, 783 (19.79 %) records (276 normal, 507 abnormal) for validation, and, 404 (10.21%) documents (142 normal, 262 abnormal) for test. The resultant accuracy rate in document classification was 97.2% (393/404) (positive predictive value (PPV) 95.8 %, 137/143) (negative predictive value (NPV) 98.1 %, 256/261) and the sentence classification accuracy rate was 96.8% (5951/6150). By adding 100 misclassified sentences to the training dataset, we improved sentence classification accuracy rate to 98.0 % (6029/6150) with a small reduction of document classification accuracy to 96.8% (391/404) (PPV 95.7 %, 135/14) (NPV 97.3 %, 256/263), which was much better than the NIH NegBio′s accuracy rate of 69.1 % (PPV 53.8 %, 121/225) (NPV 88.3 %, 158/179) by a large margin. In classifying documents with cardiomegaly, SBERT achieved an accuracy rate of 100 % for the test dataset (36 cardiomegaly and 226 non-cardiomegaly abnormal cases). NegBio’s test accuracy rate was 94.1 % (380/404) (PPV 60.3 %, 35/58) (NPV 99.7 %,345/346).

關鍵字(中)

★ 自然語言處理
★ 電子健康紀錄
★ 健康資訊學
★ 命名實體識別
★ 遷移學習
★ 變換神經網路

關鍵字(英)

★ Natural Language Processing (NLP)
★ Electronic Health Records (EHR)
★ Health Informatics, Named Entity Recognition
★ Transfer Learning
★ Transformer

論文目次

中文摘要 i
Abstract iii
Acknowledgement v
Table of content vi
Table of figures vii
List of tables viii
Introduction 1
Data and methods 22
Experiment Training Flow Charts 28
Results 32
Discussion 35
Conclusion 38
Bibliography 39

參考文獻

1. Wolinski, F., F. Vichot, and O. Gremont, Producing NLP-based on-line contentware. arXiv preprint cs/9809021, 1998.
2. Annarumma, M., et al., Automated triaging of adult chest radiographs with deep artificial neural networks. Radiology, 2019. 291(1): p. 196.
3. Rish, I. An empirical study of the naive Bayes classifier. in IJCAI 2001 workshop on empirical methods in artificial intelligence. 2001.
4. Cornegruta, S., et al., Modelling radiological language with bidirectional long short-term memory networks. arXiv preprint arXiv:1609.08409, 2016.
5. Devlin, J., et al., Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
6. Reimers, N. and I. Gurevych, Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084, 2019.
7. Koch, G., R. Zemel, and R. Salakhutdinov. Siamese neural networks for one-shot image recognition. in ICML deep learning workshop. 2015. Lille.
8. Friedman, C., et al., A general natural-language text processor for clinical radiology. Journal of the American Medical Informatics Association, 1994. 1(2): p. 161-174.
9. Peng, Y., et al., NegBio: a high-performance tool for negation and uncertainty detection in radiology reports. AMIA Summits on Translational Science Proceedings, 2018. 2018: p. 188.
10. Chapman, W.W., et al., A simple algorithm for identifying negated findings and diseases in discharge summaries. Journal of biomedical informatics, 2001. 34(5): p. 301-310.
11. negbio2. 2019.
12. Pota, M., et al., An effective BERT-based pipeline for Twitter sentiment analysis: A case study in Italian. Sensors, 2020. 21(1): p. 133.
13. Alaparthi, S. and M. Mishra, BERT: A sentiment analysis odyssey. Journal of Marketing Analytics, 2021. 9(2): p. 118-126.
14. Zhang, T., et al., Bertscore: Evaluating text generation with bert. arXiv preprint arXiv:1904.09675, 2019.
15. Sun, C., et al. Revisiting unreasonable effectiveness of data in deep learning era. in Proceedings of the IEEE international conference on computer vision. 2017.
16. Raoof, S., et al., Interpretation of plain chest roentgenogram. Chest, 2012. 141(2): p. 545-558.
17. Demner-Fushman, D., et al., Preparing a collection of radiology examinations for distribution and retrieval. Journal of the American Medical Informatics Association, 2016. 23(2): p. 304-310.
18. Hassanpour, S. and C.P. Langlotz, Unsupervised topic modeling in a large free text radiology report repository. Journal of digital imaging, 2016. 29(1): p. 59-62.
19. Mikolov, T., et al., Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781, 2013.
20. Chen, Y., Convolutional neural network for sentence classification. 2015, University of Waterloo.
21. Vaswani, A., et al., Attention is all you need. Advances in neural information processing systems, 2017. 30.
22. Liao, H.-H., 自然語言處理於病例情感分析分類器及句子相似度計算. 2021, National Central University.

指導教授

黃輝揚(Adam Huang)

審核日期

2022-8-17

推文