摘要(英) |
The diagnostic records of medical cases are of very high research and clinical values. They include not only the information regarding the patients’ conditions, but also the judgments made by doctors based on their domain knowledge and expertise. If these diagnostic files are wellused and implemented in computer-aided monitoring systems, the manpower needed for quality management of small clinics and large hospitals will be reduced. The reduced management burden will allow physicians devote more time and resources to taking care of patients. It is a difficult task to process medical records using traditional rule-based national language processing (NLP) technologies. Most of the existing clinical reports are written in an unstructured format. Further, there are also a lot of technical notes and shorthands in these records. The content often contains excluding and suspecting terms, which usually require the medical professional knowledge background to be able to understand fully. Medical doctors generally need to read the entire report to make judgments and the high level of medical knowledge makes the task extremely difficult for people from non-medical fields. To tackle this challenging task, we propose using newly developed deep convolutional NLP models to help bridging non-medical information scientists to derive information from annotated clinical records. BERT is the most advanced technology at present, which can create better embedding vectors for measuring free-text document similarity, and Sentence BERT (SBERT) is a collection of pretrained models which allow users to adopt the BERT models by transferring learning. In this study, we train SBERT to classify between normal and abnormal phases and sentences. A separated model was also trained to find cardiomegaly sentences which can be assigned with highlight colors and scores for fast grasping at a glance. In this study, we used the Medical Subject Headings (MeSH) dataset (3,955 labeled medical record reports) for training the SBERT classifier Model. The data was split into 2768 iv (70 %) records (975 normal, 1,793 abnormal) for training, 783 (19.79 %) records (276 normal, 507 abnormal) for validation, and, 404 (10.21%) documents (142 normal, 262 abnormal) for test. The resultant accuracy rate in document classification was 97.2% (393/404) (positive predictive value (PPV) 95.8 %, 137/143) (negative predictive value (NPV) 98.1 %, 256/261) and the sentence classification accuracy rate was 96.8% (5951/6150). By adding 100 misclassified sentences to the training dataset, we improved sentence classification accuracy rate to 98.0 % (6029/6150) with a small reduction of document classification accuracy to 96.8% (391/404) (PPV 95.7 %, 135/14) (NPV 97.3 %, 256/263), which was much better than the NIH NegBio′s accuracy rate of 69.1 % (PPV 53.8 %, 121/225) (NPV 88.3 %, 158/179) by a large margin. In classifying documents with cardiomegaly, SBERT achieved an accuracy rate of 100 % for the test dataset (36 cardiomegaly and 226 non-cardiomegaly abnormal cases). NegBio’s test accuracy rate was 94.1 % (380/404) (PPV 60.3 %, 35/58) (NPV 99.7 %,345/346). |
參考文獻 |
1. Wolinski, F., F. Vichot, and O. Gremont, Producing NLP-based on-line contentware. arXiv preprint cs/9809021, 1998.
2. Annarumma, M., et al., Automated triaging of adult chest radiographs with deep artificial neural networks. Radiology, 2019. 291(1): p. 196.
3. Rish, I. An empirical study of the naive Bayes classifier. in IJCAI 2001 workshop on empirical methods in artificial intelligence. 2001.
4. Cornegruta, S., et al., Modelling radiological language with bidirectional long short-term memory networks. arXiv preprint arXiv:1609.08409, 2016.
5. Devlin, J., et al., Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
6. Reimers, N. and I. Gurevych, Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084, 2019.
7. Koch, G., R. Zemel, and R. Salakhutdinov. Siamese neural networks for one-shot image recognition. in ICML deep learning workshop. 2015. Lille.
8. Friedman, C., et al., A general natural-language text processor for clinical radiology. Journal of the American Medical Informatics Association, 1994. 1(2): p. 161-174.
9. Peng, Y., et al., NegBio: a high-performance tool for negation and uncertainty detection in radiology reports. AMIA Summits on Translational Science Proceedings, 2018. 2018: p. 188.
10. Chapman, W.W., et al., A simple algorithm for identifying negated findings and diseases in discharge summaries. Journal of biomedical informatics, 2001. 34(5): p. 301-310.
11. negbio2. 2019.
12. Pota, M., et al., An effective BERT-based pipeline for Twitter sentiment analysis: A case study in Italian. Sensors, 2020. 21(1): p. 133.
13. Alaparthi, S. and M. Mishra, BERT: A sentiment analysis odyssey. Journal of Marketing Analytics, 2021. 9(2): p. 118-126.
14. Zhang, T., et al., Bertscore: Evaluating text generation with bert. arXiv preprint arXiv:1904.09675, 2019.
15. Sun, C., et al. Revisiting unreasonable effectiveness of data in deep learning era. in Proceedings of the IEEE international conference on computer vision. 2017.
16. Raoof, S., et al., Interpretation of plain chest roentgenogram. Chest, 2012. 141(2): p. 545-558.
17. Demner-Fushman, D., et al., Preparing a collection of radiology examinations for distribution and retrieval. Journal of the American Medical Informatics Association, 2016. 23(2): p. 304-310.
18. Hassanpour, S. and C.P. Langlotz, Unsupervised topic modeling in a large free text radiology report repository. Journal of digital imaging, 2016. 29(1): p. 59-62.
19. Mikolov, T., et al., Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781, 2013.
20. Chen, Y., Convolutional neural network for sentence classification. 2015, University of Waterloo.
21. Vaswani, A., et al., Attention is all you need. Advances in neural information processing systems, 2017. 30.
22. Liao, H.-H., 自然語言處理於病例情感分析分類器及句子相似度計算. 2021, National Central University. |