English  |  正體中文  |  简体中文  |  全文筆數/總筆數 : 78728/78728 (100%)
造訪人次 : 33537036      線上人數 : 1669
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋


    請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/89908


    題名: 使用文字探勘與深度學習技術建置中風後肺炎之預測模型;Develop The Predictive Model For Post Stroke Pneumonia By Using Text Mining And Deep Learning
    作者: 徐筱茜;Hsu, Hsiao-Chein
    貢獻者: 資訊管理學系
    關鍵詞: 腦中風;肺炎;文字探勘;機器學習;深度學習;電子醫療紀錄;stroke;pneumonia;text mining;machine learning;deep learning;EMR
    日期: 2022-08-26
    上傳時間: 2022-10-04 12:04:26 (UTC+8)
    出版者: 國立中央大學
    摘要: 腦中風是全球重大的健康問題之一,為全球人類死亡的第二大主因,且中風造成失能 的後遺症,是我國成人殘障的主因之一。中風相關性肺炎(Stroke-associated pneumonia, SAP) 是急性中風 (Acute ischemic stroke, AIS)患者預後的一個重要臨床問題,大多數的中風患者會 有不同程度的活動障礙,例如吞嚥困難使吸入性肺炎風險增加了七倍,三分之一的 AIS 患者 患有肺炎,是最常見的呼吸系統併發症,SAP 與造成長期死亡率增加、住院時間延長、醫療 費用上升和預後功能下降密切相關。
    本研究主要目的為加入深度學習技術入文字探勘萃取出非結構化電子病歷中可能影響 SAP 之新關鍵字作爲變數,並運用機器學習技術來為這些變數建構中風後肺炎的早期預測模 型,比較非結構化資料與結構化資料所建立之預測模型的預測效能差異,透過此研究模型預 測 AIS 患者於住院期間是否有併發肺炎風險,協助醫師做精準判斷並。
    本研究使用嘉義基督教醫院由 2007 年 5 月至 2020 年 9 月共 941 名缺血性中風患者 之病人入院當日之中文 EMR 護理紀錄與中風登錄資料庫作為研究資料,其中之非結構化資 料運用六項技術進行特徵工程,包含 TFIDF、Doc2Vec、Bidirectional Encoder Representations from Transformers (BERT) 、 BioBERT 、 Bio_Clinical Bert 和 MetaMap 搭 配 UMLS Metathesaurus 與 Term Frequency,產生之文本特徵與結構特徵使用八項機器學習方法包含 包含支持向量機、單純貝氏分類器、K-近鄰演算法、邏輯斯迴歸、決策樹、隨機森林、極限 梯度提升、 輕量梯度提升進行建模預測及結果比較。
    結果發現 1. 加入非結構化文本特徵合併結構特徵結構化來建構模型,將會提升單純 使用結構特徵或非結構文本特徵建構模型之預測效果,AUC 分別提升 1%和 9%。2.加入深度 學習技術於結合非結構化文本特徵工程,詞嵌入效果將於 AUC 上相比使用傳統特徵工程方 法有 1%之提升 3.使用基於生物醫學語料庫預訓練之 BioBERT 模型和基於醫學臨床紀錄預訓 練之 Bio_Clinical Bert 模型來作為此任務之非結構化 EMR 特徵工程技術,將比基於一 般語料 庫預訓練之 BERT 之詞嵌入,在後續建模上有更好之預測效果,AUC 分別提升 9%和 8%。 4.極限梯度學習 XGB 為中風後肺炎預測模型最適合機器學習分類器,以上有助於提升中風後 肺炎模型之預測性能,給予臨床醫師更準確之決策支援。
    ;Stroke is one of the major health problems and the second leading cause of human death in the world. The disability sequelae caused by stroke is one of the main causes of adult disability in Taiwan. Stroke-associated pneumonia (SAP) is an important clinical problem in the prognosis of patients with acute stroke (AIS). Most stroke patients will suffer from varying degrees of mobility impairment. Dysphagia for example, seven times increased the risk of aspiration pneumonia. Thus, pneumonia is the most common respiratory complication which occurs in one- third of patients with AIS. SAP is closely associated with increased long-term mortality, prolonged hospital stays, increased healthcare costs, and decreased prognostic function.
    The main purpose of this study is to add deep learning and text mining techniques to extract new risk factor that may affect SAP in unstructured electronic medical records (EMR). And then apply them to construct a model for predicting the risk of pneumonia complicated by AIS patients during hospitalization. It assists doctors in diagnosing accurately.
    This study used Chinese EMR and stroke registration database from 2007-2017 from Chia- Yi Christian Hospital. In total, 941 eligible patients with AIS were used to build and evaluate the models. The unstructured data used six techniques for feature engineering, including TFIDF, Doc2Vec, MetaMap, Bidirectional Encoder Representations from Transformers (BERT), BERT for Biomedical Text Mining (BioBERT), BERT for Clinical Text Mining (Bio_Clinical Bert). Eight machine learning methods including support vector machine, simple Bayesian classifier, K-Nearest Neighbor algorithm, logistic regression, Decision Tree, Random Forest, Extreme Gradient Boosting, and Light Gradient Boosting Machine were implemented for developing models and comparing predictive result.
    The results show that adding unstructured text features and combining structural features to construct prediction model achieved better performance than model constructed by simply using structural features or unstructured text features. In addition, Deep learning feature engineering technology achieved better embedding effect better than traditional feature engineering methods. Using the BioBERT and Bio_Clinical Bert model which is pre-trained based on biomedical corpus and medical clinical records as the unstructured EMR feature engineering technology for this task enabled subsequent modeling to achieve better performance than using BERT general corpus- based model. In all investigated classification techniques, extreme gradient boosting is the most suitable machine learning classifier for the prediction model of post-stroke pneumonia.
    顯示於類別:[資訊管理研究所] 博碩士論文

    文件中的檔案:

    檔案 描述 大小格式瀏覽次數
    index.html0KbHTML105檢視/開啟


    在NCUIR中所有的資料項目都受到原著作權保護.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明