English  |  正體中文  |  简体中文  |  Items with full text/Total items : 75369/75369 (100%)
Visitors : 25602468      Online Users : 492
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version

    Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/68964

    Title: 混合式心臟疾病危險因子與其病程辨識 於電子病歷之研究;A hybrid approach to identifying heart disease risk factors and progression in electronic medical records
    Authors: 簡舟陽;Chien,Chou-Yang
    Contributors: 資訊工程學系
    Keywords: 生醫探勘;自然語言處理;機器學習;Biomedcal imformation;Natural language processing;Machine learning
    Date: 2015-08-25
    Issue Date: 2015-09-23 14:47:33 (UTC+8)
    Publisher: 國立中央大學
    Abstract: 在電子病歷中提供許多病患的健康資訊,而疾病的危險因子是影響病人健康的重要威脅。因此,偵測危險因子成為是醫療文件探勘的一個重要目標,其中又以心臟疾病中的冠狀動脈疾病為2012~2013的全球第一大死因,於是從電子病歷中偵測心臟疾病的危險因子和追蹤危險因子的發展,將可以提供醫護人員參考與預防該疾病的發生。心臟疾病的危險因子在病歷中的表達方式主要包括命名實體、表格、句子的一部分及多句,因此很難只使用單一的方法來辨識它們是否存在。


    我們利用2014年i2b2中心舉辦的自然語言競賽,第二項任務的測試資料集來評估系統的實驗結果中發現,使用基於條件隨機域的系統得到F-score 88.27%的成績,而在添加規則語法的組態達到了F-Score 89.74%,提高了F-score 1.47%的效能,最後加上後處理所做出來我們目前最佳的F-score 91.74%,改善2%的成績。;The electronic medical records of patients provide detailed health information, and risk factors of disease effect patient on illness, thus they are an important target for medical text mining. The top one cause to death is coronary artery disease from 2012 to 2013, so detecting the risk factor of heart disease and tracking their progression over sets of longitudinal records is helpful to refer and prevent the heart disease. Risk factors are presented as named entity, part-of-sentence, tabular, and multi-sentence expressions in medical records; therefore, it is difficult to detect them using a single approach.

    In this paper, we present a hybrid approach to this task by developing three systems based on the conditional random fields (CRF) model, each of which targets one of three major risk factor categories: disease, medication, and smoker. To recognize risk factors not found by our CRF-based systems, our team formulate syntactic rules based on physiological indicators and risk factor keywords. To track patient progression longitudinally, we also use maximum entropy to label the identified risk factor mentions with tags that describe their relation to the document creation time.

    Our experimental results show that our CRF-based systems achieve an F-score of 88.27% on the i2b2 2014 Track 2 test dataset. Adding the various rules improves the F-score by 1.47% and achieves an F-score of 89.74%. Finally we combine previous system and post-processing, and the system achieves 91.74% and improve the F-score 2%.
    Appears in Collections:[資訊工程研究所] 博碩士論文

    Files in This Item:

    File Description SizeFormat

    All items in NCUIR are protected by copyright, with all rights reserved.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明