中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/72251
English  |  正體中文  |  简体中文  |  全文笔数/总笔数 : 78818/78818 (100%)
造访人次 : 34729467      在线人数 : 822
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻


    jsp.display-item.identifier=請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/72251


    题名: 基於生醫文本擷取功能性層級之生物學表徵語言敘述:由主成分分析發想之K近鄰算法;Extracting Function-level Statements in Biological Expression Language from Biomedical Literature:A K Nearest Neighbor approach inspired by Principal Component Analysis
    作者: 羅玉燕;Lo,Yu-Yan
    贡献者: 資訊工程學系
    关键词: 生醫文獻探勘;生物學表徵語言;機器學習;主成分分析;K近鄰算法;Biomedical text mining;Biological Expression Language;Machine learning;Principal component analysis;K-nearest neighbor
    日期: 2016-08-19
    上传时间: 2016-10-13 14:34:54 (UTC+8)
    出版者: 國立中央大學
    摘要: 一直以來,瞭解生物體中的蛋白質訊息傳導路徑是生醫領域研究的主要目的之一,因為蛋白質訊息傳導路徑牽涉到許多生物體內的調控作用,不同調控作用的組合會形成不同的蛋白質訊號路徑,而這些訊號路徑之間具有關聯性,彼此相連成精密的訊息傳遞網路。近年來,基於生醫實驗技術的精進以及資訊交流的便利,生醫領域中的文獻數量大幅成長,對於生醫文本探勘技術的需求也逐漸增加。生物學表徵語言(Biological Expression Language, BEL)是一種描述生醫訊息傳導網絡的表示法,此語言不僅可以描述兩生醫實體(基因、蛋白質、化合物等)之間的正負回饋關係,更可以近一步的表示生醫實體的功能性層級資訊,例如:是否為複合物、是否為伴侶性蛋白或是扮演催化物角色等等。在相關研究中,最新的擷取功能性層級(function-level)之生物學表徵語言成績為30.5\%,而此擷取成果會影響之後自動化擷取生物學表徵語言之完整性。為了提升生物學表徵語言敘述完整性,我們提出了主成分分析發想之K近鄰算法來自動化識別功能性層級之生醫實體,並在實驗中提出了基於不平衡資料集之功能性層級之生醫實體分類法,比較支持向量機(SVM)實驗與主成分分析發想之K近鄰算法之結果優缺。在實驗結果中,使用主成分分析發想之K近鄰算法對於不平衡資料集分類的效果為佳,其分類成績可達到59.70\%。因此,我們期望透過此自動化識別功能性層級之生醫實體之方法,提升未來建構生醫訊息傳導網路之完整性,進而加快生醫學者醫藥研究之進程。;Nowadays, understanding pathway is one of the main purpose of biomedical domains, because the biological pathway involves various regulation mechanisms. Many regulation mechanisms have being discovered and presented in biomedical literature, allowing life scientists to perceive the latest results. It also has being highly demanded within the scientific community in the text mining for biomedical researches. Biological Expression Language (BEL) is designed to capture relationships between the two biological entities, such as gene, protein and chemical in scientific literatures. This is can not only describe the positive/negative relationship between biomedical entities, but represent biomedical function-level information, such as complex abundance, chaperone protein, catalyst and so on. In related research, the latest performance of function-level classification is 30.5\%, and the performance will effect on the BEL full-statement performance. In order to enhance the integrity of the BEL full-statements, we proposed a K-nearest neighbor (KNN) approach inspired by Principal Component Analysis (PCA) to recognize the function-level terms automatically. In experimental results, combination of PCA and KNN has the higher performance than SVM-based method, and it can achieve F-score of 59.70\%. In conclusion, we hope that the higher performance of function-level classification can not only enhance the integrity of BEL full-statement, but help to construct complete biological networks and to accelerate the biomedical research processes for life scientists.
    显示于类别:[資訊工程研究所] 博碩士論文

    文件中的档案:

    档案 描述 大小格式浏览次数
    index.html0KbHTML386检视/开启


    在NCUIR中所有的数据项都受到原著作权保护.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明