English  |  正體中文  |  简体中文  |  全文筆數/總筆數 : 78852/78852 (100%)
造訪人次 : 37790905      線上人數 : 1130
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋


    請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/8764


    題名: 利用峰點特徵值來分析高解析度蛋白質質譜資料;Analysis of high-resolution protein mass spectra based on peak feature selection
    作者: 陳聰百;Tsung-Pai Chen
    貢獻者: 資訊工程學系碩士在職專班
    關鍵詞: 質譜校準;峰點偵測;質譜儀;分類預測;基線校正;feature selection;SELDI-TOF;MALDI--TOF;classification;peak detection
    日期: 2006-06-15
    上傳時間: 2009-09-22 11:34:23 (UTC+8)
    出版者: 國立中央大學圖書館
    摘要: 表面強化雷射解析電離飛行質譜(SELDI-TOF)及基質輔助雷射脫附游離法飛行時間質譜(MALDI-TOF)技術是目前使用於辨識生物標記的技術。本論文是使用來自美國國家癌症研究協會的SELDI-TOF卵巢癌資料集,與來自長庚大學的MALDI-TOF口腔癌資料集。樣本皆區分為控制組及癌症病患組。我們的研究目標是縮減質譜的高維度並從中擷取出有意義的特徵峰點。抽取特徵的方法諸如基線校正、峰點偵測、質譜校準等。特徵選取則利用 Kolmogorov-Smirnov檢定(KS 檢定)、Logistic Regression(邏輯斯迴歸)和Random Forest 等方法。有鑑別力的特徵被挑選出來之後再應用三種分類方法來針對資料集做分類預測。 我們分別挑選了50個和100個最有鑑別力的特徵峰點來做1000次重複隨機性地10-fold 交叉驗證,並利用regression tree with bagging(迴歸樹), k-nearest neighbor(k 個最近鄰居)及SVM(支持向量機)等分類方法所得到的靈敏度(Sensitivity)、特異度(Specificity)、準確度(Accuracy)、精準度(Precision)皆有不錯的分類效果。同時我們也開發了一個質譜相關性查詢系統,去辨識在癌症及非癌症族群有高度相關的峰點值。在此我們提出的分析流程可以提供一個相對較小的特徵峰點資料集,該資料集具有足夠識別力來進行分類預測及相關性分析的研究。 The SELDI-TOF and MALDI-TOF process are the currently used techniques to identify biomarkers for cancers. Our work has focused on the ovarian cancer dataset that is generated by SELDI-TOF technique from National Cancer Institute, USA. Another study set is the oral cancer dataset that is generated by MALDI-TOF technique from Proteomics Center of Chang Gung University, Taiwan. The aim of this work is to reduce the high dimensionality of the mass spectra and extract the significant peak-features for further study. The methods used such as baseline subtraction, peak detection, spectra alignment and normalization are used for feature extraction. Kolmogorov-Smirnov test, logistic regression and random forest are used for feature selection. After feature selection, discriminatory peak-features are selected and three methods had applied to classify the two classes of the ovarian cancer datasets. The selected 50 and 100 most discriminatory peak-features were applied to do classification with 1000 replications using 10-fold proportional validation independently. The results yielded good accuracy, precision, sensitivity and specificity respectively, by regression tree with bagging, k-nearest neighbor and SVM classifier. We also develop a correlation based query system to identify the highly correlated peaks of cancer and non-cancer groups. The analysis pipeline that we proposed could provide a relatively small peak-feature set that is discriminatory enough for classification and correlation based studies.
    顯示於類別:[資訊工程學系碩士在職專班 ] 博碩士論文

    文件中的檔案:

    檔案 大小格式瀏覽次數


    在NCUIR中所有的資料項目都受到原著作權保護.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明