摘要: | 基質輔助雷射脫附電離飛行時間質譜法(Matrix-assisted Laser Desorption Ionization-Time of Flight Mass Spectrometry,MALDI-TOF MS)被廣泛應用於微生物之鑑定,近年來亦有許多研究用以辨識細菌之抗藥性。為了分辨具有抗藥性之細菌,各種預處理方法被用於找出質譜資料中帶有辨識資訊之特徵峰值;使用不同預處理方法會得到不同資訊,為了獲得更多特徵峰值以提升辨識抗藥性之效能。本研究藉由長庚醫院多年蒐集之Acinetobacter nosocomialis、Acinetobacter baumannii、Enterococcus faecium、Group B Streptococci之質譜資料,結合多種預處理方法並搭配機器學習方法建立快速辨識抗生素耐藥性模型。本研究結合FlexAnalysis (Bruker Daltonics)、MALDIquant(R套件)與基於連續小波轉換方法進行質譜資料預處理,並採用羅吉斯回歸、單純貝氏分類器、隨機森林與支持向量機建構模型,並比較只使用單一種預處理方法與結合多種預處理方法找出之特徵峰值於辨識抗藥性細菌效能之差異。在各個細菌中,結合多種預處理方法提取之特徵搭配隨機森林建構之模型皆有最高準確率;其在獨立測試中的準確率分別為90.96%,84.37%,78.54%,70.12%。藉由特徵選擇亦可從綜合各方法得到的資訊中找出重要的特徵峰值。本研究根據質譜資料所建立之辨識各細菌抗藥模型可及時提供臨床醫師抗生素之相關資訊,而特徵峰值亦可供未來關於辨識細菌之抗藥性研究參考。;Matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) is widely used in the identification of microorganisms and applied for the prediction of antibiotic resistance in recent years. In order to distinguish antibiotic resistant bacteria, various preprocessing methods are used to find informative peaks from the MS data. Using different preprocessing methods will get different information. Get more informative peaks from spectra to promote the performance on identification of antibiotic resistance. In this study, we combine multiple preprocessing methods, FlexAnalysis (Bruker Daltonics), MALDIquant (R package), and continuous wavelet transform-based method, to detect peaks and build machine learning classifiers, logistic regressions, naïve Bayes classifiers, random forests and support vector machine, to identify antibiotic resistance for Acinetobacter nosocomialis, Acinetobacter baumannii, Enterococcus faecium, Group B Streptococci based on the MS data provided by Chang Gung Memorial Hospital. Meanwhile, the combined method will be compared with the individual method. The random forest with the combined methods have the highest accuracy and achieve 90.96%, 84.37%, 78.54% and 70.12% accuracy on independent test respectively. Through feature selection, important peaks about antibiotic resistance could be found from the integrated information. The prediction model can provide an opinion for clinicians, and the informative peak can provide a reference for further research. |