近幾年來許多研究已經發現在histone及其他的蛋白質上的甲基化會參與基因轉錄的調控,針對lysine及arginine而言,已有些生物資訊方法被發展出來,可辨識蛋白質潛在的甲基化位置。蛋白質三級結構的研究上已經證實有關蛋白質甲基化的位置可能較傾向發生在蛋白質結構的表面容易被分子附著的區域。然而,先前的方法並沒有考慮到在甲基化位置週圍的水分子的親合性(ASA,solvent Accessible Surface Area)的特性。因此,本研究提出一套方法(MASA),主要針對四種發生甲基化的胺基酸lysine, arginine, glutamate及asparagine,整合SVM(support vector machine)及採用蛋白質序列及結構上的特徵,用來辨識甲基化的位置。然而現今大部份實驗上已經被證實有甲基化的位置的蛋白質資料,於PDB上並沒有存在相對應的三級結構資訊;對此而言,可利用軟體工具有效預測amino acid上的ASA值,經由cross-validation 計算評估,有採用ASA值的模型可有效改進預測的準確度。另外,獨立測試也顯示出甲基化在lysine上及arginine 上分別可到80.8%及85.0%。最後,本方法亦實作出一網頁系統,網址為http://MASA.mbc.nctu.edu.tw,讓使用者透過web server方便操作,此網站可有效地協助生物學家辨識蛋白質甲基化位置。 Studies over the last few years have identified protein methylation on histones and other proteins that are involved in the regulation of gene transcription. Several works have developed approaches to identify computationally the potential methylation sites on lysine and arginine. Studies of protein tertiary structure have demonstrated that the sites of protein methylation are preferentially in regions that are easily accessible. However, previous studies have not taken into account the solvent-accessible surface area (ASA) that surrounds the methylation sites. This work presents a method named MASA that combines the support vector machine (SVM) with the sequence and structural characteristics of proteins to identify methylation sites on lysine, arginine, glutamate and asparagine. Since most experimental methylation sites are not associated with corresponding protein tertiary structures in the Protein Data Bank (PDB), the effective solvent-accessible prediction tools have been adopted to determine the potential ASA values of amino acids in proteins. Evaluation of predictive performance by cross-validation indicates that the ASA values around the methylation sites can improve the accuracy of prediction. Additionally, an independent test reveals that the prediction accuracies for methylated lysine and arginine are 80.8% and 85.0%, respectively. Finally, the proposed method is implemented as an effective system for identifying protein methylation sites. The developed web server is freely available at http://MASA.mbc.nctu.edu.tw/.