摘要(英) |
The phosphorylation of proteins, which is an important mechanism in post-translational modification, affects essentially cellular process such as metabolism, cell signaling, differentiation and membrane transportation. Phosphorylation is performed by protein kinases. The aim here is to computationally predict phosphorylation sites within given protein sequences. The known phosphorylation sites are categorized by substrate sequences and their corresponding protein kinase classes. Profile Hidden Markov Model (HMM) is applied for learning to each group of sequences surrounding to the phosphorylation residues. A predictive tool of protein phosphorylation sites, namely KinasePhos, is implemented to allow users submit protein sequences for prediction of phosphorylation sites. By comparing to other approaches previously developed, our method has higher accuracy and provides not only the location of the phosphorylation sites, but also the corresponding catalytic protein kinases. |
參考文獻 |
Bairoch, A. and R. Apweiler. 1998. The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1998. Nucleic Acids Res 26: 38-42.
Berry, E.A., A.R. Dalby, and Z.R. Yang. 2004. Reduced bio basis function neural network for identification of protein phosphorylation sites: comparison with pattern recognition algorithms. Comput Biol Chem 28: 75-85.
Blom, N., S. Gammeltoft, and S. Brunak. 1999. Sequence and structure-based prediction of eukaryotic protein phosphorylation sites. J Mol Biol 294: 1351-1362.
Blom, N., A. Kreegipuu, and S. Brunak. 1998. PhosphoBase: a database of phosphorylation sites. Nucleic Acids Res 26: 382-386.
Boeckmann, B., A. Bairoch, R. Apweiler, M.C. Blatter, A. Estreicher, E. Gasteiger, M.J. Martin, K. Michoud, C. O'Donovan, I. Phan, S. Pilbout, and M. Schneider. 2003. The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res 31: 365-370.
Burge, C. and S. Karlin. 1997. Prediction of complete gene structures in human genomic DNA. J Mol Biol 268: 78-94.
Crooks, G.E., G. Hon, J.M. Chandonia, and S.E. Brenner. 2004. WebLogo: A Sequence Logo Generator. Genome Res 14: 1188-1190.
Eddy, S.R. 1998. Profile hidden Markov models. Bioinformatics 14: 755-763.
Garnier, J., J.F. Gibrat, and B. Robson. 1996. GOR method for predicting protein secondary structure from amino acid sequence. Methods Enzymol 266: 540-553.
Gibrat, J.F., J. Garnier, and B. Robson. 1987. Further developments of protein secondary structure prediction using information theory. New parameters and consideration of residue pairs. J Mol Biol 198: 425-443.
Iakoucheva, L.M., P. Radivojac, C.J. Brown, T.R. O'Connor, J.G. Sikes, Z. Obradovic, and A.K. Dunker. 2004. The importance of intrinsic disorder for protein phosphorylation. Nucleic Acids Res 32: 1037-1049.
Kreegipuu, A., N. Blom, S. Brunak, and J. Jarv. 1998. Statistical analysis of protein kinase specificity determinants. FEBS Lett 430: 45-50.
Lindberg, R.A., A.M. Quinn, and T. Hunter. 1992. Dual-specificity protein kinases: will any hydroxyl do? Trends Biochem Sci 17: 114-119.
Schneider, T.D. and R.M. Stephens. 1990. Sequence logos: a new way to display consensus sequences. Nucleic Acids Res 18: 6097-6100. |