Predicting the phosphorylation sites using hidden Markov models and machine learning methods. : WestminsterResearch

Publication dates
Title	Predicting the phosphorylation sites using hidden Markov models and machine learning methods.
Type	Journal article
Authors	Senawongse, P., Dalby, A.R. and Yang, Z.R.
Abstract	Accurately predicting phosphorylation sites in proteins is an important issue in postgenomics, for which how to efficiently extract the most predictive features from amino acid sequences for modeling is still challenging. Although both the distributed encoding method and the bio-basis function method work well, they still have some limits in use. The distributed encoding method is unable to code the biological content in sequences efficiently, whereas the bio-basis function method is a nonparametric method, which is often computationally expensive. As hidden Markov models (HMMs) can be used to generate one model for one cluster of aligned protein sequences, the aim in this study is to use HMMs to extract features from amino acid sequences, where sequence clusters are determined using available biological knowledge. In this novel method, HMMs are first constructed using functional sequences only. Both functional and nonfunctional training sequences are then inputted into the trained HMMs to generate functional and nonfunctional feature vectors. From this, a machine learning algorithm is used to construct a classifier based on these feature vectors. It is found in this work that (1) this method provides much better prediction accuracy than the use of HMMs only for prediction, and (2) the support vector machines (SVMs) algorithm outperforms decision trees and neural network algorithms when they are constructed on the features extracted using the trained HMMs.
Journal	Journal of Chemical Information and Modeling
Journal citation	45 (4), pp. 1147-1152
ISSN	1549-9596
	1549-960X
Year	2005
Publisher	ACS Publications
Digital Object Identifier (DOI)	https://doi.org/10.1021/ci050047+
PubMed ID	16045309
Web address (URL)	http://europepmc.org/abstract/med/16045309
Published	18 Jun 2005

Related outputs

microRNA 1307 Is a Potential Target for SARS-CoV-2 Infection: An <i>in Vitro</i> Model
Arisan, Elif Damla, Dart, D. Alwyn, Grant, Guy H., Dalby, A.R., Kancagi, Derya Dilek, Turan, Raife Dilek, Yurtsever, Bulut, Karakus, Gozde Sir, Ovali, Ercument, Lange, Sigrun and Uysal-Onganer, P. 2022. microRNA 1307 Is a Potential Target for SARS-CoV-2 Infection: An <i>in Vitro</i> Model. ACS Omega. 7 (42), pp. 38003-38014. https://doi.org/10.1021/acsomega.2c05245

Bacterial Adaptation to Venom in Snakes and Arachnida
Esmaeilishirazifard, Elham, Usher, Louise, Trim, Carol, Denise, Hubert, Sangal, V., Tyson, G., Barlow, Axel, Redway, Keith F, Taylor, John D, Kremyda-Vlachou, Myrto, Davies, Sam, Loftus, Teresa D, Lock, Mikaella M G, Wright, Kstir, Dalby, Andrew, Snyder, L., Wuster, Wolfgang, Trim, Steve and Moschos, S. 2022. Bacterial Adaptation to Venom in Snakes and Arachnida. Microbiology Spectrum. 10 (3) e02408-21. https://doi.org/10.1128/spectrum.02408-21

Complete analysis of the H5 hemagglutinin and N8 neuraminidase phylogenetic trees reveals that the H5N8 subtype has been produced by multiple reassortment events
Dalby, A.R. 2016. Complete analysis of the H5 hemagglutinin and N8 neuraminidase phylogenetic trees reveals that the H5N8 subtype has been produced by multiple reassortment events. F1000Research . 5, p. 2463 2463. https://doi.org/10.12688/f1000research.9261.1

Molecular dynamics simulations of the temperature-induced unfolding of crambin follow the Arrhenius equation
Dalby, A.R. and Shamsir, M. 2015. Molecular dynamics simulations of the temperature-induced unfolding of crambin follow the Arrhenius equation. F1000Research. 4 (589). https://doi.org/10.12688/f1000research.6831.1

The European and Japanese outbreaks of H5N8 derive from a single source population providing evidence for the dispersal along the long distance bird migratory flyways
Dalby, A.R. and Iqbal, M. 2015. The European and Japanese outbreaks of H5N8 derive from a single source population providing evidence for the dispersal along the long distance bird migratory flyways. PeerJ. 3 e934. https://doi.org/10.7717/peerj.934

A global phylogenetic analysis in order to determine the host species and geography dependent features present in the evolution of avian H9N2 influenza hemagglutinin
Dalby, A.R. and Iqbal, M. 2014. A global phylogenetic analysis in order to determine the host species and geography dependent features present in the evolution of avian H9N2 influenza hemagglutinin. PeerJ. 2 e655. https://doi.org/10.7717/peerj.655

The Robustness of Pathway Analysis in Identifying Potential Drug Targets in Non-Small Cell Lung Carcinoma
Dalby, A.R. and Bailey, I. 2014. The Robustness of Pathway Analysis in Identifying Potential Drug Targets in Non-Small Cell Lung Carcinoma. Microarrays. 3 (4), pp. 212-225. https://doi.org/10.3390/microarrays3040212

Analysis of gene expression data from non-small celllung carcinoma cell lines reveals distinct sub-classesfrom those identified at the phenotype level
Dalby, A.R., Emam, I. and Franke, R. 2012. Analysis of gene expression data from non-small celllung carcinoma cell lines reveals distinct sub-classesfrom those identified at the phenotype level. PLoS ONE. 7 (11) e50253. https://doi.org/10.1371/journal.pone.0050253

Identification of Schistosoma mansoni microRNAs
Simões, M.C., Lee, J., Djikeng, A., Cerqueira, G.C., Zerlotini, A., da Silva-Pereira, R.A., Dalby, A.R., LoVerde, P., El-Sayed, N.M. and Oliveira, G. 2011. Identification of Schistosoma mansoni microRNAs. BMC Genomics. 12 (47), pp. 1-17. https://doi.org/10.1186/1471-2164-12-47

Developing stochastic models for spatial inference: bacterial chemotaxis
Yu, Y.D., Choi, Y., Teo, Y.Y. and Dalby, A.R. 2010. Developing stochastic models for spatial inference: bacterial chemotaxis. PLoS ONE. 5 (5) e10464. https://doi.org/10.1371/journal.pone.0010464

A comparative proteomic analysis of the simple aminoacid repeat distributions in Plasmodia reveals lineagespecific amino acid selection
Dalby, A.R. 2009. A comparative proteomic analysis of the simple aminoacid repeat distributions in Plasmodia reveals lineagespecific amino acid selection. PLoS ONE. 4 (7) e6231. https://doi.org/10.1371/journal.pone.0006231

Beta-sheet containment by flanking prolines: molecular dynamic simulations of the inhibition of beta-sheet elongation by proline residues in human prion protein.
Shamsir, M.S. and Dalby, A.R. 2007. Beta-sheet containment by flanking prolines: molecular dynamic simulations of the inhibition of beta-sheet elongation by proline residues in human prion protein. Biophysical Journal. 92 (6), pp. P2080-2089. https://doi.org/10.1529/biophysj.106.092320

COPASAAR--a database for proteomic analysis of single amino acid repeats.
Depledge, D.P. and Dalby, A.R. 2005. COPASAAR--a database for proteomic analysis of single amino acid repeats. BMC Bioinformatics. https://doi.org/10.1186/1471-2105-6-196

Evaluation of mutual information and genetic programming for feature selection in QSAR.
Venkatraman, V., Dalby, A.R. and Yang, Z.R. 2004. Evaluation of mutual information and genetic programming for feature selection in QSAR. Journal of Chemical Information and Computer Sciences. 44 (5), pp. 1686-1692. https://doi.org/10.1021/ci049933v

Reduced bio basis function neural network for identification of protein phosphorylation sites: comparison with pattern recognition algorithms.
Berry, E.A., Dalby, A.R. and Yang, Z.R. 2004. Reduced bio basis function neural network for identification of protein phosphorylation sites: comparison with pattern recognition algorithms. Computational Biology and Chemistry. 28 (1), pp. 75-85. https://doi.org/10.1016/j.compbiolchem.2003.11.005

Constructing an enzyme-centric view of metabolism.
Horne, A.B., Hodgman, T.C., Spence, H.D. and Dalby, A.R. 2004. Constructing an enzyme-centric view of metabolism. Bioinformatics. 20 (13), pp. 2050-2055. https://doi.org/10.1093/bioinformatics/bth199

Mining HIV protease cleavage data using genetic programming with a sum-product function.
Yang, Z.R., Dalby, A.R. and Qiu, J. 2004. Mining HIV protease cleavage data using genetic programming with a sum-product function. Bioinformatics. 20 (18), pp. 3398-3405. https://doi.org/10.1093/bioinformatics/bth414

The structure of human liver fructose-1,6-bisphosphate aldolase
Dalby, A.R., Tolan, D.R. and Littlechild, J.A. 2002. The structure of human liver fructose-1,6-bisphosphate aldolase. Acta Crystallographica Section D. D57, pp. 1526-1533. https://doi.org/10.1107/s0907444901012719

Structural and functional comparisons between vanadium haloperoxidase and acid phosphatase enzymes.
Littlechild, J., Garcia-Rodriguez, E., Dalby, A.R. and Isupov, M. 2002. Structural and functional comparisons between vanadium haloperoxidase and acid phosphatase enzymes. Journal of Molecular Recognition. 15 (5), pp. 291-296. https://doi.org/10.1002/jmr.590

Crystal structure of dodecameric vanadium-dependent bromoperoxidase from the red algae Corallina officinalis.
Isupov, M.N., Dalby, A.R., Brindley, A.A., Izumi, Y., Tanabe, T., Murshudov, G.N. and Littlechild, J.A. 2000. Crystal structure of dodecameric vanadium-dependent bromoperoxidase from the red algae Corallina officinalis. Journal of Molecular Biology. 299 (4), pp. 1035-1049. https://doi.org/10.1006/jmbi.2000.3806

Crystal structure of human muscle aldolase complexed with fructose 1,6-bisphosphate: mechanistic implications.
Dalby, A.R., Dauter, Z. and Littlechild, J.A. 1999. Crystal structure of human muscle aldolase complexed with fructose 1,6-bisphosphate: mechanistic implications. Protein Science. 8 (2), pp. 291-297. https://doi.org/10.1110/ps.8.2.291

Structure of a phosphoglycerate mutase:3-phosphoglyceric acid complex at 1.7 A.
Crowhurst, G.S., Dalby, A.R., Isupov, M.N., Campbell, J.W. and Littlechild, J.A. 1999. Structure of a phosphoglycerate mutase:3-phosphoglyceric acid complex at 1.7 A. Acta Crystallographica Section D. D55, pp. 1822-1826. https://doi.org/10.1107/s0907444999009944

Preliminary X-ray analysis of a new crystal form of the vanadium-dependent bromoperoxidase from Corallina officinalis.
Brindley, A.A., Dalby, A.R., Isupov, M.N. and Littlechild, J.A. 1998. Preliminary X-ray analysis of a new crystal form of the vanadium-dependent bromoperoxidase from Corallina officinalis. Acta Crystallographica Section D: Structural Biology. D54 (Pt 3), pp. 454-457. https://doi.org/10.1107/s0907444997014558

Studies with type I aldolase to understand fructose intolerance and combat parasitic disease.
Dalby, A.R. and Littlechild, J.A. 1996. Studies with type I aldolase to understand fructose intolerance and combat parasitic disease. Journal of Pharmacy and Pharmacology. 48 (2), pp. 214-217. https://doi.org/10.1111/j.2042-7158.1996.tb07126.x

Permalink - https://westminsterresearch.westminster.ac.uk/item/vw758/predicting-the-phosphorylation-sites-using-hidden-markov-models-and-machine-learning-methods

Predicting the phosphorylation sites using hidden Markov models and machine learning methods.

Related outputs

Share this

Usage statistics

Export as