Abstract | Protein phosphorylation is a post-translational modification performed by a group of enzymes known as the protein kinases or phosphotransferases (Enzyme Commission classification 2.7). It is essential to the correct functioning of both proteins and cells, being involved with enzyme control, cell signalling and apoptosis. The major problem when attempting prediction of these sites is the broad substrate specificity of the enzymes. This study employs back-propagation neural networks (BPNNs), the decision tree algorithm C4.5 and the reduced bio-basis function neural network (rBBFNN) to predict phosphorylation sites. The aim is to compare prediction efficiency of the three algorithms for this problem, and examine knowledge extraction capability. All three algorithms are effective for phosphorylation site prediction. Results indicate that rBBFNN is the fastest and most sensitive of the algorithms. BPNN has the highest area under the ROC curve and is therefore the most robust, and C4.5 has the highest prediction accuracy. C4.5 also reveals the amino acid 2 residues upstream from the phosporylation site is important for serine/threonine phosphorylation, whilst the amino acid 3 residues upstream is important for tyrosine phosphorylation. |
---|