Rxivist logo

Prediction of kinase-specific phosphorylation sites through an integrative model of protein context and sequence

By Ralph Patrick, Coralie Horin, Bostjan Kobe, Kim-Anh LĂȘ Cao, Mikael Boden

Posted 15 Mar 2016
bioRxiv DOI: 10.1101/043679 (published DOI: 10.1016/j.bbapap.2016.08.001)

The identification of kinase substrates and the specific phosphorylation sites they regulate is an important factor in understanding protein function regulation and signalling pathways. Computational prediction of kinase targets -- assigning kinases to putative substrates, and selecting from protein sequence the sites that kinases can phosphorylate -- requires the consideration of both the cellular context that kinases operate in, as well as their binding affinity. This consideration enables investigation of how phosphorylation influences a range of biological processes. We report here a novel probabilistic model for the classification of kinase-specific phosphorylation sites from sequence across three model organisms: human, mouse and yeast. The model incorporates position-specific amino acid frequencies, and counts of co-occurring amino acids from kinase binding sites in a kinase- and family-specific manner. We show how this model can be seamlessly integrated with protein interactions and cell-cycle abundance profiles. When evaluating the prediction accuracy of our method, PhosphoPICK, on an independent hold-out set of kinase-specific phosphorylation sites, we found it achieved an average specificity of 97% while correctly predicting 32% of true positives. We also compared PhosphoPICK's ability, through cross-validation, to predict kinase-specific phosphorylation sites with alternative methods, and found that at high levels of specificity PhosphoPICK outperforms alternative methods for most comparisons made. We investigated the relationship between experimentally confirmed phosphorylation sites and predicted nuclear localisation signals by predicting the most likely kinases to be regulating the phosphorylated residues immediately upstream or downstream from the localisation signal. We show that kinases PKA, Akt1 and AurB have an over-representation of predicted binding sites at particular positions downstream from predicted nuclear localisation signals, demonstrating an important role for these kinases in regulating the nuclear import of proteins.

Download data

  • Downloaded 994 times
  • Download rankings, all-time:
    • Site-wide: 30,018
    • In bioinformatics: 3,279
  • Year to date:
    • Site-wide: 62,206
  • Since beginning of last month:
    • Site-wide: 47,407

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide