PHOSIDA (phosphorylation site database): management, structural and evolutionary investigation, and prediction of phosphosites - PubMed (original) (raw)

PHOSIDA (phosphorylation site database): management, structural and evolutionary investigation, and prediction of phosphosites

Florian Gnad et al. Genome Biol. 2007.

Abstract

PHOSIDA http://www.phosida.com, a phosphorylation site database, integrates thousands of high-confidence in vivo phosphosites identified by mass spectrometry-based proteomics in various species. For each phosphosite, PHOSIDA lists matching kinase motifs, predicted secondary structures, conservation patterns, and its dynamic regulation upon stimulus. Using support vector machines, PHOSIDA also predicts phosphosites.

PubMed Disclaimer

Figures

Figure 1

Figure 1

PHOSIDA: phosphorylation site information. For each detected phosphorylation site, the position within the protein sequence along with its surrounding region, maximum assignment localization value, matching kinase motifs, and accessibility is shown. In addition, all detected phosphopeptides that contain the selected phosphosite are displayed along with their corresponding database identification scores, ratios after stimulus, fractions, and occurrences in other proteins.

Figure 2

Figure 2

Accessibilities of phosphorylation sites as calculated by SABLE. The relative accessibility prediction assigns a value between 0 (fully buried) and 9 (fully exposed) to each residue. For phosphoserines, phosphothreonines and phosphotyrosines, accessibility is significantly higher than for their non-phosphorylated counterparts in the same proteins.

Figure 3

Figure 3

Proportion of phosphorylation sites located in loops and hinges as determined by SABLE. In each case, phosphosites are significantly more frequently located in flexible regions.

Figure 4

Figure 4

Proportions of phosphoproteins with orthologs. To examine the conservation of phosphoproteins in comparison to the entire human proteome, we aligned two-directionally against the protein sequences of Saccharomyces cerevisiae, D. melanogaster, D. rerio, Gallus gallus, Bos bovis, Rattus norvegicus and Mus musculus via BLASTP. Phosphoproteins (red) have a much higher likelihood to have an ortholog than the entire set of human proteins from SwissProt (blue).

Figure 5

Figure 5

PHOSIDA: evolutionary section. The phylogeny in 70 species is illustrated for each phosphoprotein. The degree of homology is indicated by colors. Red means that the selected phosphoprotein does not show any significant sequence similarity. Blue means that the sequence of the phosphoprotein is significantly similar to a protein of another organism, but only one-directionally according to BLASTP. Green means that the phosphoprotein is probably orthologous to a protein of the chosen organism, since its sequence is significantly similar to the homologous protein in both directions. To enable users to set more stringent criteria for homology relating to the identities of aligned sequences and to check the entire sequence similarity, the global alignments of homologous proteins are also provided.

Figure 6

Figure 6

PHOSIDA: evolutionary section. The conservation status of phosphorylation sites within global alignments of homologous proteins is indicated in green or red. Green means that the chosen phosphorylation is conserved. Furthermore, the surrounding aligned sequence is also displayed, to check the conservation of matching kinase motifs.

Figure 7

Figure 7

Percentage sequence identity of phosphoproteins with orthologs.

Figure 8

Figure 8

Conservation of phosphoserines (red) compared to non-phosphoserines (blue) in phosphoproteins. Phosphoserines are significantly more conserved except in yeast.

Figure 9

Figure 9

Conservation of phosphothreonines (red) compared to non-phosphothreonines (blue). Phosphothreonines are significantly more conserved within mammals.

Figure 10

Figure 10

Conservation of phosphotyrosines (red) compared to non-phosphotyrosines (blue). Tyrosine is very highly conserved in mammals in both forms. In more distantly related species the numbers are small and differences are not statistically significant.

Figure 11

Figure 11

Conservation of phosphorylation motifs. Bars represent the proportion of identical residues in zebrafish orthologs of human phosphoproteins. The red line is the average identity in the region -20 to +20 amino acids surrounding the phosphosite. For both (a) serine and (b) threonine, about five amino acids in each direction show elevated sequence identity.

Figure 12

Figure 12

Feature transformation of phosphorylation sites for in silico prediction. The surrounding sequence of a phosphorylation site comprises 260 dimensions. Each dimension is defined by the position within the surrounding region and the amino acid type. The possible values in each dimension are 0 and 1. (a) Primary sequence (b) Extends set a by three dimensions, which include information about the predicted secondary structure of the phosphorylation site. (c) Extends set b by one dimension that contains the predicted accessibility. (d) Extends set a by three dimensions that reflect the conservation of the phosphosite in mammals and seven additional dimensions that describe the protein conservation in yeast, fly, zebrafish, chicken, cow, rat and mouse. (e) Combines set c and set d.

Figure 13

Figure 13

Precision-recall curve for phosphoserines. The two lines present the tradeoff between false positives and false negatives without (blue) and with (green) inclusion of structural and evolutionary constraints.

Similar articles

Cited by

References

    1. Hunter T. Signaling - 2000 and beyond. Cell. 2000;100:113–127. doi: 10.1016/S0092-8674(00)81688-8. - DOI - PubMed
    1. Cohen P. The regulation of protein function by multisite phosphorylation - a 25 year update. Trends Biochem Sci. 2000;25:596–601. doi: 10.1016/S0968-0004(00)01712-6. - DOI - PubMed
    1. Pawson T, Nash P. Protein-protein interactions define specificity in signal transduction. Genes Dev. 2000;14:1027–1047. - PubMed
    1. Schlessinger J. Cell signaling by receptor tyrosine kinases. Cell. 2000;103:211–225. doi: 10.1016/S0092-8674(00)00114-8. - DOI - PubMed
    1. Olsen JV, Blagoev B, Gnad F, Macek B, Kumar C, Mortensen P, Mann M. Global, in vivo, and site-specific phosphorylation dynamics in signaling networks. Cell. 2006;127:635–648. doi: 10.1016/j.cell.2006.09.026. - DOI - PubMed

MeSH terms

Substances

LinkOut - more resources