Global Post-Translational Modification Discovery - PubMed (original) (raw)

Global Post-Translational Modification Discovery

Qiyao Li et al. J Proteome Res. 2017.

Abstract

A new global post-translational modification (PTM) discovery strategy, G-PTM-D, is described. A proteomics database containing UniProt-curated PTM information is supplemented with potential new modification types and sites discovered from a first-round search of mass spectrometry data with ultrawide precursor mass tolerance. A second-round search employing the supplemented database conducted with standard narrow mass tolerances yields deep coverage and a rich variety of peptide modifications with high confidence in complex unenriched samples. The G-PTM-D strategy represents a major advance to the previously reported G-PTM strategy and provides a powerful new capability to the proteomics research community.

Keywords: G-PTM-D; database search; post-translational modification discovery; proteomics.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interest.

Figures

Figure 1

Figure 1

G-PTM-D workflow, illustrated with results from the Jurkat cell data set. An expanded view (±160 Da) of the histogram of precursor mass error (ΔM) searched with ±1000 Da precursor mass tolerance is displayed here; the full histogram and a comparison to a ±200 Da search are shown in

Supplementary Figure S1

. The numbers of modified sites for each of the 27 identified modification types are also displayed.

Figure 2

Figure 2

Results from three types of searches of the Jurkat cell data set: a vPhospho search (using the UniProt FASTA database with phosphorylation as a variable modification), a G-PTM search (using the PTM-curated UniProt database), and a G-PTM-D search. (a) Numbers of modified proteins, unique peptides, and PSMs for each search. The 45 687 modified PSMs identified by G-PTM-D are shown in

Supplementary Table S2

, with a hyperlink to the MS-Viewer report for each PSM. (b) False discovery rate (FDR) for modified peptides. (c) Posterior error probability (PEP) for modified peptides as a function of the Morpheus score. All results are based on 1% global FDR.

Figure 3

Figure 3

Numbers of peptides with modifications identified by G-PTM or G-PTM-D for the four human cell lines. “Others” include trimethylation, carboxylation, sulfation, water loss, ammonia loss, and deamidation.

Figure 4

Figure 4

Histogram of Δ Morpheus score, the difference between the Morpheus score by G-PTM-D and the Morpheus score by G-PTM, for all the Jurkat spectra that were identified by G-PTM-D with 1% FDR (orange), for those modified (blue), or for those that were assigned to different base peptide sequence in G-PTM and G-PTM-D (gray). A Δ Morpheus score of zero indicates no difference between the two types of searches. The positive Δ Morpheus scores (15% of all assignments) indicate G-PTM-D found a better match, and all of these improved cases were for modified spectra.

Figure 5

Figure 5

Annotation of the same spectrum from the Jurkat data set (fraction 6, spectrum number 18675) from (a) G-PTM identification, which includes phosphorylation of serine 20 and (b) G-PTM-D identification, which yields a much better match to fragment ions for this phosphorylation of serine 4. Red font is used to represent ion matches. Note that the G-PTM search employed the curated phosphorylation sites from UniProt, which only included serine 20 for this peptide. G-PTM-D, however, was able to reassign this spectrum to phosphorylation of serine 4, which is likely the correct modification site, given the substantial improvement in fragment ion matches.

Figure 6

Figure 6

Results from searches of the Matrigel data set using pMatch, MODa, and G-PTM-D. (a) Numbers of identified PSMs (all and modified). (b) False discovery rate (FDR) for modified peptides. (c) Posterior error probability (PEP) for modified peptides as a function of score. Note that the PEP values were plotted versus normalized scores to account for the different score scales of Morpheus, pMatch, and MODa (scores were normalized to a scale from 1 to 13). All results in this Figure met a 1% global FDR.

References

    1. Olsen J. V.; Mann M. Status of large-scale analysis of post-translational modifications by mass. Mol. Cell. Proteomics 2013, 12, 3444–3452. 10.1074/mcp.O113.034181. - DOI - PMC - PubMed
    1. Shortreed M. R.; Wenger C. D.; Frey B. L.; Sheynkman G. M.; Scalf M.; Keller M. P.; Attie A. D.; Smith L. M. Global identification of protein post-translational modifications in a single-pass database search. J. Proteome Res. 2015, 14, 4714–4720. 10.1021/acs.jproteome.5b00599. - DOI - PMC - PubMed
    1. Cesnik A. J.; Shortreed M. R.; Sheynkman G. M.; Frey B. L.; Smith L. M. Human proteomic variation revealed by combining RNA-Seq proteogenomics and global post-translational modification (G-PTM) search strategy. J. Proteome Res. 2016, 15, 800–808. 10.1021/acs.jproteome.5b00817. - DOI - PMC - PubMed
    1. Gelse K.; Poschl E.; Aigner T. Collagens--structure, function, and biosynthesis. Adv. Drug Delivery Rev. 2003, 55, 1531–1546. 10.1016/j.addr.2003.08.002. - DOI - PubMed
    1. Shoulders M. D.; Raines R. T. Collagen structure and stability. Annu. Rev. Biochem. 2009, 78, 929–958. 10.1146/annurev.biochem.77.032207.120833. - DOI - PMC - PubMed

Publication types

MeSH terms

Grants and funding

LinkOut - more resources