Mass-spectrometry-based draft of the human proteome (original) (raw)

References

  1. UniProt. C. Update on activities at the Universal Protein Resource (UniProt) in 2013. Nucleic Acids Res. 41, D43–D47 (2013)
  2. Paik, Y. K. et al. The Chromosome-Centric Human Proteome Project for cataloging proteins encoded in the genome. Nature Biotechnol. 30, 221–223 (2012)
    CAS Google Scholar
  3. Uhlen, M. et al. Towards a knowledge-based Human Protein Atlas. Nature Biotechnol. 28, 1248–1250 (2010)
    CAS Google Scholar
  4. Vizcaíno, J. A. et al. ProteomeXchange provides globally coordinated proteomics data submission and dissemination. Nature Biotechnol. 32, 223–226 (2014)
    Google Scholar
  5. Farrah, T. et al. State of the human proteome in 2013 as viewed through PeptideAtlas: comparing the kidney, urine, and plasma proteomes for the biology- and disease-driven Human Proteome Project. J. Proteome Res. 13, 60–75 (2014)
    CAS PubMed Google Scholar
  6. Wang, M. et al. PaxDb, a database of protein abundance averages across all three domains of life. Mol. Cell. Proteomics 11, 492–500 (2012)
    CAS PubMed PubMed Central Google Scholar
  7. Cox, J. & Mann, M. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nature Biotechnol. 26, 1367–1372 (2008)
    CAS Google Scholar
  8. Perkins, D. N., Pappin, D. J., Creasy, D. M. & Cottrell, J. S. Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20, 3551–3567 (1999)
    CAS PubMed Google Scholar
  9. Gupta, N., Bandeira, N., Keich, U. & Pevzner, P. A. Target-decoy approach and false discovery rate: when things may go wrong. J. Am. Soc. Mass Spectrom. 22, 1111–1120 (2011)
    ADS CAS PubMed PubMed Central Google Scholar
  10. Higdon, R. et al. IPM: An integrated protein model for false discovery rate estimation and identification in high-throughput proteomics. J. Proteomics 75, 116–121 (2011)
    CAS PubMed Google Scholar
  11. Beausoleil, S. A., Villen, J., Gerber, S. A., Rush, J. & Gygi, S. P. A probability-based approach for high-throughput protein phosphorylation analysis and site localization. Nature Biotechnol. 24, 1285–1292 (2006)
    CAS Google Scholar
  12. Reiter, L. et al. Protein identification false discovery rates for very large proteomics data sets generated by tandem mass spectrometry. Mol. Cell. Proteomics 8,. 2405–2417 (2009)
    CAS PubMed PubMed Central Google Scholar
  13. Nagaraj, N. et al. Deep proteome and transcriptome mapping of a human cancer cell line. Mol. Syst. Biol. 7, 548 (2011)
    PubMed PubMed Central Google Scholar
  14. Tran, J. C. et al. Mapping intact protein isoforms in discovery mode using top-down proteomics. Nature 480, 254–258 (2011)
    ADS CAS PubMed PubMed Central Google Scholar
  15. Lane, L. et al. Metrics for the Human Proteome Project 2013–2014 and strategies for finding missing proteins. J. Proteome Res. 13, 15–20 (2014)
    CAS PubMed Google Scholar
  16. Cabili, M. N. et al. Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev. 25, 1915–1927 (2011)
    CAS PubMed PubMed Central Google Scholar
  17. Djebali, S. et al. Landscape of transcription in human cells. Nature 489, 101–108 (2012)
    ADS CAS PubMed PubMed Central Google Scholar
  18. Bánfai, B. et al. Long noncoding RNAs are rarely translated in two human cell lines. Genome Res. 22, 1646–1657 (2012)
    PubMed PubMed Central Google Scholar
  19. Guttman, M., Russell, P., Ingolia, N. T., Weissman, J. S. & Lander, E. S. Ribosome profiling provides evidence that large noncoding RNAs do not encode proteins. Cell 154, 240–251 (2013)
    CAS PubMed PubMed Central Google Scholar
  20. Ingolia, N. T., Lareau, L. F. & Weissman, J. S. Ribosome profiling of mouse embryonic stem cells reveals the complexity and dynamics of mammalian proteomes. Cell 147, 789–802 (2011)
    CAS PubMed PubMed Central Google Scholar
  21. Flintoft, L. Non-coding RNA: Ribosomes, but no translation, for lincRNAs. Nature Rev. Genet. 14, 520 (2013)
    CAS PubMed Google Scholar
  22. Geiger, T., Wehner, A., Schaab, C., Cox, J. & Mann, M. Comparative proteomic analysis of eleven common cell lines reveals ubiquitous but varying expression of most proteins. Mol. Cell. Proteomics 11, M111.014050 (2012)
    PubMed PubMed Central Google Scholar
  23. Mertins, P. et al. Integrated proteomic analysis of post-translational modifications by serial enrichment. Nature Methods 10, 634–637 (2013)
    CAS PubMed PubMed Central Google Scholar
  24. Moghaddas Gholami, A. et al. Global proteome analysis of the NCI-60 cell line panel. Cell Rep. 4, 609–620 (2013)
    Google Scholar
  25. Shiromizu, T. et al. Identification of missing proteins in the neXtProt database and unregistered phosphopeptides in the PhosphoSitePlus database as part of the Chromosome-centric Human Proteome Project. J. Proteome Res. 12, 2414–2421 (2013)
    CAS PubMed Google Scholar
  26. Schirle, M., Heurtier, M. A. & Kuster, B. Profiling core proteomes of human cell lines by one-dimensional PAGE and liquid chromatography-tandem mass spectrometry. Mol. Cell. Proteomics 2, 1297–1305 (2003)
    CAS PubMed Google Scholar
  27. Fagerberg, L. et al. Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics. Mol. Cell. Proteomics 13, 397–406 (2014)
    CAS PubMed Google Scholar
  28. Hughes, G. M., Teeling, E. C. & Higgins, D. G. Loss of olfactory receptor function in hominin evolution. PLoS ONE 9, e84714 (2014)
    ADS PubMed PubMed Central Google Scholar
  29. Ahrné, E., Molzahn, L., Glatter, T. & Schmidt, A. Critical assessment of proteome-wide label-free absolute abundance estimation strategies. Proteomics 13, 2567–2578 (2013)
    PubMed Google Scholar
  30. Beck, M. et al. The quantitative proteome of a human cell line. Mol. Syst. Biol. 7, 549 (2011)
    PubMed PubMed Central Google Scholar
  31. Schwanhäusser, B. et al. Global quantification of mammalian gene expression control. Nature 473, 337–342 (2011)
    ADS PubMed Google Scholar
  32. Geiger, T. et al. Initial quantitative proteomic map of 28 mouse tissues using the SILAC mouse. Mol. Cell. Proteomics 12, 1709–1722 (2013)
    CAS PubMed PubMed Central Google Scholar
  33. Low, T. Y. et al. Quantitative and qualitative proteome characteristics extracted from in-depth integrated genomics and proteomics analysis. Cell Rep. 5, 1469–1478 (2013)
    CAS PubMed Google Scholar
  34. Barretina, J. et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483, 603–607 (2012)
    ADS CAS PubMed PubMed Central Google Scholar
  35. Koumangoye, R. B. et al. Reduced annexin A6 expression promotes the degradation of activated epidermal growth factor receptor and sensitizes invasive breast cancer cells to EGFR-targeted tyrosine kinase inhibitors. Mol. Cancer 12, 167 (2013)
    PubMed PubMed Central Google Scholar
  36. Klingelhöfer, J. et al. Epidermal growth factor receptor ligands as new extracellular targets for the metastasis-promoting S100A4 protein. FEBS J. 276, 5936–5948 (2009)
    PubMed Google Scholar
  37. Argenzio, E. et al. Proteomic snapshot of the EGF-induced ubiquitin network. Mol. Syst. Biol. 7, 462 (2011)
    PubMed PubMed Central Google Scholar
  38. Havugimana, P. C. et al. A census of human soluble protein complexes. Cell 150, 1068–1081 (2012)
    CAS PubMed PubMed Central Google Scholar
  39. Ori, A. et al. Cell type-specific nuclear pores: a case in point for context-dependent stoichiometry of molecular machines. Mol. Syst. Biol. 9, 648 (2013)
    CAS PubMed PubMed Central Google Scholar
  40. Hisamatsu, H. et al. Newly identified pair of proteasomal subunits regulated reciprocally by interferon gamma. J. Exp. Med. 183, 1807–1816 (1996)
    CAS PubMed Google Scholar
  41. Nandi, D., Jiang, H. & Monaco, J. J. Identification of MECL-1 (LMP-10) as the third IFN-gamma-inducible proteasome subunit. J. Immunol. 156, 2361–2364 (1996)
    CAS PubMed Google Scholar
  42. Mallick, P. et al. Computational prediction of proteotypic peptides for quantitative proteomics. Nature Biotechnol. 25, 125–131 (2007)
    CAS Google Scholar
  43. Domon, B. Considerations on selected reaction monitoring experiments: implications for the selectivity and accuracy of measurements. Proteomics Clin. Appl. 6, 609–614 (2012)
    CAS PubMed Google Scholar
  44. Gallien, S. et al. Targeted proteomic quantification on quadrupole-orbitrap mass spectrometer. Mol. Cell. Proteomics 11, 1709–1723 10.1074/mcp.O112.019802. (2012)
    PubMed PubMed Central Google Scholar
  45. Marx, H. et al. A large synthetic peptide and phosphopeptide reference library for mass spectrometry-based proteomics. Nature Biotechnol. 31, 557–564 (2013)
    CAS Google Scholar
  46. Johannsson, H. J. et al. Retinoic acid receptor alpha is associated with tamoxifen resistance in breast cancer. Nature Commun. 4, 2175 (2013)
    ADS Google Scholar

Download references

Acknowledgements

The authors wish to thank all originators of the mass-spectrometry-data used in this study for making their data available. We are grateful to P. Mallick, J. Cottrell and M. Schirle for conceptual discussions, to F. Pachl, S. Heinzlmeir, S. Klaeger, S. Maier, D. Helm, B. Ferreia, M. Frejno, H. Koch, M. Mundt, J. Zecha, D. Zolg, E. Gillmeier, B. Ruprecht, K. Kramer, G. Medard and X. Ku of TUM for the annotation of experiments, and to Y. Morad, A. Niadzelka, E. Kny, H. Cossmann, D. Schikora of SAP and V. Wichnalek, A. Klaus, M. Kroetz-Fahning, T. Schmidt of TUM for technical assistance.

Author information

Author notes

  1. Mathias Wilhelm, Judith Schlegl, Hannes Hahne and Amin Moghaddas Gholami: These authors contributed equally to this work.

Authors and Affiliations

  1. Chair of Proteomics and Bioanalytics, Technische Universität München, Emil-Erlenmeyer Forum 5, 85354 Freising, Germany,
    Mathias Wilhelm, Hannes Hahne, Amin Moghaddas Gholami, Harald Marx, Simone Lemeer & Bernhard Kuster
  2. SAP AG, Dietmar-Hopp-Allee 16, 69190 Walldorf, Germany,
    Mathias Wilhelm, Judith Schlegl, Marcus Lieberenz, Emanuel Ziegler, Lars Butzmann, Siegfried Gessulat, Joos-Hendrik Boese, Anja Gerstmair & Franz Faerber
  3. Cellzome GmbH, Meyerhofstraße 1, 69117 Heidelberg, Germany,
    Mikhail M. Savitski, Toby Mathieson & Marcus Bantscheff
  4. JPT Peptide Technologies GmbH, Volmerstraße 5, 12489 Berlin, Germany,
    Karsten Schnatbaum, Ulf Reimer & Holger Wenschuh
  5. Institute of Pathology, Technische Universität München, Trogerstraße 18, 81675 München, Germany,
    Martin Mollenhauer & Julia Slotta-Huspenina
  6. Center for Integrated Protein Science Munich, Germany
    Bernhard Kuster

Authors

  1. Mathias Wilhelm
    You can also search for this author inPubMed Google Scholar
  2. Judith Schlegl
    You can also search for this author inPubMed Google Scholar
  3. Hannes Hahne
    You can also search for this author inPubMed Google Scholar
  4. Amin Moghaddas Gholami
    You can also search for this author inPubMed Google Scholar
  5. Marcus Lieberenz
    You can also search for this author inPubMed Google Scholar
  6. Mikhail M. Savitski
    You can also search for this author inPubMed Google Scholar
  7. Emanuel Ziegler
    You can also search for this author inPubMed Google Scholar
  8. Lars Butzmann
    You can also search for this author inPubMed Google Scholar
  9. Siegfried Gessulat
    You can also search for this author inPubMed Google Scholar
  10. Harald Marx
    You can also search for this author inPubMed Google Scholar
  11. Toby Mathieson
    You can also search for this author inPubMed Google Scholar
  12. Simone Lemeer
    You can also search for this author inPubMed Google Scholar
  13. Karsten Schnatbaum
    You can also search for this author inPubMed Google Scholar
  14. Ulf Reimer
    You can also search for this author inPubMed Google Scholar
  15. Holger Wenschuh
    You can also search for this author inPubMed Google Scholar
  16. Martin Mollenhauer
    You can also search for this author inPubMed Google Scholar
  17. Julia Slotta-Huspenina
    You can also search for this author inPubMed Google Scholar
  18. Joos-Hendrik Boese
    You can also search for this author inPubMed Google Scholar
  19. Marcus Bantscheff
    You can also search for this author inPubMed Google Scholar
  20. Anja Gerstmair
    You can also search for this author inPubMed Google Scholar
  21. Franz Faerber
    You can also search for this author inPubMed Google Scholar
  22. Bernhard Kuster
    You can also search for this author inPubMed Google Scholar

Contributions

M.W., J.S., M.L., E.Z., L.B., J.-H.B., S.G., A.G., H.H., A.M.G. and B.K. designed ProteomicsDB. H.H., K.S., U.R., M.M. and J.S.-H. performed experiments. M.W., H.H., A.M.G., M.M.S., H.M., T.M., S.L. and B.K. performed data analysis. H.W., M.B., F.F. and B.K. conceptualized the study. M.W., H.H., A.M.G. and B.K. wrote manuscript.

Corresponding author

Correspondence toBernhard Kuster.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Additional information

Mass-spectrometry data are available from ProteomicsDB (https://www.proteomicsdb.org) and ProteomeXchange (http://proteomecentral.proteomexchange.org; dataset identifier PXD000865).

Extended data figures and tables

Extended Data Figure 1 Peptide and protein identifications.

a, Spectrum viewer enabling access to more than 70-million annotated tandem mass spectra of endogenous peptides and synthetic reference standards in real time. b, Peptide length and score distribution for targets and decoys for the search engine Mascot. It is of note that the peptide- and protein-identification criteria followed a two-step process. First, for each LC-MS/MS run, we applied a global 1% target-decoy false discovery rate (FDR) cut on the level of peptide spectrum matches (PSMs, not shown); second, we applied a peptide-length-dependent local FDR cut of 5% for all PSMs and the results are depicted here. c, Same as in a but for the search engine Andromeda. d, e, Heat maps showing FDRs as a function of search engine score and peptide length. Solid lines indicate the 5% local FDR.

Extended Data Figure 2 Protein-identification quality in very large data sets.

a, First filtering step. The first step filters every LC-MS/MS run at 1% PSM FDR. Top panel, score distribution for target and decoy PSMs following 1% PSM FDR filtering for Maxquant identifications. Bottom panel, the binned peptide-length distribution for target PSMs. b, Same as a but for Mascot identifications. c, Second filtering step. Same as a, but this time applying an additional 5% local length- and score-dependent FDR on the total aggregated data for Maxquant identifications in ProteomicsDB. It is apparent that the second filtering step improves the FDR about threefold and removes most PSMs shorter than 9 amino acids. d, Same as c but for Mascot identifications in ProteomicsDB. e, Comparative analysis of protein FDR characteristics of two different approaches based on Mascot analysis. In the classical target-decoy approach, aggregation of large quantities of data leads to accumulation of large numbers of decoy proteins and a concomitant loss of true target proteins when filtering the data at 1% protein FDR. The alternative ‘picked’ target-decoy method does not suffer from this scaling problem and maintains a constant decoy rate (and therefore lower protein FDR) but at the expense of lower sensitivity of target protein detection compared to the classical target-decoy approach. Please refer to the Supplementary Information for details and a discussion on the topic. Note that the two protein FDR methods were not used in this manuscript. Instead, we used the criteria shown in a and b.

Extended Data Figure 3 Further characterization of the proteome.

a, Some proteins are refractory to identification using tryptic digestion because they do not generate sufficient—or any—peptides that are within the productive mass range of a mass spectrometer typically used for bottom-up proteomics. This can be improved by the use of alternative proteases; for example, chymotrypsin as shown here for one of the many keratin-associated proteins localized on chromosome 21 (detected chymotryptic peptides in red). b, c, Translation of lincRNAs is rare but does exist and can be identified (b) across all chromosomes as well as (c) in many tissues and in HeLa cells. d, Peptide-intensity distribution of protein-coding genes and non-coding transcripts. Interestingly, the abundance of translated lincRNAs is broadly similar to that of classical proteins.

Extended Data Figure 4 Further characterization of the proteome.

a, Proteome coverage rapidly saturates with the addition of shotgun proteomic data. Tissue proteomes saturate at ∼approximately 16,000 proteins, but both body fluids and cell lines add small but noticeable numbers of proteins not covered in the tissues (see also b and c for a different ordering of samples). This indicates that proteome coverage is likely not to increase much more by merely adding high-throughput data (although it may increase confidence in protein identifications and will probably also increase sequence coverage). b, Same plot as a but different ordering of samples. c, Saturation plots showing that PTMs and affinity purifications each contribute distinctly to the coverage of the proteome. d, Comparison of five large-scale projects suggesting that a ‘core proteome’ of 10,000–12,000 ubiquitously expressed proteins exists. Ellipses represent the corresponding publications. e, Abundance distribution of the ‘core proteome’ based on the normalized iBAQ method. The most highly expressed 10% of proteins are dominated by proteins relating to energy production and protein synthesis. The least abundant 10% of proteins are enriched in proteins with regulatory functions. f, Tree-view summary of Gene Ontology (GO) term analysis for the proteins constituting the ‘core proteome’, showing that the core proteome is mainly concerned with biological processes relating to the homeostasis and life cycle of cells. The colours represent the broader categories of the treemap.

Extended Data Figure 5 Comparative analysis of five intensity-based label-free absolute-quantification approaches.

a, Linearity of intensity (U2-OS cell line data from ref. 22) and copies per cell for absolute protein quantification (AQUA)-quantified proteins (red dots, red regression line; same cell line30) and derived copy-number estimates (grey dots, blue regression line; from the same study). b, Total sum normalization re-scales intensity distributions of Colo-205 cell digests measured on two different mass spectrometers (Orbitrap Elite data in red, LTQ Orbitrap XL data in blue24). c, Quantile-quantile (Q-Q) plots of the normalized data presented in b illustrating good alignment of data across 4.5 orders of magnitude. d, Empirical cumulative density function (ECDF) of error distributions derived from a showing that all five methods have merit. e, Comparison of the fold error of iBAQ and top3 as a function of the number of quantified peptides. f, Same as e but for protein length. When peptide numbers are low, iBAQ shows errors that are slightly smaller in magnitude compared to the top3 method. g, Comparison of iBAQ and total sum normalized iBAQ for heavy SILAC-labelled MCF-7 cell digests (red bars32 and label-free quantified MCF-7cell digests (same as MCF-7 deep proteome in a; blue bars) before (left panel) and after normalization (right panel) showing no influence of the presence of the SILAC label on quantification results. h, Comparison of iBAQ and total sum normalized iBAQ for iTRAQ reporter-ion-intensity-based quantification (red bars; MCF-7 cell digest46) and label-free quantified MCF-7 cell digests (blue bars; same as a and c) before (left panel) and after normalization (right panel). The intensity-distribution characteristics of iTRAQ and label-free measurements are too different to allow for comparative analyses of MS1- and MS2-based quantification data. i, Normalized iBAQ distributions of 347 cell-line and tissue proteomes (all MS1 quantified) available in ProteomicsDB showing the general applicability of MS1-based quantification across many sources of biological material.

Extended Data Figure 6 Functional protein-expression analysis.

Gene ontology analysis of proteins with expression levels 10-fold above average in a particular organ or body fluid invariably highlights protein signatures with direct organ-related functional significance.

Extended Data Figure 7 Protein- versus mRNA-expression analysis.

a, Comparison of mRNA and protein expression of 12 human tissues showing the general rather poor correlation of protein and mRNA levels, implying the widespread application of transcriptional, translational and post-translational control mechanisms of protein-abundance regulation. Spearman correlation coefficients vary from 0.41 (thyroid gland) to 0.55 (kidney). ‘Corner proteins’ (0.5 logs to either side of zero) are marked in colours. b, Clustering of mRNA expression (left triangle) and protein expression (right triangle) across the 12 tissues does not reveal tissues with common profiles suggesting that the transcriptomes and proteomes of human tissues are quite different from each other. c, The ratio of protein and mRNA level for a protein is approximately constant across many tissues. The heat map shows proteins and tissues clustered according to their protein/mRNA ratio. d, Protein abundance can be predicted from mRNA levels. Using the median ratio of protein/mRNA across 12 tissues, it is possible to predict protein levels from mRNA levels for every tissue with a good correlation coefficient, underscoring the importance of the translation rate (and mRNA levels) on protein expression.

Extended Data Figure 8 Protein markers for drug sensitivity and resistance.

a, Elastic net analysis of protein expression and drug sensitivity for the EGFR kinase inhibitor erlotinib. Positive-effect-size values indicate that high protein expression is associated with drug sensitivity. Negative-effect-size values indicate that high protein expression is associated with drug resistance. b, Same as in a but for the EGFR kinase inhibitor lapatinib. c, Correlation analysis of the elastic net effect sizes for erlotinib and lapatinib (proteins with elastic net frequencies of less than 600 are not shown for clarity). Proteins in the top-right quadrant are common markers for drug sensitivity (including EGFR as the primary target of both drugs). Proteins in the bottom-left quadrant are common markers for drug resistance (including S100A4, a known resistance marker for lapatinib). Proteins that are strong markers for sensitivity or resistance are annotated in each plot and most proteins can be easily placed into EGFR signalling and regulation pathways.

Extended Data Figure 9 Protein complex composition and stoichiometry from shotgun proteomic data.

a, Stoichiometry of the nuclear pore complex (NPC) reconstructed from shotgun proteomics data. To illustrate that normalized iBAQ values from shotgun experiments actually reflect protein copy numbers, we reconstructed the stoichiometry of the NPC (blue bars, data from nuclear extracts of HeLa cells39; error bars indicate standard deviation from triplicate experiments) and compared it to the stoichiometry determined in the same study using AQUA peptides and SRM experiments (red bars). Note that most of the time, the stoichiometries are in very good agreement between the methods and the stoichiometries reported in the literature. b, Stoichiometry of the α- and β-subunits of the proteasome reconstructed from shotgun proteomics data (examples). β-subunits of the constitutive proteasome are indicated in grey, immunoproteasome subunits (β1i, β2i, β5i) are indicated in red. Note that PC-3 cells are devoid of the immunoproteasome, whereas cells in the lymph node almost exclusively express this version of the molecular machine. c, Systematic assessment of the fraction of βi subunits (red bars) and β-subunits (grey bars) across 29 tissue samples and 80 cell-line samples (tissue data from human body map (this study), cell-line data from22,24). Note that many cell lines and tissues contain both versions of the proteasome and the data also suggest that further forms of the proteasome with different subunit compositions may exist.

Extended Data Figure 10 Examples for the analytical utility of large mass-spectrometry-based data collected in ProteomicsDB.

a, Enumeration of post-translational modifications and protein termini. b, Computation of proteotypic peptides. Generally the same one to five peptides are identified every time a protein is identified (top panel) making proteotypic peptides useful for assessing protein identification and as reagents for targeted mass-spectrometry measurements. We note that the proteotypicity of a peptide strongly depends on the presence or absence of a chemical modification (bottom panel, here tandem mass tags (TMT) or isobaric tags for relative and absolute quantification (iTRAQ)). c, Analysis of the selectivity of SRM transitions. The top panel shows the y8 transition of the peptide LHYGLPVVVK (β-catenin, marked with an arrow) in a slice of the precursor and fragment-ion window of 0.7 Da and 0.7 Da, respectively, typically employed on triple-quadrupole mass spectrometers. The size of the circle represents the relative intensity of the y8 fragment in a full tandem mass spectrum of this peptide. All other circles are interfering peptides (extracted from the entire ProteomicsDB) that have precursor and fragment ions in the same m/z window and with varying intensities (circle size). Interference can be reduced by using high-resolution mass spectrometry (middle panel) and confining the analysis to the tissue in question (here, a colon sample, bottom panel). Such interference plots in conjunction with the proteotypicity of peptides can be valuable for the design of targeted proteomic experiments.

Supplementary information

PowerPoint slides

Rights and permissions

About this article

Cite this article

Wilhelm, M., Schlegl, J., Hahne, H. et al. Mass-spectrometry-based draft of the human proteome.Nature 509, 582–587 (2014). https://doi.org/10.1038/nature13319

Download citation