DTASelect and Contrast: tools for assembling and comparing protein identifications from shotgun proteomics - PubMed (original) (raw)
DTASelect and Contrast: tools for assembling and comparing protein identifications from shotgun proteomics
David L Tabb et al. J Proteome Res. 2002 Jan-Feb.
Abstract
The components of complex peptide mixtures can be separated by liquid chromatography, fragmented by tandem mass spectrometry, and identified by the SEQUEST algorithm. Inferring a mixture's source proteins requires that the identified peptides be reassociated. This process becomes more challenging as the number of peptides increases. DTASelect, a new software package, assembles SEQUEST identifications and highlights the most significant matches. The accompanying Contrast tool compares DTASelect results from multiple experiments. The two programs improve the speed and precision of proteomic data analysis.
Figures
Figure 1
Sample DTASelect.html fragment. Each protein identity is printed beside the count of peptide sequences associated with it. The number of spectra representing those sequences is also shown, along with the protein's sequence coverage, length in residues, molecular weight, calculated pI, and description from the specified database. If multiple proteins in the database correspond to the same set of peptide sequences, the proteins are grouped together. The peptides found for each collection of loci are listed beneath it. Spectra matching the same sequences but possessing different charge states (discernible by the “.2” vs “.3” suffixes on filenames) are not considered duplicates. Peptides that are uniquely found at a particular locus are indicated with asterisks. The fields enumerated for each peptide include file name, XCorr, DeltCN, precursor ion mass, Sp rank, percentage of fragment ions found, copy count, and sequence. Addition symbols (as seen with w26S.0501.0501.1) link to other proteins in the report that also contain the indicated peptide. The similarity for protein YOL055C to YPL258C is reported, showing that one peptide present for YOL055C matches to the other protein and one peptide does not.
Figure 2
Summary tables from DTASelect output for LC/MS/MS and MudPIT analysis of purified 26S protesomes: (A) DTASelect summary output for LC/MS/MS analysis on 4 _μ_g of purified 26S proteosome. Shown are total counts for proteins, peptides, and spectra. The difference between the nonredundant and redundant protein counts reflects that some proteins have been grouped together because of identical sequence coverage. When used with databases that contain a large number of related proteins (such as the human database), DTASelect's grouping functionality is a timesaver. (B) As in (A) except that results are for a MudPIT analysis of 40 _μ_g of purified 26S proteosome.
Figure 3
DTASelect graphical user interface. Identified peaks are color-coded blue for y ions or red b ions. The letters along the top of the window show the correspondence between fragment ions and sequence. Clicking on a peptide will cause its spectrum to be shown. Selecting a protein will show sequence coverage.
Figure 4
Sample Contrast.html fragment. This represents a group of proteins that appear in the new MudPIT sample but not the previous experiment when the same criteria are used against each. Each row in the table represents one protein, and the numbers in the columns are the sequence coverage percentages found in each data set (or, in the Total column, the cumulative sequence coverage across multiple columns). The percentages link to each protein's location in a corresponding DTASelect.html file. If multiple proteins have identical sequence coverage, they are grouped together (for example, NRL_1IKFH and NRL_1INDH). Several such sections appear in each Contrast output file, one for each combination of presence and absence.
Figure 5
Sample Contrast.html summary. Each row in this table represents a particular combination of presence and absence in each of the data sets, with the “X” marks indicating this pattern. Each row's count links back to the appearance of the group above it in the Contrast.html file. Of the 118 proteins appearing, 60 were present in both samples, 18 were present only in the “new” analysis, and 40 were found only in the “prev” experiment.
Figure 6
Sample Verbose Contrast.html fragment. Proteins YDR471W and YHR010W were found in both samples under this criteria set, though with different sequence coverages (17.6% and 21.3%, respectively). One peptide was found in both samples, but the other peptides were found in only one. The highest XCorr for each peptide in each sample is shown beside its sequence. Cumulatively, these peptides add up to 30.9% sequence coverage. The sequence coverage percentages for each sample lead to the relevant sections in the respective DTASelect output files. The cumulative sequence coverage links to a view of the protein's sequence overlaid with the peptide sequences.
Similar articles
- Added value for tandem mass spectrometry shotgun proteomics data validation through isoelectric focusing of peptides.
Heller M, Ye M, Michel PE, Morier P, Stalder D, Jünger MA, Aebersold R, Reymond F, Rossier JS. Heller M, et al. J Proteome Res. 2005 Nov-Dec;4(6):2273-82. doi: 10.1021/pr050193v. J Proteome Res. 2005. PMID: 16335976 - Improved ranking functions for protein and modification-site identifications.
Bern M, Goldberg D. Bern M, et al. J Comput Biol. 2008 Sep;15(7):705-19. doi: 10.1089/cmb.2007.0119. J Comput Biol. 2008. PMID: 18651800 - SQID: an intensity-incorporated protein identification algorithm for tandem mass spectrometry.
Li W, Ji L, Goya J, Tan G, Wysocki VH. Li W, et al. J Proteome Res. 2011 Apr 1;10(4):1593-602. doi: 10.1021/pr100959y. Epub 2011 Feb 23. J Proteome Res. 2011. PMID: 21204564 Free PMC article. - Verification of single-peptide protein identifications by the application of complementary database search algorithms.
Rohrbough JG, Breci L, Merchant N, Miller S, Haynes PA. Rohrbough JG, et al. J Biomol Tech. 2006 Dec;17(5):327-32. J Biomol Tech. 2006. PMID: 17122065 Free PMC article. Review. - Elective affinities--bioinformatic analysis of proteomic mass spectrometry data.
Li X, Pizarro A, Grosser T. Li X, et al. Arch Physiol Biochem. 2009 Dec;115(5):311-9. doi: 10.3109/13813450903390039. Arch Physiol Biochem. 2009. PMID: 19911947 Review.
Cited by
- Microprotein-encoding RNA regulation in cells treated with pro-inflammatory and pro-fibrotic stimuli.
Pai VJ, Lau CJ, Garcia-Ruiz A, Donaldson C, Vaughan JM, Miller B, De Souza EV, Pinto AM, Diedrich J, Gavva NR, Yu S, DeBoever C, Horman SR, Saghatelian A. Pai VJ, et al. BMC Genomics. 2024 Nov 5;25(1):1034. doi: 10.1186/s12864-024-10948-1. BMC Genomics. 2024. PMID: 39497054 Free PMC article. - Exceptional longevity of mammalian ovarian and oocyte macromolecules throughout the reproductive lifespan.
Bomba-Warczak EK, Velez KM, Zhou LT, Guillermier C, Edassery S, Steinhauser ML, Savas JN, Duncan FE. Bomba-Warczak EK, et al. Elife. 2024 Oct 31;13:RP93172. doi: 10.7554/eLife.93172. Elife. 2024. PMID: 39480006 Free PMC article. - Characterization and functional analysis of Toxoplasma Golgi-associated proteins identified by proximity labeling.
Pasquarelli RR, Quan JJ, Cheng ES, Yang V, Britton TA, Sha J, Wohlschlegel JA, Bradley PJ. Pasquarelli RR, et al. mBio. 2024 Nov 13;15(11):e0238024. doi: 10.1128/mbio.02380-24. Epub 2024 Sep 30. mBio. 2024. PMID: 39345210 Free PMC article. - Reduction of RAD23A extends lifespan and mitigates pathology in TDP-43 mice.
Guo X, Prajapati R, Chun J, Byun I, Gebis KK, Wang YZ, Ling K, Dalton C, Blair JA, Hamidianjahromi A, Bachmann G, Rigo F, Jafar-Nejad P, Savas JN, Lee MJ, Sreedharan J, Kalb RG. Guo X, et al. bioRxiv [Preprint]. 2024 Sep 14:2024.09.10.612226. doi: 10.1101/2024.09.10.612226. bioRxiv. 2024. PMID: 39314471 Free PMC article. Preprint. - Repeat modules and N-linked glycans define structure and antigenicity of a critical enterotoxigenic E. coli adhesin.
Berndsen ZT, Akhtar M, Thapa M, Vickers TJ, Schmitz A, Torres JL, Baboo S, Kumar P, Khatoon N, Sheikh A, Hamrick M, Diedrich JK, Martinez-Bartolome S, Garrett PT, Yates JR 3rd, Turner JS, Laird RM, Poly F, Porter CK, Copps J, Ellebedy AH, Ward AB, Fleckenstein JM. Berndsen ZT, et al. PLoS Pathog. 2024 Sep 16;20(9):e1012241. doi: 10.1371/journal.ppat.1012241. eCollection 2024 Sep. PLoS Pathog. 2024. PMID: 39283948 Free PMC article.
References
- Yates JR, III, McCormack AL, Eng JK. Anal Chem. 1996;68:534A–540A. - PubMed
- Yates JR., III Electrophoresis. 1998;19:893–900. - PubMed
- Gatlin CL, Kleeman GR, Hays LG, Link AJ, Yates JR., III Anal Biochem. 1998;263:93–101. - PubMed
- McCormack AL, Schieltz DM, Goode B, Yang S, Barnes G, Drubin D, Yates JR., III Anal Chem. 1997;69:767–776. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
- P41 RR011823-04/RR/NCRR NIH HHS/United States
- P41 RR011823/RR/NCRR NIH HHS/United States
- R33CA81665/CA/NCI NIH HHS/United States
- R33 CA081665-04/CA/NCI NIH HHS/United States
- RR11823/RR/NCRR NIH HHS/United States
LinkOut - more resources
Full Text Sources
Other Literature Sources