Semisupervised model-based validation of peptide identifications in mass spectrometry-based proteomics - PubMed (original) (raw)
doi: 10.1021/pr070542g. Epub 2007 Dec 27.
Affiliations
- PMID: 18159924
- DOI: 10.1021/pr070542g
Semisupervised model-based validation of peptide identifications in mass spectrometry-based proteomics
Hyungwon Choi et al. J Proteome Res. 2008 Jan.
Abstract
Development of robust statistical methods for validation of peptide assignments to tandem mass (MS/MS) spectra obtained using database searching remains an important problem. PeptideProphet is one of the commonly used computational tools available for that purpose. An alternative simple approach for validation of peptide assignments is based on addition of decoy (reversed, randomized, or shuffled) sequences to the searched protein sequence database. The probabilistic modeling approach of PeptideProphet and the decoy strategy can be combined within a single semisupervised framework, leading to improved robustness and higher accuracy of computed probabilities even in the case of most challenging data sets. We present a semisupervised expectation-maximization (EM) algorithm for constructing a Bayes classifier for peptide identification using the probability mixture model, extending PeptideProphet to incorporate decoy peptide matches. Using several data sets of varying complexity, from control protein mixtures to a human plasma sample, and using three commonly used database search programs, SEQUEST, MASCOT, and TANDEM/k-score, we illustrate that more accurate mixture estimation leads to an improved control of the false discovery rate in the classification of peptide assignments.
Similar articles
- Statistical validation of peptide identifications in large-scale proteomics using the target-decoy database search strategy and flexible mixture modeling.
Choi H, Ghosh D, Nesvizhskii AI. Choi H, et al. J Proteome Res. 2008 Jan;7(1):286-92. doi: 10.1021/pr7006818. Epub 2007 Dec 14. J Proteome Res. 2008. PMID: 18078310 - Improving sensitivity by probabilistically combining results from multiple MS/MS search methodologies.
Searle BC, Turner M, Nesvizhskii AI. Searle BC, et al. J Proteome Res. 2008 Jan;7(1):245-53. doi: 10.1021/pr070540w. J Proteome Res. 2008. PMID: 18173222 - Added value for tandem mass spectrometry shotgun proteomics data validation through isoelectric focusing of peptides.
Heller M, Ye M, Michel PE, Morier P, Stalder D, Jünger MA, Aebersold R, Reymond F, Rossier JS. Heller M, et al. J Proteome Res. 2005 Nov-Dec;4(6):2273-82. doi: 10.1021/pr050193v. J Proteome Res. 2005. PMID: 16335976 - Protein identification by tandem mass spectrometry and sequence database searching.
Nesvizhskii AI. Nesvizhskii AI. Methods Mol Biol. 2007;367:87-119. doi: 10.1385/1-59745-275-0:87. Methods Mol Biol. 2007. PMID: 17185772 Review. - Modes of inference for evaluating the confidence of peptide identifications.
Fitzgibbon M, Li Q, McIntosh M. Fitzgibbon M, et al. J Proteome Res. 2008 Jan;7(1):35-9. doi: 10.1021/pr7007303. Epub 2007 Dec 8. J Proteome Res. 2008. PMID: 18067248 Free PMC article. Review.
Cited by
- Beyond the E-Value: Stratified Statistics for Protein Domain Prediction.
Ochoa A, Storey JD, Llinás M, Singh M. Ochoa A, et al. PLoS Comput Biol. 2015 Nov 17;11(11):e1004509. doi: 10.1371/journal.pcbi.1004509. eCollection 2015 Nov. PLoS Comput Biol. 2015. PMID: 26575353 Free PMC article. - A novel algorithm for validating peptide identification from a shotgun proteomics search engine.
Jian L, Niu X, Xia Z, Samir P, Sumanasekera C, Mu Z, Jennings JL, Hoek KL, Allos T, Howard LM, Edwards KM, Weil PA, Link AJ. Jian L, et al. J Proteome Res. 2013 Mar 1;12(3):1108-19. doi: 10.1021/pr300631t. Epub 2013 Feb 12. J Proteome Res. 2013. PMID: 23402659 Free PMC article. - Target-small decoy search strategy for false discovery rate estimation.
Kim H, Lee S, Park H. Kim H, et al. BMC Bioinformatics. 2019 Aug 23;20(1):438. doi: 10.1186/s12859-019-3034-8. BMC Bioinformatics. 2019. PMID: 31443634 Free PMC article. - A guided tour of the Trans-Proteomic Pipeline.
Deutsch EW, Mendoza L, Shteynberg D, Farrah T, Lam H, Tasman N, Sun Z, Nilsson E, Pratt B, Prazen B, Eng JK, Martin DB, Nesvizhskii AI, Aebersold R. Deutsch EW, et al. Proteomics. 2010 Mar;10(6):1150-9. doi: 10.1002/pmic.200900375. Proteomics. 2010. PMID: 20101611 Free PMC article. Review. - Liquid Chromatography Mass Spectrometry-Based Proteomics: Biological and Technological Aspects.
Karpievitch YV, Polpitiya AD, Anderson GA, Smith RD, Dabney AR. Karpievitch YV, et al. Ann Appl Stat. 2010;4(4):1797-1823. doi: 10.1214/10-AOAS341. Ann Appl Stat. 2010. PMID: 21593992 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources