Predicting functional effect of human missense mutations using PolyPhen-2 - PubMed (original) (raw)
Predicting functional effect of human missense mutations using PolyPhen-2
Ivan Adzhubei et al. Curr Protoc Hum Genet. 2013 Jan.
Abstract
PolyPhen-2 (Polymorphism Phenotyping v2), available as software and via a Web server, predicts the possible impact of amino acid substitutions on the stability and function of human proteins using structural and comparative evolutionary considerations. It performs functional annotation of single-nucleotide polymorphisms (SNPs), maps coding SNPs to gene transcripts, extracts protein sequence annotations and structural attributes, and builds conservation profiles. It then estimates the probability of the missense mutation being damaging based on a combination of all these properties. PolyPhen-2 features include a high-quality multiple protein sequence alignment pipeline and a prediction method employing machine-learning classification. The software also integrates the UCSC Genome Browser's human genome annotations and MultiZ multiple alignments of vertebrate genomes with the human genome. PolyPhen-2 is capable of analyzing large volumes of data produced by next-generation sequencing projects, thanks to built-in support for high-performance computing environments like Grid Engine and Platform LSF.
Figures
Figure 7.20.1
PolyPhen-2 home Web page with the input form prepared to submit a single protein substitution query using Swiss-Prot accession as a protein identifier. Also supported are RefSeq and Ensembl protein identifiers; alternatively, a dbSNP reference SNP identifier can be entered, in which case no other input is required.
Figure 7.20.2
Detailed results of the PolyPhen-2 analysis for single variant query. This format is used for all PolyPhen-2 query reports except the Batch Query. The top Query section includes UniProtKB/Swiss-Prot description of query protein, if it was recognized as a known database entry. The large “heatmap” color bar with the black indicator mark dominates the display, illustrating the strength of the putative damaging effect for the variant, assessed using the default HumDiv-trained predictor. Clicking on the [+] control boxes expands the Prediction/Confidence panel for the HumDiv-trained predictor, as well as additional panels with protein multiple sequence alignment and 3D-structure viewers. For the color version of this figure, go to
http://www.currentprotocols.com/protocol/hg0720
.
Figure 7.20.3
Detailed results of the PolyPhen-2 analysis for a single variant query with the multiple sequence alignment and 3-D-structure protein viewer panels expanded the multiple sequence alignment panel displays a fixed 75-residue wide window surrounding the variant’s position (the column indicated by black frame), with the alignment colored using the ClustalX (Thompson et al., 1997) scheme for all columns above 50% conservation threshold. Clicking on the link at the bottom of the alignment panel opens the Jalview (Waterhouse et al., 2009) alignment viewer applet with the complete multiple alignment loaded. Displayed below is a 3-D-structure viewer applet (Jmol;
) with the protein structure loaded and zoomed into the mutation residue using the Zoom into mutation button. The structure viewer window is fully interactive, and the protein structure can be rotated, moved, or zoomed in and out.
Figure 7.20.4
The PolyPhen-2 Batch Query Web page allows submitting large number of variants for analysis in a single operation. Type or paste your variants into the Batch Query text input area (one variant per line) or upload a text file listing variants using Upload batch file text box (locate the file using the Browse button). If you enter your e-mail address into the corresponding text box, you will be notified via e-mail when your query completes. To analyze protein variants in nonstandard or unannotated proteins, you can upload your own protein sequences in FASTA format using the Upload FASTA file text box. Genomic variants are also supported; see the Sample Batch panel for the various input format examples. Do not forget to select the genome assembly version matching your genomic SNP data under Advanced Options; default assembly version used is GRCh37/UCSC hg19.
Figure 7.20.5
Grid Gateway Interface (GGI) Web page showing a PolyPhen-2 user session with one single-variant query completed and a Batch Query pending execution. Click on the View link to access results of a single-variant query (no errors were reported). This Batch Query was queued as a 7-stage pipeline; the status of each pipeline stage is tracked and displayed separately, with short stage explanations printed in the Description column. The batch will be completed when the last stage finishes. Grid Status shows high Grid Load and large number of other Pending jobs; batch completion waiting time is likely to be substantial. Click on the Refresh button periodically to update session status. You can also close your browser and check your session at a later time—go to the PolyPhen-2 home page, click on the Check Status button, and you will be transferred to your session automatically.
Figure 7.20.6
Grid Gateway Interface (GGI) Web page showing a completed PolyPhen-2 Batch Query. Right-click on one of the SNPs, Short, or Full links in the Batches/Results column to download results to your computer; see Basic Protocol 2 for description of the three various types of report files. Click on the Logs link under Batches/Errors to view all error and warning messages generated by the pipeline. Note that most of the warnings are for your information only and do not indicate failure of the analysis. After downloading results, all batch data can be deleted by checking corresponding Batches/Delete checkbox and clicking on the Refresh button. Be warned that the delete operation is irreversible and deleted data cannot be restored.
Similar articles
- Using SIFT and PolyPhen to predict loss-of-function and gain-of-function mutations.
Flanagan SE, Patch AM, Ellard S. Flanagan SE, et al. Genet Test Mol Biomarkers. 2010 Aug;14(4):533-7. doi: 10.1089/gtmb.2010.0036. Genet Test Mol Biomarkers. 2010. PMID: 20642364 - Predicting the functional, molecular, and phenotypic consequences of amino acid substitutions using hidden Markov models.
Shihab HA, Gough J, Cooper DN, Stenson PD, Barker GL, Edwards KJ, Day IN, Gaunt TR. Shihab HA, et al. Hum Mutat. 2013 Jan;34(1):57-65. doi: 10.1002/humu.22225. Epub 2012 Nov 2. Hum Mutat. 2013. PMID: 23033316 Free PMC article. - Machine learning classifier for identification of damaging missense mutations exclusive to human mitochondrial DNA-encoded polypeptides.
Martín-Navarro A, Gaudioso-Simón A, Álvarez-Jarreta J, Montoya J, Mayordomo E, Ruiz-Pesini E. Martín-Navarro A, et al. BMC Bioinformatics. 2017 Mar 7;18(1):158. doi: 10.1186/s12859-017-1562-7. BMC Bioinformatics. 2017. PMID: 28270093 Free PMC article. - Computational prediction of the effects of non-synonymous single nucleotide polymorphisms in human DNA repair genes.
Nakken S, Alseth I, Rognes T. Nakken S, et al. Neuroscience. 2007 Apr 14;145(4):1273-9. doi: 10.1016/j.neuroscience.2006.09.004. Epub 2006 Oct 19. Neuroscience. 2007. PMID: 17055652 Review. - Molecular mechanisms of disease-causing missense mutations.
Stefl S, Nishi H, Petukh M, Panchenko AR, Alexov E. Stefl S, et al. J Mol Biol. 2013 Nov 1;425(21):3919-36. doi: 10.1016/j.jmb.2013.07.014. Epub 2013 Jul 16. J Mol Biol. 2013. PMID: 23871686 Free PMC article. Review.
Cited by
- Uncovering a Genetic Diagnosis in a Pediatric Patient by Whole Exome Sequencing: A Modeling Investigation in Wiedemann-Steiner Syndrome.
di Bari I, Ceccarini C, Curcetti M, Cesarano C, Croce AI, Adipietro I, Gallicchio MG, Palladino GP, Patrizio MP, Frisoli B, Santacroce R, D'Apolito M, D'Andrea G, Castriota OM, Pierri CL, Margaglione M. di Bari I, et al. Genes (Basel). 2024 Sep 1;15(9):1155. doi: 10.3390/genes15091155. Genes (Basel). 2024. PMID: 39336746 Free PMC article. - Association Between Clonal Hematopoiesis and Left Ventricular Reverse Remodeling in Nonischemic Dilated Cardiomyopathy.
Inoue S, Ko T, Shindo A, Nomura S, Yamada T, Jimba T, Dai Z, Nakao H, Suzuki A, Kashimura T, Iwahana T, Goto K, Matsushima S, Ishida J, Amiya E, Zhang B, Kubota M, Sawami K, Heryed T, Yamada S, Katoh M, Katagiri M, Ito M, Nayakama Y, Fujiu K, Hatano M, Takeda N, Takimoto E, Akazawa H, Morita H, Yamaguchi J, Inomata T, Kobayashi Y, Minamino T, Tsutsui H, Kurokawa M, Aiba A, Aburatani H, Komuro I. Inoue S, et al. JACC Basic Transl Sci. 2024 Jun 12;9(8):956-967. doi: 10.1016/j.jacbts.2024.04.010. eCollection 2024 Aug. JACC Basic Transl Sci. 2024. PMID: 39297129 Free PMC article. - Unraveling the potential effects of non-synonymous single nucleotide polymorphisms (nsSNPs) on the Protein structure and function of the human SLC30A8 gene on type 2 diabetes and colorectal cancer: An In silico approach.
Uddin MM, Hossain MT, Hossain MA, Ahsan A, Shamim KH, Hossen MA, Rahman MS, Rahman MH, Ahmed K, Bui FM, Al-Zahrani FA. Uddin MM, et al. Heliyon. 2024 Aug 31;10(17):e37280. doi: 10.1016/j.heliyon.2024.e37280. eCollection 2024 Sep 15. Heliyon. 2024. PMID: 39296124 Free PMC article. - AlzDiscovery: A computational tool to identify Alzheimer's disease-causing missense mutations using protein structure information.
Pan Q, Parra GB, Myung Y, Portelli S, Nguyen TB, Ascher DB. Pan Q, et al. Protein Sci. 2024 Oct;33(10):e5147. doi: 10.1002/pro.5147. Protein Sci. 2024. PMID: 39276018 Free PMC article. - Characterizing the Mutational Landscape of Diffuse Large B-Cell Lymphoma in a Prospective Cohort of Mexican Patients.
Candelaria M, Cerrato-Izaguirre D, Gutierrez O, Diaz-Chavez J, Aviles A, Dueñas-Gonzalez A, Malpica L. Candelaria M, et al. Int J Mol Sci. 2024 Aug 28;25(17):9328. doi: 10.3390/ijms25179328. Int J Mol Sci. 2024. PMID: 39273276 Free PMC article.
References
- Ashley EA, Butte AJ, Wheeler MT, Chen R, Klein TE, Dewey FE, Dudley JT, Ormond KE, Pavlovic A, Morgan AA, Pushkarev D, Neff NF, Hudgins L, Gong L, Hodges LM, Berlin DS, Thorn CF, Sangkuhl K, Hebert JM, Woon M, Sagreiya H, Whaley R, Knowles JW, Chou MF, Thakuria JV, Rosenbaum AM, Zaranek AW, Church GM, Greely HT, Quake SR, Altman RB. Clinical assessment incorporating a personal genome. Lancet. 2010;375:1525–1535. - PMC - PubMed
- Bamshad MJ, Ng SB, Bigham AW, Tabor HK, Emond MJ, Nickerson DA, Shendure J. Exome sequencing as a tool for Mendelian disease gene discovery. Nat Rev Genet. 2011;12:745–755. - PubMed
- Chasman D, Adams RM. Predicting the functional consequences of non-synonymous single nucleotide polymorphisms: Structure-based assessment of amino acid variation. J Mol Biol. 2001;307:683–706. - PubMed
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources