The 1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature467, 1061–1073 (2010).
Lander, E. S. Initial impact of the sequencing of the human genome. Nature470, 187–197 (2011). ArticleCASPubMed Google Scholar
Manly, K. F., Nettleton, D. & Hwang, J. T. Genomics, prior probability, and statistical tests of multiple hypotheses. Genome Res.14, 997–1001 (2004). This is a valuable review of the relationships between prior probability, statistical significance and false-discovery rates as they pertain to genome-wide analyses. ArticleCASPubMed Google Scholar
Ng, S. B. et al. Exome sequencing identifies the cause of a mendelian disorder. Nature Genet.42, 30–35 (2010). ArticleCASPubMed Google Scholar
Ng, S. B. et al. Targeted capture and massively parallel sequencing of 12 human exomes. Nature461, 272–276 (2009). This is the first demonstration of exome sequencing being used to identify the causal variants for a Mendelian disease. Protein-based annotations of functional deleteriousness were essential to this effort. ArticleCASPubMedPubMed Central Google Scholar
Choi, M. et al. Genetic diagnosis by whole exome capture and massively parallel DNA sequencing. Proc. Natl Acad. Sci. USA106, 19096–19101 (2009). ArticleCASPubMedPubMed Central Google Scholar
Erlich, Y. et al. Exome sequencing and disease-network analysis of a single family implicate a mutation in KIF1A in hereditary spastic paraparesis. Genome Res.21, 658–664 (2011). ArticleCASPubMedPubMed Central Google Scholar
Kimura, M. The Neutral Theory Of Molecular Evolution (Cambridge Univ. Press, New York, 1983). Book Google Scholar
Cooper, G. M. & Brown, C. D. Qualifying the relationship between sequence conservation and molecular function. Genome Res.18, 201–205 (2008). ArticleCASPubMed Google Scholar
McAuliffe, J. D., Jordan, M. I. & Pachter, L. Subtree power analysis and species selection for comparative genomics. Proc. Natl Acad. Sci. USA102, 7900–7905 (2005). ArticleCASPubMedPubMed Central Google Scholar
Stone, E. A., Cooper, G. M. & Sidow, A. Trade-offs in detecting evolutionarily constrained sequence by comparative genomics. Annu. Rev. Genomics Hum. Genet.6, 143–164 (2005). ArticleCASPubMed Google Scholar
The Chimpanzee Sequencing and Analysis Consortium. Initial sequence of the chimpanzee genome and comparison with the human genome. Nature437, 69–87 (2005).
Davydov, E. V. et al. Identifying a high fraction of the human genome to be under selective constraint using GERP++. PLoS Comput. Biol.6, e1001025 (2010). ArticleCASPubMedPubMed Central Google Scholar
The Mouse Genome Sequencing Consortium. Initial sequencing and comparative analysis of the mouse genome. Nature420, 520–562 (2002).
The ENCODE Project Consortium. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature447, 799–816 (2007).
Boffelli, D. et al. Phylogenetic shadowing of primate sequences to find functional regions of the human genome. Science299, 1391–1394 (2003). ArticleCASPubMed Google Scholar
Prabhakar, S. et al. Close sequence comparisons are sufficient to identify human _cis_-regulatory elements. Genome Res.16, 855–863 (2006). ArticleCASPubMedPubMed Central Google Scholar
Johnson, M. E. et al. Positive selection of a gene family during the emergence of humans and African apes. Nature413, 514–519 (2001). ArticleCASPubMed Google Scholar
Wang, T. et al. Species-specific endogenous retroviruses shape the transcriptional network of the human tumor suppressor protein p53. Proc. Natl Acad. Sci. USA104, 18613–18618 (2007). ArticleCASPubMedPubMed Central Google Scholar
Enard, W. et al. Molecular evolution of FOXP2, a gene involved in speech and language. Nature418, 869–872 (2002). ArticleCASPubMed Google Scholar
Prabhakar, S. et al. Human-specific gain of function in a developmental enhancer. Science321, 1346–1350 (2008). This study demonstrates that constraint-based measures may also identify sequences with human-specific functionality. ArticleCASPubMedPubMed Central Google Scholar
Stone, E. A. & Sidow, A. Physicochemical constraint violation by missense substitutions mediates impairment of protein function and disease severity. Genome Res.15, 978–986 (2005). The authors describe a combined phylogenetic and biochemical approach to predict the effects of amino acid substitutions. They demonstrate a quantitative relationship between past evolutionary rates of biochemical change and present day deleteriousness. ArticleCASPubMedPubMed Central Google Scholar
De Gobbi, M. et al. A regulatory SNP causes a human genetic disease by creating a new transcriptional promoter. Science312, 1215–1217 (2006). ArticleCASPubMed Google Scholar
Botstein, D. & Risch, N. Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease. Nature Genet.33, 228–237 (2003). ArticleCASPubMed Google Scholar
Ng, S. B. et al. Exome sequencing identifies MLL2 mutations as a cause of Kabuki syndrome. Nature Genet.42, 790–793 (2010). ArticleCASPubMed Google Scholar
Grantham, R. Amino acid difference formula to help explain protein evolution. Science185, 862–864 (1974). ArticleCASPubMed Google Scholar
Ng, P. C. & Henikoff, S. Predicting the effects of amino acid substitutions on protein function. Annu. Rev. Genomics Hum. Genet.7, 61–80 (2006). ArticleCASPubMed Google Scholar
Care, M. A., Needham, C. J., Bulpitt, A. J. & Westhead, D. R. Deleterious SNP prediction: be mindful of your training data! Bioinformatics23, 664–672 (2007). ArticleCASPubMed Google Scholar
Capriotti, E., Calabrese, R. & Casadio, R. Predicting the insurgence of human genetic diseases associated to single point protein mutations with support vector machines and evolutionary information. Bioinformatics22, 2729–2734 (2006). ArticleCASPubMed Google Scholar
Ferrer-Costa, C., Orozco, M. & de la Cruz, X. Sequence-based prediction of pathological mutations. Proteins57, 811–819 (2004). ArticleCASPubMed Google Scholar
Ng, P. C. & Henikoff, S. Predicting deleterious amino acid substitutions. Genome Res.11, 863–874 (2001). This describes SIFT (also see reference 46), a commonly used tool to predict the effects of amino acid substitutions and an early demonstration of the importance of sequence conservation to functional predictions. ArticleCASPubMedPubMed Central Google Scholar
Schwarz, J. M., Rodelsperger, C., Schuelke, M. & Seelow, D. MutationTaster evaluates disease-causing potential of sequence alterations. Nature Methods7, 575–576 (2010). ArticleCASPubMed Google Scholar
Ye, Z. Q. et al. Finding new structural and sequence attributes to predict possible disease association of single amino acid polymorphism (SAP). Bioinformatics23, 1444–1450 (2007). ArticleCASPubMed Google Scholar
Bao, L., Zhou, M. & Cui, Y. nsSNPAnalyzer: identifying disease-associated nonsynonymous single nucleotide polymorphisms. Nucleic Acids Res.33, W480–W482 (2005). ArticleCASPubMedPubMed Central Google Scholar
Sunyaev, S. et al. Prediction of deleterious human alleles. Hum. Mol. Genet.10, 591–597 (2001). This paper describes polymorphism phenotyping (polyPhen) (also see reference 35), a commonly used tool to predict the effects of amino acid substitutions, and illustrates the value of classifiers trained on numerous biochemical and evolutionary features. ArticleCASPubMed Google Scholar
Lynch, M. & Conery, J. S. The evolutionary fate and consequences of duplicate genes. Science290, 1151–1155 (2000). ArticleCASPubMed Google Scholar
Marini, N. J., Thomas, P. D. & Rine, J. The use of orthologous sequences to predict the impact of amino acid substitutions on protein function. PLoS Genet.6, e1000968 (2010). ArticleCASPubMedPubMed Central Google Scholar
Dobson, R. J., Munroe, P. B., Caulfield, M. J. & Saqi, M. A. Predicting deleterious nsSNPs: an analysis of sequence and structural attributes. BMC Bioinformatics7, 217 (2006). ArticleCASPubMedPubMed Central Google Scholar
Saunders, C. T. & Baker, D. Evaluation of structural and evolutionary contributions to deleterious mutation prediction. J. Mol. Biol.322, 891–901 (2002). ArticleCASPubMed Google Scholar
Yue, P., Li, Z. & Moult, J. Loss of protein structure stability as a major causative factor in monogenic disease. J. Mol. Biol.353, 459–473 (2005). ArticleCASPubMed Google Scholar
Bao, L. & Cui, Y. Prediction of the phenotypic effects of non-synonymous single nucleotide polymorphisms using structural and evolutionary information. Bioinformatics21, 2185–2190 (2005). ArticleCASPubMed Google Scholar
Li, Y. et al. Predicting disease-associated substitution of a single amino acid by analyzing residue interactions. BMC Bioinformatics12, 14 (2011). ArticlePubMedPubMed Central Google Scholar
Hindorff, L. A. et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl Acad. Sci. USA106, 9362–9367 (2009). ArticleCASPubMedPubMed Central Google Scholar
Musunuru, K. et al. From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus. Nature466, 714–719 (2010). This paper describes the precise identification of a common transcriptional regulatory variant that influences cholesterol levels and cardiovascular disease risk. ArticleCASPubMedPubMed Central Google Scholar
Nicolae, D. L. et al. Trait-associated SNPs are more likely to be eQTLs: annotation to enhance discovery from GWAS. PLoS Genet.6, e1000888 (2010). This analysis demonstrated that expression-associated variants are enriched among trait-associated variants, suggesting that non-coding regulatory variants are causally relevant for many traits. ArticleCASPubMedPubMed Central Google Scholar
King, M. C. & Wilson, A. C. Evolution at two levels in humans and chimpanzees. Science188, 107–116 (1975). ArticleCASPubMed Google Scholar
Lettice, L. A. et al. A long-range Shh enhancer regulates expression in the developing limb and fin and is associated with preaxial polydactyly. Hum. Mol. Genet.12, 1725–1735 (2003). This study describes non-coding mutations that cause Mendelian limb defects by affecting enhancers important to developmental sonic hedgehog (Shh) gene regulation. A combination of evolutionary sequence conservation and mouse-based experimental assessments of variant function were used. ArticleCASPubMed Google Scholar
Stenson, P. D. et al. The Human Gene Mutation Database: providing a comprehensive central mutation database for molecular diagnostics and personalized genomics. Hum. Genomics4, 69–72 (2009). ArticleCASPubMedPubMed Central Google Scholar
Treisman, R., Orkin, S. H. & Maniatis, T. Specific transcription and RNA splicing defects in five cloned β-thalassaemia genes. Nature302, 591–596 (1983). ArticleCASPubMed Google Scholar
Woolfe, A. et al. Highly conserved non-coding sequences are associated with vertebrate development. PLoS Biol.3, e7 (2005). ArticleCASPubMed Google Scholar
Dehal, P. et al. The draft genome of Ciona intestinalis: insights into chordate and vertebrate origins. Science298, 2157–2167 (2002). ArticleCASPubMed Google Scholar
Pollard, K. S., Hubisz, M. J., Rosenbloom, K. R. & Siepel, A. Detection of nonneutral substitution rates on mammalian phylogenies. Genome Res.20, 110–121 (2010). ArticleCASPubMedPubMed Central Google Scholar
Asthana, S., Roytberg, M., Stamatoyannopoulos, J. & Sunyaev, S. Analysis of sequence conservation at nucleotide resolution. PLoS Comput. Biol.3, e254 (2007). ArticleCASPubMedPubMed Central Google Scholar
Margulies, E. H., Blanchette, M., Haussler, D. & Green, E. D. Identification and characterization of multi-species conserved sequences. Genome Res.13, 2507–2518 (2003). ArticleCASPubMedPubMed Central Google Scholar
Dubchak, I. et al. Active conservation of noncoding sequences revealed by three-way species comparisons. Genome Res.10, 1304–1306 (2000). ArticleCASPubMedPubMed Central Google Scholar
Parker, S. C., Hansen, L., Abaan, H. O., Tullius, T. D. & Margulies, E. H. Local DNA topography correlates with functional noncoding regions of the human genome. Science324, 389–392 (2009). ArticleCASPubMedPubMed Central Google Scholar
Cooper, G. M. et al. Single-nucleotide evolutionary constraint scores highlight disease-causing mutations. Nature Methods7, 250–251 (2010). This paper demonstrated that functionally agnostic nucleotide-level constraint scores, defined by GERP (also see references 17 and 67), offer considerable utility for causal variant discovery in exome analyses. ArticleCASPubMedPubMed Central Google Scholar
Wang, G. S. & Cooper, T. A. Splicing in disease: disruption of the splicing code and the decoding machinery. Nature Rev. Genet.8, 749–761 (2007). ArticleCASPubMed Google Scholar
Drake, J. A. et al. Conserved noncoding sequences are selectively constrained and not mutation cold spots. Nature Genet.38, 223–227 (2006). ArticleCASPubMed Google Scholar
Katzman, S. et al. Human genome ultraconserved elements are ultraselected. Science317, 915 (2007). ArticleCASPubMed Google Scholar
Goode, D. L. et al. Evolutionary constraint facilitates interpretation of genetic variation in resequenced human genomes. Genome Res.20, 301–310 (2010). ArticleCASPubMedPubMed Central Google Scholar
Pennacchio, L. A. et al. In vivo enhancer analysis of human conserved non-coding sequences. Nature444, 499–502 (2006). ArticleCASPubMed Google Scholar
Margulies, E. H. et al. Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome. Genome Res.17, 760–774 (2007). ArticleCASPubMedPubMed Central Google Scholar
The ENCODE Project Consortium. A user's guide to the encyclopedia of DNA elements (ENCODE). PLoS Biol.9, e1001046 (2011).
Ge, B. et al. Global patterns of cis variation in human cells revealed by high-density allelic expression analysis. Nature Genet.41, 1216–1222 (2009). ArticleCASPubMed Google Scholar
Nica, A. C. et al. Candidate causal regulatory effects by integration of expression QTLs with complex trait genetic associations. PLoS Genet.6, e1000895 (2010). ArticleCASPubMedPubMed Central Google Scholar
Zheng, W., Zhao, H., Mancera, E., Steinmetz, L. M. & Snyder, M. Genetic analysis of variation in transcription factor binding in yeast. Nature464, 1187–1191 (2010). ArticleCASPubMedPubMed Central Google Scholar
Patwardhan, R. P. et al. High-resolution analysis of DNA regulatory elements by synthetic saturation mutagenesis. Nature Biotech.27, 1173–1175 (2009). This paper defined a method to exploit next-generation sequencing to comprehensively yet efficiently assay point mutations in transcriptional promoters. ArticleCAS Google Scholar
Mortazavi, A., Williams, B. A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-seq. Nature Methods5, 621–628 (2008). ArticleCASPubMed Google Scholar
Johnson, D. S., Mortazavi, A., Myers, R. M. & Wold, B. Genome-wide mapping of in vivo protein-DNA interactions. Science316, 1497–1502 (2007). ArticleCASPubMed Google Scholar
Cao, A. R. et al. Genome-wide analysis of transcription factor E2F1 mutant proteins reveals that N- and C-terminal protein interaction domains do not participate in targeting E2F1 to the human genome. J. Biol. Chem.286, 11985–11996 (2011). ArticleCASPubMedPubMed Central Google Scholar
Botstein, D. & Shortle, D. Strategies and applications of in vitro mutagenesis. Science229, 1193–1201 (1985). ArticleCASPubMed Google Scholar
Blow, M. J. et al. ChIP-seq identification of weakly conserved heart enhancers. Nature Genet.42, 806–810 (2010). ArticleCASPubMed Google Scholar
Cheng, Y. et al. Erythroid GATA1 function revealed by genome-wide analysis of transcription factor occupancy, histone modifications, and mRNA expression. Genome Res.19, 2172–2184 (2009). ArticleCASPubMedPubMed Central Google Scholar
Miller, D. T. et al. Consensus statement: chromosomal microarray is a first-tier clinical diagnostic test for individuals with developmental disabilities or congenital anomalies. Am. J. Hum. Genet.86, 749–764 (2010). ArticleCASPubMedPubMed Central Google Scholar
Walsh, T. et al. Rare structural variants disrupt multiple genes in neurodevelopmental pathways in schizophrenia. Science320, 539–543 (2008). ArticleCASPubMed Google Scholar
Markiewicz, P., Kleina, L. G., Cruz, C., Ehret, S. & Miller, J. H. Genetic studies of the lac repressor. XIV. Analysis of 4000 altered Escherichia coli lac repressors reveals essential and non-essential residues, as well as “spacers” which do not require a specific sequence. J. Mol. Biol.240, 421–433 (1994). ArticleCASPubMed Google Scholar
Rennell, D., Bouvier, S. E., Hardy, L. W. & Poteete, A. R. Systematic mutation of bacteriophage T4 lysozyme. J. Mol. Biol.222, 67–88 (1991). ArticleCASPubMed Google Scholar
Hardison, R. C. et al. HbVar: a relational database of human hemoglobin variants and thalassemia mutations at the globin gene server. Hum. Mutat.19, 225–233 (2002). ArticleCASPubMed Google Scholar
Olivier, M. et al. The IARC TP53 database: new online mutation analysis and recommendations to users. Hum. Mutat.19, 607–614 (2002). ArticleCASPubMed Google Scholar
Yip, Y. L. et al. The Swiss-Prot variant page and the ModSNP database: a resource for sequence and structure information on human protein variants. Hum. Mutat.23, 464–470 (2004). ArticleCASPubMed Google Scholar
Brown, C. D., Johnson, D. S. & Sidow, A. Functional architecture and evolution of transcriptional elements that drive gene coexpression. Science317, 1557–1560 (2007). ArticleCASPubMed Google Scholar
Moses, A. M., Chiang, D. Y., Kellis, M., Lander, E. S. & Eisen, M. B. Position specific variation in the rate of evolution in transcription factor binding sites. BMC Evol. Biol.3, 19 (2003). ArticlePubMedPubMed Central Google Scholar
Dekker, J., Rippe, K., Dekker, M. & Kleckner, N. Capturing chromosome conformation. Science295, 1306–1311 (2002). ArticleCASPubMed Google Scholar
Liu, D. J. & Leal, S. M. A novel adaptive method for the analysis of next-generation sequencing data to detect complex trait associations with rare variants due to gene main effects and interactions. PLoS Genet.6, e1001156 (2010). This paper describes an approach to assess the significance of correlations between gene or locus aggregates of rare variants and phenotypes and may also be useful in identifying significant variant interactions. ArticleCASPubMedPubMed Central Google Scholar
Yandell, M. et al. A probabilistic disease-gene finder for personal genomes. Genome Res. 23 Jun 2011 (doi:10.1101/gr.123158.111). This paper defines a method, VAAST, to predict disease genes or loci on the basis of the total predicted deleteriousness of rare variants observed in affected individuals. ArticleCASPubMedPubMed Central Google Scholar
Gerke, J., Lorenz, K. & Cohen, B. Genetic interactions between transcription factors cause natural variation in yeast. Science323, 498–501 (2009). ArticleCASPubMedPubMed Central Google Scholar
Gerke, J., Lorenz, K., Ramnarine, S. & Cohen, B. Gene–environment interactions at nucleotide resolution. PLoS Genet.6, e1001144 (2010). ArticleCASPubMedPubMed Central Google Scholar
Bush, W. S. et al. A knowledge-driven interaction analysis reveals potential neurodegenerative mechanism of multiple sclerosis susceptibility. Genes Immun. (2011).
Rual, J. F. et al. Towards a proteome-scale map of the human protein-protein interaction network. Nature437, 1173–1178 (2005). ArticleCASPubMed Google Scholar
Emilsson, V. et al. Genetics of gene expression and its effect on disease. Nature452, 423–428 (2008). ArticleCASPubMed Google Scholar
The Gene Ontology Consortium. et al. Gene ontology: tool for the unification of biology. Nature Genet.25, 25–29 (2000).
Risch, N. & Merikangas, K. The future of genetic studies of complex human diseases. Science273, 1516–1517 (1996). ArticleCASPubMed Google Scholar
Rothman, K. J. No adjustments are needed for multiple comparisons. Epidemiology1, 43–46 (1990). ArticleCASPubMed Google Scholar
Keinan, A., Mullikin, J. C., Patterson, N. & Reich, D. Measurement of the human allele frequency spectrum demonstrates greater genetic drift in East Asians than in Europeans. Nature Genet.39, 1251–1255 (2007). ArticleCASPubMed Google Scholar