A genome-wide comparison of the functional properties of rare and common genetic variants in humans - PubMed (original) (raw)

Comparative Study

A genome-wide comparison of the functional properties of rare and common genetic variants in humans

Qianqian Zhu et al. Am J Hum Genet. 2011.

Abstract

One of the longest running debates in evolutionary biology concerns the kind of genetic variation that is primarily responsible for phenotypic variation in species. Here, we address this question for humans specifically from the perspective of population allele frequency of variants across the complete genome, including both coding and noncoding regions. We establish simple criteria to assess the likelihood that variants are functional based on their genomic locations and then use whole-genome sequence data from 29 subjects of European origin to assess the relationship between the functional properties of variants and their population allele frequencies. We find that for all criteria used to assess the likelihood that a variant is functional, the rarer variants are significantly more likely to be functional than the more common variants. Strikingly, these patterns disappear when we focus on only those variants in which the major alleles are derived. These analyses indicate that the majority of the genetic variation in terms of phenotypic consequence may result from a mutation-selection balance, as opposed to balancing selection, and have direct relevance to the study of human disease.

Copyright © 2011 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

PubMed Disclaimer

Figures

Figure 1

Figure 1

The Distribution of SNVs in MAF Bins The MAF range of each bin is shown in Table 1.

Figure 2

Figure 2

The Cumulative Distribution Plot of Conservation Scores Corresponding to SNVs in Each MAF Bin Insets shows the conservation score distribution of SNVs in the first (red line) and last bin (blue line). Only the distribution of conservation scores for primates is shown. Consistent results were obtained using conservation scores for placental mammals or vertebrates (Figures S1 and S2).

Figure 3

Figure 3

The Enrichment of SNVs from Each MAF Bin in Functional Regions The enrichment of SNVs from each MAF bin in units of gene structure (A) and regulatory regions (B). The relative fraction is calculated as the fraction of SNVs from each bin falling inside a particular type of functional region divided by the fraction corresponding to SNVs from the first bin. The result corresponding to protein-coding genes is very close to the result corresponding to genes, and therefore is not shown in the figure. The relationship between MAF and SNV enrichment in 5′UTR is not significant and therefore is not shown either.

Figure 4

Figure 4

The Enrichment of SNVs in Functional Regions when Minor Alleles Are Ancestral or Derived The enrichment of SNVs in which minor alleles are ancestral (A) or derived (B) in functional regions. The relative fraction is calculated as the fraction of SNVs from each bin falling inside a particular type of functional region divided by the fraction corresponding to SNVs from the first bin. The relationship between MAF and SNV enrichment in extremely conserved noncoding elements is not shown in (A) because only 189 SNVs are in this group, and no significant pattern is observed.

Figure 5

Figure 5

Simulation Results Corresponding to Varying Proportions of Purifying and Positive Selection ΔOR measures the enrichment of selected SNVs in rare SNVs versus common SNVs between two situations: minor allele is ancestral and minor allele is derived (see Material and Methods). The x axis is the ratio between the proportion of purifying selection and positive selection. The red line corresponds to the value observed from nonsynonymous SNVs in our real data when assuming all nonsynonymous SNVs are under selection.

Similar articles

Cited by

References

    1. Manolio T.A., Collins F.S., Cox N.J., Goldstein D.B., Hindorff L.A., Hunter D.J., McCarthy M.I., Ramos E.M., Cardon L.R., Chakravarti A. Finding the missing heritability of complex diseases. Nature. 2009;461:747–753. - PMC - PubMed
    1. Cirulli E.T., Goldstein D.B. Uncovering the roles of rare variants in common disease through whole-genome sequencing. Nat. Rev. Genet. 2010;11:415–425. - PubMed
    1. McClellan J., King M.C. Genetic heterogeneity in human disease. Cell. 2010;141:210–217. - PubMed
    1. Alexander R.P., Fang G., Rozowsky J., Snyder M., Gerstein M.B. Annotating non-coding regions of the genome. Nat. Rev. Genet. 2010;11:559–571. - PubMed
    1. Birney E., Stamatoyannopoulos J.A., Dutta A., Guigó R., Gingeras T.R., Margulies E.H., Weng Z., Snyder M., Dermitzakis E.T., Thurman R.E., ENCODE Project Consortium. NISC Comparative Sequencing Program. Baylor College of Medicine Human Genome Sequencing Center. Washington University Genome Sequencing Center. Broad Institute. Children's Hospital Oakland Research Institute Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature. 2007;447:799–816. - PMC - PubMed

Publication types

MeSH terms

Grants and funding

LinkOut - more resources