Signals of recent positive selection in a worldwide sample of human populations - PubMed (original) (raw)

Signals of recent positive selection in a worldwide sample of human populations

Joseph K Pickrell et al. Genome Res. 2009 May.

Abstract

Genome-wide scans for recent positive selection in humans have yielded insight into the mechanisms underlying the extensive phenotypic diversity in our species, but have focused on a limited number of populations. Here, we present an analysis of recent selection in a global sample of 53 populations, using genotype data from the Human Genome Diversity-CEPH Panel. We refine the geographic distributions of known selective sweeps, and find extensive overlap between these distributions for populations in the same continental region but limited overlap between populations outside these groupings. We present several examples of previously unrecognized candidate targets of selection, including signals at a number of genes in the NRG-ERBB4 developmental pathway in non-African populations. Analysis of recently identified genes involved in complex diseases suggests that there has been selection on loci involved in susceptibility to type II diabetes. Finally, we search for local adaptation between geographically close populations, and highlight several examples.

PubMed Disclaimer

Figures

Figure 1.

Figure 1.

Top 10 iHS (A) and XP-EHH (B) signals by population cluster. Each row is a 200-kb genomic window, each column is a geographic region, and each cell is colored according to the position of the window in the empirical distribution of scores for that region. Plotted are the most extreme 10 windows for each geographic region. Gray cells in A are windows that have fewer than 20 SNPs for which iHS was calculated (see Methods). To the right of each row is a list of genes that fall in the window. Windows where the genes are in red are discussed in the text. Note that interpretation of the overlap in XP-EHH signals is complicated by the need for a reference population; see the main text.

Figure 2.

Figure 2.

Evidence for selection in a region containing part of the gene C21orf34. (A) Haplotype plots in a 500-kb region on chromosome 21 surrounding the locus. Each row represents a haplotype, and each column a SNP. Rows are colored the same if and only if the underlying sequence is identical (some low-frequency SNPs are excluded). For full details on the generation of these plots, see Conrad et al. (2006). (B) Heterozygosity in the same region. Lines show heterozygosity calculated in a sliding window of three SNPs across the region in different populations. Black arrows at the top of the plot represent the positions of SNPs with _F_ST > 0.6 (i.e., in the 0.01% tail of worldwide _F_ST). (C). A pie chart of the worldwide distribution of a SNP that tags the red haplotype in A (rs2823850). (Red) The derived allele frequency; (blue) the ancestral allele frequency.

Figure 3.

Figure 3.

_F_ST around loci involved in natural variation in pigmentation. For each SNP found to be associated with pigmentation in a genome-wide scan, we plot the maximum pairwise _F_ST between geographic regions in a 100-kb window surrounding the SNP in the HGDP data, as well as a histogram of the null distribution calculated by finding the maximum _F_ST in 100-kb windows surrounding each of 10,000 random SNPs. The dotted lines shows the position beyond which 5% of the random SNPs fall, and the solid lines the position beyond which 1% of the random SNPs fall. Gene names that are starred fall in the 5% tail of at least one comparison, and those with two stars fall in the 1% tail of at least one comparison. Letters are positioned along the _y_-axis to improve readability. The key in the bottom right panel applies to all panels.

Figure 4.

Figure 4.

_F_ST around loci involved in natural variation in diabetes susceptibility. For each SNP associated with either type I or type II diabetes we plot the maximum pairwise _F_ST between geographic regions in a 100-kb window surrounding the SNP in the HGDP data, as well as a histogram of the null distribution calculated by finding the maximum _F_ST in 100-kb windows surrounding each of 10,000 random SNPs. The dotted lines shows the position beyond which 5% of the random SNPs fall, and the solid lines the position beyond which 1% of the random SNPs fall. Gene names that are starred fall in the 5% tail of at least one comparison, and those with two stars fall in the 1% tail of at least one comparison. Letters are positioned along the _y_-axis to improve readability. The key in the bottom panel of each column applies to the entire column.

Figure 5.

Figure 5.

Selection signals in the NRG–ERBB4 pathway. (A) A schematic of the NRG–ERBB4 pathway, drawn from interactions reported in KEGG (Kanehisa et al. 2008) and Mei and Xiong (2008). Each oval represents a gene, and the colored circles denote the geographic regions that have significant selection signals (empirical scores in the top 5% of the distribution). We excluded Oceania and the Americas from this plot since selection scans are expected to have low power in these regions. For ADAM17, the selection statistic is XP-EHH; for the others it is iHS. (B) Haplotype plots at the putative selected region in ERBB4. (C) Worldwide allele frequencies of a SNP that tags the red haplotype in B (rs1505353). (Red) The derived allele; (blue) the ancestral allele.

Figure 6.

Figure 6.

Worldwide allele frequencies of two nonsynonymous SNPs showing evidence of local adaptation. (A) Frequencies of rs5743810 in TLR6; (B) frequencies of rs12421620 in DPP3. (Red) The frequency of the derived allele; (blue) the frequency of the ancestral allele.

Similar articles

Cited by

References

    1. Barreiro L.B., Laval G., Quach H., Patin E., Quintana-Murci L. Natural selection has driven population differentiation in modern humans. Nat. Genet. 2008;40:340–345. - PubMed
    1. Bersaglieri T., Sabeti P.C., Patterson N., Vanderploeg T., Schaffner S.F., Drake J.A., Rhodes M., Reich D.E., Hirschhorn J.N. Genetic signatures of strong recent positive selection at the lactase gene. Am. J. Hum. Genet. 2004;74:1111–1120. - PMC - PubMed
    1. Bowcock A., Sartorelli V. Polymorphism and mapping of the IGF1 gene, and absence of association with stature among African Pygmies. Hum. Genet. 1990;85:349–354. - PubMed
    1. Campbell C.D., Ogburn E.L., Lunetta K.L., Lyon H.N., Freedman M.L., Groop L.C., Altshuler D., Ardlie K.G., Hirschhorn J.N. Demonstrating stratification in a European American population. Nat. Genet. 2005;37:868–872. - PubMed
    1. Carlson C.S., Eberle M.A., Rieder M.J., Yi Q., Kruglyak L., Nickerson D.A. Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am. J. Hum. Genet. 2004;74:106–120. - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources