Human population differentiation is strongly correlated with local recombination rate - PubMed (original) (raw)

Human population differentiation is strongly correlated with local recombination rate

Alon Keinan et al. PLoS Genet. 2010.

Abstract

Allele frequency differences across populations can provide valuable information both for studying population structure and for identifying loci that have been targets of natural selection. Here, we examine the relationship between recombination rate and population differentiation in humans by analyzing two uniformly-ascertained, whole-genome data sets. We find that population differentiation as assessed by inter-continental F(ST) shows negative correlation with recombination rate, with F(ST) reduced by 10% in the tenth of the genome with the highest recombination rate compared with the tenth of the genome with the lowest recombination rate (P<<10(-12)). This pattern cannot be explained by the mutagenic properties of recombination and instead must reflect the impact of selection in the last 100,000 years since human continental populations split. The correlation between recombination rate and F(ST) has a qualitatively different relationship for F(ST) between African and non-African populations and for F(ST) between European and East Asian populations, suggesting varying levels or types of selection in different epochs of human history.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1

Figure 1. Population differentiation in allele frequencies is inversely correlated with recombination rate.

We placed 1,110,338 SNPs into 10 bins according to the recombination rate in a 3 Mb window centered on each SNP. The x-axis of all panels indicates the recombination rate, with the values indicated on the ticks corresponding to the edges between 10 bins. For each bin, at an x-axis position corresponding to the median recombination rate across the SNPs at that bin, the figure presents (A) global population differentiation between African Americans, Europeans, and Chinese; (B) F ST between African Americans and Europeans; (C) F ST between African Americans and Chinese; and (D) F ST between Europeans and Chinese. Error bars indicate ±1 standard error, which is estimated based on 1,000 moving block bootstraps over the SNPs in the bin. Linear regression of F ST estimates as a function of median recombination rate in each bin is also presented (solid line) and corresponds to (A) 0.1280–0.0048ρ (B) 0.1138–0.0057ρ (C) 0.1546–0.0067ρ and (D) 0.1156–0.0022ρ. The corresponding correlation coefficient estimates between F ST and median recombination rate are (A) r = −0.962 (P = 8.9×10−6), (B) −0.815 (P = 0.0041), (C) −0.931 (P = 0.0001), and (D) −0.361 (P = 0.306). For comparison, population differentiation based on all SNPs in all bins combined is also presented (horizontal dotted line). The y-axis range is different between the four panels but spans 0.02 units in all. Figure S1 repeats Figure 1A for sets of SNPs of different minor allele frequency categories.

Figure 2

Figure 2. Population differentiation is more strongly correlated with recombination rate in genes than outside of genes.

Global population differentiation between African Americans, Europeans, and Chinese is presented for coding SNPs (cSNPs). Except for focusing on the 21,391 SNPs in coding exons, the figure is identical to Figure 1A. In addition to the linear regression of F ST estimates as a function of the median recombination rate in each bin (solid line; 0.1381–0.0081ρ), the linear regression for the rest of the data set (non-coding SNPs) is provided (dashed line; 0.1278–0.0048ρ), which is very similar to the regression based on the entire data set (Figure 1A). The correlation coefficient between F ST of cSNPs and median recombination rate is −0.752 (P = 0.012).

Figure 3

Figure 3. Confirmation of the correlation of allele frequency differentiation and recombination rate in uniformly-ascertained subsets of HapMap.

Similar to Figure 1, we placed 248,886 uniformly-ascertained HapMap SNPs into 10 bins according to the recombination rate and estimated for each bin (A) global population differentiation between YRI, CEU, and ASN (ASN denotes the combined CHB and JPT samples); (B) F ST between YRI and CEU; (C) F ST between YRI and ASN; and (D) F ST between CEU and ASN. Linear regression as a function of the median recombination rate (solid line) is (A) 0.1473–0.0026ρ (B) 0.1541–0.0028ρ (C) 0.1819–0.0046ρ, and (D) 0.1060–0.0005ρ. The corresponding correlation coefficient estimate between F ST and median recombination rate is (A) r = −0.526 (P = 0.118), (B) −0.482 (P = 0.158), (C) −0.634 (P = 0.049), and (D) −0.066 (P = 0.857).

Figure 4

Figure 4. The relationship between population differentiation and recombination rate in the larger set of HapMap 3 populations.

We placed 1,326,404 autosomal HapMap 3 SNPs (release 2) into 10 bins according to recombination rate and estimated for each bin (A) global population differentiation between all 11 populations, and (B–E) population differentiation between pairs of populations. To avoid clutter, (B–E) depict only the linear (dashed lines) and quadratic (solid lines) regression of F ST estimates as a function of the median recombination rate in each bin and partition the populations-pairs as follows: (B) F ST between an African and a non-African population, where a negative correlation is observed with recombination rate, and where the quadratic regression is convex, (C) F ST between a population of European, East Asian, or South Asian ancestry and a second population of a different one of these three ancestries, which shows a concave quadratic regression for all pairs of populations, and which recapitulates the result observed between North Europeans and East Asians in the uniformly-ascertained datasets (Figure 1D and Figure 3D). (A weaker phenomenon is observed for the South Asian GIH sample, which may be due to this population being somewhat related to both Europeans and East Asians , thereby confounding the North European–East Asian signal), (D) F ST between two African populations, which shows a much steeper linear regression compared to intercontinental F ST, as well as a convex quadratic regression, and (E) F ST between closely-related non-African populations (within either Europe or East Asia; genome-wide F ST<0.008), showing a very steep linear regression and a convex quadratic regression. F ST based on all SNPs in all bins combined is presented as a horizontal dotted line and is equal to 1 in panels B–E since these present normalized F ST values obtained by dividing each value by the genome-wide F ST for the same pair of populations. Population codes are as follows: WAF (“West African”) is a combined sample of YRI (Yoruba in Ibadan, Nigeria) and LWK (Luhya in Webuye, Kenya); EAS (“East Asia”) is a combined sample of CHB (Han Chinese in Beijing, China), CHD (Chinese in Metropolitan Denver, CO, USA), and JPT (Japanese in Tokyo, Japan); EUR (“Europe”) is a combined sample of CEU (ancestry from Northern and Western Europe) and TSI (Toscani in Italia); GIH is a sample of Gujarati Indians in Houston, TX, USA; MKK is a sample of Maasai in Kinyawa, Kenya; and CHI (Chinese) is a combined sample of CHB and CHD.

Similar articles

Cited by

References

    1. Weir BS, Cockerham CC. Estimating F-statistics for the analysis of population structure. Evolution. 1984;38:1358–1370. - PubMed
    1. Wright S. Isolation by Distance. Genetics. 1943;28:114–138. - PMC - PubMed
    1. Jakobsson M, Scholz SW, Scheet P, Gibbs JR, VanLiere JM, et al. Genotype, haplotype and copy-number variation in worldwide human populations. Nature. 2008;451:998–1003. - PubMed
    1. Keinan A, Mullikin JC, Patterson N, Reich D. Measurement of the human allele frequency spectrum demonstrates greater genetic drift in East Asians than in Europeans. Nat Genet. 2007;39:1251–1255. - PMC - PubMed
    1. Keinan A, Mullikin JC, Patterson N, Reich D. Accelerated genetic drift on chromosome X during the human dispersal out of Africa. Nat Genet. 2009;41:66–70. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources