Model-based analyses of whole-genome data reveal a complex evolutionary history involving archaic introgression in Central African Pygmies - PubMed (original) (raw)

Model-based analyses of whole-genome data reveal a complex evolutionary history involving archaic introgression in Central African Pygmies

PingHsun Hsieh et al. Genome Res. 2016 Mar.

Erratum in

Abstract

Comparisons of whole-genome sequences from ancient and contemporary samples have pointed to several instances of archaic admixture through interbreeding between the ancestors of modern non-Africans and now extinct hominids such as Neanderthals and Denisovans. One implication of these findings is that some adaptive features in contemporary humans may have entered the population via gene flow with archaic forms in Eurasia. Within Africa, fossil evidence suggests that anatomically modern humans (AMH) and various archaic forms coexisted for much of the last 200,000 yr; however, the absence of ancient DNA in Africa has limited our ability to make a direct comparison between archaic and modern human genomes. Here, we use statistical inference based on high coverage whole-genome data (greater than 60×) from contemporary African Pygmy hunter-gatherers as an alternative means to study the evolutionary history of the genus Homo. Using whole-genome simulations that consider demographic histories that include both isolation and gene flow with neighboring farming populations, our inference method rejects the hypothesis that the ancestors of AMH were genetically isolated in Africa, thus providing the first whole genome-level evidence of African archaic admixture. Our inferences also suggest a complex human evolutionary history in Africa, which involves at least a single admixture event from an unknown archaic population into the ancestors of AMH, likely within the last 30,000 yr.

© 2016 Hsieh et al.; Published by Cold Spring Harbor Laboratory Press.

PubMed Disclaimer

Figures

Figure 1.

Figure 1.

Significant excess of tests with very low _P_-values in the observed _S* P_-value distribution. Plotted is the observed (dashed lines) _S* P_-value distribution for the real data, calculated based on each of the four sets of whole-genome simulations. The four sets arise from the combination of the two demographic null models (Supplemental Fig. S2): Model-1 (the continuous asymmetric gene flow model) and Model-2 (the single-pulse admixture model); and the two genetic recombination maps: HapMap Yoruba map (HapMap) and African American map (AAMap). The solid line represents the null _S* P_-value distributions of the four whole-genome null simulation sets, derived by calculating _P_-values using a single randomly chosen simulation from each set. All four are uniformly distributed between 0 and 1 as expected. For the real data (dashed lines), all four analyses show a significant shift to small _P_-values in the observed _S* P_-value distribution (one-sided Mann-Whitney U test, P < 2.2 × 10−16), thus rejecting our demographic null hypotheses including no archaic admixture.

Figure 2.

Figure 2.

Haplotype network for the candidate introgressive locus Chr 16:8702222-8747116. Each circle is a haplotype with size proportional to the haplotype frequency, and the shade of gray indicates the haplotype frequency in the Pygmy (lighter gray) and Yoruba (darker gray) samples. Vertical bars along each branch indicate the number of mutations separating the haplotypes.

Figure 3.

Figure 3.

Wide joint distribution of TMRCA and genetic length for the top 1% _S* P_-value candidate loci. Darker gray dots and lighter gray triangles are, respectively, the two candidate sets from the top 1% candidate loci from the observed _S* P_-value and empirical S* distributions. The top and right plots show the marginal density of genetic length and TMRCA, respectively, for both candidate sets.

Figure 4.

Figure 4.

Decay of pairwise LD with respect to genetic distance for SNPs ascertained from the top 1% candidate introgressive loci. Black and blue dots are the average estimated LD among pairs of SNPs binned using genetic distance (in 0.001 cM increments) using real data and 100 bootstraps, respectively. The genetic distance is calculated based on the HapMap Yoruba map (The International HapMap Consortium 2007). For the cases of using the African American map (Hinch et al. 2011), see Supplemental Figure S10. The green curve is the fit of a single exponential using the data, while the red and orange curves are the fits of two exponentials using the real data and 100 bootstraps, respectively. (A) Fitting LD decay within genetic distance 0.02–1 cM. (B) Fitting LD decay within genetic distance 0.002–1 cM.

Figure 5.

Figure 5.

Comparison of the joint distributions of TMRCA and genetic length for the top 1% _S* P_-value candidate loci from the data and whole-genome archaic admixture simulations. In each panel, the joint and marginal distributions of TMRCA (million yr ago) and genetic length (cM) of our candidate loci from the data (black cross and solid line in scatter and density plots, respectively) are compared with those from archaic admixture simulations (symbols and dashed lines in scatter and density plots, respectively): (A) two-wave archaic admixture model; (B) single-wave, 2% archaic admixture; (C) single-wave, 5% archaic admixture. TMRCA estimates for archaic simulation candidates were obtained from simulated coalescent trees in MaCS (Chen et al. 2009).

Similar articles

Cited by

References

    1. Bandelt HJ, Forster P, Rohl A. 1999. Median-joining networks for inferring intraspecific phylogenies. Mol Biol Evol 16: 37–48. - PubMed
    1. Bräuer G. 2008. The origin of modern anatomy: by speciation or intraspecific evolution? Evol Anthropol 17: 22–37.
    1. Browning SR, Browning BL. 2007. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am J Hum Genet 81: 1084–1097. - PMC - PubMed
    1. Campana MG, Bower MA, Crabtree PJ. 2013. Ancient DNA for the archaeologist: the future of African research. Afr Archaeol Rev 30: 21–37.
    1. Cavalli-Sforza LL, Menozzi P, Piazza A. 1994. The history and geography of human genes. Princeton University Press, Princeton, NJ.

Publication types

MeSH terms

Grants and funding

LinkOut - more resources