The architecture of long-range haplotypes shared within and across populations - PubMed (original) (raw)

The architecture of long-range haplotypes shared within and across populations

Alexander Gusev et al. Mol Biol Evol. 2012 Feb.

Abstract

Homologous long segments along the genomes of close or remote relatives that are identical by descent (IBD) from a common ancestor provide clues for recent events in human genetics. We set out to extensively map such IBD segments in large cohorts and investigate their distribution within and across different populations. We report analysis of several data sets, demonstrating that IBD is more common than expected by naïve models of population genetics. We show that the frequency of IBD pairs is population dependent and can be used to cluster individuals into populations, detect a homogeneous subpopulation within a larger cohort, and infer bottleneck events in such a subpopulation. Specifically, we show that Ashkenazi Jewish individuals are all connected through transitive remote family ties evident by sharing of 50 cM IBD to a publicly available data set of less than 400 individuals. We further expose regions where long-range haplotypes are shared significantly more often than elsewhere in the genome, observed across multiple populations, and enriched for common long structural variation. These are inconsistent with recent relatedness and suggest ancient common ancestry, with limited recombination between haplotypes.

PubMed Disclaimer

Figures

F<sc>IG</sc>. 1.

FIG. 1.

Manhattan-style plots of IBD segment sharing in worldwide populations. Fraction of pairs of individuals IBD, on the y axis, at a locus shown as a function of the genomic position at the locus (A) within Ashkenazi/European cohorts, (B) within HapMap cohorts, and (C) between HapMap continents/populations (scale not consistent with A, B). Panel c highlights enriched regions, consistent with intrapopulation sharing. Within populations, the normalization factor was equal to the number of unique pairs; between populations, the normalization factor was the product of the respective cohort sizes.

F<sc>IG</sc>. 2.

FIG. 2.

Graph plot of IBD sharing in HapMap populations and resultant clusters. Nodes denote individuals, color-coded by cohort, and edges represent normalized genome-wide IBD sharing. (A) Initial clusters from unfiltered sharing—{GIH},{LWK},{JPT,CHD,CHB},{CEU,TSI} segregate. (b) Final clusters after cross-cluster edges have been iteratively removed—{TSI},{CEU} newly segregated.

F<sc>IG</sc>. 3.

FIG. 3.

Graph plot of IBD sharing between samples of Ashkenazi (blue/dark) and European (green/light) origin. Each colored vertex represents a sample from the respective population, edges represent IBD sharing between incident individuals, and edge width represents total amount of sharing genomewide. Ashkenazi samples form “giant connected component” and no edges longer than 100 cM to the European population.

F<sc>IG</sc>. 4.

FIG. 4.

Relationship between segment length and amount of sharing in real and simulated data. We compute the expected number of IBD segments shared within each population (y axis, logarithmic scale) for the discrete segment length range of 3 to 30 cM (x axis). (A) AJ and EU populations shown with dot and line, solid lines show simulated coalescent data rawn from a Wright–Fisher model (WF—dark/light gray) and a bottleneck model (BN—highlight). (B) HapMap populations shown in solid colors. Y-intercept correlates to ancestral population size, decay loosely correlates to population growth. For both figures, only data points at which sharing is more than 1 in a 1,000 pairs of individuals (varies by population) are shown.

Similar articles

Cited by

References

    1. Behar DM, Metspalu E, Kivisild T, et al. (20 co-authors) The matrilineal ancesry of Ashkenazi Jewry: portrait of a recent founder event. Am J Hum Genet. 2006;78:487–497. - PMC - PubMed
    1. Browning BL, Browning SR. A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. Am J Hum Genet. 2009;84:210–223. - PMC - PubMed
    1. Browning SR, Browning BL. High-resolution detection of identity by descent in unrelated individuals. Am J Hum Genet. 2010;86:526–539. - PMC - PubMed
    1. Chen WM, Abecasis GR. Family-based association tests for genomewide association scans. Am J Hum Genet. 2007;81:913–926. - PMC - PubMed
    1. de Bakker PI, McVean G, Sabeti PC, et al. (29 co-authors) A high-resolution HLA and SNP haplotype map for disease association studies in the extended human MHC. Nat Genet. 2006;38:1166–1172. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources