Reconstructing genetic ancestry blocks in admixed individuals - PubMed (original) (raw)

Reconstructing genetic ancestry blocks in admixed individuals

Hua Tang et al. Am J Hum Genet. 2006 Jul.

Abstract

A chromosome in an individual of recently admixed ancestry resembles a mosaic of chromosomal segments, or ancestry blocks, each derived from a particular ancestral population. We consider the problem of inferring ancestry along the chromosomes in an admixed individual and thereby delineating the ancestry blocks. Using a simple population model, we infer gene-flow history in each individual. Compared with existing methods, which are based on a hidden Markov model, the Markov-hidden Markov model (MHMM) we propose has the advantage of accounting for the background linkage disequilibrium (LD) that exists in ancestral populations. When there are more than two ancestral groups, we allow each ancestral population to admix at a different time in history. We use simulations to illustrate the accuracy of the inferred ancestry as well as the importance of modeling the background LD; not accounting for background LD between markers may mislead us to false inferences about mixed ancestry in an indigenous population. The MHMM makes it possible to identify genomic blocks of a particular ancestry by use of any high-density single-nucleotide-polymorphism panel. One application of our method is to perform admixture mapping without genotyping special ancestry-informative-marker panels.

PubMed Disclaimer

Figures

Figure 1.

Graphical representation of an HMM (a) and an MHMM (b).

Figure 2.

Estimation of two-marker haplotype frequency estimation. Unphased genotype data in 50 individuals were simulated on the basis of chromosome 22 haplotypes of the CEU individuals genotyped in the HapMap project. Each plot can be viewed as a two-dimensional histogram, in which the _X_-axis represents the true haplotype frequency, and the _Y_-axis represents the corresponding estimated frequencies. The intensity at each pixel indicates the height of the histogram, or the number of marker pairs whose true haplotype frequency is at the _X_-coordinate while the estimated haplotype frequency is at the _Y_-coordinate. a, Naive haplotype frequency estimates. Both allele frequencies and haplotype frequencies are estimated from a small sample of individuals. b, Augmented haplotype frequency estimates. Haplotype frequencies were estimated from same set of individuals as in panel a, but allele frequencies were estimated from a larger sample.

Figure 3.

Estimated admixing time, τ, of 400 simulated individuals. Red circles represent the MLE under the MHMM; blue triangles represent the MLE under the HMM by use of the same genotype data. True times are 25, 10, and 25, indicated with a yellow square. Some jitter is added to the MLEs to aid visualization.

Figure 4.

Ancestry for a simulated admixed individual. The _Y_-axis represents the posterior probability that one allele is derived from a specific ancestry; the _X_-axis indicates the physical locations of the markers. Top, True ancestral states. Middle, MHMM estimates. Bottom, HMM estimates.

Figure 5.

Comparison of percentage reduction in MSE. Percentage reduction for individual n is defined as

(MSE HMM _n_-MSE MHMM n)/MSE HMM n

Figure 6.

Estimated ancestry for a Han Chinese individual from Beijing. The _Y_-axis represents the posterior probability that one allele is derived from a specific ancestry; the _X_-axis indicates the physical locations of markers. Markers were sampled at an average spacing of 30 kb (top panels), 6 kb (middle panels), and 3 kb (bottom panels), which approximated the density of a 100K SNP chip, approximated the density of a 500K SNP chip, and used all HapMap SNPs, respectively. Left panels, MHMM correctly infers Asian ancestry (yellow) at most markers. Right panels, HMM assigns considerable probability of European ancestry (blue) or African ancestry (red) in several regions.

Figure 7.

Estimated ancestry for a simulated individual with asymmetric admixing history. The _Y_-axis represents the posterior probability that one allele is derived from a specific ancestry; the _X_-axis indicates the physical locations of markers. a, True ancestry along the paternal and the maternal chromosomes. The paternal chromosome was generated assuming

τ=(25,25,25)

and

π=(0.4,0.4,0.2)

, whereas the maternal chromosome was generated assuming

τ=(2,2,2)

and

π=(0.75,0.125,0.125)

. b, Posterior ancestry estimates at the MLE of τ. c, Posterior ancestry estimates under the assumption

τ=(2,2,2)

. d, Posterior ancestry estimates under the assumption

τ=(50,50,50)

Cited by

Inference of Locus-Specific Population Mixtures from Linked Genome-Wide Allele Frequencies.
Reyna-Blanco CS, Caduff M, Galimberti M, Leuenberger C, Wegmann D. Reyna-Blanco CS, et al. Mol Biol Evol. 2024 Jul 3;41(7):msae137. doi: 10.1093/molbev/msae137. Mol Biol Evol. 2024. PMID: 38958167 Free PMC article.
Power comparison of admixture mapping and direct association analysis in genome-wide association studies.
Qin H, Zhu X. Qin H, et al. Genet Epidemiol. 2012 Apr;36(3):235-43. doi: 10.1002/gepi.21616. Epub 2012 Mar 28. Genet Epidemiol. 2012. PMID: 22460597 Free PMC article.
Dating the age of admixture via wavelet transform analysis of genome-wide data.
Pugach I, Matveyev R, Wollstein A, Kayser M, Stoneking M. Pugach I, et al. Genome Biol. 2011;12(2):R19. doi: 10.1186/gb-2011-12-2-r19. Epub 2011 Feb 25. Genome Biol. 2011. PMID: 21352535 Free PMC article.
Effect of genetic divergence in identifying ancestral origin using HAPAA.
Sundquist A, Fratkin E, Do CB, Batzoglou S. Sundquist A, et al. Genome Res. 2008 Apr;18(4):676-82. doi: 10.1101/gr.072850.107. Epub 2008 Mar 18. Genome Res. 2008. PMID: 18353807 Free PMC article.
Imputation of exome sequence variants into population- based samples and blood-cell-trait-associated loci in African Americans: NHLBI GO Exome Sequencing Project.
Auer PL, Johnsen JM, Johnson AD, Logsdon BA, Lange LA, Nalls MA, Zhang G, Franceschini N, Fox K, Lange EM, Rich SS, O'Donnell CJ, Jackson RD, Wallace RB, Chen Z, Graubert TA, Wilson JG, Tang H, Lettre G, Reiner AP, Ganesh SK, Li Y. Auer PL, et al. Am J Hum Genet. 2012 Nov 2;91(5):794-808. doi: 10.1016/j.ajhg.2012.08.031. Epub 2012 Oct 25. Am J Hum Genet. 2012. PMID: 23103231 Free PMC article.

References

Web Resource

1. SABER, http://www.fhcrc.org/science/labs/tang/

References

1. Rife D (1954) Populations of hybrid origin as source material for the detection of linkage. Am J Hum Genet 6:26–33 - PMC - PubMed
1. McKeigue P (1998) Mapping genes that underlie ethnic differences in disease risk: methods for detecting linkage in admixed populations, by conditioning on parental admixture. Am J Hum Genet 63:241–251 - PMC - PubMed
1. Montana G, Pritchard J (2004) Statistical tests for admixture mapping with case-control and cases-only data. Am J Hum Genet 75:771–789 - PMC - PubMed
1. Hoggart C, Shriver M, Kittles R, Clayton D, McKeigue P (2004) Design and analysis of admixture mapping studies. Am J Hum Genet 74:965–978 - PMC - PubMed
1. Patterson N, Hattangadi N, Lane B, Lohmueller K, Hafler D, Oksenberg J, Hauser S, Smith M, O’Brien S, Altshuler D, Daly M, Reich D (2004) Methods for high-density admixture mapping of disease genes. Am J Hum Genet 74:979–1000 - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
Research Materials
- NCI CPTC Antibody Characterization Program

Reconstructing genetic ancestry blocks in admixed individuals - PubMed (original) (raw)