Estimating kinship in admixed populations - PubMed (original) (raw)

Estimating kinship in admixed populations

Timothy Thornton et al. Am J Hum Genet. 2012.

Abstract

Genome-wide association studies (GWASs) are commonly used for the mapping of genetic loci that influence complex traits. A problem that is often encountered in both population-based and family-based GWASs is that of identifying cryptic relatedness and population stratification because it is well known that failure to appropriately account for both pedigree and population structure can lead to spurious association. A number of methods have been proposed for identifying relatives in samples from homogeneous populations. A strong assumption of population homogeneity, however, is often untenable, and many GWASs include samples from structured populations. Here, we consider the problem of estimating relatedness in structured populations with admixed ancestry. We propose a method, REAP (relatedness estimation in admixed populations), for robust estimation of identity by descent (IBD)-sharing probabilities and kinship coefficients in admixed populations. REAP appropriately accounts for population structure and ancestry-related assortative mating by using individual-specific allele frequencies at SNPs that are calculated on the basis of ancestry derived from whole-genome analysis. In simulation studies with related individuals and admixture from highly divergent populations, we demonstrate that REAP gives accurate IBD-sharing probabilities and kinship coefficients. We apply REAP to the Mexican Americans in Los Angeles, California (MXL) population sample of release 3 of phase III of the International Haplotype Map Project; in this sample, we identify third- and fourth-degree relatives who have not previously been reported. We also apply REAP to the African American and Hispanic samples from the Women's Health Initiative SNP Health Association Resource (WHI-SHARe) study, in which hundreds of pairs of cryptically related individuals have been identified.

Copyright © 2012 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Kinship Coefficients Plotted against Zero-IBD-Sharing Probabilities Estimated kinship coefficients plotted against zero-IBD-sharing-probability estimates for three population-structure settings. (A, C, and E) Scatter plots comparing the REAP kinship-coefficient estimator from Equation 3 with the REAP zero-IBD-sharing-probability estimator from Equation 4 for population-structure settings 1 (A), 2 (C), and 3 (E). (B, D, and F) Scatter plots comparing the homogeneous-population kinship-coefficient estimator from Equation 1 with the homogeneous-population zero-IBD-sharing-probability estimator from Equation 2 for population-structure settings 1 (B), 2 (D), and 3 (F). Zero-IBD-sharing-probability and kinship-coefficient estimates were calculated with 10,000 simulated random SNPs.

Figure 2

Figure 2

KING-Robust and REAP Kinship-Coefficient Histograms for Unrelated Pairs with Admixture (A and B) Histograms of kinship coefficients estimated with the KING-robust kinship-coefficient estimator (A) and the REAP kinship-coefficient estimator from Equation 3 (B) for all pairs of unrelated individuals in population-structure setting 2. The vertical line at 0 in each histogram represents the true kinship coefficient for all pairs. Kinship-coefficient estimates were calculated with 10,000 simulated random SNPs.

Figure 3

Figure 3

Individual-Ancestry Estimates for HapMap MXL Individual-ancestry estimates for 86 HapMap MXL sample individuals from a supervised structure analysis with the frappe software program. In the figure, each individual is represented by a vertical bar; European (HapMap CEU) and African (HapMap YRI) ancestry contributions are in blue and red, respectively, and Native American (HGDP samples from the Americas) ancestry contributions are in green.

Figure 4

Figure 4

REAP Kinship Coefficients versus Zero-IBD-Sharing Probabilities for HapMap MXL REAP kinship-coefficient estimates are plotted against REAP zero-IBD-sharing-probability estimates for the HapMap MXL sample. REAP estimates were calculated with the kinship-coefficient and zero-IBD-sharing-probability estimators from Equations 3 and 4, respectively. Relative pairs were classified on the basis of kinship-coefficient and zero-IBD-sharing-probability estimates.

Figure 5

Figure 5

Example of an Extended Pedigree Reconstructed with REAP in HapMap MXL REAP-inferred pedigree relationships for four HapMap-reported pedigrees from the MXL sample are given. HapMap-reported pedigree relationships are circled, and HapMap-reported pedigree identification numbers (M008, 2382, M011, and M012) are given in bold font in each of the circles.

Figure 6

Figure 6

Example of Two HapMap MXL Pedigrees Connected with REAP Pedigree relationships for two HapMap-reported pedigrees from the MXL sample are given. HapMap-reported pedigree relationships are circled, and HapMap-reported pedigree identification numbers (M007 and M032) are given in bold font in each of the circles.

Figure 7

Figure 7

REAP Kinship Coefficients versus Zero-IBD-Sharing Probabilities for WHI-SHARe (A and B) REAP kinship-coefficient estimates are plotted against REAP zero-IBD-sharing-probability estimates for the WHI-SHARe self-reported African Americans and self-reported Hispanics, respectively. REAP estimates were calculated with the kinship-coefficient and zero-IBD-sharing-probability estimators from Equations 3 and 4, respectively.

Similar articles

Cited by

References

    1. Slager S.L., Schaid D.J. Evaluation of candidate genes in case-control studies: A statistical method to account for related subjects. Am. J. Hum. Genet. 2001;68:1457–1462. - PMC - PubMed
    1. Bourgain C., Hoffjan S., Nicolae R., Newman D., Steiner L., Walker K., Reynolds R., Ober C., McPeek M.S. Novel case-control test in a founder population identifies P-selectin as an atopy-susceptibility locus. Am. J. Hum. Genet. 2003;73:612–626. - PMC - PubMed
    1. Thornton T., McPeek M.S. Case-control association testing with related individuals: a more powerful quasi-likelihood score test. Am. J. Hum. Genet. 2007;81:321–337. - PMC - PubMed
    1. Purcell S., Neale B., Todd-Brown K., Thomas L., Ferreira M.A.R., Bender D., Maller J., Sklar P., de Bakker P.I., Daly M.J. PLINK: A tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 2007;81:559–575. - PMC - PubMed
    1. Choi Y., Wijsman E.M., Weir B.S. Case-control association testing in the presence of unknown relationships. Genet. Epidemiol. 2009;33:668–678. - PMC - PubMed

Publication types

MeSH terms

Grants and funding

LinkOut - more resources