Family-based association tests for genomewide association scans - PubMed (original) (raw)

Family-based association tests for genomewide association scans

Wei-Min Chen et al. Am J Hum Genet. 2007 Nov.

Abstract

With millions of single-nucleotide polymorphisms (SNPs) identified and characterized, genomewide association studies have begun to identify susceptibility genes for complex traits and diseases. These studies involve the characterization and analysis of very-high-resolution SNP genotype data for hundreds or thousands of individuals. We describe a computationally efficient approach to testing association between SNPs and quantitative phenotypes, which can be applied to whole-genome association scans. In addition to observed genotypes, our approach allows estimation of missing genotypes, resulting in substantial increases in power when genotyping resources are limited. We estimate missing genotypes probabilistically using the Lander-Green or Elston-Stewart algorithms and combine high-resolution SNP genotypes for a subset of individuals in each pedigree with sparser marker data for the remaining individuals. We show that power is increased whenever phenotype information for ungenotyped individuals is included in analyses and that high-density genotyping of just three carefully selected individuals in a nuclear family can recover >90% of the information available if every individual were genotyped, for a fraction of the cost and experimental effort. To aid in study design, we evaluate the power of strategies that genotype different subsets of individuals in each pedigree and make recommendations about which individuals should be genotyped at a high density. To illustrate our method, we performed genomewide association analysis for 27 gene-expression phenotypes in 3-generation families (Centre d'Etude du Polymorphisme Humain pedigrees), in which genotypes for ~860,000 SNPs in 90 grandparents and parents are complemented by genotypes for ~6,700 SNPs in a total of 168 individuals. In addition to increasing the evidence of association at 15 previously identified cis-acting associated alleles, our genotype-inference algorithm allowed us to identify associated alleles at 4 cis-acting loci that were missed when analysis was restricted to individuals with the high-density SNP data. Our genotype-inference algorithm and the proposed association tests are implemented in software that is available for free.

PubMed Disclaimer

Figures

Figure  1.

Figure 1.

Exemplar scoring of expected genotype scores. In each panel, the first sibling (individual II-1) is marked with an arrow. In panel A, only the first sibling is genotyped, and no flanking-marker data are available. In panel B, hypothetical flanking-marker data are available and can be used to characterize IBD sharing between the genotyped individual and her siblings. In panel C, two individuals are genotyped, providing further information.

Figure  2.

Figure 2.

Genome scan for CTBP1 expression levels. The gene maps to the beginning of chromosome 4. A, Genome scan using 60 unrelated individuals only. B, Genome scan using all 90 individuals genotyped by the HapMap Consortium. C, Genome scan that augmented the observed genotypes with expected genotype scores for other individuals, resulting in a total sample size of 156 individuals. All analysis were performed using the computationally efficient SCORE statistic. D, Q-Q plot. E, log Q–log Q plot. The plots show that the statistic is behaving adequately.

Similar articles

Cited by

References

Web Resources

    1. Ghost, http://www.sph.umich.edu/csg/chen/ghost/ (for the Elston-Stewart–based implementation of our method)
    1. Merlin, http://www.sph.umich.edu/csg/abecasis/Merlin/ (for the Lander-Green–based implementation of our method)

References

    1. The International HapMap Consortium (2005) The International HapMap Project. Nature 437:1299–132010.1038/nature04226 - DOI - PMC - PubMed
    1. Hirschhorn JN, Daly MJ (2005) Genome-wide association studies for common diseases and complex traits. Nat Rev Genet 6:95–10810.1038/nrg1521 - DOI - PubMed
    1. Abecasis GR, Ghosh D, Nichols TE (2005) Linkage disequilibrium: ancient history drives the new genetics. Hum Hered 59:118–12410.1159/000085226 - DOI - PubMed
    1. Barrett JC, Cardon LR (2006) Evaluating coverage of genome-wide association studies. Nat Genet 38:659–66210.1038/ng1801 - DOI - PubMed
    1. Pe'er I, de Bakker PI, Maller J, Yelensky R, Altshuler D, Daly MJ (2006) Evaluating and improving power in whole-genome association studies using fixed marker sets. Nat Genet 38:663–66710.1038/ng1816 - DOI - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources