The future of association studies: gene-based analysis and replication - PubMed (original) (raw)

The future of association studies: gene-based analysis and replication

Benjamin M Neale et al. Am J Hum Genet. 2004 Sep.

Abstract

Historically, association tests were limited to single variants, so that the allele was considered the basic unit for association testing. As marker density increases and indirect approaches are used to assess association through linkage disequilibrium, association is now frequently considered at the haplotypic level. We suggest that there are difficulties in replicating association findings at the single-nucleotide-polymorphism (SNP) or the haplotype level, and we propose a shift toward a gene-based approach in which all common variation within a candidate gene is considered jointly. Inconsistencies arising from population differences are more readily resolved by use of a gene-based approach rather than either a SNP-based or a haplotype-based approach. A gene-based approach captures all of the potential risk-conferring variations; thus, negative findings are subject only to the issue of power. In addition, chance findings due to multiple testing can be readily accounted for by use of a genewide-significance level. Meta-analysis procedures can be formalized for gene-based methods through the combination of P values. It is only a matter of time before all variation within genes is mapped, at which point the gene-based approach will become the natural end point for association analysis and will inform our search for functional variants relevant to disease etiology.

PubMed Disclaimer

Figures

Figure  1

Figure 1

Adapted from figure 1 of Williams et al. (2004). Blackened boxes 1–13 represent the coding regions of dysbindin; adjacent unblackened boxes represent alternative splicing sites. P1–P4 represent the four hypothesized promoter regions of dysbindin. The numbered loci constitute the initial 8-marker haplotype specified by Straub et al. (2002). The lettered loci are the SNPs Williams et al. (2004) discovered and analyzed. The Roman numerals specify the additional markers typed by Schwab et al. (2003). Van Den Bogaert et al. (2003) typed 2, 4, 5, 7, and 8, whereas Schwab et al. (2003) typed 2, 3, 4, 6, 7, and 8.

Figure  2

Figure 2

Minimum sample allele frequency for achieving different levels of significance. We assume equal numbers of cases and controls in an association sample and plot the behavior of the minimum-allele frequency capable of demonstrating significant association at the nominal (.05), genewide (.00167), and genomewide level (

5.56×10-8

) against total sample size. The genewide significance assumes 30 detectable haplotypes across the gene, in accordance with the Crawford et al. (2004) estimate, and the genomewide level assumes 30,000 genes. The minimum-allele frequency is derived from the instance in which all copies of the allele (c) are found in either the cases (disease predisposing) or controls (disease protective). The significance is defined as

0.5_c_-1

, since, under the null hypothesis, the first copy of the allele must be in either the cases or controls, and each subsequent allele is regarded as independent and has .5 probability of being in the same group as the first allele. Note the convergence of the nominal and genewide frequencies as sample size increases.

References

Electronic-Database Information

    1. HapMap, http://www.hapmap.org/

References

    1. Allison DB, Heo M (1998) Meta-analysis of linkage data under worst-case conditions: a demonstration using the human OB region. Genetics 148:859–865 - PMC - PubMed
    1. Altshuler D, Kruglyak L, Lander E (1998) Genetic polymorphisms and disease. N Engl J Med 338:162610.1056/NEJM199805283382214 - DOI - PubMed
    1. Ardlie KG, Kruglyak L, Seielstad M (2002) Patterns of linkage disequilibrium in the human genome. Nat Rev Genet 3:299–30910.1038/nrg777 - DOI - PubMed
    1. Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Roy Stat Soc Ser B 57:289–300
    1. Byng MC, Whittaker JC, Cuthbert AP, Mathew CG, Lewis CM (2003) SNP subset selection for genetic association studies. Ann Hum Genet 67:543–55610.1046/j.1529-8817.2003.00055.x - DOI - PubMed

Publication types

MeSH terms

LinkOut - more resources