Enriching the analysis of genomewide association studies with hierarchical modeling - PubMed (original) (raw)

. 2007 Aug;81(2):397-404.

doi: 10.1086/519794. Epub 2007 Jun 26.

Affiliations

Enriching the analysis of genomewide association studies with hierarchical modeling

Gary K Chen et al. Am J Hum Genet. 2007 Aug.

Abstract

Genomewide association studies (GWAs) initially investigate hundreds of thousands of single-nucleotide polymorphisms (SNPs), and the most promising SNPs are further evaluated with additional subjects, for replication or a joint analysis. Deciding which SNPs merit follow-up is one of the most crucial aspects of these studies. We present here an approach for selecting the most-promising SNPs that incorporates into a hierarchical model both conventional results and other existing information about the SNPs. The model is developed for general use, its potential value is shown by application, and tools are provided for undertaking hierarchical modeling. By quantitatively harnessing all available information in GWAs, hierarchical modeling may more clearly distinguish true causal variants from noise.

PubMed Disclaimer

Figures

Figure  1.

Figure 1.

The smallest 500 −log10 P values estimated from ordinary linear regression of the CHI3L2 gene–expression phenotype on the genotypes of 57 CEU individuals across chromosome 1. The causal SNP rs755467 is shown at 111.48 Mb with a log10 (P value) of 7.29.

Figure  2.

Figure 2.

A comparison of the smallest 500 −log10 P values from the CHI3L2 example with use of hierarchical models across three values of the SD parameter ρ. Larger values of ρ reduce the effect of reduction toward the second-stage mean at the region with strong prior evidence (i.e., linked region in center), whereas smaller values increase the reduction.

Figure  3.

Figure 3.

A comparison of the smallest 500 −log10 P values estimated from ordinary linear regression (in red, as shown in fig. 1) and the hierarchical model, with

ρ=0.05

estimates superimposed in blue.

Figure  4.

Figure 4.

Proportion of the top 500 SNPs located across windows centered at the causal variant for CHII3L2 gene expression for ordinary linear regression and for the hierarchical model. The _X_-axis denotes the distance from the causal SNP to either edge of a window.

Similar articles

Cited by

References

Web Resources

    1. Ensembl, http://www.ensembl.org/
    1. Gene Expression Omnibus, http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi (for phenotype data about 57 CEU individuals [accession number GSE2552])
    1. International HapMap Project, http://www.hapmap.org/downloads/index.html.en (for genotype and LD data about SNPs)
    1. J.S.W. lab, http://www.epibiostat.ucsf.edu/witte_lab/
    1. NCBI FTP, http://www.ncbi.nlm.nih.gov/Ftp/

References

    1. Satagopan JM, Verbel DA, Venkatraman ES, Offit KE, Begg CB (2002) Two-stage designs for gene-disease association studies. Biometrics 58:163–17010.1111/j.0006-341X.2002.00163.x - DOI - PMC - PubMed
    1. Sun L, Craiu RV, Paterson AD, Bull SB (2006) Stratified false discovery control for large-scale hypothesis testing with application to genome-wide association studies. Genet Epidemiol 30:519–53010.1002/gepi.20164 - DOI - PubMed
    1. Roeder K, Bacanu S-A, Wasserman L, Devlin B (2006) Using linkage genome scans to improve power of association in genome scans. Am J Hum Genet 78:243–252 - PMC - PubMed
    1. Pe’er I, de Bakker PI, Maller J, Yelensky R, Altshuler D, Daly MJ (2006) Evaluating and improving power in whole-genome association studies using fixed marker sets. Nat Genet 38:663–66710.1038/ng1816 - DOI - PubMed
    1. Morris C (1983) Parametric empirical Bayes inference: theory and applications. J Am Stat Assoc 78:47–6510.2307/2287098 - DOI

Publication types

MeSH terms

LinkOut - more resources