Genome-wide association studies for complex traits: consensus, uncertainty and challenges (original) (raw)
Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature447, 661–678 (2007). In this study, high density, genome-wide association data on 17,000 individuals identified many novel complex-trait susceptibility loci and explored key methodological and technical issues relevant to the GWA approach.
Todd, J. A. et al. Robust associations of four new chromosome regions from genome-wide analyses of type 1 diabetes. Nature Genet.39, 857–864 (2007). ArticleCASPubMed Google Scholar
Hakonarson, H. et al. A genome-wide association study identifies KIAA0350 as a type 1 diabetes gene. Nature448, 591–594 (2007). ArticleCASPubMed Google Scholar
Sladek, R. et al. A genome-wide association study identifies novel risk loci for type 2 diabetes. Nature445, 881–885 (2007). ArticleCASPubMed Google Scholar
Zeggini, E. et al. Replication of genome-wide association signals in UK samples reveals risk loci for type 2 diabetes. Science316, 1336–1341 (2007). ArticleCASPubMedPubMed Central Google Scholar
Scott, L. J. et al. A genome-wide association study of type 2 diabetes in Finns detects multiple susceptibility variants. Science316, 1341–1345 (2007). ArticleCASPubMedPubMed Central Google Scholar
Diabetes Genetics Initiative. Genome-wide association analysis identifies loci for type 2 diabetes and triglyceride levels. Science316, 1331–1336 (2007).
Steinthorsdottir, V. et al. A variant in CDKAL1 influences insulin response and risk of type 2 diabetes. Nature Genet.39, 770–775 (2007). ArticleCASPubMed Google Scholar
Zeggini, E., Scott, L. J., Saxena, R., Voight, B. & DIAGRAM Consortium. Meta-analysis of genome-wide association data and large-scale replication identifies several additional susceptibility loci for type 2 diabetes. Nature Genet. 30 Mar 2008 (doi:10.1038/ng.120).
Parkes, M. et al. Sequence variants in the autophagy gene IRGM and multiple other replicating loci contribute to Crohn's disease susceptibility. Nature Genet.39, 830–832 (2007). ArticleCASPubMed Google Scholar
Duerr, R. H. et al. A genome-wide association study identifies IL23R as an inflammatory bowel disease gene. Science314, 1461–1463 (2006). ArticleCASPubMedPubMed Central Google Scholar
Rioux, J. D. et al. Genome-wide association study identifies new susceptibility loci for Crohn disease and implicates autophagy in disease pathogenesis. Nature Genet.39, 596–604 (2007). ArticleCASPubMed Google Scholar
Libioulle, C. et al. Novel Crohn disease locus identified by genome-wide association maps to a gene desert on 5p13.1 and modulates expression of PTGER4. PLoS Genet.3, e58 (2007). ArticleCASPubMed Google Scholar
Hampe, J. et al. A genome-wide association scan of nonsynonymous SNPs identifies a susceptibility variant for Crohn disease in ATG16L1. Nature Genet.39, 207–211 (2007). ArticleCASPubMed Google Scholar
Gudmundsson, J. et al. Genome-wide association study identifies a second prostate cancer susceptibility variant at 8q24. Nature Genet.39, 631–637 (2007). ArticleCASPubMed Google Scholar
Gudmundsson, J. et al. Two variants on chromosome 17 confer prostate cancer risk, and the one in TCF2 protects against type 2 diabetes. Nature Genet.39, 977–983 (2007). This paper is one of the clearest demonstrations so far of the potential for pleiotropy: the same variants inTCF2influence risk to both type 2 diabetes and prostate cancer. ArticleCASPubMed Google Scholar
Yeager, M. et al. Genome-wide association study of prostate cancer identifies a second risk locus at 8q24. Nature Genet.39, 645–649 (2007). ArticleCASPubMed Google Scholar
Thomas, G. et al. Multiple loci identified in a genome-wide association study of prostate cancer. Nature Genet.40, 310–315 (2008). ArticleCASPubMed Google Scholar
Gudmundsson, J. et al. Common sequence variants on 2p15 and Xp11.22 confer susceptibility to prostate cancer. Nature Genet.40, 281–283 (2008). ArticleCASPubMed Google Scholar
Eeles, R. A. et al. Multiple newly identified loci associated with prostate cancer susceptibility. Nature Genet.40, 316–321 (2008). ArticleCASPubMed Google Scholar
Hunter, D. J. et al. A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer. Nature Genet.39, 870–874 (2007). ArticleCASPubMed Google Scholar
Stacey, S. N. et al. Common variants on chromosomes 2q35 and 16q12 confer susceptibility to estrogen receptor-positive breast cancer. Nature Genet.39, 865–869 (2007). ArticleCASPubMed Google Scholar
Moffatt, M. F. et al. Genetic variants regulating ORMDL3 expression contribute to the risk of childhood asthma. Nature448, 470–473 (2007). ArticleCASPubMed Google Scholar
Helgadottir, A. et al. A common variant on chromosome 9p21 affects the risk of myocardial infarction. Science316, 1491–1493 (2007). ArticleCASPubMed Google Scholar
Gudbjartsson, D. F. et al. Variants conferring risk of atrial fibrillation on chromosome 4q25. Nature448, 353–357 (2007). ArticleCASPubMed Google Scholar
Willer, C. J. et al. Newly identified loci that influence lipid concentrations and risk of coronary artery disease. Nature Genet.40, 161–169 (2008). ArticleCASPubMed Google Scholar
Kathiresan, S. et al. Six new loci associated with blood low-density lipoprotein cholesterol, high-density lipoprotein cholesterol or triglycerides in humans. Nature Genet.40, 189–197 (2008). ArticleCASPubMed Google Scholar
Kooner, J. S. et al. Genome-wide scan identifies variation in MLXIPL associated with plasma triglycerides. Nature Genet.40, 149–151 (2008). ArticleCASPubMed Google Scholar
Weedon, M. N. et al. A common variant of HMGA2 is associated with adult and childhood height in the general population. Nature Genet.39, 1245–1250 (2007). This paper demonstrates the power of the GWA approach to identify genes influencing continuous biomedical phenotypes, in this case, height. ArticleCASPubMed Google Scholar
Sanna, S. et al. Common variants in the GDF5-UQCC region are associated with variation in human height. Nature Genet.40, 198–203 (2008). ArticleCASPubMed Google Scholar
Weedon, M. N. et al. Genome-wide association analysis identifies 20 loci that influence adult height. Nature Genet. (in the press).
Lettre, G. et al. Genome-wide association studies identify 10 novel loci for height and highlight new biological pathways in human growth. Nature Genet. (in the press).
Frayling, T. M. et al. A common variant in the FTO gene is associated with body mass index and predisposes to childhood and adult obesity. Science316, 889–894 (2007). ArticleCASPubMedPubMed Central Google Scholar
Scuteri, A et al. Genome-wide association scans shows genetic variants in the FTO gene are associated with obesity-related traits. PLoS Genet.3, e115 (2007). ArticleCASPubMedPubMed Central Google Scholar
Loos, R. J. F. et al. Association studies involving over 90,000 people demonstrate that common variants near to MC4R influence fat mass, weight and risk of obesity. Nature Genet. (in the press).
Li, M., Boehnke, M. & Abecasis, G. R. Efficient study designs for test of genetic association using sibship data and unrelated cases and controls. Am. J. Hum. Genet.78, 778–792 (2006). ArticleCASPubMedPubMed Central Google Scholar
Howson, J. M., Barratt, B.J., Todd, J. A. & Cordell, H. J. Comparison of population- and family-based methods for genetic association analysis in the presence of interacting loci. Genet. Epidemiol.29, 51–67 (2005). ArticlePubMed Google Scholar
Price, A. L. et al. Principal components analysis corrects for stratification in genome-wide association studies. Nature Genet.38, 904–909 (2006). ArticleCASPubMed Google Scholar
Zheng, G., Freidlin, B. & Gastwirth, J. L. Robust genomic control for association studies. Am. J. Hum. Genet.78, 350–356 (2006). ArticleCASPubMed Google Scholar
Paschou, P. et al. PCA-correlated SNPs for structure identification in worldwide human populations. PLoS Genet.3, e160 (2007). ArticleCASPubMed Central Google Scholar
International HapMap Consortium. A haplotype map of the human genome. Nature437, 1299–1320 (2005).
Laird, N. M. & Lange, C. Family-based designs in the age of large-scale gene-association studies. Nature Rev. Genet.7, 385–394 (2006). ArticleCASPubMed Google Scholar
Clayton, D. G. et al. Population structure, differential bias and genomic control in a large-scale, case–control association study. Nature Genet.37, 1243–1246 (2005). This paper presents a detailed description of the potential for bias and error to complicate the analysis of large-scale genetic association data. ArticleCASPubMed Google Scholar
Plagnol, V., Cooper, J. D., Todd, J. A. & Clayton D. G. A method to address differential bias in genotyping in large-scale association studies. PLoS Genet.3, e74 (2007). ArticleCASPubMedPubMed Central Google Scholar
Cupples, L. A. et al. The Framingham Heart Study 100k SNP genome-wide association study resource: overview of 17 phenotype working group reports. BMC Med. Genet.8, S1 (2007). ArticleCASPubMedPubMed Central Google Scholar
Ridker, P. M. et al. Rationale, design, and methodology of the Women's Genome Health Study: A genome-wide association study of more than 25,000 initially healthy American women. Clin. Chem.54, 249–255 (2008). ArticleCASPubMed Google Scholar
Cordell, H. J. & Clayton, D. G. Genetic association studies. Lancet366, 1121–1131 (2005). ArticlePubMed Google Scholar
Wong, M. Y., Day, N. E., Luan, J. A., Chan, K. P & Wareham, N. J. The detection of gene–environment interaction for continuous traits: should we deal with measurement error by bigger studies or better measurement? Int. J. Epidemiol.32, 51–57 (2003). ArticleCASPubMed Google Scholar
Wong, M. Y., Day, N. E., Luan, J. A. & Wareham, N. J. Estimation of magnitude in gene–environment interactions in the presence of measurement error. Stat. Med.23, 987–998 (2004). ArticleCASPubMed Google Scholar
Burke, W., Khoury, M. J., Stewart, A., Zimmern, R. L. & Bellagio Group. The path from genome-based research to population health: development of an international public health genomics network. Genet. Med.8, 451–458 (2006). ArticlePubMed Google Scholar
Barrett, J. C. & Cardon, L. R. Evaluating coverage of genome-wide association studies. Nature Genet.38, 659–662 (2006). ArticleCASPubMed Google Scholar
Pe'er, I. et al. Evaluating and improving power in whole-genome association studies using fixed marker sets. Nature Genet.38, 663–667 (2006). ArticleCASPubMed Google Scholar
Marchini, J., Howie, B., Myers, S., McVean, G. & Donnelly, P. A new multipoint method for genome-wide association studies by imputation of genotypes. Nature Genet.39, 906–913 (2007). ArticleCASPubMed Google Scholar
Servin, B. & Stephens, M. Imputation-based analysis of association studies: candidate regions and quantitative traits. PLoS Genet.3, e114 (2007). ArticleCASPubMedPubMed Central Google Scholar
McCarroll, S. A. & Altshuler, D. M. Copy-number variation and association studies of human disease. Nature Genet.39, S37–S42 (2007). This paper gives an excellent summary of the challenges to be addressed if large-scale genetic association studies are to be extended to CNVs. ArticleCASPubMed Google Scholar
Scherer, S. W. et al. Challenges and standards in integrating surveys of structural variation. Nature Genet.39, S7–S15 (2007). ArticleCASPubMed Google Scholar
Weiss, L. A. et al. Association between microdeletion and microduplication at 16p11.2 and autism. N. Engl. J. Med.358, 667–675 (2008). ArticleCASPubMed Google Scholar
Sham, P., Bader, J. S., Craig, I., O'Donovan, M. & Owen, M. DNA pooling: a tool for large-scale association studies. Nature Rev. Genet.3, 862–871 (2002). ArticleCASPubMed Google Scholar
Cargill, M. et al. A large-scale genetic association study confirms IL12B and leads to the identification of IL23R as psoriasis-risk genes. Am. J. Hum. Genet.80, 273–290 (2007). ArticleCASPubMed Google Scholar
Wang, W. Y., Barratt, B. J., Clayton, D. G. & Todd, J. A. Genome-wide association studies: theoretical and practical concerns. Nature Rev. Genet.6, 109–118 (2005). ArticleCASPubMed Google Scholar
Hirschhorn, J. N. & Daly, M. J. Genome-wide association studies for common diseases and complex traits. Nature Rev. Genet.6, 95–108 (2005). ArticleCASPubMed Google Scholar
Nicolae, D. L,. Wu, X., Miyake, K. & Cox, N. J. GEL: a novel genotype calling algorithm using empirical likelihood. Bioinformatics22, 1942–1947 (2006). ArticleCASPubMed Google Scholar
Rabbee, N. & Speed, T. P. A genotype calling algorithm for affymetrix SNP arrays. Bioinformatics22, 7–12 (2006). ArticleCASPubMed Google Scholar
Xiao, Y., Segal, M. R., Yang, Y. H. & Yeh, R. F. A multi-array multi-SNP genotyping algorithm for Affymetrix SNP microarrays. Bioinformatics23, 1459–1467 (2007). ArticleCASPubMed Google Scholar
Wittke-Thompson, J. K., Pluzhnikov, A. & Cox, N. J. Rational inferences about departures from Hardy–Weinberg equilibrium. Am. J. Hum. Genet.76, 967–986 (2005). ArticleCASPubMedPubMed Central Google Scholar
Cox, D. G. & Kraft, P. Quantification of the power of Hardy–Weinberg equilibrium testing to detect genotyping error. Hum. Hered.61, 10–14 (2006). ArticlePubMed Google Scholar
Smyth, D. J. et al. A genome-wide association study of nonsynonymous SNPs identifies a type 1 diabetes locus in the interferon-induced helicase (IFIH1) region. Nature Genet.38, 617–619 (2006). ArticleCASPubMed Google Scholar
Lettre, G., Lange, C. & Hirschhorn, J. N. Genetic model testing and statistical power in population-based association studies of quantitative traits. Genet. Epidemiol.31, 358–362 (2007). ArticlePubMed Google Scholar
Risch, N. & Merikangas, K. The future of genetic studies of complex human diseases. Science273, 1516–1517 (1996). ArticleCASPubMed Google Scholar
Hoggart, C. J. et al. Genome-wide significance for dense SNP and resequencing data. Genet. Epidemiol.32, 179–185 (2008). ArticlePubMed Google Scholar
Wacholder, S., Chanock, S., Garcia-Closas, M., El Ghormli, L. & Rothman, N. Assessing the probability that a positive report is false: an approach for molecular epidemiology studies. J. Natl Cancer Inst.96, 434–442 (2004). This is an influential paper setting out the rationale for a Bayesian interpretation of genetic association findings, focusing on methods for establishing the confidence with which any given positive association can be regarded. ArticlePubMedPubMed Central Google Scholar
Wakefield, J. A Bayesian measure of the probability of false discovery in genetic epidemiology studies. Am. J. Hum. Genet.81, 208–227 (2007). ArticleCASPubMedPubMed Central Google Scholar
De Bakker, P. I. et al. Efficiency and power in genetic association studies. Nature Genet.37, 1217–1223 (2005). ArticleCASPubMed Google Scholar
Morris, A. P. A flexible Bayesian framework for modeling haplotype association with disease, allowing for dominance effects of the underlying causative variants. Am. J. Hum. Genet.79, 679–694 (2006). ArticleCASPubMedPubMed Central Google Scholar
De Bakker, P. I. et al. Transferability of tag SNPs in genetic association studies in multiple populations. Nature Genet.38, 1298–1303 (2006). ArticleCASPubMed Google Scholar
Service, S. et al. Magnitude and distribution of linkage disequilibrium in population isolates and implications for genome-wide association studies. Nature Genet.38, 556–560 (2006). ArticleCASPubMed Google Scholar
Zeggini, E. et al. An evaluation of HapMap sample size and tagging SNP performance in large-scale empirical and simulated data sets. Nature Genet.37, 1320–1322 (2005). ArticleCASPubMed Google Scholar
Easton, D. F. et al. A systematic genetic assessment of 1,433 sequence variants of unknown clinical significance in the BRCA1 and BRCA2 breast cancer-predisposition genes. Am. J. Hum. Genet.81, 873–883 (2007). ArticleCASPubMedPubMed Central Google Scholar
Marchini, J., Donnelly, P. & Cardon, L. R. Genome-wide strategies for detecting multiple loci that influence complex diseases. Nature Genet.37, 413–417 (2005). ArticleCASPubMed Google Scholar
Hirschhorn, J.N., Lohmueller, K., Byrne, E. & Hirschhorn, K. A comprehensive review of genetic association studies. Genet. Med.4, 45–61 (2002). ArticleCASPubMed Google Scholar
NCI-NHGRI Working Group on Replication in Association Studies. Replicating genotype–phenotype associations: what constitutes replication of a genotype–phenotype association, and how best can it be achieved? Nature447, 655–660 (2007). This feature article is a thoughtful summary of the main issues relating to replication of genetic association studies.
Lohmueller, K. E., Pearce, C. L., Pike, M., Lander, E. S. & Hirschhorn, J. N. Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease. Nature Genet.33, 177–182 (2003). ArticleCASPubMed Google Scholar
Clarke, G. M., Carter, K. W., Palmer, L. J., Morris, A. P. & Cardon, L. R. Fine mapping versus replication in whole-genome association studies. Am. J. Hum. Genet.81, 995–1007 (2007). ArticleCASPubMedPubMed Central Google Scholar
Skol, A. D., Scott, L. J., Abecasis, G. R. & Boehnke M. Optimal designs for two-stage genome-wide association studies. Genet. Epidemiol.31, 766–788 (2007). Article Google Scholar
Wang, H., Thomas, D. C., Pe'er, I. & Stram, D. O. Optimal two-stage genotyping designs for genome-wide association scans. Genet. Epidemiol.30, 356–368 (2006). ArticlePubMed Google Scholar
Müller, H. H., Pahl, R. & Schäfer, H. Including sampling and phenotyping costs into the optimization of two stage designs for genome wide association studies. Genet. Epidemiol.31, 844–852 (2007). ArticlePubMed Google Scholar
Zollner, S. & Pritchard, J. K. Overcoming the winner's curse: estimating penetrance parameters from case–control data. Am. J. Hum. Genet.80, 605–615 (2007). ArticleCASPubMedPubMed Central Google Scholar
Gorrochurn, P., Hodge, S. E., Heiman, G. A., Durner, M. & Greenberg, D. A. Non-replication of association studies: 'pseudo-failures' to replicate? Genet. Med.9, 325–331 (2007). Article Google Scholar
Ioannidis J. P., Patsopoulos, N. A. & Evangelou, E. Heterogeneity in meta-analyses of genome-wide association investigations. PLoS ONE2, e841 (2007). ArticleCASPubMedPubMed Central Google Scholar
Ioannidis J. P. Non-replication and inconsistency in the genome-wide association setting. Hum. Hered.64, 203–213 (2007). ArticleCASPubMed Google Scholar
Moonesinghe, R., Khoury, M. J., Liu, T. & Ioannidis, J. P. Required sample size and nonreplicability thresholds for heterogeneous genetic associations. Proc. Natl Acad. Sci. USA105, 617–622 (2008). ArticlePubMedPubMed Central Google Scholar
The GAIN Collaborative Research Group. New models of collaboration in genome-wide association studies: the Genetic Association Information Network. Nature Genet.39, 1045–1051 (2007).
Helgason, A. et al. Refining the impact of TCF7L2 gene variants on type 2 diabetes and adaptive evolution. Nature Genet.39, 218–225 (2007). ArticleCASPubMed Google Scholar
Locke, D. P., et al. Linkage disequilibrium and heritability of copy-number polymorphisms within duplicated regions of the human genome. Am. J. Hum. Genet.79, 275–290 (2006). ArticleCASPubMedPubMed Central Google Scholar
ENCODE Project Consortium. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature447, 799–816 (2007). This is a detailed examination of the functional annotation of a subset of the human genome, which reveals the complexity of genomic organization.
Stranger, B. et al. Population genomics of human gene expression. Nature Genet.39, 1217–1224 (2007). ArticleCASPubMed Google Scholar
Dixon, A. L. et al. A genome-wide association study of global gene expression. Nature Genet.39, 1202–1207 (2007). ArticleCASPubMed Google Scholar
Goring, H. H. et al. Discovery of expression QTLs using large-scale transcriptional profiling in human lymphocytes. Nature Genet.39, 1208–1216 (2007). ArticleCASPubMed Google Scholar
Ioannidis, J. P. & Kavvoura, F. K. Concordance of functional in vitro data and epidemiological associations in complex disease genetics. Genet. Med.8, 583–593 (2006). ArticlePubMed Google Scholar
Lowe, C. E. et al. Large-scale genetic fine mapping and genotype–phenotype associations implicate polymorphism in the IL2RA region in type 1 diabetes. Nature Genet.39, 1074–1082 (2007). ArticleCASPubMed Google Scholar
Ioannidis, J. P. et al. Assessment of cumulative evidence on genetic associations: interim guidelines. Int. J. Epidemiol.37, 120–132 (2008). ArticlePubMed Google Scholar
Davey Smith, G. & Ebrahim, S. 'Mendelian randomization': can genetic epidemiology contribute to understanding environmental determinants of disease? Int. J. Epidemiol.32, 1–22 (2003). Article Google Scholar
Zheng, S. L. et al. Cumulative association of five genetic variants with prostate cancer. N. Engl. J. Med.358, 910–919 (2008). ArticleCASPubMed Google Scholar
Stratton, M. R. & Rahman, N. The emerging landscape of breast cancer susceptibility. Nature Genet.40, 17–22 (2008). ArticleCASPubMed Google Scholar
Mailman, M. D. et al. The NCBI dbGaP database of genotypes and phenotypes. Nature Genet.39, 1181–1186 (2007). ArticleCASPubMed Google Scholar
Zheng, S. L. et al. Association between two unlinked loci at 8q24 and prostate cancer risk among European Americans. J. Natl Cancer Inst.99, 1499–1501 (2007). ArticleCAS Google Scholar
Brazma, A. et al. Minimum information about a microarray experiment (MIAME) — toward standards for microarray data. Nature Genet.29, 356–371 (2001). ArticleCAS Google Scholar
Altman, D. & Moher, D. Developing guidelines for reporting healthcare research: scientific rationale and procedures. Med. Clin. (Barc).125, 8–13 (2005). ArticlePubMed Google Scholar
Gludd, L. L. Bias in clinical intervention research. Am. J. Epidemiol.163, 493–501 (2006). Article Google Scholar
Altman, D. G. et al. The revised CONSORT statement for reporting randomized trials: explanation and elaboration. Ann. Intern. Med.134, 663–694 (2001). ArticleCASPubMed Google Scholar
Von Elm, E. et al. The strengthening the reporting of observational studies in epidemiology (STROBE) statement: guidelines for reporting observational studies. Lancet370, 1453–1457 (2007). ArticlePubMed Google Scholar
Seminara, D. et al. The emergence of networks in human genome epidemiology: challenges and opportunities. Epidemiology18, 1–8 (2007). ArticlePubMed Google Scholar
Ge, D. et al. WGAViewer: a software for genomic annotation of whole genome association studies. Genome Res. 3 Mar 2008 (doi:10.1101/gr.071571.107).
Janssens, A. C. J. W, Gwinn, M., Subramonia-Iyer, S. & Khoury, M. J. Does genetic testing really improve the prediction of future type 2 diabetes? PLOS Med.3, e114 (2006). ArticlePubMedPubMed Central Google Scholar