Opportunities and challenges for transcriptome-wide association studies (original) (raw)

Nature Genetics volume 51, pages 592–599 (2019)Cite this article

Subjects

Abstract

Transcriptome-wide association studies (TWAS) integrate genome-wide association studies (GWAS) and gene expression datasets to identify gene–trait associations. In this Perspective, we explore properties of TWAS as a potential approach to prioritize causal genes at GWAS loci, by using simulations and case studies of literature-curated candidate causal genes for schizophrenia, low-density-lipoprotein cholesterol and Crohn’s disease. We explore risk loci where TWAS accurately prioritizes the likely causal gene as well as loci where TWAS prioritizes multiple genes, some likely to be non-causal, owing to sharing of expression quantitative trait loci (eQTL). TWAS is especially prone to spurious prioritization with expression data from non-trait-related tissues or cell types, owing to substantial cross-cell-type variation in expression levels and eQTL strengths. Nonetheless, TWAS prioritizes candidate causal genes more accurately than simple baselines. We suggest best practices for causal-gene prioritization with TWAS and discuss future opportunities for improvement. Our results showcase the strengths and limitations of using eQTL datasets to determine causal genes at GWAS loci.

This is a preview of subscription content, access via your institution

Access options

Access Nature and 54 other Nature Portfolio journals

Get Nature+, our best-value online-access subscription

$32.99 / 30 days

cancel any time

Subscribe to this journal

Receive 12 print issues and online access

$259.00 per year

only $21.58 per issue

Buy this article

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

Additional access options:

Similar content being viewed by others

References

  1. Gallagher, M. D. & Chen-Plotkin, A. S. The post-GWAS era: from association to function. Am. J. Hum. Genet. 102, 717–730 (2018).
    Article CAS PubMed PubMed Central Google Scholar
  2. Gamazon, E. R. et al. A gene-based association method for mapping traits using reference transcriptome data. Nat. Genet. 47, 1091–1098 (2015).
    Article CAS PubMed PubMed Central Google Scholar
  3. Gusev, A. et al. Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet. 48, 245–252 (2016).
    Article CAS PubMed PubMed Central Google Scholar
  4. Barbeira, A. N. et al. Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics. Nat. Commun. 9, 1825 (2018).
    Article PubMed PubMed Central Google Scholar
  5. Zhu, Z. et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 48, 481–487 (2016).
    Article CAS PubMed Google Scholar
  6. Hauberg, M. E. et al. Large-scale identification of common trait and disease variants affecting gene expression. Am. J. Hum. Genet. 100, 885–894 (2017).
    Article CAS PubMed PubMed Central Google Scholar
  7. Pavlides, J. M. W. et al. Predicting gene targets from integrative analyses of summary data from GWAS and eQTL studies for 28 human complex traits. Genome Med. 8, 84 (2016).
    Article PubMed PubMed Central Google Scholar
  8. He, X. et al. Sherlock: detecting gene-disease associations by matching patterns of expression QTL and GWAS. Am. J. Hum. Genet. 92, 667–680 (2013).
    Article CAS PubMed PubMed Central Google Scholar
  9. Wallace, C. et al. Statistical colocalization of monocyte gene expression and genetic risk variants for type 1 diabetes. Hum. Mol. Genet. 21, 2815–2824 (2012).
    Article CAS PubMed PubMed Central Google Scholar
  10. Giambartolomei, C. et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 10, e1004383 (2014).
    PubMed PubMed Central Google Scholar
  11. Plagnol, V., Smyth, D. J., Todd, J. A. & Clayton, D. G. Statistical independence of the colocalized association signals for type 1 diabetes and RPS26 gene expression on chromosome 12q13. Biostatistics 10, 327–334 (2009).
    Article PubMed Google Scholar
  12. Hormozdiari, F. et al. Colocalization of GWAS and eQTL signals detects target genes. Am. J. Hum. Genet. 99, 1245–1260 (2016).
    CAS PubMed PubMed Central Google Scholar
  13. Wen, X., Pique-Regi, R. & Luca, F. Integrating molecular QTL data into genome-wide genetic association analysis: probabilistic assessment of enrichment and colocalization. PLoS Genet. 13, e1006646 (2017).
    Article PubMed PubMed Central Google Scholar
  14. Nica, A. C. et al. Candidate causal regulatory effects by integration of expression QTLs with complex trait genetic associations. PLoS Genet. 6, e1000895 (2010).
    Article PubMed PubMed Central Google Scholar
  15. Mancuso, N. et al. Integrating gene expression with summary association statistics to identify genes associated with 30 complex traits. Am. J. Hum. Genet. 100, 473–487 (2017).
    Article CAS PubMed PubMed Central Google Scholar
  16. Gusev, A. et al. Transcriptome-wide association study of schizophrenia and chromatin activity yields mechanistic disease insights. Nat. Genet. 50, 538–548 (2018).
    Article CAS PubMed PubMed Central Google Scholar
  17. Sekar, A. et al. Schizophrenia risk from complex variation of complement component 4. Nature 530, 177–183 (2016).
    Article CAS PubMed PubMed Central Google Scholar
  18. GTEx Consortium et al. Genetic effects on gene expression across human tissues. Nature 550, 204–213 (2017).
    Article PubMed Central Google Scholar
  19. Willer, C. J. et al. Discovery and refinement of loci associated with lipid levels. Nat. Genet. 45, 1274–1283 (2013).
    Article CAS PubMed PubMed Central Google Scholar
  20. Liu, J. Z. et al. Association analyses identify 38 susceptibility loci for inflammatory bowel disease and highlight shared genetic risk across populations. Nat. Genet. 47, 979–986 (2015).
    Article CAS PubMed PubMed Central Google Scholar
  21. Franzén, O. et al. Cardiometabolic risk loci share downstream cis- and trans-gene regulation across tissues and diseases. Science 353, 827–830 (2016).
    Article PubMed PubMed Central Google Scholar
  22. Musunuru, K. et al. From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus. Nature 466, 714–719 (2010).
    Article CAS PubMed PubMed Central Google Scholar
  23. Grundberg, E. et al. Mapping cis- and trans-regulatory effects across multiple tissues in twins. Nat. Genet. 44, 1084–1089 (2012).
    Article CAS PubMed PubMed Central Google Scholar
  24. Mancuso, N. et al. Probabilistic fine-mapping of transcriptome-wide association studies. Nat. Genet. https://doi.org/10.1038/s41588-019-0367-1 (2019).
    Article CAS PubMed PubMed Central Google Scholar
  25. de Leeuw, C. A., Neale, B. M., Heskes, T. & Posthuma, D. The statistical properties of gene-set analysis. Nat. Rev. Genet. 17, 353–364 (2016).
    Article PubMed Google Scholar
  26. Liu, S. J. et al. CRISPRi-based genome-scale identification of functional long noncoding RNA loci in human cells. Science 355, aah7111 (2017).
    Article PubMed Google Scholar
  27. Palazzo, A. F. & Lee, E. S. Non-coding RNA: what is functional and what is junk? Front. Genet. 6, 2 (2015).
    Article PubMed PubMed Central Google Scholar
  28. Luo, Y. et al. Exploring the genetic architecture of inflammatory bowel disease by whole-genome sequencing identifies association at ADCY7. Nat. Genet. 49, 186–192 (2017).
    Article CAS PubMed PubMed Central Google Scholar
  29. Finucane, H. K. et al. Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types. Nat. Genet. 50, 621–629 (2018).
    Article CAS PubMed PubMed Central Google Scholar
  30. Barbeira, A. N. et al. Integrating predicted transcriptome from multiple tissues improves association detection. PLoS Genet. 15, e1007889 (2019).
    Article PubMed PubMed Central Google Scholar
  31. Hu, Y. et al. A statistical framework for cross-tissue transcriptome-wide association analysis. Nat. Genet. 51, 568–576 (2019).
    Article CAS PubMed PubMed Central Google Scholar
  32. Xu, Z., Wu, C., Wei, P. & Pan, W. A powerful framework for integrating eQTL and GWAS summary data. Genetics 207, 893–902 (2017).
    Article CAS PubMed PubMed Central Google Scholar
  33. Wu, M. C. et al. Rare-variant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet. 89, 82–93 (2011).
    Article CAS PubMed PubMed Central Google Scholar
  34. Mogil, L. S. et al. Genetic architecture of gene expression traits across diverse populations. PLoS Genet. 14, e1007586 (2018).
    Article PubMed PubMed Central Google Scholar
  35. Võsa, U. et al. Unraveling the polygenic architecture of complex traits using blood eQTL meta-analysis. Preprint at https://www.biorxiv.org/content/10.1101/447367v1 (2018).
  36. Wheeler, H. E. et al. Imputed gene associations identify replicable trans-acting genes enriched in transcription pathways and complex traits. Preprint at https://www.biorxiv.org/content/10.1101/471748v1 (2018).
  37. Boyle, E. A., Li, Y. I. & Pritchard, J. K. An expanded view of complex traits: from polygenic to omnigenic. Cell 169, 1177–1186 (2017).
    Article CAS PubMed PubMed Central Google Scholar
  38. Engreitz, J. M. et al. Local regulation of gene expression by lncRNA promoters, transcription and splicing. Nature 539, 452–455 (2016).
    Article CAS PubMed PubMed Central Google Scholar
  39. Solovieff, N., Cotsapas, C., Lee, P. H., Purcell, S. M. & Smoller, J. W. Pleiotropy in complex traits: challenges and strategies. Nat. Rev. Genet. 14, 483–495 (2013).
    Article CAS PubMed PubMed Central Google Scholar
  40. Bhutani, K., Sarkar, A., Park, Y., Kellis, M. & Schork, N. J. Modeling prediction error improves power of transcriptome-wide association studies. Preprint at https://www.biorxiv.org/content/10.1101/108316v1 (2017).
  41. Ongen, H. et al. Estimating the causal tissues for complex traits and diseases. Nat. Genet. 49, 1676–1683 (2017).
    Article CAS PubMed Google Scholar
  42. Claussnitzer, M. et al. FTO obesity variant circuitry and adipocyte browning in humans. N. Engl. J. Med. 373, 895–907 (2015).
    Article CAS PubMed PubMed Central Google Scholar
  43. Regev, A. et al. The Human Cell Atlas. eLife 6, e27041 (2017).
    Article PubMed PubMed Central Google Scholar
  44. Stegle, O., Parts, L., Piipari, M., Winn, J. & Durbin, R. Using probabilistic estimation of expression residuals (PEER) to obtain increased power and interpretability of gene expression analyses. Nat. Protoc. 7, 500–507 (2012).
    Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We gratefully acknowledge J. Pritchard, H. Tang and members of the laboratory of N. Zaitlen for helpful discussions. This work was funded in part by the Natural Sciences and Engineering Research Council of Canada (NSERC) (grant PGSD3-476082-2015 to M.W.); a Stanford Bio-X Bowes fellowship (to M.W.); a Stanford Graduate Fellowship (to N.S.-A.); a National Defense Science & Engineering Grant (to N.S.-A.); NIH grants 1DP2OD022870 and U01HG009431 (to A.K.), 1U24HG008956 and 5U01HG009080 (to M.A.R.), R01HG009120 and R01MH115676 (to B.P.), R01MH107666, R01MH101820 and P30DK20595 (to H.K.I.), and R01HL125863 and R21TR001739 (to J.L.M.B.); NHGRI grant R01HG010140 (to M.A.R.); Leducq Foundation grant 12CVD02 (to J.L.M.B.); and American Heart Association grant A14SFRN20840000 (to J.L.M.B.). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Author information

Authors and Affiliations

  1. Department of Computer Science, Stanford University, Stanford, CA, USA
    Michael Wainberg & Anshul Kundaje
  2. Department of Genetics, Stanford University, Stanford, CA, USA
    Nasa Sinnott-Armstrong, David Golan & Anshul Kundaje
  3. Department of Pathology & Laboratory Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA
    Nicholas Mancuso & Bogdan Pasaniuc
  4. Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA
    Alvaro N. Barbeira & Hae Kyung Im
  5. New York Genome Center, New York, NY, USA
    David A. Knowles
  6. Department of Computer Science, Columbia University, New York, NY, USA
    David A. Knowles
  7. Department of Cardiac Surgery, Tartu University Hospital, Tartu, Estonia
    Raili Ermel & Arno Ruusalepp
  8. Clinical Gene Networks AB, Stockholm, Sweden
    Arno Ruusalepp & Johan L. M. Björkegren
  9. Division of Cardiovascular Medicine, Stanford University, Stanford, CA, USA
    Thomas Quertermous
  10. Department of Genetics & Genomic Sciences, Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
    Ke Hao & Johan L. M. Björkegren
  11. Department of Pathophysiology, Institute of Biomedicine and Translational Medicine, University of Tartu, Tartu, Estonia
    Johan L. M. Björkegren
  12. Integrated Cardio Metabolic Centre, Department of Medicine, Karolinska Institutet, Karolinska Universitetssjukhuset, Huddinge, Sweden
    Johan L. M. Björkegren
  13. Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA
    Bogdan Pasaniuc
  14. Department of Computational Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA
    Bogdan Pasaniuc
  15. Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
    Manuel A. Rivas

Authors

  1. Michael Wainberg
  2. Nasa Sinnott-Armstrong
  3. Nicholas Mancuso
  4. Alvaro N. Barbeira
  5. David A. Knowles
  6. David Golan
  7. Raili Ermel
  8. Arno Ruusalepp
  9. Thomas Quertermous
  10. Ke Hao
  11. Johan L. M. Björkegren
  12. Hae Kyung Im
  13. Bogdan Pasaniuc
  14. Manuel A. Rivas
  15. Anshul Kundaje

Contributions

M.W., M.A.R. and A.K. conceived the study. M.W., N.M. and A.N.B. performed analyses. N.S.-A., D.A.K. and D.G. provided intellectual input. R.E., A.R., T.Q., K.H. and J.L.M.B. provided assistance with analysis of the STARNET dataset. H.K.I., B.P., M.A.R. and A.K. supervised the study. M.W., H.K.I., B.P., M.A.R. and A.K. wrote the manuscript. All authors reviewed the manuscript.

Corresponding authors

Correspondence toJohan L. M. Björkegren, Hae Kyung Im, Bogdan Pasaniuc, Manuel A. Rivas or Anshul Kundaje.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Rights and permissions

About this article

Cite this article

Wainberg, M., Sinnott-Armstrong, N., Mancuso, N. et al. Opportunities and challenges for transcriptome-wide association studies.Nat Genet 51, 592–599 (2019). https://doi.org/10.1038/s41588-019-0385-z

Download citation

This article is cited by