Opportunities and challenges for transcriptome-wide association studies (original) (raw)
- Perspective
- Published: 29 March 2019
- Nasa Sinnott-Armstrong ORCID: orcid.org/0000-0003-4490-06012,
- Nicholas Mancuso ORCID: orcid.org/0000-0002-9352-59273,
- Alvaro N. Barbeira ORCID: orcid.org/0000-0002-9153-61204,
- David A. Knowles ORCID: orcid.org/0000-0002-7408-146X5,6,
- David Golan2,
- Raili Ermel7,
- Arno Ruusalepp7,8,
- Thomas Quertermous ORCID: orcid.org/0000-0002-7645-90679,
- Ke Hao ORCID: orcid.org/0000-0002-1815-919710,
- Johan L. M. Björkegren ORCID: orcid.org/0000-0003-1945-74258,10,11,12,
- Hae Kyung Im ORCID: orcid.org/0000-0003-0333-56854,
- Bogdan Pasaniuc ORCID: orcid.org/0000-0002-0227-20563,13,14,
- Manuel A. Rivas ORCID: orcid.org/0000-0003-1457-992515 &
- …
- Anshul Kundaje ORCID: orcid.org/0000-0003-3084-22871,2
Nature Genetics volume 51, pages 592–599 (2019)Cite this article
- 54k Accesses
- 898 Citations
- 69 Altmetric
- Metrics details
Subjects
Abstract
Transcriptome-wide association studies (TWAS) integrate genome-wide association studies (GWAS) and gene expression datasets to identify gene–trait associations. In this Perspective, we explore properties of TWAS as a potential approach to prioritize causal genes at GWAS loci, by using simulations and case studies of literature-curated candidate causal genes for schizophrenia, low-density-lipoprotein cholesterol and Crohn’s disease. We explore risk loci where TWAS accurately prioritizes the likely causal gene as well as loci where TWAS prioritizes multiple genes, some likely to be non-causal, owing to sharing of expression quantitative trait loci (eQTL). TWAS is especially prone to spurious prioritization with expression data from non-trait-related tissues or cell types, owing to substantial cross-cell-type variation in expression levels and eQTL strengths. Nonetheless, TWAS prioritizes candidate causal genes more accurately than simple baselines. We suggest best practices for causal-gene prioritization with TWAS and discuss future opportunities for improvement. Our results showcase the strengths and limitations of using eQTL datasets to determine causal genes at GWAS loci.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
USD 39.95
Prices may be subject to local taxes which are calculated during checkout
Additional access options:
Similar content being viewed by others
References
- Gallagher, M. D. & Chen-Plotkin, A. S. The post-GWAS era: from association to function. Am. J. Hum. Genet. 102, 717–730 (2018).
Article CAS PubMed PubMed Central Google Scholar - Gamazon, E. R. et al. A gene-based association method for mapping traits using reference transcriptome data. Nat. Genet. 47, 1091–1098 (2015).
Article CAS PubMed PubMed Central Google Scholar - Gusev, A. et al. Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet. 48, 245–252 (2016).
Article CAS PubMed PubMed Central Google Scholar - Barbeira, A. N. et al. Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics. Nat. Commun. 9, 1825 (2018).
Article PubMed PubMed Central Google Scholar - Zhu, Z. et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 48, 481–487 (2016).
Article CAS PubMed Google Scholar - Hauberg, M. E. et al. Large-scale identification of common trait and disease variants affecting gene expression. Am. J. Hum. Genet. 100, 885–894 (2017).
Article CAS PubMed PubMed Central Google Scholar - Pavlides, J. M. W. et al. Predicting gene targets from integrative analyses of summary data from GWAS and eQTL studies for 28 human complex traits. Genome Med. 8, 84 (2016).
Article PubMed PubMed Central Google Scholar - He, X. et al. Sherlock: detecting gene-disease associations by matching patterns of expression QTL and GWAS. Am. J. Hum. Genet. 92, 667–680 (2013).
Article CAS PubMed PubMed Central Google Scholar - Wallace, C. et al. Statistical colocalization of monocyte gene expression and genetic risk variants for type 1 diabetes. Hum. Mol. Genet. 21, 2815–2824 (2012).
Article CAS PubMed PubMed Central Google Scholar - Giambartolomei, C. et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 10, e1004383 (2014).
PubMed PubMed Central Google Scholar - Plagnol, V., Smyth, D. J., Todd, J. A. & Clayton, D. G. Statistical independence of the colocalized association signals for type 1 diabetes and RPS26 gene expression on chromosome 12q13. Biostatistics 10, 327–334 (2009).
Article PubMed Google Scholar - Hormozdiari, F. et al. Colocalization of GWAS and eQTL signals detects target genes. Am. J. Hum. Genet. 99, 1245–1260 (2016).
CAS PubMed PubMed Central Google Scholar - Wen, X., Pique-Regi, R. & Luca, F. Integrating molecular QTL data into genome-wide genetic association analysis: probabilistic assessment of enrichment and colocalization. PLoS Genet. 13, e1006646 (2017).
Article PubMed PubMed Central Google Scholar - Nica, A. C. et al. Candidate causal regulatory effects by integration of expression QTLs with complex trait genetic associations. PLoS Genet. 6, e1000895 (2010).
Article PubMed PubMed Central Google Scholar - Mancuso, N. et al. Integrating gene expression with summary association statistics to identify genes associated with 30 complex traits. Am. J. Hum. Genet. 100, 473–487 (2017).
Article CAS PubMed PubMed Central Google Scholar - Gusev, A. et al. Transcriptome-wide association study of schizophrenia and chromatin activity yields mechanistic disease insights. Nat. Genet. 50, 538–548 (2018).
Article CAS PubMed PubMed Central Google Scholar - Sekar, A. et al. Schizophrenia risk from complex variation of complement component 4. Nature 530, 177–183 (2016).
Article CAS PubMed PubMed Central Google Scholar - GTEx Consortium et al. Genetic effects on gene expression across human tissues. Nature 550, 204–213 (2017).
Article PubMed Central Google Scholar - Willer, C. J. et al. Discovery and refinement of loci associated with lipid levels. Nat. Genet. 45, 1274–1283 (2013).
Article CAS PubMed PubMed Central Google Scholar - Liu, J. Z. et al. Association analyses identify 38 susceptibility loci for inflammatory bowel disease and highlight shared genetic risk across populations. Nat. Genet. 47, 979–986 (2015).
Article CAS PubMed PubMed Central Google Scholar - Franzén, O. et al. Cardiometabolic risk loci share downstream cis- and trans-gene regulation across tissues and diseases. Science 353, 827–830 (2016).
Article PubMed PubMed Central Google Scholar - Musunuru, K. et al. From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus. Nature 466, 714–719 (2010).
Article CAS PubMed PubMed Central Google Scholar - Grundberg, E. et al. Mapping cis- and trans-regulatory effects across multiple tissues in twins. Nat. Genet. 44, 1084–1089 (2012).
Article CAS PubMed PubMed Central Google Scholar - Mancuso, N. et al. Probabilistic fine-mapping of transcriptome-wide association studies. Nat. Genet. https://doi.org/10.1038/s41588-019-0367-1 (2019).
Article CAS PubMed PubMed Central Google Scholar - de Leeuw, C. A., Neale, B. M., Heskes, T. & Posthuma, D. The statistical properties of gene-set analysis. Nat. Rev. Genet. 17, 353–364 (2016).
Article PubMed Google Scholar - Liu, S. J. et al. CRISPRi-based genome-scale identification of functional long noncoding RNA loci in human cells. Science 355, aah7111 (2017).
Article PubMed Google Scholar - Palazzo, A. F. & Lee, E. S. Non-coding RNA: what is functional and what is junk? Front. Genet. 6, 2 (2015).
Article PubMed PubMed Central Google Scholar - Luo, Y. et al. Exploring the genetic architecture of inflammatory bowel disease by whole-genome sequencing identifies association at ADCY7. Nat. Genet. 49, 186–192 (2017).
Article CAS PubMed PubMed Central Google Scholar - Finucane, H. K. et al. Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types. Nat. Genet. 50, 621–629 (2018).
Article CAS PubMed PubMed Central Google Scholar - Barbeira, A. N. et al. Integrating predicted transcriptome from multiple tissues improves association detection. PLoS Genet. 15, e1007889 (2019).
Article PubMed PubMed Central Google Scholar - Hu, Y. et al. A statistical framework for cross-tissue transcriptome-wide association analysis. Nat. Genet. 51, 568–576 (2019).
Article CAS PubMed PubMed Central Google Scholar - Xu, Z., Wu, C., Wei, P. & Pan, W. A powerful framework for integrating eQTL and GWAS summary data. Genetics 207, 893–902 (2017).
Article CAS PubMed PubMed Central Google Scholar - Wu, M. C. et al. Rare-variant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet. 89, 82–93 (2011).
Article CAS PubMed PubMed Central Google Scholar - Mogil, L. S. et al. Genetic architecture of gene expression traits across diverse populations. PLoS Genet. 14, e1007586 (2018).
Article PubMed PubMed Central Google Scholar - Võsa, U. et al. Unraveling the polygenic architecture of complex traits using blood eQTL meta-analysis. Preprint at https://www.biorxiv.org/content/10.1101/447367v1 (2018).
- Wheeler, H. E. et al. Imputed gene associations identify replicable trans-acting genes enriched in transcription pathways and complex traits. Preprint at https://www.biorxiv.org/content/10.1101/471748v1 (2018).
- Boyle, E. A., Li, Y. I. & Pritchard, J. K. An expanded view of complex traits: from polygenic to omnigenic. Cell 169, 1177–1186 (2017).
Article CAS PubMed PubMed Central Google Scholar - Engreitz, J. M. et al. Local regulation of gene expression by lncRNA promoters, transcription and splicing. Nature 539, 452–455 (2016).
Article CAS PubMed PubMed Central Google Scholar - Solovieff, N., Cotsapas, C., Lee, P. H., Purcell, S. M. & Smoller, J. W. Pleiotropy in complex traits: challenges and strategies. Nat. Rev. Genet. 14, 483–495 (2013).
Article CAS PubMed PubMed Central Google Scholar - Bhutani, K., Sarkar, A., Park, Y., Kellis, M. & Schork, N. J. Modeling prediction error improves power of transcriptome-wide association studies. Preprint at https://www.biorxiv.org/content/10.1101/108316v1 (2017).
- Ongen, H. et al. Estimating the causal tissues for complex traits and diseases. Nat. Genet. 49, 1676–1683 (2017).
Article CAS PubMed Google Scholar - Claussnitzer, M. et al. FTO obesity variant circuitry and adipocyte browning in humans. N. Engl. J. Med. 373, 895–907 (2015).
Article CAS PubMed PubMed Central Google Scholar - Regev, A. et al. The Human Cell Atlas. eLife 6, e27041 (2017).
Article PubMed PubMed Central Google Scholar - Stegle, O., Parts, L., Piipari, M., Winn, J. & Durbin, R. Using probabilistic estimation of expression residuals (PEER) to obtain increased power and interpretability of gene expression analyses. Nat. Protoc. 7, 500–507 (2012).
Article CAS PubMed PubMed Central Google Scholar
Acknowledgements
We gratefully acknowledge J. Pritchard, H. Tang and members of the laboratory of N. Zaitlen for helpful discussions. This work was funded in part by the Natural Sciences and Engineering Research Council of Canada (NSERC) (grant PGSD3-476082-2015 to M.W.); a Stanford Bio-X Bowes fellowship (to M.W.); a Stanford Graduate Fellowship (to N.S.-A.); a National Defense Science & Engineering Grant (to N.S.-A.); NIH grants 1DP2OD022870 and U01HG009431 (to A.K.), 1U24HG008956 and 5U01HG009080 (to M.A.R.), R01HG009120 and R01MH115676 (to B.P.), R01MH107666, R01MH101820 and P30DK20595 (to H.K.I.), and R01HL125863 and R21TR001739 (to J.L.M.B.); NHGRI grant R01HG010140 (to M.A.R.); Leducq Foundation grant 12CVD02 (to J.L.M.B.); and American Heart Association grant A14SFRN20840000 (to J.L.M.B.). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Author information
Authors and Affiliations
- Department of Computer Science, Stanford University, Stanford, CA, USA
Michael Wainberg & Anshul Kundaje - Department of Genetics, Stanford University, Stanford, CA, USA
Nasa Sinnott-Armstrong, David Golan & Anshul Kundaje - Department of Pathology & Laboratory Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA
Nicholas Mancuso & Bogdan Pasaniuc - Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA
Alvaro N. Barbeira & Hae Kyung Im - New York Genome Center, New York, NY, USA
David A. Knowles - Department of Computer Science, Columbia University, New York, NY, USA
David A. Knowles - Department of Cardiac Surgery, Tartu University Hospital, Tartu, Estonia
Raili Ermel & Arno Ruusalepp - Clinical Gene Networks AB, Stockholm, Sweden
Arno Ruusalepp & Johan L. M. Björkegren - Division of Cardiovascular Medicine, Stanford University, Stanford, CA, USA
Thomas Quertermous - Department of Genetics & Genomic Sciences, Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
Ke Hao & Johan L. M. Björkegren - Department of Pathophysiology, Institute of Biomedicine and Translational Medicine, University of Tartu, Tartu, Estonia
Johan L. M. Björkegren - Integrated Cardio Metabolic Centre, Department of Medicine, Karolinska Institutet, Karolinska Universitetssjukhuset, Huddinge, Sweden
Johan L. M. Björkegren - Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA
Bogdan Pasaniuc - Department of Computational Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA
Bogdan Pasaniuc - Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
Manuel A. Rivas
Authors
- Michael Wainberg
- Nasa Sinnott-Armstrong
- Nicholas Mancuso
- Alvaro N. Barbeira
- David A. Knowles
- David Golan
- Raili Ermel
- Arno Ruusalepp
- Thomas Quertermous
- Ke Hao
- Johan L. M. Björkegren
- Hae Kyung Im
- Bogdan Pasaniuc
- Manuel A. Rivas
- Anshul Kundaje
Contributions
M.W., M.A.R. and A.K. conceived the study. M.W., N.M. and A.N.B. performed analyses. N.S.-A., D.A.K. and D.G. provided intellectual input. R.E., A.R., T.Q., K.H. and J.L.M.B. provided assistance with analysis of the STARNET dataset. H.K.I., B.P., M.A.R. and A.K. supervised the study. M.W., H.K.I., B.P., M.A.R. and A.K. wrote the manuscript. All authors reviewed the manuscript.
Corresponding authors
Correspondence toJohan L. M. Björkegren, Hae Kyung Im, Bogdan Pasaniuc, Manuel A. Rivas or Anshul Kundaje.
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
About this article
Cite this article
Wainberg, M., Sinnott-Armstrong, N., Mancuso, N. et al. Opportunities and challenges for transcriptome-wide association studies.Nat Genet 51, 592–599 (2019). https://doi.org/10.1038/s41588-019-0385-z
- Received: 25 May 2018
- Accepted: 13 February 2019
- Published: 29 March 2019
- Version of record: 29 March 2019
- Issue date: April 2019
- DOI: https://doi.org/10.1038/s41588-019-0385-z