Understanding mechanisms underlying human gene expression variation with RNA sequencing (original) (raw)
- Letter
- Published: 10 March 2010
- John C. Marioni1,
- Athma A. Pai1,
- Jacob F. Degner1,
- Barbara E. Engelhardt2,
- Everlyne Nkadori1,3,
- Jean-Baptiste Veyrieras1,
- Matthew Stephens1,4,
- Yoav Gilad1 &
- …
- Jonathan K. Pritchard1,3
Nature volume 464, pages 768–772 (2010)Cite this article
- 27k Accesses
- 947 Citations
- 48 Altmetric
- Metrics details
Subjects
Abstract
Understanding the genetic mechanisms underlying natural variation in gene expression is a central goal of both medical and evolutionary genetics, and studies of expression quantitative trait loci (eQTLs) have become an important tool for achieving this goal1. Although all eQTL studies so far have assayed messenger RNA levels using expression microarrays, recent advances in RNA sequencing enable the analysis of transcript variation at unprecedented resolution. We sequenced RNA from 69 lymphoblastoid cell lines derived from unrelated Nigerian individuals that have been extensively genotyped by the International HapMap Project2. By pooling data from all individuals, we generated a map of the transcriptional landscape of these cells, identifying extensive use of unannotated untranslated regions and more than 100 new putative protein-coding exons. Using the genotypes from the HapMap project, we identified more than a thousand genes at which genetic variation influences overall expression levels or splicing. We demonstrate that eQTLs near genes generally act by a mechanism involving allele-specific expression, and that variation that influences the inclusion of an exon is enriched within and near the consensus splice sites. Our results illustrate the power of high-throughput sequencing for the joint analysis of variation in transcription, splicing and allele-specific expression across individuals.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Additional access options:
Similar content being viewed by others
Accession codes
Primary accessions
Gene Expression Omnibus
Data deposits
Sequencing data have been deposited in Gene Expression Omnibus (GEO) under accession number GSE19480, and are also available at http://eqtl.uchicago.edu.
References
- Rockman, M. V. & Kruglyak, L. Genetics of global gene expression. Nature Rev. Genet. 7, 862–872 (2006)
Article CAS PubMed Google Scholar - Frazer, K. A. et al. A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851–861 (2007)
Article CAS ADS PubMed Google Scholar - Cheung, V. G. et al. Natural variation in human gene expression assessed in lymphoblastoid cells. Nature Genet. 33, 422–425 (2003)
Article CAS PubMed Google Scholar - Kwan, T. et al. Heritability of alternative splicing in the human genome. Genome Res. 17, 1210–1218 (2007)
Article CAS PubMed PubMed Central Google Scholar - Cheung, V. G. et al. Mapping determinants of human gene expression by regional and genome-wide association. Nature 437, 1365–1369 (2005)
Article CAS ADS PubMed PubMed Central Google Scholar - Stranger, B. E. et al. Population genomics of human gene expression. Nature Genet. 39, 1217–1224 (2007)
Article CAS PubMed Google Scholar - Veyrieras, J.-B. et al. High-resolution mapping of expression-QTLs yields insight into human gene regulation. PLoS Genet. 4, e1000214 (2008)
Article PubMed PubMed Central Google Scholar - Wang, Z., Gerstein, M. & Snyder, M. RNA-Seq: a revolutionary tool for transcriptomics. Nature Rev. Genet. 10, 57–63 (2009)
Article CAS PubMed Google Scholar - Li, H., Ruan, J. & Durbin, R. Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res. 18, 1851–1858 (2008)
Article CAS PubMed PubMed Central Google Scholar - Huang, R. S. et al. A genome-wide approach to identify genetic variants that contribute to etoposide-induced cytotoxicity. Proc. Natl Acad. Sci. USA 104, 9758–9763 (2007)
Article CAS ADS PubMed PubMed Central Google Scholar - Miller, W. et al. 28-way vertebrate alignment and conservation track in the UCSC Genome Browser. Genome Res. 17, 1797–1808 (2007)
Article CAS PubMed PubMed Central Google Scholar - Wang, E. T. et al. Alternative isoform regulation in human tissue transcriptomes. Nature 456, 470–476 (2008)
Article CAS ADS PubMed PubMed Central Google Scholar - Zhao, J., Hyman, L. & Moore, C. Formation of mRNA 3′ ends in eukaryotes: mechanism, regulation, and interrelationships with other steps in mRNA synthesis. Microbiol. Mol. Biol. Rev. 63, 405–445 (1999)
CAS PubMed PubMed Central Google Scholar - Xie, X. et al. Systematic discovery of regulatory motifs in human promoters and 3′ UTRs by comparison of several mammals. Nature 434, 338–345 (2005)
Article CAS ADS PubMed PubMed Central Google Scholar - Sandberg, R., Neilson, J. R., Sarma, A., Sharp, P. A. & Burge, C. B. Proliferating cells express mRNAs with shortened 3′ untranslated regions and fewer microRNA target sites. Science 320, 1643–1647 (2008)
Article CAS ADS PubMed PubMed Central Google Scholar - Mayr, C. & Bartel, D. P. Widespread shortening of 3′UTRs by alternative cleavage and polyadenylation activates oncogenes in cancer cells. Cell 138, 673–684 (2009)
Article CAS PubMed PubMed Central Google Scholar - Mortazavi, A., Williams, B. A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nature Methods 5, 621–628 (2008)
Article CAS PubMed Google Scholar - Cloonan, N. et al. Stem cell transcriptome profiling via massive-scale mRNA sequencing. Nature Methods 5, 613–619 (2008)
Article CAS PubMed Google Scholar - Choy, E. et al. Genetic analysis of human traits in vitro: drug response and gene expression in lymphoblastoid cell lines. PLoS Genet. 4, e1000287 (2008)
Article PubMed PubMed Central Google Scholar - Kang, H. M., Ye, C. & Eskin, E. Accurate discovery of expression quantitative trait loci under confounding from spurious and genuine regulatory hotspots. Genetics 180, 1909–1925 (2008)
Article CAS PubMed PubMed Central Google Scholar - Stranger, B. E. et al. Genome-wide associations of gene expression variation in humans. PLoS Genet. 1, e78 (2005)
Article PubMed PubMed Central Google Scholar - Montgomery, S. B. et al. Transcriptome genetics using second generation sequencing in a Caucasian population. Nature 10.1038/nature08903 (this issue)
- Ge, B. et al. Global patterns of cis variation in human cells revealed by high-density allelic expression analysis. Nature Genet. 41, 1216–1222 (2009)
Article CAS PubMed Google Scholar - Verlaan, D. J. et al. Targeted screening of _cis_-regulatory variation in human haplotypes. Genome Res. 19, 118–127 (2009)
Article CAS PubMed PubMed Central Google Scholar - Watson, J. et al. Molecular Biology of the Gene 6th edn, Ch. 13 (Benjamin Cummings, 2008)
Google Scholar - Fraser, H. B. & Xie, X. Common polymorphic transcript variation in human disease. Genome Res. 19, 567–575 (2009)
Article CAS PubMed Google Scholar - Moffatt, M. F. et al. Genetic variants regulating ORMDL3 expression contribute to the risk of childhood asthma. Nature 448, 470–473 (2007)
Article CAS ADS PubMed Google Scholar - Marioni, J. C., Mason, C. E., Mane, S. M., Stephens, M. & Gilad, Y. RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res. 18, 1509–1517 (2008)
Article CAS PubMed PubMed Central Google Scholar - Guan, Y. & Stephens, M. Practical issues in imputation-based association mapping. PLoS Genet. 4, e1000279 (2008)
Article PubMed PubMed Central Google Scholar - Degner, J. F. et al. Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data. Bioinformatics 25, 3207–3212 (2009)
Article CAS PubMed PubMed Central Google Scholar
Acknowledgements
We thank D. Gaffney, J. Bell, K. Bullaughey, Y. Guan and other members of the Pritchard, M. Przeworski and Stephens laboratory groups for helpful discussions, M. Domanus and P. Zumbo for sequencing support, and J. Zekos for computational assistance. J.F.D. and A.A.P. are supported by an NIH Training Grant to the University of Chicago. This work was supported by the HHMI and by NIH grants MH084703-01 to J.K. Pritchard and GM077959 to Y.G.
Author Contributions J.K. Pickrell performed most of the data analysis. J.C.M. contributed to the analysis of GC content and data normalizations and provided input on other aspects of data analysis. A.A.P. coordinated the cell culture and sequencing, and A.A.P. and E.N. prepared the sequencing libraries. The PCA-based normalization procedure was on the basis of results from J.-B.V., B.E.E. and M.S. J.F.D. provided software for the analysis of allele-specific expression. All authors participated in regular, detailed discussions of study design and data analysis at all stages of the study. The project was designed and supervised by Y.G. and J.K. Pritchard with regular input from M.S. The paper was written by J.K. Pickrell, Y.G. and J.K. Pritchard, with input from all authors.
Author information
Authors and Affiliations
- Department of Human Genetics,,
Joseph K. Pickrell, John C. Marioni, Athma A. Pai, Jacob F. Degner, Everlyne Nkadori, Jean-Baptiste Veyrieras, Matthew Stephens, Yoav Gilad & Jonathan K. Pritchard - Department of Computer Science,,
Barbara E. Engelhardt - Howard Hughes Medical Institute,,
Everlyne Nkadori & Jonathan K. Pritchard - Department of Statistics, The University of Chicago, Chicago 60637, USA,
Matthew Stephens
Authors
- Joseph K. Pickrell
You can also search for this author inPubMed Google Scholar - John C. Marioni
You can also search for this author inPubMed Google Scholar - Athma A. Pai
You can also search for this author inPubMed Google Scholar - Jacob F. Degner
You can also search for this author inPubMed Google Scholar - Barbara E. Engelhardt
You can also search for this author inPubMed Google Scholar - Everlyne Nkadori
You can also search for this author inPubMed Google Scholar - Jean-Baptiste Veyrieras
You can also search for this author inPubMed Google Scholar - Matthew Stephens
You can also search for this author inPubMed Google Scholar - Yoav Gilad
You can also search for this author inPubMed Google Scholar - Jonathan K. Pritchard
You can also search for this author inPubMed Google Scholar
Corresponding authors
Correspondence toJoseph K. Pickrell, Yoav Gilad or Jonathan K. Pritchard.
Supplementary information
Supplementary Information
This file contains Supplementary Material including Supplementary Figures 1-19 with legends, Supplementary Tables 1-2, and Supplementary References. (PDF 1169 kb)
PowerPoint slides
Rights and permissions
About this article
Cite this article
Pickrell, J., Marioni, J., Pai, A. et al. Understanding mechanisms underlying human gene expression variation with RNA sequencing.Nature 464, 768–772 (2010). https://doi.org/10.1038/nature08872
- Received: 23 September 2009
- Accepted: 01 February 2010
- Published: 10 March 2010
- Issue Date: 01 April 2010
- DOI: https://doi.org/10.1038/nature08872
Editorial Summary
RNA sequencing unlocks key to gene expression
There is currently much interest in the understanding of genetic mechanisms that underlie variation at the gene expression level. Two groups reporting in this issue of Nature use RNA sequencing to study global gene expression in two contrasting populations. Pickrell et al. sequenced RNA from 69 lymphoblastoid cell lines derived from unrelated Nigerian individuals who have been extensively genotyped as part of the HapMap Project. By pooling data from all the individuals it was possible to identify many genetic determinants of variation in gene expression. Montgomery et al. characterize the mRNA fraction of RNA isolated from lymphoblastoid cell lines derived from 63 HapMap individuals of Caucasian origin. They obtain a fine-scale view of the transcriptome and identify genetic variants that affect alternative splicing.