Understanding mechanisms underlying human gene expression variation with RNA sequencing (original) (raw)

Nature volume 464, pages 768–772 (2010)Cite this article

Subjects

Abstract

Understanding the genetic mechanisms underlying natural variation in gene expression is a central goal of both medical and evolutionary genetics, and studies of expression quantitative trait loci (eQTLs) have become an important tool for achieving this goal1. Although all eQTL studies so far have assayed messenger RNA levels using expression microarrays, recent advances in RNA sequencing enable the analysis of transcript variation at unprecedented resolution. We sequenced RNA from 69 lymphoblastoid cell lines derived from unrelated Nigerian individuals that have been extensively genotyped by the International HapMap Project2. By pooling data from all individuals, we generated a map of the transcriptional landscape of these cells, identifying extensive use of unannotated untranslated regions and more than 100 new putative protein-coding exons. Using the genotypes from the HapMap project, we identified more than a thousand genes at which genetic variation influences overall expression levels or splicing. We demonstrate that eQTLs near genes generally act by a mechanism involving allele-specific expression, and that variation that influences the inclusion of an exon is enriched within and near the consensus splice sites. Our results illustrate the power of high-throughput sequencing for the joint analysis of variation in transcription, splicing and allele-specific expression across individuals.

This is a preview of subscription content, access via your institution

Access options

Subscribe to this journal

Receive 51 print issues and online access

$199.00 per year

only $3.90 per issue

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Additional access options:

Similar content being viewed by others

Accession codes

Primary accessions

Gene Expression Omnibus

Data deposits

Sequencing data have been deposited in Gene Expression Omnibus (GEO) under accession number GSE19480, and are also available at http://eqtl.uchicago.edu.

References

  1. Rockman, M. V. & Kruglyak, L. Genetics of global gene expression. Nature Rev. Genet. 7, 862–872 (2006)
    Article CAS PubMed Google Scholar
  2. Frazer, K. A. et al. A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851–861 (2007)
    Article CAS ADS PubMed Google Scholar
  3. Cheung, V. G. et al. Natural variation in human gene expression assessed in lymphoblastoid cells. Nature Genet. 33, 422–425 (2003)
    Article CAS PubMed Google Scholar
  4. Kwan, T. et al. Heritability of alternative splicing in the human genome. Genome Res. 17, 1210–1218 (2007)
    Article CAS PubMed PubMed Central Google Scholar
  5. Cheung, V. G. et al. Mapping determinants of human gene expression by regional and genome-wide association. Nature 437, 1365–1369 (2005)
    Article CAS ADS PubMed PubMed Central Google Scholar
  6. Stranger, B. E. et al. Population genomics of human gene expression. Nature Genet. 39, 1217–1224 (2007)
    Article CAS PubMed Google Scholar
  7. Veyrieras, J.-B. et al. High-resolution mapping of expression-QTLs yields insight into human gene regulation. PLoS Genet. 4, e1000214 (2008)
    Article PubMed PubMed Central Google Scholar
  8. Wang, Z., Gerstein, M. & Snyder, M. RNA-Seq: a revolutionary tool for transcriptomics. Nature Rev. Genet. 10, 57–63 (2009)
    Article CAS PubMed Google Scholar
  9. Li, H., Ruan, J. & Durbin, R. Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res. 18, 1851–1858 (2008)
    Article CAS PubMed PubMed Central Google Scholar
  10. Huang, R. S. et al. A genome-wide approach to identify genetic variants that contribute to etoposide-induced cytotoxicity. Proc. Natl Acad. Sci. USA 104, 9758–9763 (2007)
    Article CAS ADS PubMed PubMed Central Google Scholar
  11. Miller, W. et al. 28-way vertebrate alignment and conservation track in the UCSC Genome Browser. Genome Res. 17, 1797–1808 (2007)
    Article CAS PubMed PubMed Central Google Scholar
  12. Wang, E. T. et al. Alternative isoform regulation in human tissue transcriptomes. Nature 456, 470–476 (2008)
    Article CAS ADS PubMed PubMed Central Google Scholar
  13. Zhao, J., Hyman, L. & Moore, C. Formation of mRNA 3′ ends in eukaryotes: mechanism, regulation, and interrelationships with other steps in mRNA synthesis. Microbiol. Mol. Biol. Rev. 63, 405–445 (1999)
    CAS PubMed PubMed Central Google Scholar
  14. Xie, X. et al. Systematic discovery of regulatory motifs in human promoters and 3′ UTRs by comparison of several mammals. Nature 434, 338–345 (2005)
    Article CAS ADS PubMed PubMed Central Google Scholar
  15. Sandberg, R., Neilson, J. R., Sarma, A., Sharp, P. A. & Burge, C. B. Proliferating cells express mRNAs with shortened 3′ untranslated regions and fewer microRNA target sites. Science 320, 1643–1647 (2008)
    Article CAS ADS PubMed PubMed Central Google Scholar
  16. Mayr, C. & Bartel, D. P. Widespread shortening of 3′UTRs by alternative cleavage and polyadenylation activates oncogenes in cancer cells. Cell 138, 673–684 (2009)
    Article CAS PubMed PubMed Central Google Scholar
  17. Mortazavi, A., Williams, B. A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nature Methods 5, 621–628 (2008)
    Article CAS PubMed Google Scholar
  18. Cloonan, N. et al. Stem cell transcriptome profiling via massive-scale mRNA sequencing. Nature Methods 5, 613–619 (2008)
    Article CAS PubMed Google Scholar
  19. Choy, E. et al. Genetic analysis of human traits in vitro: drug response and gene expression in lymphoblastoid cell lines. PLoS Genet. 4, e1000287 (2008)
    Article PubMed PubMed Central Google Scholar
  20. Kang, H. M., Ye, C. & Eskin, E. Accurate discovery of expression quantitative trait loci under confounding from spurious and genuine regulatory hotspots. Genetics 180, 1909–1925 (2008)
    Article CAS PubMed PubMed Central Google Scholar
  21. Stranger, B. E. et al. Genome-wide associations of gene expression variation in humans. PLoS Genet. 1, e78 (2005)
    Article PubMed PubMed Central Google Scholar
  22. Montgomery, S. B. et al. Transcriptome genetics using second generation sequencing in a Caucasian population. Nature 10.1038/nature08903 (this issue)
  23. Ge, B. et al. Global patterns of cis variation in human cells revealed by high-density allelic expression analysis. Nature Genet. 41, 1216–1222 (2009)
    Article CAS PubMed Google Scholar
  24. Verlaan, D. J. et al. Targeted screening of _cis_-regulatory variation in human haplotypes. Genome Res. 19, 118–127 (2009)
    Article CAS PubMed PubMed Central Google Scholar
  25. Watson, J. et al. Molecular Biology of the Gene 6th edn, Ch. 13 (Benjamin Cummings, 2008)
    Google Scholar
  26. Fraser, H. B. & Xie, X. Common polymorphic transcript variation in human disease. Genome Res. 19, 567–575 (2009)
    Article CAS PubMed Google Scholar
  27. Moffatt, M. F. et al. Genetic variants regulating ORMDL3 expression contribute to the risk of childhood asthma. Nature 448, 470–473 (2007)
    Article CAS ADS PubMed Google Scholar
  28. Marioni, J. C., Mason, C. E., Mane, S. M., Stephens, M. & Gilad, Y. RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res. 18, 1509–1517 (2008)
    Article CAS PubMed PubMed Central Google Scholar
  29. Guan, Y. & Stephens, M. Practical issues in imputation-based association mapping. PLoS Genet. 4, e1000279 (2008)
    Article PubMed PubMed Central Google Scholar
  30. Degner, J. F. et al. Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data. Bioinformatics 25, 3207–3212 (2009)
    Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We thank D. Gaffney, J. Bell, K. Bullaughey, Y. Guan and other members of the Pritchard, M. Przeworski and Stephens laboratory groups for helpful discussions, M. Domanus and P. Zumbo for sequencing support, and J. Zekos for computational assistance. J.F.D. and A.A.P. are supported by an NIH Training Grant to the University of Chicago. This work was supported by the HHMI and by NIH grants MH084703-01 to J.K. Pritchard and GM077959 to Y.G.

Author Contributions J.K. Pickrell performed most of the data analysis. J.C.M. contributed to the analysis of GC content and data normalizations and provided input on other aspects of data analysis. A.A.P. coordinated the cell culture and sequencing, and A.A.P. and E.N. prepared the sequencing libraries. The PCA-based normalization procedure was on the basis of results from J.-B.V., B.E.E. and M.S. J.F.D. provided software for the analysis of allele-specific expression. All authors participated in regular, detailed discussions of study design and data analysis at all stages of the study. The project was designed and supervised by Y.G. and J.K. Pritchard with regular input from M.S. The paper was written by J.K. Pickrell, Y.G. and J.K. Pritchard, with input from all authors.

Author information

Authors and Affiliations

  1. Department of Human Genetics,,
    Joseph K. Pickrell, John C. Marioni, Athma A. Pai, Jacob F. Degner, Everlyne Nkadori, Jean-Baptiste Veyrieras, Matthew Stephens, Yoav Gilad & Jonathan K. Pritchard
  2. Department of Computer Science,,
    Barbara E. Engelhardt
  3. Howard Hughes Medical Institute,,
    Everlyne Nkadori & Jonathan K. Pritchard
  4. Department of Statistics, The University of Chicago, Chicago 60637, USA,
    Matthew Stephens

Authors

  1. Joseph K. Pickrell
    You can also search for this author inPubMed Google Scholar
  2. John C. Marioni
    You can also search for this author inPubMed Google Scholar
  3. Athma A. Pai
    You can also search for this author inPubMed Google Scholar
  4. Jacob F. Degner
    You can also search for this author inPubMed Google Scholar
  5. Barbara E. Engelhardt
    You can also search for this author inPubMed Google Scholar
  6. Everlyne Nkadori
    You can also search for this author inPubMed Google Scholar
  7. Jean-Baptiste Veyrieras
    You can also search for this author inPubMed Google Scholar
  8. Matthew Stephens
    You can also search for this author inPubMed Google Scholar
  9. Yoav Gilad
    You can also search for this author inPubMed Google Scholar
  10. Jonathan K. Pritchard
    You can also search for this author inPubMed Google Scholar

Corresponding authors

Correspondence toJoseph K. Pickrell, Yoav Gilad or Jonathan K. Pritchard.

Supplementary information

Supplementary Information

This file contains Supplementary Material including Supplementary Figures 1-19 with legends, Supplementary Tables 1-2, and Supplementary References. (PDF 1169 kb)

PowerPoint slides

Rights and permissions

About this article

Cite this article

Pickrell, J., Marioni, J., Pai, A. et al. Understanding mechanisms underlying human gene expression variation with RNA sequencing.Nature 464, 768–772 (2010). https://doi.org/10.1038/nature08872

Download citation

Editorial Summary

RNA sequencing unlocks key to gene expression

There is currently much interest in the understanding of genetic mechanisms that underlie variation at the gene expression level. Two groups reporting in this issue of Nature use RNA sequencing to study global gene expression in two contrasting populations. Pickrell et al. sequenced RNA from 69 lymphoblastoid cell lines derived from unrelated Nigerian individuals who have been extensively genotyped as part of the HapMap Project. By pooling data from all the individuals it was possible to identify many genetic determinants of variation in gene expression. Montgomery et al. characterize the mRNA fraction of RNA isolated from lymphoblastoid cell lines derived from 63 HapMap individuals of Caucasian origin. They obtain a fine-scale view of the transcriptome and identify genetic variants that affect alternative splicing.