Genetics of single-cell protein abundance variation in large yeast populations (original) (raw)

Nature volume 506, pages 494–497 (2014)Cite this article

Subjects

This article has been updated

Abstract

Variation among individuals arises in part from differences in DNA sequences, but the genetic basis for variation in most traits, including common diseases, remains only partly understood. Many DNA variants influence phenotypes by altering the expression level of one or several genes. The effects of such variants can be detected as expression quantitative trait loci (eQTL)1. Traditional eQTL mapping requires large-scale genotype and gene expression data for each individual in the study sample, which limits sample sizes to hundreds of individuals in both humans and model organisms and reduces statistical power2,3,4,5,6. Consequently, many eQTL are probably missed, especially those with smaller effects7. Furthermore, most studies use messenger RNA rather than protein abundance as the measure of gene expression. Studies that have used mass-spectrometry proteomics8,9,10,11,12,13 reported unexpected differences between eQTL and protein QTL (pQTL) for the same genes9,10, but these studies have been even more limited in scope. Here we introduce a powerful method for identifying genetic loci that influence protein expression in the yeast Saccharomyces cerevisiae. We measure single-cell protein abundance through the use of green fluorescent protein tags in very large populations of genetically variable cells, and use pooled sequencing to compare allele frequencies across the genome in thousands of individuals with high versus low protein abundance. We applied this method to 160 genes and detected many more loci per gene than previous studies. We also observed closer correspondence between loci that influence protein abundance and loci that influence mRNA abundance of a given gene. Most loci that we detected were clustered in ‘hotspots’ that influence multiple proteins, and some hotspots were found to influence more than half of the proteins that we examined. The variants that underlie these hotspots have profound effects on the gene regulatory network and provide insights into genetic variation in cell physiology between yeast strains.

This is a preview of subscription content, access via your institution

Access options

Subscribe to this journal

Receive 51 print issues and online access

$199.00 per year

only $3.90 per issue

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Additional access options:

Similar content being viewed by others

Change history

A minor change was made to the opening paragraph.

References

  1. Rockman, M. V. & Kruglyak, L. Genetics of global gene expression. Nature Rev. Genet. 7, 862–872 (2006)
    Article CAS Google Scholar
  2. Smith, E. N. & Kruglyak, L. Gene–environment interaction in yeast gene expression. PLoS Biol. 6, e83 (2008)
    Article Google Scholar
  3. Rockman, M. V., Skrovanek, S. S. & Kruglyak, L. Selection at linked sites shapes heritable phenotypic variation in C. elegans. Science 330, 372–376 (2010)
    Article CAS ADS Google Scholar
  4. Huang, G. J. et al. High resolution mapping of expression QTLs in heterogeneous stock mice in multiple tissues. Genome Res. 19, 1133–1140 (2009)
    Article CAS Google Scholar
  5. West, M. A. L. et al. Global eQTL mapping reveals the complex genetic architecture of transcript-level variation in Arabidopsis. Genetics 175, 1441–1450 (2007)
    Article CAS Google Scholar
  6. Lappalainen, T. et al. Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501, 506–511 (2013)
    Article CAS ADS Google Scholar
  7. Brem, R. B. & Kruglyak, L. The landscape of genetic complexity across 5,700 gene expression traits in yeast. Proc. Natl Acad. Sci. USA 102, 1572–1577 (2005)
    Article CAS ADS Google Scholar
  8. Foss, E. J. et al. Genetic basis of proteome variation in yeast. Nature Genet. 39, 1369–1375 (2007)
    Article CAS Google Scholar
  9. Foss, E. J. et al. Genetic variation shapes protein networks mainly through non-transcriptional mechanisms. PLoS Biol. 9, e1001144 (2011)
    Article CAS Google Scholar
  10. Ghazalpour, A. et al. Comparative analysis of proteome and transcriptome variation in mouse. PLoS Genet. 7, e1001393 (2011)
    Article CAS Google Scholar
  11. Wu, L. et al. Variation and genetic control of protein abundance in humans. Nature 499, 79–82 (2013)
    Article CAS ADS Google Scholar
  12. Khan, Z., Bloom, J. S., Garcia, B. A., Singh, M. & Kruglyak, L. Protein quantification across hundreds of experimental conditions. Proc. Natl Acad. Sci. USA 106, 15544–15548 (2009)
    Article CAS ADS Google Scholar
  13. Skelly, D. A. et al. Integrative phenomics reveals insight into the structure of phenotypic diversity in budding yeast. Genome Res. 23, 1496–1504 (2013)
    Article CAS Google Scholar
  14. Ehrenreich, I. M. et al. Dissection of genetically complex traits with extremely large pools of yeast segregants. Nature 464, 1039–1042 (2010)
    Article CAS ADS Google Scholar
  15. Huh, W.-K. et al. Global analysis of protein localization in budding yeast. Nature 425, 686–691 (2003)
    Article CAS ADS Google Scholar
  16. Edwards, M. D. & Gifford, D. K. High-resolution genetic mapping with pooled sequencing. BMC Bioinformatics 13, S8 (2012)
    Article Google Scholar
  17. Picotti, P. et al. A complete mass-spectrometric map of the yeast proteome applied to quantitative trait analysis. Nature 494, 266–270 (2013)
    Article CAS ADS Google Scholar
  18. Brem, R. B., Yvert, G., Clinton, R. & Kruglyak, L. Genetic dissection of transcriptional regulation in budding yeast. Science 296, 752–755 (2002)
    Article CAS ADS Google Scholar
  19. Litvin, O., Causton, H. C., Chen, B. J. & Pe’er, D. Modularity and interactions in the genetics of gene expression. Proc. Natl Acad. Sci. USA 106, 6441–6446 (2009)
    Article CAS ADS Google Scholar
  20. Zitomer, R. S. & Lowry, C. V. Regulation of gene expression by oxygen in Saccharomyces cerevisiae. Microbiol. Rev. 56, 1–11 (1992)
    CAS PubMed PubMed Central Google Scholar
  21. Gaisne, M., Bécam, A. M., Verdiere, J. & Herbert, C. J. A. A ‘natural’ mutation in Saccharomyces cerevisiae strains derived from S288c affects the complex regulatory gene HAP1 (CYP1). Curr. Genet. 36, 195–200 (1999)
    Article CAS Google Scholar
  22. Harbison, C. T. et al. Transcriptional regulatory code of a eukaryotic genome. Nature 431, 99–104 (2004)
    Article CAS ADS Google Scholar
  23. Butler, G. Hypoxia and gene expression in eukaryotic microbes. Annu. Rev. Microbiol. 67, 291–312 (2013)
    Article CAS Google Scholar
  24. Zaman, S., Lippman, S. I., Zhao, X. & Broach, J. R. How Saccharomyces responds to nutrients. Annu. Rev. Genet. 42, 27–81 (2008)
    Article CAS Google Scholar
  25. Zaman, S., Lippman, S. I., Schneper, L., Slonim, N. & Broach, J. R. Glucose regulates transcription in yeast through a network of signaling pathways. Mol. Syst. Biol. 5, 245 (2009)
    Article Google Scholar
  26. Spor, A. et al. Niche-driven evolution of metabolic and life-history strategies in natural and domesticated populations of Saccharomyces cerevisiae. BMC Evol. Biol. 9, 296 (2009)
    Article Google Scholar
  27. Warringer, J. et al. Trait variation in yeast is defined by population history. PLoS Genet. 7, e1002111 (2011)
    Article CAS Google Scholar
  28. Fraser, H. B., Moses, A. M. & Schadt, E. E. Evidence for widespread adaptive evolution of gene expression in budding yeast. Proc. Natl Acad. Sci. USA 107, 2977–2982 (2010)
    Article CAS ADS Google Scholar
  29. Lewis, J. A. & Gasch, A. P. Natural variation in the yeast glucose-signaling network reveals a new role for the Mig3p transcription factor. G3 Gene Genomes Genetics 2, 1607–1612 (2012)
    CAS Google Scholar
  30. Henras, A. K. et al. The post-transcriptional steps of eukaryotic ribosome biogenesis. Cell. Mol. Life Sci. 65, 2334–2359 (2008)
    Article CAS Google Scholar
  31. Howson, R. et al. Construction, verification and experimental use of two epitope-tagged collections of budding yeast strains. Comp. Funct. Genomics 6, 2–16 (2005)
    Article CAS Google Scholar
  32. Tong, A. H. Y. & Boone, C. High-throughput strain construction and systematic synthetic lethal screening in Saccharomyces cerevisiae. Methods in Microbiology 36, 369–707 (2007)
    Article CAS Google Scholar
  33. Newman, J. R. S. et al. Single-cell proteomic analysis of S. cerevisiae reveals the architecture of biological noise. Nature 441, 840–846 (2006)
    Article CAS ADS Google Scholar
  34. Adey, A. et al. Rapid, low-input, low-bias construction of shotgun fragment libraries by high-density in vitro transposition. Genome Biol. 11, R119 (2010)
    Article CAS Google Scholar
  35. Bloom, J. S., Ehrenreich, I. M., Loo, W. T., Lite, T.-L. V. & Kruglyak, L. Finding the sources of missing heritability in a yeast cross. Nature 494, 234–237 (2013)
    Article CAS ADS Google Scholar
  36. Meyer, M. & Kircher, M. Illumina Sequencing Library Preparation for Highly Multiplexed Target Capture and Sequencing. Cold Spring Harbor Protocols http://dx.doi.org/10.1101/pdb.prot5448 (2010)
  37. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009)
    Article CAS Google Scholar
  38. Broman, K. W., Wu, H., Sen, S. & Churchill, G. A. R/qtl: QTL mapping in experimental crosses. Bioinformatics 19, 889–890 (2003)
    Article CAS Google Scholar
  39. Yvert, G. et al. Trans-acting regulatory variation in Saccharomyces cerevisiae and the role of transcription factors. Nature Genet. 35, 57–64 (2003)
    Article CAS Google Scholar
  40. Storey, J. D. & Tibshirani, R. Statistical significance for genomewide studies. Proc. Natl Acad. Sci. USA 100, 9440–9445 (2003)
    Article CAS ADS MathSciNet Google Scholar
  41. Spivak, A. T. & Stormo, G. D. ScerTF: a comprehensive database of benchmarked position weight matrices for Saccharomyces species. Nucleic Acids Res. 40, D162–D168 (2012)
    Article CAS Google Scholar

Download references

Acknowledgements

We are grateful to C. DeCoste at the Princeton Flow Cytometry Resource Facility for technical assistance and advice on the experiments. This work was supported by National Institutes of Health (NIH) grant R01 GM102308, a James S. McDonnell Centennial Fellowship, and the Howard Hughes Medical Institute (L.K.), German Science Foundation research fellowship AL 1525/1-1 (F.W.A.), a National Science Foundation fellowship (J.S.B.), and NIH postdoctoral fellowship F32 GM101857-02 (S.T.).

Author information

Authors and Affiliations

  1. Department of Human Genetics, University of California, Los Angeles, 90095, California, USA
    Frank W. Albert, Joshua S. Bloom & Leonid Kruglyak
  2. Lewis Sigler Institute for Integrative Genomics, Princeton University, Princeton, 08544, New Jersey, USA
    Frank W. Albert & Sebastian Treusch
  3. Synthetic Genomics, 11149 North Torrey Pines Road, La Jolla, 92037, California, USA
    Arthur H. Shockley
  4. Howard Hughes Medical Institute, University of California, Los Angeles, 90095, California, USA
    Joshua S. Bloom & Leonid Kruglyak
  5. Department of Biological Chemistry, University of California, Los Angeles, 90095, California, USA
    Leonid Kruglyak

Authors

  1. Frank W. Albert
    You can also search for this author inPubMed Google Scholar
  2. Sebastian Treusch
    You can also search for this author inPubMed Google Scholar
  3. Arthur H. Shockley
    You can also search for this author inPubMed Google Scholar
  4. Joshua S. Bloom
    You can also search for this author inPubMed Google Scholar
  5. Leonid Kruglyak
    You can also search for this author inPubMed Google Scholar

Contributions

F.W.A. and L.K. conceived the project, designed research and wrote the paper. F.W.A. and A.H.S. performed experiments. F.W.A. analysed the data. S.T. provided advice on yeast strain construction, the initial experimental design and other experimental procedures. J.S.B. provided advice on experimental procedures and data analysis.

Corresponding authors

Correspondence toFrank W. Albert or Leonid Kruglyak.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Extended data figures and tables

Extended Data Figure 1 Overview of the experimental design.

Extended Data Figure 2 Illustration of FACS design.

Shown is GFP intensity and forward scatter (FSC, a measure of cell size) recorded during FACS. The correlation between cell size and GFP intensity is clearly visible. The superimposed collection gates are an illustration, and do not show the actual gates used for this gene. a, The low GFP (blue) and high GFP (red) gates sample extreme levels of GFP within a defined range of cell sizes. b, For the ‘null’ experiments, the same cell size range is collected, but without selecting on GFP.

Extended Data Figure 3 Sequence analyses and X-pQTL detection example.

In all panels, physical genomic coordinates are shown on the _x_-axes. The position of the gene (LEU1) is indicated by the purple vertical line. Top panel: frequency of the BY allele in the high (red) and low (blue) GFP population. SNPs are indicated by dots, and loess-smoothed averages as solid lines. Note the fixation of the BY allele in all segregants at the gene position and at the mating type locus on chromosome III, as well as the fixation of the RM allele at the synthetic genetic array marker integrated at the CAN1 locus on the left arm of chromosome V. Middle panel: subtraction of allele frequencies in the low from those in the high GFP population. SNPs are indicated by grey dots, with the loess-smoothed average indicated in black. Note that, on average, there is no difference between the high and the low populations. Positive difference values correspond to a higher frequency of the BY allele in the high GFP population, which we interpret as higher expression being caused by the BY allele at that locus. The red horizontal lines indicate the 99.99% quantile from the empirical ‘null’ sort experiments. They are shown for illustration only and were not used for peak calling. The blue vertical boxes indicate positions of genome-wide X-pQTL, with the width representing the 2-lod drop interval. Bottom panel: lod scores obtained from MULTIPOOL16. The red horizontal line is the genome-wide significance threshold (lod = 4.5). Stars indicate X-pQTL called by our algorithm; these positions correspond to the blue bars in the middle panel. For this gene, 14 X-pQTL are called.

Extended Data Figure 4 Reproducibility examples.

Shown are allele frequency differences between the high and low GFP populations along the genome of replicates for three genes. The gene positions are indicated by purple vertical lines; note that YMR315W and GCN1 were ‘local’ experiments where peaks at the gene position are visible. The red horizontal lines indicate the 99.99% quantile from the empirical ‘null’ sort experiments. Note the near-perfect agreement for strong X-pQTL, with some differences discernible at weaker loci. See Supplementary Note 1 for details.

Extended Data Figure 5 Example for a local X-pQTL in the gene MAE1.

Shown is the difference in the frequency of the BY allele between the high and the low GFP population along the genome. Red dashed horizontal lines indicate the 99.99% quantile from the empirical ‘null’ sort experiments. They are shown for illustration only and were not used for peak calling.

Extended Data Figure 7 The impact of small effect sizes on the π1 estimate.

Each panel shows the _P_-value distribution obtained from 5,000 tests of a given effect size x, if two groups of 50 individuals each are compared using a _t_-test. The effect size x is given along with the corresponding variance explained (VE), the π1 estimate, and the fraction of tests that achieved nominal significance (P < 0.05). Note that π1 reaches 0.3 at VE = 0.5% – 1% (middle row, right columns). See Supplementary Note 2 for details.

Extended Data Figure 8 Genes regulated by the hotspots on chromosomes XI, XII and XV.

The table shows genes that have an X-pQTL at three hotspots. For each gene involved in aerobic respiration, we show the X-pQTL lod scores along the genome in the top half of the plot, and the eQTL and pQTL lod scores in the bottom half on an inverted scale. The hotspot locations are shown as grey bars labelled with the names of the causative genes. Purple vertical lines indicate the gene positions. Red dashed horizontal lines are significance thresholds. Stars indicate significant QTL.

Extended Data Table 1 mRNA-specific and protein-specific local QTL

Full size table

Extended Data Table 2 Hotspot regulators of protein expression

Full size table

Supplementary information

PowerPoint slides

Rights and permissions

About this article

Cite this article

Albert, F., Treusch, S., Shockley, A. et al. Genetics of single-cell protein abundance variation in large yeast populations.Nature 506, 494–497 (2014). https://doi.org/10.1038/nature12904

Download citation

This article is cited by

Editorial Summary

Mapping gene variance in yeast

Many DNA variants influence phenotypes by altering the expression level of one or several genes, hence the current interest in the mapping of these expression quantitative trait loci (eQTL). This paper presents a new method for eQTL mapping, designed to overcome the limitations of the existing approaches that focus either on RNA or protein abundance. The new approach relies on GFP (green florescent protein) tags to measure single-cell protein abundance in the yeast Saccharomyces cerevisiae. Pooled sequencing is then used to compare allele frequencies across the genome in thousands of individuals with high versus low protein abundance. The authors report close correspondence between loci that influence mRNA and protein abundance for a given gene and identify hotspot locations that influence multiple proteins; the latter have profound effects on the gene regulatory network.