Parallel single-cell sequencing links transcriptional and epigenetic heterogeneity (original) (raw)
Accession codes
Primary accessions
Gene Expression Omnibus
Referenced accessions
Gene Expression Omnibus
Data deposits
This study reused some data from Smallwood et al.3 available in the Gene Expression Omnibus under accession GSE56879.
References
- Shapiro, E., Biezuner, T. & Linnarsson, S. Nat. Rev. Genet. 14, 618–630 (2013).
Article CAS PubMed Google Scholar - Guo, H. et al. Genome Res. 23, 2126–2135 (2013).
Article CAS PubMed PubMed Central Google Scholar - Smallwood, S.A. et al. Nat. Methods 11, 817–820 (2014).
Article CAS PubMed PubMed Central Google Scholar - Farlik, M. et al. Cell Rep. 10, 1386–1397 (2015).
Article CAS PubMed PubMed Central Google Scholar - Levsky, J.M., Shenoy, S.M., Pezo, R.C. & Singer, R.H. Science 297, 836–840 (2002).
Article CAS PubMed Google Scholar - Yan, L. et al. Nat. Struct. Mol. Biol. 20, 1131–1139 (2013).
Article CAS PubMed Google Scholar - Macaulay, I.C. et al. Nat. Methods 12, 519–522 (2015).
Article CAS PubMed Google Scholar - Dey, S.S., Kester, L., Spanjaard, B., Bienko, M. & van Oudenaarden, A. Nat. Biotechnol. 33, 285–289 (2015).
Article CAS PubMed PubMed Central Google Scholar - Schübeler, D. Nature 517, 321–326 (2015).
Article PubMed Google Scholar - Jones, P.A. Nat. Rev. Genet. 13, 484–492 (2012).
Article CAS PubMed Google Scholar - Singer, Z.S. et al. Mol. Cell 55, 319–331 (2014).
Article CAS PubMed PubMed Central Google Scholar - Kalmar, T. et al. PLoS Biol. 7, e1000149 (2009).
Article PubMed PubMed Central Google Scholar - Chambers, I. et al. Nature 450, 1230–1234 (2007).
Article CAS PubMed Google Scholar - Singh, A.M., Hamazaki, T., Hankowski, K.E. & Terada, N. Stem Cells 25, 2534–2542 (2007).
Article CAS PubMed Google Scholar - Torres-Padilla, M.E. & Chambers, I. Development 141, 2173–2181 (2014).
Article CAS PubMed Google Scholar - Ficz, G. et al. Cell Stem Cell 13, 351–359 (2013).
Article CAS PubMed PubMed Central Google Scholar - Klein, A.M. et al. Cell 161, 1187–1201 (2015).
Article CAS PubMed PubMed Central Google Scholar - Kolodziejczyk, A.A. et al. Cell Stem Cell 17, 471–485 (2015).
Article CAS PubMed PubMed Central Google Scholar - Habibi, E. et al. Cell Stem Cell 13, 360–369 (2013).
Article CAS PubMed Google Scholar - Stadler, M.B. et al. Nature 480, 490–495 (2011).
Article CAS PubMed Google Scholar - Lee, H.J., Hore, T.A. & Reik, W. Cell Stem Cell 14, 710–719 (2014).
Article CAS PubMed PubMed Central Google Scholar - Papp, B. & Plath, K. EMBO J. 31, 4255–4257 (2012).
Article CAS PubMed PubMed Central Google Scholar - Whyte, W.A. et al. Cell 153, 307–319 (2013).
CAS PubMed PubMed Central Google Scholar - Krueger, F. & Andrews, S.R. Bioinformatics 27, 1571–1572 (2011).
Article CAS PubMed PubMed Central Google Scholar - Wu, T.D. & Nacu, S. Bioinformatics 26, 873–881 (2010).
Article CAS PubMed PubMed Central Google Scholar - Love, M.I., Huber, W. & Anders, S. Genome Biol. 15, 550 (2014).
Article PubMed PubMed Central Google Scholar - Trapnell, C. et al. Nat. Biotechnol. 28, 511–515 (2010).
Article CAS PubMed PubMed Central Google Scholar - Bourgon, R., Gentleman, R. & Huber, W. Proc. Natl. Acad. Sci. USA 107, 9546–9551 (2010).
Article CAS PubMed PubMed Central Google Scholar
Acknowledgements
We thank A. Kolodziejczyk and S.A. Teichmann for providing a list of 86 ESC pluripotency and differentiation genes18. We thank W. Haerty for his supervision and valuable advice to T.X.H. We thank the Wellcome Trust Sanger Institute sequencing pipeline team for assistance with Illumina sequencing. We thank the members of the Sanger–European Bioinformatics Institute (EBI) Single-Cell Genomics Centre for general advice. W.R. is supported by the UK Biotechnology and Biological Sciences Research Council (BBSRC), the Wellcome Trust and the EU. G.K. is supported by the BBSRC, the UK Medical Research Council (MRC) and the EU. C.P.P. is supported by the Wellcome Trust and the MRC. T.V. is supported by the Wellcome Trust and KU Leuven (SymBioSys, PFV/10/016). H.J.L. is supported by EU Network of Excellence EpiGeneSys. O.S. is supported by the European Molecular Biology Laboratory (EMBL), the Wellcome Trust and the EU.
Author information
Author notes
- Christof Angermueller, Stephen J Clark, Heather J Lee and Iain C Macaulay: These authors contributed equally to this work.
Authors and Affiliations
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, UK
Christof Angermueller, Tim Xiaoming Hu & Oliver Stegle - Epigenetics Programme, Babraham Institute, Cambridge, UK
Stephen J Clark, Heather J Lee, Sébastien A Smallwood, Gavin Kelsey & Wolf Reik - Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK
Heather J Lee, Iain C Macaulay, Mabel J Teng, Tim Xiaoming Hu, Chris P Ponting, Thierry Voet & Wolf Reik - Medical Research Council Functional Genomics Unit, University of Oxford, Oxford, UK
Tim Xiaoming Hu & Chris P Ponting - Bioinformatics Group, Babraham Institute, Cambridge, UK
Felix Krueger - Department of Human Genetics, Katholieke Universiteit Leuven, Leuven, Belgium
Thierry Voet
Authors
- Christof Angermueller
You can also search for this author inPubMed Google Scholar - Stephen J Clark
You can also search for this author inPubMed Google Scholar - Heather J Lee
You can also search for this author inPubMed Google Scholar - Iain C Macaulay
You can also search for this author inPubMed Google Scholar - Mabel J Teng
You can also search for this author inPubMed Google Scholar - Tim Xiaoming Hu
You can also search for this author inPubMed Google Scholar - Felix Krueger
You can also search for this author inPubMed Google Scholar - Sébastien A Smallwood
You can also search for this author inPubMed Google Scholar - Chris P Ponting
You can also search for this author inPubMed Google Scholar - Thierry Voet
You can also search for this author inPubMed Google Scholar - Gavin Kelsey
You can also search for this author inPubMed Google Scholar - Oliver Stegle
You can also search for this author inPubMed Google Scholar - Wolf Reik
You can also search for this author inPubMed Google Scholar
Contributions
C.A. performed all statistical analyses of the data. H.J.L., I.C.M., S.J.C. and S.A.S. developed the protocol and performed experiments. H.J.L., I.C.M., C.A., S.J.C., O.S., W.R. and C.P.P. interpreted the results. M.J.T. contributed to method development. T.X.H. processed RNA-seq data. F.K. processed BS-seq data. W.R., G.K., I.C.M. and T.V. contributed protocols and reagents. H.J.L., I.C.M., W.R. and T.V. conceived the project. W.R., O.S., T.V. and G.K. jointly supervised the project. O.S., H.J.L., S.J.C., W.R. and I.C.M. wrote the paper with input from all other authors. Names of authors who contributed equally to this work are ordered alphabetically on the first page.
Corresponding authors
Correspondence toThierry Voet, Gavin Kelsey, Oliver Stegle or Wolf Reik.
Ethics declarations
Competing interests
W.R. is a consultant and shareholder of Cambridge Epigenetix.
Integrated supplementary information
Supplementary Figure 1 Detailed flow chart of the scM&T-seq protocol.
Single cells are collected and lysed before poly-A RNA is captured on magnetic beads and physically separated from DNA. Amplified cDNA is generated from mRNA on beads whilst DNA is bisulfite converted and Illumina sequencing libraries are prepared from both components in parallel.
Supplementary Figure 2 Quality metrics of scRNA-seq data obtained from mouse ESCs profiled using scM&T-seq.
(a,b) Number of genes detected on (Y-axis) as a function of the expression cut off (x-axis). In each cell, between 4,000 and 8,000 genes were expressed (TPM>1) (the dashed line drawn at X=1). High quality cells generally have about 5,000 genes detectable at the cut-off of TPM>1, indicating a high level of quality among the 61 serum ESCs (or the 14 2i ESCs). (c,d) Distribution of Pearson correlation coefficient calculated pairwise on the 61 serum ESCs (or the 14 2i ESCs). The observed correlation coefficient tended to be between 0.7-0.99, indicating a high degree of technical consistency in the measured transcriptome of the cells considered, and attesting high quality of scRNA-seq data.
Supplementary Figure 3 Quality metrics of single-cell methylomes in serum ESCs profiled using alternative protocols.
Shown are quality metrics for the scM&T-seq protocol to profile 20 serum ESCs, compared with scBS-seq (Smallwood et al. 2014) to profile 20 serum cells. (a) Read mapping efficiency. (b) Read duplication rate. (c) Genome-wide CpG and CHH methylation rate per cell. (d) Analysis of representation bias for different genomic contexts. (e) FASTQC report of adapter content from one representative single cell bisulfite library (Read 1 of cell B06). A large proportion of sequenced fragments are concatemers of the primer used in first strand synthesis which substantially limits the alignment rates of these libraries. It may be possible to improve mapping efficiencies by reducing oligo concentrations or reaction times but this is likely to result in reduced genomic coverage.
Supplementary Figure 4 Methylation coverage in different genomic contexts.
Shown is the percentage of genomic contexts of different classes (y-axis) that are covered for an increasing number of minimum cells (x-axis), considering both scBS-seq (Smallwood et al. 2014, green) and scM&T-seq (blue). Note that the total number of serum cells is 20 for scBS-seq and 61 for scM&T-seq.
Supplementary Figure 5 Genome-wide methylation coverage.
Shown is the percentage of genome-wide 10kb, 5kb, and 1kb windows covered (y-axis) by an increasing minimum number of cells (x-axis), for scBS-seq (Smallwood et al. 2014, green) and scM&T-seq (blue). Note that the total number of serum cells is 20 for scBS-seq and 61 for scM&T-seq.
Supplementary Figure 6 Hierarchical clustering of DNA-methylation profiles generated by scM&T-seq and scBS-seq.
Shown s a joint hierarchical clustering from 61 serum and 16 2i cells profiled using scM&T-seq, as well as 20 serum and 12 2i ESCs profiled by scBS-seq (Smallwood et al. 2014), as well as corresponding synthetic bulk samples and an independent bulk BS-seq sample from serum ESCs (Ficz et al. 2013). The clustering analysis was performed on gene body methylation of the 500 genes with the largest epigenome heterogeneity.
Supplementary Figure 7 Correlation between single-cell methylomes and the methylome of a bulk cell population.
Shown is a scatter plot, relating bulk gene-body methylation (Ficz et al. 2013) on the x-axis, versus synthetic bulk estimates of gene-body methylation derived using either scBS-seq (Smallwood et al. 2014, green) or scM&T-seq (blue) on the y-axis. Synthetic bulk methylation profiles are derived form averages of the single-cell methylation profiles. The true bulk methylation profile is concordant with both single-cell profiles, where the scM&T-seq bulk estimates correlate slightly better (R=0.77) than the scBS-seq bulk (R=0.69).
Supplementary Figure 8 Principal-component analysis of gene-body methylation and gene expression in serum-grown ESCs.
Shown are projections onto first two principle components (left) alongside with percentage of variance explained by individual components (right) for both gene expression levels (a) and gene body methylation (b). Cells are color-coded based on clustering obtained using gene expression values, showing that that the methylation principal components partially recapitulate the structure in the expression data.
Supplementary Figure 9 Scatter-plot matrix of principal components from methylation and gene expression profiles.
Shown are scatter plots between individual principal components of gene expression levels (y-axis) and corresponding gene body methylation (x-axis), using 61 serum cells profiled using scM&T-seq. Cells are color coded as in Supplementary Fig. 8. There is a strong correlation between the second principal component of DNA methylation and the corresponding component from gene expression, suggesting shared axes of variation between transcriptome and methylome profiles.
Supplementary Figure 10 Clustering analysis of transcriptome and methylation data from 61 serum ESCs.
Shown are heatmaps for the gene body methylation (left) and gene expression profiles (right) using the 300 most heterogeneous genes (based on gene expression). The order of genes was taken from an individual clustering analysis based on gene methylation whereas cells were clustered separately either using DNA methylation or expression data, showing unlinked clusters (colored clusters). The bar plots in the center show the heterogeneity in DNA methylation (left) and gene expression (right).
Supplementary Figure 11 Bootstrap robustness analysis of the gene-specific correlation analysis.
Shown is the absolute (a) and relative (b) reduction in the number of significant methylation-expression associations for different genomic contexts, as well as the root mean squared error of Pearson’s correlation coefficient (c) when either considering the full datasets or alternatively boot-strapped samples for the methylation-RNA correlation analysis. Bootstrap samples were obtained from independent draws of 60%, 70%, or 80% of the total set of cells. As expected, a reduction in the number of analyzed cells resulted in reduced power to detect significant associations (a, b). Overall, only a relatively small number of linkages were affected and the concordance to the full dataset remained high (c).
Supplementary Figure 13 Volcano plots for association tests between DNA-methylation profiles in alternative genomic contexts and gene expression levels.
For each context, shown is the correlation coefficient (Pearson r, x-axis) versus the adjusted p-value (Benjamini Hochberg adjustment; y-axis). The blue horizontal line corresponds to the 10% FDR significance level. Each dot corresponds to a gene and the size to the adjusted p-value of the association test. Genes colored in red correspond to known pluripotency genes (Supplementary Table 5). The vertical orange line denotes the average correlation coefficient across all genes for a given annotation.
Supplementary Figure 14 Comparison of results of cell-specific correlation analysis with known covariates (mean CpG methylation rate).
Supplementary Figure 15 Comparison of cell-specific correlation analysis with known covariates (CpG coverage).
For alternative genomic contexts, shown are scatter plots between cell-specific methylation-expression correlation coefficients and the (technical) CpG coverage in the corresponding cell. The lack of associations suggests that technical factors do not drive the heterogeneity in the coupling between methylation and expression between cells.
Supplementary information
Source data
Rights and permissions
About this article
Cite this article
Angermueller, C., Clark, S., Lee, H. et al. Parallel single-cell sequencing links transcriptional and epigenetic heterogeneity.Nat Methods 13, 229–232 (2016). https://doi.org/10.1038/nmeth.3728
- Received: 29 October 2015
- Accepted: 09 December 2015
- Published: 11 January 2016
- Issue Date: March 2016
- DOI: https://doi.org/10.1038/nmeth.3728