A comparative strategy for single-nucleus and single-cell transcriptomes confirms accuracy in predicted cell-type expression from nuclear RNA - PubMed (original) (raw)

A comparative strategy for single-nucleus and single-cell transcriptomes confirms accuracy in predicted cell-type expression from nuclear RNA

Blue B Lake et al. Sci Rep. 2017.

Abstract

Significant heterogeneities in gene expression among individual cells are typically interrogated using single whole cell approaches. However, tissues that have highly interconnected processes, such as in the brain, present unique challenges. Single-nucleus RNA sequencing (SNS) has emerged as an alternative method of assessing a cell's transcriptome through the use of isolated nuclei. However, studies directly comparing expression data between nuclei and whole cells are lacking. Here, we have characterized nuclear and whole cell transcriptomes in mouse single neurons and provided a normalization strategy to reduce method-specific differences related to the length of genic regions. We confirmed a high concordance between nuclear and whole cell transcriptomes in the expression of cell type and metabolic modeling markers, but less so for a subset of genes associated with mitochondrial respiration. Therefore, our results indicate that single-nucleus transcriptome sequencing provides an effective means to profile cell type expression dynamics in previously inaccessible tissues.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Figure 1

Figure 1

SNS reveals excitatory neuron identity. (a) Overview of the SNS pipeline. S1 mouse cortex was dissociated to single nuclei for NeuN+ and DAPI+ sorting and capture on C1 chips for modified SmartSeq (SmartSeq+) reactions. Inset shows DAPI positive nuclei in the C1 capture site. (b) Comparison of nuclear data sets with 100 random single S1 cortical or CA1 hippocampal data sets. Top panel: Pearson correlation (r) coefficients for comparison of ERCC TPM values with their input quantities. Bottom panel: proportion of genomic reads mapping to coding sequences (CDS Exons), introns, or untranslated regions (3′ or 5′ UTRs). (c) t-SNE plots showing cluster distribution of hippocampal CA1, cortical S1 cells and cortical S1 nuclei. (d) t-SNE plots as in (c) showing positive expression levels (low – gray; high – blue) of cell type marker genes for oligodendrocytes (Mbp), astrocytes (Aldoc), endothelial cells (Cldn5), mural cells (Acta2), neurons (Thy1), inhibitory neurons (Gad1), excitatory neurons (Slc17a7), and excitatory neuron subtypes Rasgrf2 (layer 2–3), Rorb (layer 4), Plcxd2 (layer 5), FoxP2 (layer 6) and Nr4a2 (layer 6b), . (e) t-SNE plots showing expected identity of cluster groupings based on markers in (d) (Table S1, ambiguous data sets defined in Methods are shown in gray).

Figure 2

Figure 2

Nuclear transcriptomes accurately predict cell type. (a) Expression heatmap for cell type marker gene sets (colored bar) across all nuclear and cellular clusters (Fig. 1e). (b) Violin plots showing expression of select cell type marker genes across clusters.

Figure 3

Figure 3

Transcriptional heterogeneity within the measured nuclei and corresponding whole-cell subpopulations. (a) Top four statistically significant aspects of heterogeneity (rows) are shown for the measured nuclei (columns), as identified by PAGODA, labeled according to the key GO category or a gene driving each signature. (b) Expression patterns of genes driving the most prominent aspect, picked up by the synapse-associated GO category, are shown. (c) Expression of key marker genes defining subclasses of cortical neurons are shown. The synapse-distinguished neurons correspond to layer 2–3 (Rasgrf2 +) neurons. (d) A t-SNE embedding view, showing placement of the nuclei along the synapse-driven heterogeneity aspects shown in (a), which also separates two major subpopulations. (e–h) Analogous plots for an independent analysis of S1 excitatory whole cell neuron measurements. Expression of common synapse-associated (b) and marker (c) genes are shown (f and g) and t-SNE embedding (h) is driven by the synapse-associated aspect shown in (e).

Figure 4

Figure 4

Gene length bias correction. (a) Scatter plots for nuclear and indicated cellular clusters using either all detected genes or the associated cell-type specific gene sets. Pearson correlation coefficients (r) are indicated. (b) Scatter plot indicated in (a) with genes detected higher in cells (red) or detected similarly between cells and nuclei (green) indicated. Inset is a violin plot of Tbr1 expression. (c) Boxplot illustrating significant difference in average gene length between genes detected as up or down in cells over nuclei (Supplementary Fig. S4; Student t test, p = 6.41 × 10−51; Wilcoxon test, p = 3.77 × 10−60). (d) The systematic length bias in the whole cell – nucleus comparison was captured by the generalized additive model. The plot shows the interaction of total gene length (genic) and exonic length of a gene (pink – higher M values (log2 fold expression difference between whole cells and nuclei), blue – lower M values; the levels are labeled on the contours). (e) Scatter plot as shown in (b) after gene length correction showing improved Tbr1 detection in nuclear data. (f) Boxplot on corrected expression values showing the absence of gene length bias (Supplementary Fig. S4; Student t test, p = 0.852; Wilcoxon test, p = 0.762).

Figure 5

Figure 5

Differential transcript abundances between nuclei and whole cells. (a) Top panel: Total number of genes detected (count ≥ 4) from nuclei (*Indicates data sets generated from sorted nuclei frozen prior to C1 loading) and whole cell data sets representing S1 excitatory neurons. Lower panel: percentage of gene types detected, showing slightly more antisense transcripts detected in nuclear data and slightly more mitochondrial (Mt) rRNA detected in cellular data (arrow). (b) Heatmap of expression for top differentially detected genes (p < 1 × 10−20) between cellular and nuclear data sets showing representative GO annotations for genes over-represented in cells. (c) Histogram showing a higher frequency of genes that were better detected in cellular compared to nuclear data sets for S1 excitatory neurons (Supplementary Table S3). (d) Box plot showing significance values for annotations of top (p < 1 × 10−20) and bottom (p ≥ 1 × 10−20) differentially detected genes (Biological Process and Cellular Component categories, Supplementary Tables S4–S5). Student t-test p value is indicated: **p = 0.0002.

Similar articles

Cited by

References

    1. Macosko EZ, et al. Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets. Cell. 2015;161:1202–1214. doi: 10.1016/j.cell.2015.05.002. - DOI - PMC - PubMed
    1. Zeisel A, et al. Brain structure. Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq. Science. 2015;347:1138–1142. doi: 10.1126/science.aaa1934. - DOI - PubMed
    1. Pollen AA, et al. Low-coverage single-cell mRNA sequencing reveals cellular heterogeneity and activated signaling pathways in developing cerebral cortex. Nat Biotechnol. 2014;32:1053–1058. doi: 10.1038/nbt.2967. - DOI - PMC - PubMed
    1. Fuzik J, et al. Integration of electrophysiological recordings with single-cell RNA-seq data identifies neuronal subtypes. Nat Biotechnol. 2016;34:175–183. doi: 10.1038/nbt.3443. - DOI - PMC - PubMed
    1. Gole J, et al. Massively parallel polymerase cloning and genome sequencing of single cells using nanoliter microwells. Nat Biotechnol. 2013;31:1126–1132. doi: 10.1038/nbt.2720. - DOI - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources