Understanding development and stem cells using single cell-based analyses of gene expression - PubMed (original) (raw)

Review

Understanding development and stem cells using single cell-based analyses of gene expression

Pavithra Kumar et al. Development. 2017.

Abstract

In recent years, genome-wide profiling approaches have begun to uncover the molecular programs that drive developmental processes. In particular, technical advances that enable genome-wide profiling of thousands of individual cells have provided the tantalizing prospect of cataloging cell type diversity and developmental dynamics in a quantitative and comprehensive manner. Here, we review how single-cell RNA sequencing has provided key insights into mammalian developmental and stem cell biology, emphasizing the analytical approaches that are specific to studying gene expression in single cells.

Keywords: Computational biology; Gene regulatory networks; Pseudotime; RNA-Seq; Single cell; Stem cells.

© 2017. Published by The Company of Biologists Ltd.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing or financial interests.

Figures

Fig. 1.

Fig. 1.

The growth of single cell genome-wide profiling techniques. A surge in scRNA-Seq applications can be observed. The cumulative number of cells that have been subjected to scRNA-Seq is shown, separated by species. Landmark studies are highlighted. Tang et al. (2009), Tang (2010b), Islam (2011), Ramskold (2012) and Hashimshony (2012) are the first five scRNA-Seq studies. They introduced the major varieties of scRNA-Seq: Tang protocol, STRT-Seq, CEL-Seq and Smart-Seq. Yan (2013) and Xue (2013) leverage scRNA-Seq to explore and the dynamics of human zygotic genome activation. Picelli (2013) introduces Smart-Seq2 with increased sensitivity. Macosko (2015) and Klein (2015) introduce high-throughput low-cost droplet-based methods that have vastly increased the number of cells that can be sequenced.

Fig. 2.

Fig. 2.

Typical approaches for analyzing scRNA-Seq datasets. Several types of analyses are popular for analyzing scRNA-Seq datasets. (A) When trying to identify cell types, dimension reduction techniques such as independent component analysis, principal component analysis, t-distributed stochastic neighbor embedding, ZIFA (Pierson and Yau, 2015) or weighted gene co-expression network analysis (Langfelder and Horvath, 2008) are first used to project high-dimensional data into a smaller number of dimensions to ease visual evaluation and interpretation. Clusters of similar cells can be identified using generally applicable methods, such as Gaussian mixture modeling (Fraley and Raftery, 2002) or K-means clustering, or methods devised specifically for single cell data, such as StemID (Grün et al., 2016), SCUBA, SNN-Cliq (Xu and Su, 2015), Destiny (Angerer et al., 2015) or BackSpin (Zeisel et al., 2015). Clusters can then be annotated based on domain-specific knowledge of the expression of a few genes, or automatically based on gene set enrichment. Finally, specific genes that are differentially expressed between clusters can be identified using scRNA-Seq-specific methods such as SCDE (Kharchenko et al., 2014) and MAST (Finak et al., 2015). (B) Most pseudotime analyses (which place each cell on a statistically derived axis that represents progression along a process, such as developmental time) start by performing dimension reduction. They then determine trajectories through the reduced dimensionality data; some algorithms identify bifurcation points and generate a distinct trajectory. The trajectories can then be used to order single cells along the process and to identify candidate regulators of stage transitions, for example, by finding stage-specific transcription factors (TF1-TF5). (C) One of the major drawbacks of scRNA-Seq is the loss of spatial context information when cells are dissociated and/or isolated. Spatial reconstruction methods attempt to ameliorate this issue by leveraging prior knowledge of landmark gene expression. Typically, localized expression of select genes is generated from in situ hybridization. Spatial reconstruction algorithms then compare scRNA-Seq profiles to discretized in situ hybridization profiles, and cells are placed in silico in the anatomical region with a matching profile. Machine-learning approaches can be used to estimate the expression of landmark genes to overcome the noisy nature of scRNA-Ssq data.

Fig. 3.

Fig. 3.

The ‘Janus’ progenitor state. scRNA-Seq has enabled the identification of embryonic progenitors that simultaneously express genes that were previously suspected of being lineage specific. (A) The PCA analysis of scRNA-Seq profiles of 198 developing murine lung cells has identified a cluster that expresses markers for both AT1 and AT2 cells, corroborating with single-cell qPCR data of E16.5 alveolar progenitors. (B) scRNA-Seq profiles of E11.5 metanephric mesenchyme has identified cells that co-express Foxd1 and Six2, which mark stromal-committing cells and nephron-committing cells, respectively. (C) A binary cell fate decision between the macrophage lineage and the neutrophil lineage was unveiled when bipotent progenitors were shown to co-express Irf8 and Gfi1, which regulate macrophage and neutrophil specification, respectively.

Similar articles

Cited by

References

    1. Achim K., Pettit J.-B., Saraiva L. R., Gavriouchkina D., Larsson T., Arendt D. and Marioni J. C. (2015). High-throughput spatial mapping of single-cell RNA-seq data to tissue of origin. Nat. Biotechnol. 33, 503-509. 10.1038/nbt.3209 - DOI - PubMed
    1. Angerer P., Haghverdi L., Büttner M., Theis F. J., Marr C. and Buettner F. (2015). Destiny: diffusion maps for large-scale single-cell data in R. Bioinformatics 32, 1241-1243. 10.1093/bioinformatics/btv715 - DOI - PubMed
    1. Angermueller C., Clark S. J., Lee H. J., Macaulay I. C., Teng M. J., Hu T. X., Krueger F., Smallwood S. A., Ponting C. P., Voet T. et al. (2016). Parallel single-cell sequencing links transcriptional and epigenetic heterogeneity. Nat. Methods 13, 229-232. 10.1038/nmeth.3728 - DOI - PMC - PubMed
    1. Bacher R. and Kendziorski C. (2016). Design and computational analysis of single-cell RNA-sequencing experiments. Genome Biol. 17, 63 10.1186/s13059-016-0927-y - DOI - PMC - PubMed
    1. Bandura D. R., Baranov V. I., Ornatsky O. I., Antonov A., Kinach R., Lou X., Pavlov S., Vorobiev S., Dick J. E. and Tanner S. D. (2009). Mass cytometry: technique for real time single cell multitarget immunoassay based on inductively coupled plasma time-of-flight mass spectrometry. Anal. Chem. 81, 6813-6822. 10.1021/ac901049w - DOI - PubMed

Publication types

MeSH terms

LinkOut - more resources