Reconstructing lineage hierarchies of the distal lung epithelium using single-cell RNA-seq (original) (raw)

Accession codes

Primary accessions

Gene Expression Omnibus

Data deposits

The transcriptome sequencing data for all single cells has been deposited in the Gene Expression Omnibus database under accession number GSE52583.

References

  1. Kim, C. F. B. et al. Identification of bronchioalveolar stem cells in normal lung and lung cancer. Cell 121, 823–835 (2005)
    Article CAS PubMed Google Scholar
  2. Zemke, A. C. et al. Molecular staging of epithelial maturation using secretory cell-specific genes as markers. Am. J. Respir. Cell Mol. Biol. 40, 340–348 (2009)
    Article CAS PubMed Google Scholar
  3. Guha, A. et al. Neuroepithelial body microenvironment is a niche for a distinct subset of Clara-like precursors in the developing airways. Proc. Natl Acad. Sci. USA 109, 12592–12597 (2012)
    Article CAS ADS PubMed PubMed Central Google Scholar
  4. Gonzalez, R. et al. Freshly isolated rat alveolar type I cells, type II cells, and cultured type II cells have distinct molecular phenotypes. Am. J. Physiol. Lung Cell. Mol. Physiol. 288, L179–L189 (2005)
    Article CAS PubMed Google Scholar
  5. Xu, Y. et al. Transcriptional programs controlling perinatal lung maturation. PLoS ONE 7, e37046 (2012)
    Article CAS ADS PubMed PubMed Central Google Scholar
  6. Desai, T. J., Brownfield, D. G. & Krasnow, M. A. Alveolar progenitor and stem cells in lung development, renewal and cancer. Nature 507, 190–194 (2014)
    Article CAS ADS PubMed PubMed Central Google Scholar
  7. Wu, A. R. et al. Quantitative assessment of single-cell RNA-sequencing methods. Nature Methods 11, 41–46 (2013)
    Article PubMed PubMed Central Google Scholar
  8. Islam, S. et al. Characterization of the single-cell transcriptional landscape by highly multiplex RNA-seq. Genome Res. 21, 1160–1167 (2011)
    Article CAS PubMed PubMed Central Google Scholar
  9. Islam, S. et al. Highly multiplexed and strand-specific single-cell RNA 5′ end sequencing. Nature Protocols 7, 813–828 (2012)
    Article CAS PubMed Google Scholar
  10. Shalek, A. K. et al. Single-cell transcriptomics reveals bimodality inexpression and splicing in immune cells. Nature 498, 236–240 (2013)
    Article CAS ADS PubMed PubMed Central Google Scholar
  11. Sasagawa, Y. et al. Quartz-Seq: a highly reproducible and sensitive single-cell RNA-Seq reveals non-genetic gene expression heterogeneity. Genome Biol. 14, R31 (2013)
    Article PubMed PubMed Central Google Scholar
  12. Liu, C. L., Bernstein, B. E. & Schreiber, S. L. Whole genome amplification by T7-based linear amplification of DNA (TLAD). II. Second-strand synthesis and in vitro transcription. CSH Protocols, http://dx.doi.org/10.1101/pdb.prot5003 (2008)
  13. Hashimshony, T., Wagner, F., Sher, N. & Yanai, I. CEL-Seq: single-cell RNA-Seq by multiplexed linear amplification. Cell Rep. 2, 666–673 (2012)
    Article CAS PubMed Google Scholar
  14. Ramskold, D. et al. Full-length mRNA-Seq from single-cell levels of RNA and individual circulating tumor cells. Nature Biotechnol. 30, 777–782 (2012)
    Article Google Scholar
  15. Tariq, M. A., Kim, H. J., Jejelowo, O. & Pourmand, N. Whole-transcriptome RNAseq analysis from minute amount of total RNA. Nucleic Acids Res. 39, e120 (2011)
    Article CAS PubMed PubMed Central Google Scholar
  16. Picelli, S. et al. Smart-seq2 for sensitive full-length transcriptome profiling in single cells. Nature Methods 10, 1096–1098 (2013)
    Article CAS PubMed Google Scholar
  17. Brennecke, P. et al. Accounting for technical noise in single-cell RNA-seq experiments. Nature Methods 10, 1093–1095 (2013)
    Article CAS PubMed Google Scholar
  18. Tang, F. et al. mRNA-Seq whole-transcriptome analysis of a single cell. Nature Methods 6, 377–382 (2009)
    Article CAS PubMed Google Scholar
  19. Tang, F. et al. RNA-Seq analysis to capture the transcriptome landscape of a single cell. Nature Protocols 5, 516–535 (2010)
    Article CAS PubMed Google Scholar
  20. Tang, F. et al. Tracing the derivation of embryonic stem cells from the inner cell mass by single-cell RNA-Seq analysis. Cell Stem Cell 6, 468–478 (2010)
    Article CAS PubMed PubMed Central Google Scholar
  21. Huang, D. W., Sherman, B. T. & Lempicki, R. A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nature Protocols 4, 44–57 (2009)
    Article CAS Google Scholar
  22. Yin, Z. et al. Hop functions downstream of Nkx2.1 and GATA6 to mediate HDAC-dependent negative regulation of pulmonary gene expression. Am. J. Physiol. Lung Cell. Mol. Physiol. 291, L191–L199 (2006)
    Article CAS PubMed Google Scholar
  23. Sock, E. et al. Gene targeting reveals a widespread role for the high-mobility-group transcription factor Sox11 in tissue remodeling. Mol. Cell. Biol. 24, 6635–6644 (2004)
    Article CAS PubMed PubMed Central Google Scholar
  24. Wang, X. et al. Gene expression profiling and chromatin immunoprecipitation identify DBN1, SETMAR and HIG2 as direct targets of SOX11 in mantle cell lymphoma. PLoS ONE 5, e14085 (2010)
    Article ADS PubMed PubMed Central Google Scholar
  25. Trapnell, C., Pachter, L. & Salzberg, S. L. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25, 1105–1111 (2009)
    Article CAS PubMed PubMed Central Google Scholar
  26. Dalerba, P. et al. Single-cell dissection of transcriptional heterogeneity in human colon tumors. Nature Biotechnol. 29, 1120–1127 (2011)
    Article CAS Google Scholar
  27. R core team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computinghttp://www.R-project.org/
  28. Chapman, H. A. et al. Integrin α6β4 identifies an adult distal lung epithelial population with regenerative potential in mice. J. Clin. Invest. 121, 2855–2862 (2011)
    Article CAS PubMed PubMed Central Google Scholar
  29. Takeda, N. et al. Interconversion between intestinal stem cell populations in distinct niches. Science 334, 1420–1424 (2011)
    Article CAS ADS PubMed PubMed Central Google Scholar
  30. Babraham Institute. Babraham Bioinformatics. FASTQC. http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc
  31. Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal 17, 10–12 (2011)
    Article Google Scholar
  32. Schmieder, R. & Edwards, R. Quality control and preprocessing of metagenomic datasets. Bioinformatics 27, 863–864 (2011)
    Article CAS PubMed PubMed Central Google Scholar
  33. Langmead, B., Trapnell, C., Pop, M. & Salzberg, S. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009)
    Article PubMed PubMed Central Google Scholar
  34. Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nature Methods 9, 357–359 (2012)
    Article CAS PubMed PubMed Central Google Scholar
  35. Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009)
    PubMed PubMed Central Google Scholar
  36. Baker, S. C. et al. The External RNA Controls Consortium: a progress report. Nature Methods 2, 731–734 (2005)
    Article CAS PubMed Google Scholar
  37. Jiang, L. et al. Synthetic spike-in standards for RNA-seq experiments. Genome Res. 21, 1543–1551 (2011)
    Article CAS PubMed PubMed Central Google Scholar
  38. Zhang, H.-M. et al. AnimalTFDB: a comprehensive animal transcription factor database. Nucleic Acids Res. 40, D144–D149 (2012)
    Article CAS PubMed Google Scholar
  39. Walker, M. G., Volkmuth, W., Sprinzak, E., Hodgson, D. & Klingler, T. Prediction of gene function by genome-scale expression analysis: prostate cancer-associated genes. Genome Res. 9, 1198–1203 (1999)
    Article CAS PubMed PubMed Central Google Scholar
  40. Huang, D. W., Sherman, B. T. & Lempicki, R. A. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res. 37, 1–13 (2009)
    Article Google Scholar
  41. Greif, D. M. et al. Radial construction of an arterial wall. Dev. Cell 23, 482–493 (2012)
    Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We thank W. Koh and B. Passarelli for help and discussions regarding bioinformatic pipelines and statistical analysis, S. I. Gonzalez for help with immunofluorescence, and J. G. Camp and members of the Krasnow laboratory for critical discussion and reading of the manuscript. This work was supported by National Heart, Lung, and Blood Institute (NHLBI) U01HL099995 Progenitor Cell Biology Consortium Grant (B.T., M.A.K., S.R.Q.), by National Institutes of Health (NIH) T32HD007249 (D.G.B.), by a Parker B. Francis Foundation Fellowship and NIH 5KO8HL084095 Award (T.J.D.), and by NIH grant U01HL099999 (A.R.W., N.F.N.). M.A.K. and S.R.Q. are investigators of the Howard Hughes Medical Institute.

Author information

Author notes

  1. Barbara Treutlein and Doug G. Brownfield: These authors contributed equally to this work.

Authors and Affiliations

  1. Departments of Bioengineering and Applied Physics, Stanford University School of Medicine and Howard Hughes Medical Institute, Stanford, 94305, California, USA
    Barbara Treutlein, Angela R. Wu, Norma F. Neff, Gary L. Mantalas & Stephen R. Quake
  2. Department of Biochemistry, Stanford University School of Medicine and Howard Hughes Medical Institute, Stanford, 94305, California, USA
    Doug G. Brownfield, F. Hernan Espinoza & Mark A. Krasnow
  3. Department of Internal Medicine, Division of Pulmonary and Critical Care Medicine, Stanford University School of Medicine, Stanford, 94305, California, USA
    Tushar J. Desai

Authors

  1. Barbara Treutlein
    You can also search for this author inPubMed Google Scholar
  2. Doug G. Brownfield
    You can also search for this author inPubMed Google Scholar
  3. Angela R. Wu
    You can also search for this author inPubMed Google Scholar
  4. Norma F. Neff
    You can also search for this author inPubMed Google Scholar
  5. Gary L. Mantalas
    You can also search for this author inPubMed Google Scholar
  6. F. Hernan Espinoza
    You can also search for this author inPubMed Google Scholar
  7. Tushar J. Desai
    You can also search for this author inPubMed Google Scholar
  8. Mark A. Krasnow
    You can also search for this author inPubMed Google Scholar
  9. Stephen R. Quake
    You can also search for this author inPubMed Google Scholar

Contributions

B.T., D.G.B., T.D., M.A.K. and S.R.Q. conceived the study and designed the experiments. B.T., D.G.B., F.H.E., A.R.W., N.F.N., G.L.M. and T.D. performed the experiments. B.T., D.G.B., A.R.W., F.H.E., T.D., M.A.K. and S.R.Q. analysed the data and/or provided intellectual guidance in their interpretation. B.T., D.G.B., F.H.E., T.D., M.K. and S.R.Q. wrote the paper.

Corresponding authors

Correspondence toTushar J. Desai, Mark A. Krasnow or Stephen R. Quake.

Ethics declarations

Competing interests

S.R.Q. is a founder and consultant for Fluidigm Corporation.

Extended data figures and tables

Extended Data Figure 1 Schematic illustration of the process of sacculation.

a, Schematic illustration of morphological and molecular changes of the distal airways during development. Cell differentiation progresses in a directional manner from the bronchio-alveolar junction (proximal) to the distal tip (distal) of each terminal airway; progenitor cells therefore persist the longest at the tips. Ciliated (green) and Clara (blue) cells mature first, followed by the differentiation of flat alveolar type 1 (AT1, orange) and cuboidal type 2 (AT2, red) cells from cuboidal alveolar progenitors during sacculation (E16–18.5), when distal airway tubules widen as nascent AT1 cells flatten to form the gas-exchange surface. b, Micrographs of alveolar (E18.5, postnatal 3 days (PN3d)) and bronchiolar (PN3d) sections of a mouse lung co-stained for Clara (Scgb1a1, green) and ciliated (Foxj1, red) cell markers as well as AT1 (Pdpn, green) and AT2 (Sftpc, red) specific markers. Progenitor cells at the tips of sacculating alveoli are detected by an overlap of AT1 and AT2 specific markers. Newly forming alveolar sacs are marked by asterisks.

Extended Data Figure 2 Single-cell transcriptomics analysis workflow.

a, Workflow of single-cell transcriptomics analysis of mouse lung epithelial cells. A single captured lung epithelial cell stained with Alexa488 for EpCAM (green) is indicated by a red arrow. b, Single lung epithelial cells captured in microfluidic chips with capture sites designed to trap cells with a diameter of 10–17 μm (medium, left) or 17–25 μm (large, right). Cells were stained for viability with Calcein AM. Even cells captured by the large chip did not exceed a diameter of ∼15 μm, indicating that the medium-sized chips are sufficient for comprehensively profiling distal mouse lung epithelial cells.

Extended Data Figure 3 Assessment of required sequencing depth, technical and biological variation, dynamic range and reproducibility of single-cell RNA-seq data of 80 single distal lung epithelial cells at E18.5.

a, Saturation analysis reveals the sequencing depth required for the detection of most genes expressed by single cells. To detect most expressed genes, single-cell RNA-seq libraries have to be sequenced only to a depth of about 106 reads, whereas libraries of bulk samples have to be sequenced more deeply. The number of genes detected in the ensemble of all single cells (synthetic bulk) is comparable to the number of genes detected in the true bulk experiment. Each point on the saturation curve was generated by randomly selecting a number of raw reads from each sample library (bulk, 200 cell bulk library; single cell, single-cell RNA-seq libraries of 80 lung epithelial cells; single-cell ensemble, bioinformatically pooled single-cell libraries) and then using the same alignment pipeline to call genes with a mean FPKM of more than 1. Each point represents four replicate subsamplings; error bars represent s.e.m. b, Technical noise and biological variation in single-cell RNA-seq data. Relationship between mean expression level and coefficient of variation for 10,946 genes in single embryonic lung epithelial cells. Several genes show strong biological variation (blue): they show higher variability than the average noise at a given average gene expression. Housekeeping genes are shown in yellow. c, Average detected transcript levels (mean FPKM, log2) for 92 ERCC RNA spike-ins as a function of provided number of molecules per lysis reaction for each of the three independent single-cell RNA-seq experiments performed at E18.5. Linear regression fits through data points are shown. The length of each ERCC RNA spike-in transcript is encoded in the size and colour of the data points. No particular bias towards the detection of shorter versus longer transcripts is observed. The method shows single transcript sensitivity as well as a dynamic range of approximately six orders of magnitude, in agreement with a previous study evaluating microfluidic single-cell RNA-seq7. d, e, Correlation between transcript levels of a 200-cell population and median transcript levels of single cells of the same pool of embryonic lungs (d), and transcript levels of two single AT2 cells (e). r, Pearson correlation coefficients. f, g, Correlation between transcript levels of all genes detected in the single lung and the pooled lung experiment (f) and between transcript levels of all genes detected in the two independent experiments on pooled embryonic lungs (g). Pearson correlation coefficients r are given.

Extended Data Figure 4 Lineage-specific genes identified by single-cell transcriptome analysis allow functional description of individual distal lung epithelial cell populations.

a, Results of gene ontology (GO) and KEGG pathway enrichment analyses for distal lung epithelial cell types based on lineage-specific genes identified by single-cell RNA-seq of 80 E18.5 distal lung epithelial cells (Supplementary Data). b, c, Correlograms visualizing correlation of single-cell gene expression profiles between transcription factors (b) or receptors/ligands (c) and the major canonical marker genes for bronchiolar and alveolar lineages (AT1: Pdpn; AT2: Sftpc; Clara: Scgb1a1; ciliated: Foxj1). The colour bar denotes the Pearson correlation coefficient from −1 (blue, anticorrelated genes) to 1 (green, positively correlated genes). d, Validation of previously unknown marker genes by single-cell multiplexed qPCR on 74 single cells isolated from the distal mouse lung epithelium at E18.5. Lineage-specific expression of seven new marker genes is shown by clustering with known markers for respective lineages (AT2, red, previously unknown: Cftr, Cebpa, Sftpd and Id2; AT1, orange, previously unknown: Vegfa; ciliated, green, previously unknown: Itgb4 and Top2a; Clara, blue). e, Validation of Hopx expression in AT1 cells. A lung section from a transgenic _Hopx_>GFP adult mouse (Hopx-Cre-ERT2 +/− ;mTmG +/tg) was co-stained for AT1 marker Pdpn. Maximum-intensity projections of confocal z stacks show that AT1 cells expressing the membrane-localized GFP reporter (green) also express Pdpn (white). Scale bar, 50 μm. f, Hierarchical clustering of 46 transgenically labelled mature Sftpc+ AT2 cells, isolated by FACS from adult mouse lung. Most genes identified as AT2 lineage-specific from single-cell transcriptomes at E18.5 are transcribed also by mature AT2 cells. In contrast, no or low expression is observed in mature AT2 cells for the genes specific to the other alveolar or bronchiolar lineages as identified from single-cell RNA-seq data at E18.5.

Extended Data Figure 5 Molecular profiles distinguish developmental intermediates during the differentiation of AT1 and AT2 cells from a common BP.

a, Hierarchical clustering of multiplexed qPCR gene expression data for 33 single cells from E16.5 lung epithelium (CD45−/EpCAM+) suggests the presence at this time point of two major cell lineages, bronchiolar (cyan) and alveolar (brown) progenitors. Note that alveolar progenitors express a subset of both AT1 and AT2 marker genes. b, PCA of multiplexed qPCR data of lung epithelial cells at E16.5 identifies two gene groups in contrast to three observed at E18.5 (Fig. 1c). AT1 and AT2 specific marker genes do not segregate into distinct populations at E16.5. c, Hierarchical clustering of multiplexed qPCR gene expression data for 74 single embryonic lung epithelial cells (CD45−/EpCAM+) at E18.5 shows multiple distinct cell populations consistent with RNA-sequencing data at this time point: BP, AT1, AT2, Clara and ciliated cells. Each row represents a single cell and each column a gene. Cells are clustered on the basis of expression of marker genes for alveolar and bronchiolar lineages (AT2: Abca3, Sftpb, Muc1, Lyz2, Sftpc; AT1: Aqp5, Pdpn, Ager; ciliated: Foxj1; Clara: Scgb1a1). d, PCA of multiplexed qPCR data replicates gene families found by single-cell RNA-seq at E18.5. Gene groups were characterized on the basis of differential correlation with the first two principal components. e, Developmental sequence of AT1 (orange) and AT2 (red) specification from a common BP (brown). Two and three maturation intermediates were identified in the specification process of AT2 and AT1 cell types, respectively, on the basis of the expression of known and previously unknown marker genes for both alveolar lineages measured by single-cell RNA-seq (Fig. 3). Transcription factors and receptors/ligands shown here were found to be expressed in BP cells and subsequently restricted to one of the alveolar lineages. Arrows, differentiation pathway; grey braces, change in transcript level of respective genes with tip pointing towards lower expression. fi, Protein level heterogeneity of alveolar epithelial markers during sacculation. f, Immunofluorescent micrograph from an E19.5 lung with mature AT1 and AT2 cells stained for their respective markers (Pdpn (white) and Ager (red) for AT1; Sftpc (green) for AT2). BPs are positive for all three markers. Cells in intermediate states are observed, such as early AT1 (Pdpn and Ager positive, Sftpc low) and early AT2 cells (Sftpc positive, and either Pdpn positve/Ager low or Pdpn low/Ager negative). Scale bar, 10 μm. g, Markers of late AT2 cells are expressed heterogeneously at E18.5. Immunofluorescence micrograph of a lung from a Lyz2–enhanced green fluorescent protein (eGFP) transgenic mouse, in which within the epithelium (E-cadherin, blue) only a subset of Sftpc (green)-positive AT2 cells are Lyz2 (red)-positive. Scale bar, 20 μm. h, Immunofluorescent staining of E18.5 lung tissue for Lamp3 (red) shows heterogeneous expression of Lamp3 in Sftpc-positive cells (green): Proximal cells show higher Lamp3 expression than distal cells. Blue, DAPI-stained nuclei. Scale bar, 20 μm. i, Immunofluorescent staining of E18.5 lung tissue for S100a6 (red) shows heterogeneous expression of the secreted protein S100a6 in Pdpn-positve cells (green). Blue, DAPI-stained nuclei. Scale bar, 20 μm.

Extended Data Figure 6 Following _Sftpc_-expressing cells throughout their life cycle.

a, Whole-mount in situ hybridizations of embryonic mouse lungs at E11.5, E13.5 and E14.5 using probes against Sftpc mRNA show expression of Sftpc specific to the tips of the epithelial tree branches. Moreover, variations in signal intensity indicate heterogeneity in the level of Sftpc expression across cells, which is in agreement with our single-cell RNA-seq data of Sftpc+ cells at E14.5 (see Fig. 4a). b, Diagram of the different transcriptional states in the specification of an AT2 cell as identified by single-cell RNA-seq of Sftpc+ cells from distal mouse lung epithelium of embryonic (E14.5, E16.5 and E18.5) and adult mice. The cell undergoes a transition from an early (A) and late (B) early progenitor state into a BP state before either taking the AT1 fate (nascent AT1), or following the AT2 pathway to become a nascent and finally a mature AT2 cell. Groups of genes turning on/up or off/down during the individual transitions are shown above and below each arrow, respectively (Fig. 4a and Supplementary Data). Whereas EP and BP cells are double positive for Sftpc and Pdpn, nascent and mature AT2 cells express Sftpc but turn off expression of the AT1 marker Pdpn. The developmental time points at which the individual cell states were detected, and their putative locations, are shown.

Extended Data Figure 7 The number of unique genes and the total number of transcripts expressed by a single cell strongly correlates with its differentiation state.

a, Saturation analysis of single-cell RNA-seq data of lung epithelial cells at different embryonic and adult time points (E14.5, E18.5 and adult AT2) reveals that the number of unique genes expressed by single lung epithelial cells decreases with progressing differentiation state. Distal lung epithelial cells at E14.5 express more than 6,000 genes, whereas cells at E18.5 express about 3,000 genes, and mature AT2 cells only about 2,000 genes. Each point on the saturation curve was generated by randomly selecting a number of raw reads from each sample library and then using the same alignment pipeline to call genes with a mean FPKM of more than 1. Each point represents four replicate subsamplings. Error bars represent s.e.m. All libraries were sequenced to a depth of at least 2 × 106 reads. b, Single-cell RNA-seq reveals that the total number of transcripts expressed by single cells decreases with increasing differentiation state of the cell. The number of transcripts per cell was calculated from the FPKM values of all genes in each cell, using the correlation between number of transcripts of exogenous spike-in mRNA sequences and their respective measured mean FPKM values (example calibration curves are shown in Extended Data Fig. 3c for three replicates at E18.5). Area-normalized density distributions are shown for embryonic cells at E14.5 (45 cells), E16.5 (27 cells) and E18.5 (80 cells), and for 46 Sftpc+ adult AT2 cells. The number of transcripts is highest in lung epithelial progenitor cells at E16.5 and E14.5 and decreases in cells at E18.5 and even further in mature AT2 cells. Note that single-cell RNA-seq libraries for E14.5, E18.5 and adult AT2 cells were sequenced to a depth of (2–6) × 106 reads, whereas the libraries for cells at E16.5 were sequenced to a lower depth of 100,000–550,000 reads. c, Calibration of _C_t values measured by single-cell qPCR to number of molecules. Average detected transcript levels (log2Ex = _C_t,LoD − _C_t, _C_t,LoD = 22) for six ERCC RNA spike-ins as a function of provided number of molecules per lysis reaction for each of three independent single-cell qPCR experiments performed on embryonic (E16.5, two replicates; red and green) and adult mouse lung (adult AT2, one replicate; blue). Linear regression fits through data points and corresponding equations are shown and were used to convert _C_t values measured by qPCR into numbers of transcripts. d, Single-cell qPCR confirms the presence of a higher number of transcripts in lung epithelial progenitor cells in comparison with fully differentiated alveolar epithelial cells. The median number of transcripts per cell as detected by single-cell RNA-seq (y axis) and by single-cell multiplexed qPCR of 90 genes (x axis) is shown for distal lung epithelial cells at E16.5 (qPCR, 33 cells; RNA-seq, 27 cells) and mature AT2 cells (qPCR, 48 cells; RNA-seq, 46 cells).

Extended Data Figure 8 Transcriptional states during the early lifetime of the Clara cell lineage identified by single-cell RNA-seq of Scgb3a2+ cells at E14.5, E16.5 and E18.5.

a, Hierarchical clustering of 24 _Scgb3a2_-positive cells from distal mouse lung epithelium at different embryonic time points (E14.5, E16.5 and E18.5) based on the genes with highest principal-component loadings in an unbiased PCA analysis of all cells and all genes (shown in c). Cells are shown in rows, genes in columns. Cells cluster into three major groups. Scgb3a2 and Scgb1a1 transcript levels are shown in bars on the right. Whereas canonical Clara cell marker Scgb1a1 is first detected at E18.5, Scgb3a2 is detected as early as E14.5, suggesting that it is an early Clara cell marker. b, Gene Ontology (GO) enrichments of the three different gene clusters as well as transcription factors (TFs) belonging to the different groups of genes. c, PCA analysis of all _Scgb3a2_-positive cells and all genes identifies three different cell populations that were identified as bronchiolar progenitor as well as Clara and ciliated cells.

Supplementary information

Supplementary Tables

This file contains Supplementary Tables 1-2. (PDF 214 kb)

Supplementary Data 1

This file contains alignment statistics for all single cells with sequenced transcriptome. (XLSX 61 kb)

Supplementary Data 2

This zipped file contains R script used to analyze single cell RNA-seq data as .txt files. (ZIP 21 kb)

Supplementary Data 3

This file contains single cell RNA-seq expression data (log3(FPKM) values) for all 80 lung epithelial cells at E18.5 together with the putative cell type of each cell in a .txt file. (TXT 6859 kb)

Supplementary Data 4

This file contains listing of putative novel marker genes for bronchiolar and alveolar cell types identified by single cell transcriptome analysis together with correlation coefficients and p-values (Methods) as well as information regarding previous detection of each of these genes in cell types in the lung, available literature or known mouse knock-out phenotypes. (XLSX 154 kb)

Supplementary Data 5

This file contains the gene ontology and KEGG pathway enrichment analysis results of cell type specific genes for AT1, AT2, Clara and Ciliated cells as identified in the single cell RNA-seq data at E18.5 in an excel file. (XLSX 190 kb)

Supplementary Data 6

This file contains the genes identified by PCA to describe the variation in the data set of all Sftpc+ cells across 4 different time points together with Gene ontology enrichment analysis results for the different group of genes. (XLSX 569 kb)

PowerPoint slides

Rights and permissions

About this article

Cite this article

Treutlein, B., Brownfield, D., Wu, A. et al. Reconstructing lineage hierarchies of the distal lung epithelium using single-cell RNA-seq.Nature 509, 371–375 (2014). https://doi.org/10.1038/nature13173

Download citation