Robust lineage reconstruction from high-dimensional single-cell data - PubMed (original) (raw)

Robust lineage reconstruction from high-dimensional single-cell data

Gregory Giecold et al. Nucleic Acids Res. 2016.

Abstract

Single-cell gene expression data provide invaluable resources for systematic characterization of cellular hierarchy in multi-cellular organisms. However, cell lineage reconstruction is still often associated with significant uncertainty due to technological constraints. Such uncertainties have not been taken into account in current methods. We present ECLAIR (Ensemble Cell Lineage Analysis with Improved Robustness), a novel computational method for the statistical inference of cell lineage relationships from single-cell gene expression data. ECLAIR uses an ensemble approach to improve the robustness of lineage predictions, and provides a quantitative estimate of the uncertainty of lineage branchings. We show that the application of ECLAIR to published datasets successfully reconstructs known lineage relationships and significantly improves the robustness of predictions. ECLAIR is a powerful bioinformatics tool for single-cell data analysis. It can be used for robust lineage reconstruction with quantitative estimate of prediction accuracy.

© The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

PubMed Disclaimer

Figures

Figure 1.

Figure 1.

Overview of the ECLAIR method. First, multiple subsamples are randomly drawn from the data. Each subsample is divided into cell clusters with similar gene expression patterns, and a minimum spanning tree is constructed to connect the cell clusters. Next, consensus clustering is constructed by aggregating information from all cell clusters. Finally, a lineage tree connecting the consensus clusters (CC) is constructed by aggregating information from the tree ensemble.

Figure 2.

Figure 2.

ECLAIR correctly reconstructs the lineage tree in mouse early embryo. (A) The lineage tree constructed by SCUBA, based on temporal information in the data. This tree has been experimentally validated. (B) The lineage tree constructed by ECLAIR, without using temporal information. The size of each node is proportional to the number of cells in the corresponding cluster. The color of each node indicates the gene expression pattern associated with the corresponding cell cluster. In (B), the edge width is inversely proportional to the estimated dispersion rate.

Figure 3.

Figure 3.

ECLAIR identifies lineage tree in blood-forming potential cells. The pie-charts represent the cell-type composition of each node using the same color scheme as in (19). The numbers indicate the labels for each CC.

Figure 4.

Figure 4.

Analysis of the single-cell RNAseq data in (20). (A) The lineage tree inferred by ECLAIR. The numbers represent the label for each CC. (B) Heatmap showing the expression pattern of 33 lineage markers.

Figure 5.

Figure 5.

Comparison of the reproducibility between ECLAIR (A and B) and SPADE (C and D). Each heatmap shows the probability density of the cell-pair path length estimated using two trees obtained from the same method. (A) Two independent runs of ECLAIR on the same training set. (B) Two independent runs of ECLAIR on different training datasets. (C) Two independent runs of SPADE on the same training set. (D) Two Independent runs of SPADE on different training datasets.

Figure 6.

Figure 6.

Correlation between the intra-ensemble and inter-ensemble dispersion rates.

References

    1. Sandberg R. Entering the era of single-cell transcriptomics in biology and medicine. Nat. Methods. 2014;11:22–24. - PubMed
    1. Saadatpour A., Lai S., Guo G., Yuan G.C. Single-cell analysis in cancer genomics. Trends Genet. 2015;31:576–586. - PMC - PubMed
    1. Stegle O., Teichmann S.A., Marioni J.C. Computational and analytical challenges in single-cell transcriptomics. Nat. Rev. Genet. 2015;16:133–145. - PubMed
    1. Qiu P., Simonds E.F., Bendall S.C., Gibbs K.D., Jr, Bruggner R.V., Linderman M.D., Sachs K., Nolan G.P., Plevritis S.K. Extracting a cellular hierarchy from high-dimensional cytometry data with SPADE. Nat. Biotechnol. 2011;29:886–891. - PMC - PubMed
    1. Treutlein B., Brownfield D.G., Wu A.R., Neff N.F., Mantalas G.L., Espinoza F.H., Desai T.J., Krasnow M.A., Quake S.R. Reconstructing lineage hierarchies of the distal lung epithelium using single-cell RNA-seq. Nature. 2014;509:371–375. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources