CellTag Indexing: genetic barcode-based sample multiplexing for single-cell genomics - PubMed (original) (raw)

CellTag Indexing: genetic barcode-based sample multiplexing for single-cell genomics

Chuner Guo et al. Genome Biol. 2019.

Abstract

High-throughput single-cell assays increasingly require special consideration in experimental design, sample multiplexing, batch effect removal, and data interpretation. Here, we describe a lentiviral barcode-based multiplexing approach, CellTag Indexing, which uses predefined genetic barcodes that are heritable, enabling cell populations to be tagged, pooled, and tracked over time in the same experimental replicate. We demonstrate the utility of CellTag Indexing by sequencing transcriptomes using a variety of cell types, including long-term tracking of cell engraftment and differentiation in vivo. Together, this presents CellTag Indexing as a broadly applicable genetic multiplexing tool that is complementary with existing single-cell technologies.

PubMed Disclaimer

Conflict of interest statement

All animal procedures were based on animal care guidelines approved by the Institutional Animal Care and Use Committee at Washington University in St. Louis under protocol number 20150192.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

Fig. 1

Fig. 1

Validation of CellTag Indexing for genetic labeling of biological samples. a Schematic of CellTag Indexing. CellTag barcodes are positioned in the 3′ UTR of a lentiviral GFP construct with a SV40 polyadenylation signal. Barcoded viruses produced from CellTag constructs are used to transduce the cells to be “tagged.” Tagged cells can then be pooled for single-cell profiling. Prior to analysis, cell identity is demultiplexed by our classifier pipeline: A CellTag digital gene expression (DGE) matrix is generated by extracting and counting CellTag sequences for each cell; the DGE is then collapsed by consensus clustering of the detected CellTags; after filtering and log normalization, the DGE is processed by dynamic binarization and classification. Classification results can be visualized as metadata overlaying single transcriptomes projected onto reduced dimensions. b Scatter plot of 18,159 transcriptomes from the 2-tag species mixing experiment, classified by 10x Genomics Cell Ranger pipeline into 9357 single human cells, 7456 single mouse cells, and 1346 multiplets based on alignment to the custom hg19-mm10 reference genome. c Scatter plot of 18,159 transcriptomes from the 2-tag species mixing experiment, demultiplexed by CellTag Indexing into 7510 human cells (CellTagA), 6397 mouse cells (CellTagB), 1040 multiplets, and 3212 non-determined cells. d Log-normalized CellTag expression of the 4673 transcriptomes from the 5-tag species mixing experiment, demultiplexed into their respective sample identity on the _x_-axis; CellTag barcodes, _y_-axis. e Transcriptomes from the 5-tag species mixing experiment projected onto reduced dimensions by _t_-SNE, visualized with CellTag classification. CellTagC, CellTagD, CellTagE, and CellTagA label HEK293Ts; CellTagB labels MEFs

Fig. 2

Fig. 2

CellTag Indexing for long-term tracking of cells demonstrated in a competitive transplant experiment. A Schematic of iEP generation and enriched into EcadHigh and EcadLow populations by FACS, labeled with CellTagA and CellTagB respectively, pooled in equal proportions and transplanted into a mouse model of colonic injury. Engrafted colon is then processed for single-nucleus RNA-seq. B Fluorescent microscopic images of the lumen of the engrafted colon, showing patches of GFP+ iEPs. Scale bar, 100 μm. C H&E-stained section of the engrafted colon showing normal intestinal architecture with evidence of epithelial injury. Scale bar, 100 μm. D DAPI-stained section of the engrafted colon showing GFP+ iEPs in the mucosa. Scale bar, 100 μm. E Transcriptomes from three post-engraftment colon tissues sequenced and analyzed, visualized by UMAP, revealing 16 clusters. F Annotation of the 16 clusters into (a) _Lgr5_− Lrig1+ intestinal stem cells (ISCs), (b) Lgr5+ ISCs, (c) deep crypt secretory cells, (d) endothelial cells, (e) enteric neurons, (f) enterocytes, (g) enteroendocrine cells, (h) fibroblasts, (i) goblet cells, (j) iEPs, (k) immune cells, (i) muscle, (m) Nkain2+ Csmd1+ cells, and (n) Reln+ Prox1+ cells. G Marker expression in annotated cell types

Fig. 3

Fig. 3

CellTag Indexing revealed iEP engraftment and transition through an intestinal stem cell fate. a CellTags identified engrafted iEPs enriched in cluster 4 (early engraftment iEPs) and the main intestinal epithelial clusters. b Density heatmap confirms enrichment of CellTagged cells in the early engraftment iEP cluster and the main intestinal epithelial cell clusters. c, d Stacked bar plots of CellTagged cells show enrichment in clusters 0, 1, and 4. e Permutation test of cluster enrichment or depletion for each CellTag in intestinal clusters show statistically significant enrichment of EcadHigh/CellTagA cells in cluster 0 (Lgr5− Lrig1+ ISCs, p = 4.03 × 10− 5) and cluster 1 (Lgr5+ ISCs, p = 9.83 × 10− 3). _y_-axis, negative log10 of p value for cluster enrichment, log10 of p value for cluster depletion. Dotted lines correspond to a p value of 0.05. f RNA velocity analysis shows velocity vectors from iEPs towards _Lgr5_− Lrig1+ ISCs and from the ISC clusters towards the differentiated enterocyte clusters. g Subset of velocity vectors of CellTagged cells confirm transcriptional kinetics of engrafted iEPs in the direction towards intestinal stem cells

Similar articles

Cited by

References

    1. Hicks SC, Townes FW, Teng M, Irizarry RA. Missing data and technical variability in single-cell RNA-sequencing experiments. Biostatistics. 2017;19:562–578. doi: 10.1093/biostatistics/kxx053. - DOI - PMC - PubMed
    1. Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, et al. Limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43:e47. doi: 10.1093/nar/gkv007. - DOI - PMC - PubMed
    1. Shaham U, Stanton KP, Zhao J, Li H, Raddassi K, Montgomery R, et al. Removal of batch effects using distribution-matching residual networks. Bioinformatics. 2017;33:2539–2546. doi: 10.1093/bioinformatics/btx196. - DOI - PMC - PubMed
    1. Haghverdi L, Lun ATL, Morgan MD, Marioni JC. Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors. Nat Biotechnol. 2018;36:421–427. doi: 10.1038/nbt.4091. - DOI - PMC - PubMed
    1. Butler A, Hoffman P, Smibert P, Papalexi E, Satija R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat Biotechnol. 2018;36:411–420. doi: 10.1038/nbt.4096. - DOI - PMC - PubMed

Publication types

MeSH terms

Grants and funding

LinkOut - more resources