MULTI-seq: sample multiplexing for single-cell RNA sequencing using lipid-tagged indices - PubMed (original) (raw)

. 2019 Jul;16(7):619-626.

doi: 10.1038/s41592-019-0433-8. Epub 2019 Jun 17.

David M Patterson 1, Juliane Winkler 2, Daniel N Conrad 1, Marco Y Hein 3 4, Vasudha Srivastava 1, Jennifer L Hu 1, Lyndsay M Murrow 1, Jonathan S Weissman 3 4, Zena Werb 2 5, Eric D Chow 6 7, Zev J Gartner 8 9 10 11

Affiliations

MULTI-seq: sample multiplexing for single-cell RNA sequencing using lipid-tagged indices

Christopher S McGinnis et al. Nat Methods. 2019 Jul.

Abstract

Sample multiplexing facilitates scRNA-seq by reducing costs and identifying artifacts such as cell doublets. However, universal and scalable sample barcoding strategies have not been described. We therefore developed MULTI-seq: multiplexing using lipid-tagged indices for single-cell and single-nucleus RNA sequencing. MULTI-seq reagents can barcode any cell type or nucleus from any species with an accessible plasma membrane. The method involves minimal sample processing, thereby preserving cell viability and endogenous gene expression patterns. When cells are classified into sample groups using MULTI-seq barcode abundances, data quality is improved through doublet identification and recovery of cells with low RNA content that would otherwise be discarded by standard quality-control workflows. We use MULTI-seq to track the dynamics of T-cell activation, perform a 96-plex perturbation experiment with primary human mammary epithelial cells and multiplex cryopreserved tumors and metastatic sites isolated from a patient-derived xenograft mouse model of triple-negative breast cancer.

PubMed Disclaimer

Conflict of interest statement

DECLARATION OF INTERESTS

Z.J.G., E.D.C., D.M.P., and C.S.M. have filed patent applications related to the MULTI-seq barcoding method. The contents of this manuscript are solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Figures

Figure 1:

Figure 1:. MULTI-seq demultiplexes cell types, culture conditions, and time points for single-cell and single-nucleus RNA sequencing.

(A) Diagram of the anchor/co-anchor LMO and CMO scaffolds (black) with hybridized sample barcode oligonucleotide (red). LMOs and CMOs are distinguished by their unique lipophilic moieties (e.g., lignoceric acid, palmitic acid, or cholesterol). (B) Schematic overview of a proof-of-concept single-cell RNA sequencing experiment using MULTI-seq. Three samples (HEKs and HMECs with and without TGF-β stimulation) were barcoded with either LMOs or CMOs and sequenced alongside unlabeled controls. Cells were pooled together prior to scRNA-seq. Next-generation sequencing produces two UMI count matrices corresponding to gene expression and barcode abundances. (C) Cell type annotations for LMO-labeled cells demonstrate separation between HEKs (pink), MEPs (cyan), and LEPs (dark teal) in gene expression space (see Fig. S2A). Ambiguous cells positive for multiple marker genes are displayed in grey. n = 6,186 MULTI-seq barcoded cells. (D) MULTI-seq sample classifications for HEKs (dark red), unstimulated HMECs (green), and TGF-β-stimulated HMECs (blue) match cell state annotations. Cells classified as doublets (black) predominantly overlap with ambiguously-annotated cells. n = 6,186 MULTI-seq barcoded cells. (E) TGF-β-stimulated HMECs (blue) exhibited elevated TGFBI expression relative to unstimulated HMECs (green). *** = Wilcoxon rank sum test (two-sided), p <= 10−16. n = 1,950 MULTI-seq barcoded HMECs. Data are represented as mean ± SEM. (F) Single nucleus MULTI-seq sample classification proportions for each cell type identified by clustering in gene expression space (see Fig. S2E–G). n = 5,894 MULTI-seq barcoded nuclei. (G) MULTI-seq sample classifications illuminate temporal gene expression patterns in Jurkat cells following activation with ionomycin and PMA for varying amounts of time. Time-point centroids in gene expression space are denoted with larger circles. n = 3,709 Jurkat nuclei. (H) Violin plots of gene expression marking different stages of Jurkat cell activation. n = 3,709 Jurkat nuclei.

Figure 2:

Figure 2:. MULTI-seq barcoding of multiplexed HMEC culture conditions

(A) Barcode UMI abundances (left) and doublet classifications (right) mapped onto barcode space. MULTI-seq barcode #3 is used as a representative example. Doublets localize to the peripheries of sample groups in large-scale sample multiplexing experiments. n = 25,166 cells. (B) Cell state annotations demonstrate separation between MEPs (cyan) and LEPs (dark teal) in gene expression space (left, see Fig. S5A). Ambiguous cells positive for multiple marker genes are displayed in grey. MULTI-seq classifications grouped by culture composition (right) — e.g., LEP-alone (blue), MEP-alone (green), and both cell types together (dark red) — match cell state annotations. Discordant region where annotated MEPs are classified as doublets by MULTI-seq is indicated with arrows. n = 25,166 cells. (C) MULTI-seq doublet classifications (left) and computational predictions produced by DoubletFinder (right) largely overlap in gene expression space. Discordant region where DoubletFinder-defined doublets that are classified as singlets by MULTI-seq indicated with arrows. n = 25,166 cells. (D) MEP co-culture induces LEP proliferation and TGF-β signaling. Clusters corresponding to resting (black) and proliferative (green) LEPs are identifiable in gene expression space (Fig. S5B). Projecting sample classification densities onto gene expression space for co-cultured LEPs (dark red, top left) and LEPs cultured alone (blue, top right) illustrates that co-cultured LEPs are enriched in the proliferative state (table, bottom left). Co-cultured LEPs also express more TGFBI than LEPs cultured alone. Each point represents an average of LEPs grouped according growth factor condition. *** = Wilcoxon rank sum test (two-sided), p = 3.1×10−6. n = 32 signaling molecule condition groups. Data are represented as mean ± SEM. (E) Hierarchical clustering and heat map analysis of resting LEPs grouped by treatment. Emphasized genes are known EGFR signaling targets. RNA UMI abundances are scaled from 0–1 for each gene. Values correspond to the average expression within each signaling molecule treatment group. Dendrogram labels: E = EGF, W = WNT4, A = AREG, I = IGF-1, R = RANKL, C = Control.

Figure 3:

Figure 3:. PDX sample multiplexing demonstrates low-RNA cell detection, reveals immune cell proportional shifts and classical monocyte heterogeneity in the progressively metastatic lung.

(A) Schematic overview of PDX experiment. (B) MULTI-seq sample classifications (WT, early, mid, late tumor progression) mapped onto barcode space. Replicate tissues are denoted as ‘A’ or ‘B’. n = 10,427 cells. (C) MULTI-seq classifications facilitate low-RNA and low-quality cell deconvolution. CellRanger discards cells barcodes with low RNA UMI counts (red dotted line). Gene expression profiles for classified low-RNA cells reflect established immune cell types (top right, see Fig. S6F). Unclassified low-RNA cells resemble low-quality single-cell transcriptomes (bottom right, see Table S4). n = 2,580 (classified), 583 (unclassified) cells. (D) Cell state annotations (top) and tumor stages (bottom) for lung immune cells in gene expression space. Mono. = monocyte, C = classical, NC = non-classical, Mac. = macrophage, DC = dendritic cell, pDC = plasmacytoid DC. Cells with undeterminable annotations displayed in grey. n = 5,965 cells. (E) Statistically-significant shifts in lung immune cell type proportions for each tumor stage relative to WT. Two-proportion z-test with Bonferroni multiple comparisons adjustment, * = 0.05 > p > 10−10; ** = 10−10 > p > 10−20; *** = p < 10−20. n = 44 tumor-stage/cell type groups. Statistically-insignificant proportional shifts omitted. (F) Subsetted classical monocyte gene expression space overlaid with sample classification densities corresponding to tumor stage. Inset illustrates heterogeneity within late-stage classical monocytes characterized by differential expression of Thbs1 and Cd14. n = 2,496 (all), 1,087 (inset) cells.

Similar articles

Cited by

References

    1. Ramsköld D, Luo S, Wang Y, et al. Full-Length mRNA-Seq from single cell levels of RNA and individual circulating tumor cells. Nat Biotechnol. 2012; 30(8): 777–782. - PMC - PubMed
    1. Hashimshony T, Wagner F, Sher N, Yanai I. CEL-Seq: single-cell RNA-Seq by multiplexed linear amplification. Cell Rep. 2012; 2(3):666–73. - PubMed
    1. Gierahn TM, Wadsworth MH 2nd, Hughes TK, et al. Seq-Well: portable, low-cost RNA sequencing of single cells at high throughput. Nat Methods. 2017; 14(4):395–8. - PMC - PubMed
    1. Cao J, Packer JS, Ramani V, et al. Comprehensive single-cell transcriptional profiling of a multicellular organism. Science. 2017; 357(6352):661–7. - PMC - PubMed
    1. Rosenberg AB, Roco CM, Muscat RA, et al. Single-cell profiling of the developing mouse brain and spinal cord with split-pool barcoding. Science. 2018; 360(6385):176–182. - PMC - PubMed

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources