xCell: digitally portraying the tissue cellular heterogeneity landscape - PubMed (original) (raw)

xCell: digitally portraying the tissue cellular heterogeneity landscape

Dvir Aran et al. Genome Biol. 2017.

Abstract

Tissues are complex milieus consisting of numerous cell types. Several recent methods have attempted to enumerate cell subsets from transcriptomes. However, the available methods have used limited sources for training and give only a partial portrayal of the full cellular landscape. Here we present xCell, a novel gene signature-based method, and use it to infer 64 immune and stromal cell types. We harmonized 1822 pure human cell type transcriptomes from various sources and employed a curve fitting approach for linear comparison of cell types and introduced a novel spillover compensation technique for separating them. Using extensive in silico analyses and comparison to cytometry immunophenotyping, we show that xCell outperforms other methods. xCell is available at http://xCell.ucsf.edu/ .

PubMed Disclaimer

Conflict of interest statement

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

Fig. 1

Fig. 1

xCell study design. a A summary of the data sources used in the study to generate the gene signatures, showing the number of pure cell types and number of samples curated from them. b Our compendium of 64 human cell type gene signatures grouped into five cell type families. c The xCell pipeline. Using the data sources and based on different thresholds, we derived gene signatures for 64 cell types. Of this collection of 6573 signatures, we chose the 489 most reliable cell types, three for each cell type from each data source where available. The raw score is then the average single-sample GSEA (ssGSEA) score of all signatures corresponding to the cell type. Using simulations of gene expression for each cell type, we derived a function to transform the non-linear association between the scores to a linear scale. Using the simulations we also derive the dependencies between cell type scores and apply a spillover compensation method to adjust the scores

Fig. 2

Fig. 2

Evaluation of the performance of xCell using simulated mixtures. a An overview of adjusted scores for 43 cell types in 259 purified cell type samples from the Blueprint and ENCODE data sources (other data sources are in Additional file 2: Figure S4). Most signatures clearly distinguish the corresponding cell type from all other cell types. b A simulation analysis using GSE60424 as the data source [26], which was not used in the development of xCell. This data source contains 114 RNA-seq samples from six major immune cell types. Left: Pearson correlation coefficients using our method before spillover adjustment and after the adjustment. Dependencies between CD4+ T cells, CD8+ T cells, and NK cells were greatly reduced; spillover from monocytes to neutrophils was also removed. Right: Comparison of the correlation coefficients across the different methods. The first column corresponds to xCell’s predictions of the underlying abundances of the cell types in the simulations (both color and pie chart correspond to average Pearson coefficients). Bindea, Charoentong, Palmer, Rooney, and Tirosh represent sets of signatures for cell types from the corresponding manuscripts. Newman refers to the inferences produced using CIBERSORT on the simulations. xCell outperformed the other methods in 17 of 18 comparisons. c Comparison of the correlation coefficients across the different methods based on 18 simulations generated using the left-out testing samples. Here rows correspond to methods and columns show the average Pearson coefficient for the corresponding cell type across the simulations. Independent simulations are available in Additional file 2: Figure S6. xCell outperformed the other methods in 64 of 67 comparisons

Fig. 3

Fig. 3

Comparison of digital dissection methods with flow cytometry counts. Left: Scatter plots of CyTOF fractions in PBMCs vs. cell type scores from whole blood of 61 samples from SDY311 (top) and 104 samples from SDY420 (bottom). Only the top correlating cell types in each study are shown. Right: Correlation coefficients produced by our method compared to other methods. Only cell types with abundance of at least 1% on average, as measured by CyTOF, are shown. Non-significant correlations (p value < 0.05) are marked with a gray “_x_”

Fig. 4

Fig. 4

Cell type enrichment analysis in tumors. a Average scores for nine cell types across 24 cancer types from TCGA (The Cancger Genome Atlas). Scores were normalized across rows. Signatures were chosen such that they are the cell of origin of a cancer type or the most significant signature of the cancer type compared to all others. b t-SNE (t-Distributed Stochastic Neighbor Embedding) plot of 8875 primary cancer samples from TCGA (The Cancger Genome Atlas) and TARGET colored by cancer type. The t-SNE plot was generated using the enrichment scores of 48 non-epithelial, non-stem cell, and non-cell type-specific scores. Many of the cancer types create distinct clusters, emphasizing the important role of the tumor microenvironment in characterizing tumors

Similar articles

Cited by

References

    1. Galon J, Costes A, Sanchez-Cabo F, Kirilovsky A, Mlecnik B, Lagorce-Pagès C, et al. Type, density, and location of immune cells within human colorectal tumors predict clinical outcome. Science. 2006;313:1960–4. doi: 10.1126/science.1129139. - DOI - PubMed
    1. Hanahan D, Coussens LM. Accessories to the crime: functions of cells recruited to the tumor microenvironment. Cancer Cell. 2012;21:309–22. doi: 10.1016/j.ccr.2012.02.022. - DOI - PubMed
    1. Gentles AJ, Newman AM, Liu CL, Bratman SV, Feng W, Kim D, et al. The prognostic landscape of genes and infiltrating immune cells across human cancers. Nat Med. 2015;21:938–45. doi: 10.1038/nm.3909. - DOI - PMC - PubMed
    1. Abbas AR, Wolslegel K, Seshasayee D, Modrusan Z, Clark HF. Deconvolution of blood microarray data identifies cellular activation patterns in systemic lupus erythematosus. PLoS One. 2009;4:e6098. - PMC - PubMed
    1. Shen-Orr SS, Gaujoux R. Computational deconvolution: extracting cell type-specific information from heterogeneous samples. Curr Opin Immunol. 2013;25:571–8. doi: 10.1016/j.coi.2013.09.015. - DOI - PMC - PubMed

Publication types

MeSH terms

Grants and funding

LinkOut - more resources