Data exploration, quality control and testing in single-cell qPCR-based gene expression experiments - PubMed (original) (raw)
Data exploration, quality control and testing in single-cell qPCR-based gene expression experiments
Andrew McDavid et al. Bioinformatics. 2013.
Abstract
Motivation: Cell populations are never truly homogeneous; individual cells exist in biochemical states that define functional differences between them. New technology based on microfluidic arrays combined with multiplexed quantitative polymerase chain reactions now enables high-throughput single-cell gene expression measurement, allowing assessment of cellular heterogeneity. However, few analytic tools have been developed specifically for the statistical and analytical challenges of single-cell quantitative polymerase chain reactions data.
Results: We present a statistical framework for the exploration, quality control and analysis of single-cell gene expression data from microfluidic arrays. We assess accuracy and within-sample heterogeneity of single-cell expression and develop quality control criteria to filter unreliable cell measurements. We propose a statistical model accounting for the fact that genes at the single-cell level can be on (and a continuous expression measure is recorded) or dichotomously off (and the recorded expression is zero). Based on this model, we derive a combined likelihood ratio test for differential expression that incorporates both the discrete and continuous components. Using an experiment that examines treatment-specific changes in expression, we show that this combined test is more powerful than either the continuous or dichotomous component in isolation, or a t-test on the zero-inflated data. Although developed for measurements from a specific platform (Fluidigm), these tools are generalizable to other multi-parametric measures over large numbers of events.
Availability: All results presented here were obtained using the SingleCellAssay R package available on GitHub (http://github.com/RGLab/SingleCellAssay).
Figures
Fig. 1.
Histogram and theoretical (normal) distribution of for single-cell (left, light gray) and 100-cell experiments (right, dark gray). Genes FASLG, IFN- , BIRC3 and CD69 are depicted. The frequency expression of each gene in the single-cell experiments is printed above each histogram. The mean of the 100-cell and single-cell experiments is indicated by a thick black line along the _x_-axis
Fig. 2.
Concordance between 100 cell and , the in silico average of single-cell wells for datasets A, B and C. In the top row, wells with are included and treated as exact zeroes. In the middle row, they are excluded, resulting in a clear lack of concordance. In the final row, wells are filtered as per Section 2.3. Dark, thin lines show the initial location of a gene before filtering and connect to the location of the gene after filtering. In each panel, , the concordance correlation coefficient and , the average weighted squared deviation of expression measurements is printed. The dotted black line shows a loess fit through the data. In all cases, the expression values are transformed using a shifted log-transformation []. As such, a graphed value of zero corresponds to a zero expression value (i.e. )
Fig. 3.
Number of discoveries (genes units) versus FDR, by treatment, dataset A. The combined LRT is compared with a Bernoulli or normal-theory only LRT, as well as a _t_-test of the raw expression values ( scale), including zero measurements
Fig. 4.
of tests (genes units) versus frequencies of expression of the genes. The Bernoulli, normal-theory and combined LRTs are plotted. Asterisk indicates test is different from the combined test at 5% significance in a Wilcoxon signed-rank test
Fig. 5.
Heatmap of signed for selected genes (rows, see main text) and all 16 individuals (columns). The color above each column indicates the antigen stimulation applied to the cells; thus, individuals are randomly arranged in each antigen block. Red and purple are two different CMV antigen pools; yellow and orange are two different HIV antigen pools
Similar articles
- Microdroplet-based one-step RT-PCR for ultrahigh throughput single-cell multiplex gene expression analysis and rare cell detection.
Ma J, Tran G, Wan AMD, Young EWK, Kumacheva E, Iscove NN, Zandstra PW. Ma J, et al. Sci Rep. 2021 Mar 24;11(1):6777. doi: 10.1038/s41598-021-86087-4. Sci Rep. 2021. PMID: 33762663 Free PMC article. - Quantitative miRNA expression analysis using fluidigm microfluidics dynamic arrays.
Jang JS, Simon VA, Feddersen RM, Rakhshan F, Schultz DA, Zschunke MA, Lingle WL, Kolbert CP, Jen J. Jang JS, et al. BMC Genomics. 2011 Mar 9;12:144. doi: 10.1186/1471-2164-12-144. BMC Genomics. 2011. PMID: 21388556 Free PMC article. - Validation of oligonucleotide microarray data using microfluidic low-density arrays: a new statistical method to normalize real-time RT-PCR data.
Abruzzo LV, Lee KY, Fuller A, Silverman A, Keating MJ, Medeiros LJ, Coombes KR. Abruzzo LV, et al. Biotechniques. 2005 May;38(5):785-92. doi: 10.2144/05385MT01. Biotechniques. 2005. PMID: 15945375 - Twenty-five years of quantitative PCR for gene expression analysis.
VanGuilder HD, Vrana KE, Freeman WM. VanGuilder HD, et al. Biotechniques. 2008 Apr;44(5):619-26. doi: 10.2144/000112776. Biotechniques. 2008. PMID: 18474036 Review. - Microfluidic single cell analysis: from promise to practice.
Lecault V, White AK, Singhal A, Hansen CL. Lecault V, et al. Curr Opin Chem Biol. 2012 Aug;16(3-4):381-90. doi: 10.1016/j.cbpa.2012.03.022. Epub 2012 Apr 21. Curr Opin Chem Biol. 2012. PMID: 22525493 Review.
Cited by
- Single-cell transcriptomics reveals neural stem cell trans-differentiation and cell subpopulations in whole heart decellularized extracellular matrix.
Yang X, Zhao Y, Liu W, Gao Z, Wang C, Wang C, Li S, Zhang X. Yang X, et al. Biophys Rep. 2024 Aug 31;10(4):241-253. doi: 10.52601/bpr.2024.240011. Biophys Rep. 2024. PMID: 39281200 Free PMC article. - Aquaporin 1 aggravates lipopolysaccharide-induced macrophage polarization and pyroptosis.
Wen Z, Ablimit A. Wen Z, et al. Sci Rep. 2024 Aug 10;14(1):18569. doi: 10.1038/s41598-024-68899-2. Sci Rep. 2024. PMID: 39127771 Free PMC article. - Varied cellular abnormalities in thin vs. normal endometrium in recurrent implantation failure by single-cell transcriptomics.
Fu X, Guo X, Xu H, Li Y, Jin B, Zhang X, Shu C, Fan Y, Yu Y, Tian Y, Tian J, Shu J. Fu X, et al. Reprod Biol Endocrinol. 2024 Jul 31;22(1):90. doi: 10.1186/s12958-024-01263-1. Reprod Biol Endocrinol. 2024. PMID: 39085925 Free PMC article. - SifiNet: a robust and accurate method to identify feature gene sets and annotate cells.
Gao Q, Ji Z, Wang L, Owzar K, Li QJ, Chan C, Xie J. Gao Q, et al. Nucleic Acids Res. 2024 May 22;52(9):e46. doi: 10.1093/nar/gkae307. Nucleic Acids Res. 2024. PMID: 38647069 Free PMC article. - Age-Dependent RGS5 Loss in Pericytes Induces Cardiac Dysfunction and Fibrosis.
Tamiato A, Tombor LS, Fischer A, Muhly-Reinholz M, Vanicek LR, Toğru BN, Neitz J, Glaser SF, Merten M, Rodriguez Morales D, Kwon J, Klatt S, Schumacher B, Günther S, Abplanalp WT, John D, Fleming I, Wettschureck N, Dimmeler S, Luxán G. Tamiato A, et al. Circ Res. 2024 May 10;134(10):1240-1255. doi: 10.1161/CIRCRESAHA.123.324183. Epub 2024 Apr 2. Circ Res. 2024. PMID: 38563133 Free PMC article.
References
- Ge Y, et al. Resampling-based multiple testing for microarray data analysis. TEST. 2003;12:1–77.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources