QUAliFiER: an automated pipeline for quality assessment of gated flow cytometry data - PubMed (original) (raw)

QUAliFiER: an automated pipeline for quality assessment of gated flow cytometry data

Greg Finak et al. BMC Bioinformatics. 2012.

Abstract

Background: Effective quality assessment is an important part of any high-throughput flow cytometry data analysis pipeline, especially when considering the complex designs of the typical flow experiments applied in clinical trials. Technical issues like instrument variation, problematic antibody staining, or reagent lot changes can lead to biases in the extracted cell subpopulation statistics. These biases can manifest themselves in non-obvious ways that can be difficult to detect without leveraging information about the study design or other experimental metadata. Consequently, a systematic and integrated approach to quality assessment of flow cytometry data is necessary to effectively identify technical errors that impact multiple samples over time. Gated cell populations and their statistics must be monitored within the context of the experimental run, assay, and the overall study.

Results: We have developed two new packages, flowWorkspace and QUAliFiER to construct a pipeline for quality assessment of gated flow cytometry data. flowWorkspace makes manually gated data accessible to BioConductor's computational flow tools by importing pre-processed and gated data from the widely used manual gating tool, FlowJo (Tree Star Inc, Ashland OR). The QUAliFiER package takes advantage of the manual gates to perform an extensive series of statistical quality assessment checks on the gated cell sub-populations while taking into account the structure of the data and the study design to monitor the consistency of population statistics across staining panels, subject, aliquots, channels, or other experimental variables. QUAliFiER implements SVG-based interactive visualization methods, allowing investigators to examine quality assessment results across different views of the data, and it has a flexible interface allowing users to tailor quality checks and outlier detection routines to suit their data analysis needs.

Conclusion: We present a pipeline constructed from two new R packages for importing manually gated flow cytometry data and performing flexible and robust quality assessment checks. The pipeline addresses the increasing demand for tools capable of performing quality checks on large flow data sets generated in typical clinical trials. The QUAliFiER tool objectively, efficiently, and reproducibly identifies outlier samples in an automated manner by monitoring cell population statistics from gated or ungated flow data conditioned on experiment-level metadata.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Functionality of flowWorkspace applied to the QA template gating hierarchy of the test data in the QUAliFiER pipeline. Functionality of flowWorkspace applied to the QA template gating hierarchy of the test data in the QUAliFiER pipeline. A) The design of flowWorkspace and its interface with QUAliFiER, FlowJo and BioConductor. B) The gating hierarchy for the first sample in the test workspace imported by flowWorkspace and displayed via

plot

. The names of the QA gates defined in the FlowJo workspace for the gating template are displayed. This gating template was designed for performing quality assessment of flow data from the ITN (Immune Tolerance Network). MNC is a mononuclear cell gate. WBC_perct is a white blood cell gate. Gates on specific channels with MFI and margin are for detecting the positive populations and boundary events, respectively in each channel. C) Agreement between imported population statistics computed using flowWorkspace and statistics computed from FlowJo, measured via the coefficient of variation between the two values. Slight deviations are due to FlowJo’s discretization of the data transformation function, which must be interpolated by flowWorkspace. Overall the CV is fractions of a percent, indicating successful import. D) Example of the dot plots generated by flowWorkspace to visualize gated populations.

Figure 2

Figure 2

QA result of of MFI stability vs time for the FITC channel. We can see clear examples where the MFI is not stable over time, i.e. it is either increasing (HLADR, CD8, CD11c), or decreasing (LD). Some stains show residuals that are not normally distributed, suggesting non-linear trends (HLADR, CD8, CD57), while others are generally stable with the occasional sample outlier (CD1c, IgG1, 6B11). The formula used to generate the plot is: MFI_∼_RecdDt|channel_∗_stain, where the MFI is plotted against RecdDt, the date the sample was run, which is defined in the study metadata. Channel and stain are generated upon parsing the workspace. Outlier calls are done within each combination of channel and stain. An additional argument to

plot(..,subset=channel='FITC',..)

tells the function to plot only the output for the FITC channel. The population for this

qaTask

is the population, MFI, which selects all MFI gates.

Figure 3

Figure 3

Consistency of the mononuclear cell gate across aliquots. The plot shows the consistency of the MNC population across aliquots (coresampleid). The plot type is bwplot, and the formula for generating this output is coresampleid_∼_proportion, while the

qaCheck

was generated with proportion_∼_coresampleid. Additional lattice plot arguments to generate a vertical boxplot layout are passed through the

@par

slot of the

qaTask

object. Note that there is no stratification variable. The bwplot plot type implies a grouping using the coresampleid variable (defined in the study metadata). Boxplots are generated for each level of coresampleid and outlier calls (red boxes and points) are made within each group as well as between groups (i.e., identifying groups with larger than expected variability). The population for this

qaTask

is MNC, the mononuclear cell gate.

Figure 4

Figure 4

Consistency of red blood cell lysis across staining panels. If red blood cells are not properly lysed, they will be detected as events in the FCM experiment. Under ideal conditions, only white blood cells would be detected. The outlier threshold is set such that at least 80% of events should be within the white blood cell gate. Between one and two samples within each staining panel were identified as having lower than expected red blood cell lysis efficiency. Closer inspection revealed these to be from the same coresampleid.

Figure 5

Figure 5

An xample of the HTML Quality Assessment Report Generated by the QUAliFiER Package. A) qaTasks are categorized by assay level, channel level, or tube level, depending on the grouping variables for outlier detection. Within each category, a summary of the number of FCS files failing that qaTask is visible. B) Clicking on the “+” signs expands a more detailed view of the qaTask, including summary plots and tables. The summary plots themselves are interactive through the use of SVG graphics. The consistency of redundant staining is shown as boxplots of the % of marker positive cells grouped by stain. C) Clicking on individual points in the summary plots opens more detailed plots of the cell populations for individual samples failing the qaTask. Densityplots of one of the outlier groups show two samples with inconsistent positive staining in the FITC channel.

Figure 6

Figure 6

Dot plots of outlier MNC samples. Dot plots of the MNC gates for coresampleid 11732, one of the group outliers identified by the MNC stability qaTask. Samples with lower proportions of lymphocytes inside the gate are readily visible, caused by elevated debris in the sample.

Similar articles

Cited by

References

    1. Braylan RC. Impact of flow cytometry on the diagnosis and characterization of lymphomas, chronic lymphoproliferative disorders and plasma cell neoplasias. Cytometry A. 2004;58A:57–61. doi: 10.1002/cyto.a.10101. - DOI - PubMed
    1. Hengel RL, Nicholson JK. An update on the use of flow cytometry in HIV infection and AIDS. Clin Lab Med. 2001;21(4):841–856. - PubMed
    1. Illoh OC. Current applications of flow cytometry in the diagnosis of primary immunodeficiency diseases. Arch Pathol Lab Med. 2004;128:23–31. - PubMed
    1. Kiechle FL, Holland-Staley CA. Genomics, transcriptomics, proteomics, and numbers. Arch Pathol Lab Med. 2003;127(9):1089–1097. - PubMed
    1. Mandy FF. Twenty-five years of clinical flow cytometry: AIDS, accelerated global instrument distribution. Cytometry A. 2004;58A:55–56. doi: 10.1002/cyto.a.10102. - DOI - PubMed

Publication types

MeSH terms

LinkOut - more resources