Evaluation of microarray preprocessing algorithms based on concordance with RT-PCR in clinical samples - PubMed (original) (raw)
Evaluation of microarray preprocessing algorithms based on concordance with RT-PCR in clinical samples
Balazs Gyorffy et al. PLoS One. 2009.
Abstract
Background: Several preprocessing algorithms for Affymetrix gene expression microarrays have been developed, and their performance on spike-in data sets has been evaluated previously. However, a comprehensive comparison of preprocessing algorithms on samples taken under research conditions has not been performed.
Methodology/principal findings: We used TaqMan RT-PCR arrays as a reference to evaluate the accuracy of expression values from Affymetrix microarrays in two experimental data sets: one comprising 84 genes in 36 colon biopsies, and the other comprising 75 genes in 29 cancer cell lines. We evaluated consistency using the Pearson correlation between measurements obtained on the two platforms. Also, we introduce the log-ratio discrepancy as a more relevant measure of discordance between gene expression platforms. Of nine preprocessing algorithms tested, PLIER+16 produced expression values that were most consistent with RT-PCR measurements, although the difference in performance between most of the algorithms was not statistically significant.
Conclusions/significance: Our results support the choice of PLIER+16 for the preprocessing of clinical Affymetrix microarray data. However, other algorithms performed similarly and are probably also good choices.
Conflict of interest statement
Competing Interests: The authors have declared that no competing interests exist.
Figures
Figure 1. The colon and cell line data sets are representative of clinical microarray data.
For several Affymetrix data sets, box-and-whiskers plots indicate the distribution of three bias metrics: a) RNA degradation slope, b) median perfect-match probe intensity, and c) fraction of probe sets called present. A narrower distribution indicates greater consistency in technical conditions. LatinSquare133 and LatinSquare95 are spike-in data sets produced by the microarray manufacturer ; Gyorffy_cells and Gyorffy_colon are the data sets analyzed in this paper , ; the other five are publicly-available clinical data sets –.
Figure 2. Pearson correlation coefficients between microarray and RT-PCR.
The distribution of Pearson correlation coefficients for each microarray preprocessing algorithm is indicated by a box plot, for a) the colon cancer data set (84 genes, 36 samples), and b) the cell line data set (75 genes, 29 samples). The box indicates the 25th to 75th percentile, and the heavier line indicates the median. Algorithms are displayed in decreasing order of the median, such that the more accurate algorithms are at the top. The colorgrams on the right-hand side indicate P values (Wilcoxon test) comparing each pair of algorithms.
Figure 3. Log-ratio discrepancy between microarray and RT-PCR.
The distribution of the log-ratio discrepancy for each microarray preprocessing algorithm is indicated by a box plot, for a) the colon cancer data set, and b) the cell line data set. Algorithms are displayed in order of the median, such that the more accurate algorithms are at the top. The colorgrams on the right-hand side indicate P values (Wilcoxon test) comparing each pair of algorithms.
Similar articles
- Probe set filtering increases correlation between Affymetrix GeneChip and qRT-PCR expression measurements.
Mieczkowski J, Tyburczy ME, Dabrowski M, Pokarowski P. Mieczkowski J, et al. BMC Bioinformatics. 2010 Feb 24;11:104. doi: 10.1186/1471-2105-11-104. BMC Bioinformatics. 2010. PMID: 20181266 Free PMC article. - Comparing the use of Affymetrix to spotted oligonucleotide microarrays using two retinal pigment epithelium cell lines.
Rogojina AT, Orr WE, Song BK, Geisert EE Jr. Rogojina AT, et al. Mol Vis. 2003 Oct 6;9:482-96. Mol Vis. 2003. PMID: 14551534 Free PMC article. - Cross-platform comparison of SYBR Green real-time PCR with TaqMan PCR, microarrays and other gene expression measurement technologies evaluated in the MicroArray Quality Control (MAQC) study.
Arikawa E, Sun Y, Wang J, Zhou Q, Ning B, Dial SL, Guo L, Yang J. Arikawa E, et al. BMC Genomics. 2008 Jul 11;9:328. doi: 10.1186/1471-2164-9-328. BMC Genomics. 2008. PMID: 18620571 Free PMC article. - Validation of oligonucleotide microarray data using microfluidic low-density arrays: a new statistical method to normalize real-time RT-PCR data.
Abruzzo LV, Lee KY, Fuller A, Silverman A, Keating MJ, Medeiros LJ, Coombes KR. Abruzzo LV, et al. Biotechniques. 2005 May;38(5):785-92. doi: 10.2144/05385MT01. Biotechniques. 2005. PMID: 15945375 - Statistical challenges in preprocessing in microarray experiments in cancer.
Owzar K, Barry WT, Jung SH, Sohn I, George SL. Owzar K, et al. Clin Cancer Res. 2008 Oct 1;14(19):5959-66. doi: 10.1158/1078-0432.CCR-07-4532. Clin Cancer Res. 2008. PMID: 18829474 Free PMC article. Review.
Cited by
- A molecular mechanism that links Hippo signalling to the inhibition of Wnt/β-catenin signalling.
Imajo M, Miyatake K, Iimura A, Miyamoto A, Nishida E. Imajo M, et al. EMBO J. 2012 Mar 7;31(5):1109-22. doi: 10.1038/emboj.2011.487. Epub 2012 Jan 10. EMBO J. 2012. PMID: 22234184 Free PMC article. - Identifying resistance mechanisms against five tyrosine kinase inhibitors targeting the ERBB/RAS pathway in 45 cancer cell lines.
Pénzváltó Z, Tegze B, Szász AM, Sztupinszki Z, Likó I, Szendrői A, Schäfer R, Győrffy B. Pénzváltó Z, et al. PLoS One. 2013;8(3):e59503. doi: 10.1371/journal.pone.0059503. Epub 2013 Mar 29. PLoS One. 2013. PMID: 23555683 Free PMC article. - A predictor for predicting Escherichia coli transcriptome and the effects of gene perturbations.
Ling MH, Poh CL. Ling MH, et al. BMC Bioinformatics. 2014 May 13;15:140. doi: 10.1186/1471-2105-15-140. BMC Bioinformatics. 2014. PMID: 24884349 Free PMC article. - The effects of lymph node status on predicting outcome in ER+ /HER2- tamoxifen treated breast cancer patients using gene signatures.
Cockburn JG, Hallett RM, Gillgrass AE, Dias KN, Whelan T, Levine MN, Hassell JA, Bane A. Cockburn JG, et al. BMC Cancer. 2016 Jul 28;16:555. doi: 10.1186/s12885-016-2501-0. BMC Cancer. 2016. PMID: 27469239 Free PMC article. - Large-scale hypomethylated blocks associated with Epstein-Barr virus-induced B-cell immortalization.
Hansen KD, Sabunciyan S, Langmead B, Nagy N, Curley R, Klein G, Klein E, Salamon D, Feinberg AP. Hansen KD, et al. Genome Res. 2014 Feb;24(2):177-84. doi: 10.1101/gr.157743.113. Epub 2013 Sep 25. Genome Res. 2014. PMID: 24068705 Free PMC article.
References
- Affymetrix. 2008 Latin Square Data [ http://www.affymetrix.com/support/technical/sample_data/datasets.affx]
- Cope LM, Irizarry RA, Jaffee HA, Wu ZJ, Speed TP. A benchmark for affymetrix GeneChip expression measures. Bioinformatics. 2004;20:323–331. - PubMed
- Tong W, Lucas AB, Shippy R, Fan X, Fang H, et al. Evaluation of external RNA controls for the assessment of microarray performance. Nat Biotechnol. 2006;24:1132–1139. - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases