Comparing independent microarray studies: the case of human embryonic stem cells - PubMed (original) (raw)

Comparative Study

Comparing independent microarray studies: the case of human embryonic stem cells

Mayte Suárez-Fariñas et al. BMC Genomics. 2005.

Abstract

Background: Microarray studies of the same phenomenon in different labs often appear at variance because the published lists of regulated transcripts have disproportionately small intersections. We demonstrate that comparing studies by intersecting lists in this manner is methodologically flawed by reanalyzing three studies of the molecular signature of "stemness" in human embryonic stem cells. There are only 7 genes common to all three published lists, suggesting disagreement.

Results: Carefully reanalyzing all three together from the raw data we detect 111 genes upregulated and 95 downregulated in all three studies. The upregulated list was subject to rtRTPCR analysis and 75% of the genes were confirmed.

Conclusion: Our findings show that the three studies have a substantial core of common genes, which is missed if only the published lists are examined. Combined analysis of multiple experiments can be a powerful way to distil coherent conclusions.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Common published genes. Intersection (per UniGene) between the published lists of up-regulated genes for each study.

Figure 2

Figure 2

Coherence Scores. Given the expression profile for four genes A, B, C, and D in two studies 1 and 2 shown in panel (a), we can calculate the correlation of the expression profiles for every pair of genes in each of the two studies. In panel (b) those pairwise correlations are plotted for study 1 against study 2. For example, pair AB is positively correlated in both studies and pair BD is negatively correlated in study 1 and positively in study 2. Correlations for pairs AB, AC, AD lie approximately on a line of positive slope, so gene A is called "coherent". The coherence score of A is the correlation coefficient of those points, which is positive. Gene D has a negative correlation coefficient and so is incoherent. Study 3 in panel (c) is study 2 where condition and control have been swapped; all genes are perfectly coherent for studies 2 and 3, yet each gene which is up-regulated in study 2 is down-regulated in study 3.

Figure 3

Figure 3

Coherence Scores Distribution. The histograms of the coherence scores are bimodal. (a) Pairwise comparisons, and averaged score over all three comparisons. This implies that in these studies the genes can be divided into two distinct categories, coherent and incoherent. When "erratic" genes are discarded, there is a marked improvement in the agreement between studies. b) Integrated Correlation and Correlation of M-values calculated using the genes in the top percentiles of Coherence Score, red: Bhattacharya-Sperger, blue: Bhattacharya-Sato, black: Sperger-Sato

Figure 4

Figure 4

Intersection of significant genes. Improved intersection of significant genes. After our screening process, the number of transcripts which are up (a) and down (b) regulated in stem cells for the three studies. Notice that now most genes in each study also appear at least in one other study (81% Bhattacharya, 85% Sperger, 94% Sato, about 11% expected by chance), with a very important fraction common to all three (about 3 genes expected by chance at the intersection).

Figure 5

Figure 5

RT-PCR results. Comparison between the real-time RT-PCR results and the microarray results. The blue lines are linear fits (without intercept) through all the 106 genes, while the magenta lines fit only the 67 genes with a fold change bigger than 0.5 (log2-scale) in RT-PCR analysis (magenta points). (a) log2-fold change of RT-PCR vs. mean of all microarray studies (R2 = 0.5, r = 0.32, p < 10-16 for all genes, R2 = 0.83, r = 0.51, for the 67 top genes); (b) log2-fold change RT-PCR vs. Sato study alone, which had identical conditions to our study (R2 = 0.53, r = 0.41, p < 10-16 for all genes, R2 = 0.85, r = 0.59 for the 67 top genes)

References

    1. Marshall E. Getting the noise out of gene arrays. SCIENCE. 2004;306:630–631. doi: 10.1126/science.306.5696.630. - DOI - PubMed
    1. Evsikov AV, Solter D. Comment on " 'Stemness': Transcriptional Profiling of Embryonic and Adult Stem Cells" and "A Stem Cell Molecular Signature" (II) SCIENCE. 2003;302:393c. doi: 10.1126/science.1082380. - DOI - PubMed
    1. Fortunel NO, Otu HH, Ng H-H, Chen J, Mu X, Chevassut T, Li X, Joseph M, Bailey C, Hatzfeld JA, Hatzfeld A, Usta F, Vega VB, Long PM, Libermann TA, Lim B. Comment on " 'Stemness': Transcriptional Profiling of Embryonic and Adult Stem Cells" and "A Stem Cell Molecular Signature" (I) SCIENCE. 2003;302:393b. doi: 10.1126/science.1086384. - DOI - PubMed
    1. Ivanova N, Dimos J, Schaniel C, Hackney J, Moore K, Lemischka I. A stem cell molecular signature. SCIENCE. 2002;298:601–604. doi: 10.1126/science.1073823. - DOI - PubMed
    1. Ivanova NB, Dimos JT, Schaniel C, Hackney JA, Moore KA, Ramalho-Santos M, Yoon S, Matsuzaki Y, Mulligan RC, Melton DA, Lemischka IR. Response to Comments on " 'Stemness': Transcriptional Profiling of Embryonic and Adult Stem Cells" and "A Stem Cell Molecular Signature". SCIENCE. 2003;302:393d. doi: 10.1126/science.1088249. - DOI

Publication types

MeSH terms

Substances

LinkOut - more resources