Mining for coexpression across hundreds of datasets using novel rank aggregation and visualization methods - PubMed (original) (raw)

Mining for coexpression across hundreds of datasets using novel rank aggregation and visualization methods

Priit Adler et al. Genome Biol. 2009.

Abstract

We present a web resource MEM (Multi-Experiment Matrix) for gene expression similarity searches across many datasets. MEM features large collections of microarray datasets and utilizes rank aggregation to merge information from different datasets into a single global ordering with simultaneous statistical significance estimation. Unique features of MEM include automatic detection, characterization and visualization of datasets that includes the strongest coexpression patterns. MEM is freely available at http://biit.cs.ut.ee/mem/.

PubMed Disclaimer

Figures

Figure 1

Figure 1

MEM user interface and results for the transcription factor NANOG. The top of the page contains controls for the query: gene input field, dataset selection and advanced options. Bottom of the page shows the results of the query. The genes, which are displayed as rows, are ordered by multi-experiment similarity to the query gene. Additionally, the single experiment similarity ranks are displayed as a matrix of colored squares, where red and blue denote small and large ranks, respectively. The larger squares indicate the ranks that contributed to the final _P_-value. Each element corresponds to a experiment and the columns are clustered. Hovering over the results brings up context specific information: (a) word cloud that characterizes the corresponding experiments; (b) single dataset annotations; (c) gene names with short descriptions. The row of links above the results facilitates the further analysis of results. For example, the user can visualize the expression of selected datasets (marked with green ticks) as a heat map (d).

Figure 2

Figure 2

NANOG targets among first 50 MEM results. MEM query with transcription factor NANOG retrieves more of its targets among top 50 genes, than queries on any one dataset individually. Each point represents the overlap between NANOG targets and top 50 query results in one of the 487 datasets. The datasets are sorted by variation and the ones that pass standard deviation filter are highlighted. Most of the datasets that retrieve high number of NANOG targets pass the filter, which shows the specificity of the filter.

Figure 3

Figure 3

Functional descriptions of the modules found in the mouse coexpression network constructed with MEM. Annotations of the six largest modules are shown in (a). Two smaller modules are shown in the Figure, along with their functional annotations in (b) and (c).

Figure 4

Figure 4

Increasing the number of datasets for MEM queries improves prediction of Mini Chromosome Maintenance (MCM) subunits. As additional datasets are incorporated for MEM analysis, MCM complex subunits show more consistent expression patterns as measured by median distance between subunits in MEM ranked lists of most correlated genes (decreasing bar height). According to one-sided Kolmogorov-Smirnov tests, MEM analysis with different numbers of datasets (left bars) significantly outperforms correlation (rightmost bar). In addition, MEM analysis for all the 145 selected datasets gives improved results compared to plain correlation across the concatenated dataset (light blue and orange lines).

Similar articles

Cited by

References

    1. Brazma A, Hingamp P, Quackenbush J, Sherlock G, Spellman P, Stoeckert C, Aach J, Ansorge W, Ball CA, Causton HC, Gaasterland T, Glenisson P, Holstege FC, Kim IF, Markowitz V, Matese JC, Parkinson H, Robinson A, Sarkans U, Schulze-Kremer S, Stewart J, Taylor R, Vilo J, Vingron M. Minimum information about a microarray experiment (MIAME)-toward standards for microarray data. Nat Genet. 2001;29:365–371. doi: 10.1038/ng1201-365. - DOI - PubMed
    1. Schena M, Shalon D, Davis RW, Brown PO. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science. 1995;270:467–470. doi: 10.1126/science.270.5235.467. - DOI - PubMed
    1. Zhang L, Zhou W, Velculescu VE, Kern SE, Hruban RH, Hamilton SR, Vogelstein B, Kinzler KW. Gene expression profiles in normal and cancer cells. Science. 1997;276:1268–1272. doi: 10.1126/science.276.5316.1268. - DOI - PubMed
    1. Welsh JB, Sapinoso LM, Su AI, Kern SG, Wang-Rodriguez J, Moskaluk CA, Frierson HF, Hampton GM. Analysis of gene expression identifies candidate markers and pharmacological targets in prostate cancer. Cancer Res. 2001;61:5974–5978. - PubMed
    1. Segal E, Shapira M, Regev A, Pe'er D, Botstein D, Koller D, Friedman N. Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nat Genet. 2003;34:166–176. doi: 10.1038/ng1165. - DOI - PubMed

Publication types

MeSH terms

LinkOut - more resources