Argus--a new database system for Web-based analysis of multiple microarray data sets - PubMed (original) (raw)

Argus--a new database system for Web-based analysis of multiple microarray data sets

J Comander et al. Genome Res. 2001 Sep.

Abstract

The ongoing revolution in microarray technology allows biologists studying gene expression to routinely collect >10(5) data points in a given experiment. Widely accessible and versatile database software is required to process this large amount of raw data into a format that facilitates the development of new biological insights. Here, we present a novel microarray database software system, named Argus, designed to process, analyze, manage, and publish microarray data. Argus imports the intensities and images of externally quantified microarray spots, performs normalization, and calculates ratios of gene expression between conditions. The database can be queried locally or over the Web, providing a convenient format for Web-publishing entire microarray data sets. Searches for regulated genes can be conducted across multiple experiments, and the integrated results incorporate images of the actual hybridization spots for artifact screening. Query results are presented in a clone- or gene-oriented fashion to rapidly identify highly regulated genes, and scatterplots of expression ratios allow an individual ratio to be interpreted in the context of all data points in the experiment. Algorithms were developed to optimize response times for queries of regulated genes. Supporting databases are updated easily to maintain current gene identity information, and hyperlinks to the Web provide access to descriptions of gene function. Query results also can be exported for higher-order analyses of expression patterns. This combination of features currently is not available in similar software. Argus is available at http://vessels.bwh.harvard.edu/software/Argus.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Data flow diagram. First, microarray spots are quantified using an external program (arrow, top left).

Argus

then imports all data necessary to analyze a set of microarray experiments (counterclockwise from right): the latest version of the UniGene database, a description of the location of each clone on the arrays, the actual images of the scanned microarrays, and the intensity and quality value of each spot on the arrays.

Argus

processes these data and produces a database and a set of supporting files that are transferred (arrow, bottom right) to a local or centrally managed Web server. Users (bottom center), whether local or at a remote location, can access all analysis features of the interactive database using a Web browser and also can conveniently access remote biological databases (bottom left) for additional gene information.

Figure 2

Figure 2

A typical query for regulated genes. (A) The search form is configured to retrieve clones that were down-regulated at least twofold when comparing condition B to condition A and that have a minimum intensity of 3000, from any array or experiment in the database. (B) Three clones meet these criteria for at least one of their replicate measurements. If desired, ratios from all conditions can be shown on this page. Clicking on the accession number next to fibronectin 1 retrieves all data from that clone (C), including thumbnail images of the actual hybridization spots. Clicking on the image of a spot produces a scatterplot (D) highlighting that data point, in this case showing that the point is outside the scatter of points around the unity line.

Figure 3

Figure 3

Display of multiple clones from the same gene. From the details page of any clone, a new search can be initiated that displays all clones with the same gene name. In both the top and bottom clones, fibroblast growth factor receptor 3 was not regulated in condition B but was down-regulated in conditions C and D. The top clone was nearly five times as intense as the bottom clone, as seen in the Ref Intensity column, yet the intensity ratios between conditions are very similar.

Figure 4

Figure 4

Query processing using lookup tables. When a user submits a query from a Web browser, precalculated lookup tables are used to identify a list of accession numbers that match the search criteria. These accession numbers are sorted and merged with the additional information shown in italics, and the results are displayed in a Web browser window.

Similar articles

Cited by

References

    1. Aach J, Rindone W, Church GM. Systematic management and analysis of yeast gene expression data. Genome Res. 2000;10:431–445. - PubMed
    1. Aach J, Bulyk ML, Church GM, Comander J, Derti A, Shendure J. Computational comparison of two draft sequences of the human genome. Nature. 2001;409:856–859. - PubMed
    1. Alter O, Brown PO, Botstein D. Singular value decomposition for genome-wide expression data processing and modeling. Proc Natl Acad Sci. 2000;97:10101–10106. - PMC - PubMed
    1. Boguski MS, Schuler GD. ESTablishing a human transcript map. Nat Genet. 1995;10:369–371. - PubMed
    1. Bowtell DD. Options available—from start to finish—for obtaining expression data by microarray. Nat Genet. 1999;21:25–32. - PubMed

Publication types

MeSH terms

Grants and funding

LinkOut - more resources