The generic genome browser: a building block for a model organism system database - PubMed (original) (raw)

The generic genome browser: a building block for a model organism system database

Lincoln D Stein et al. Genome Res. 2002 Oct.

Abstract

The Generic Model Organism System Database Project (GMOD) seeks to develop reusable software components for model organism system databases. In this paper we describe the Generic Genome Browser (GBrowse), a Web-based application for displaying genomic annotations and other features. For the end user, features of the browser include the ability to scroll and zoom through arbitrary regions of a genome, to enter a region of the genome by searching for a landmark or performing a full text search of all features, and the ability to enable and disable tracks and change their relative order and appearance. The user can upload private annotations to view them in the context of the public ones, and publish those annotations to the community. For the data provider, features of the browser software include reliance on readily available open source components, simple installation, flexible configuration, and easy integration with other components of a model organism system Web site. GBrowse is freely available under an open source license. The software, its documentation, and support are available at http://www.gmod.org.

PubMed Disclaimer

Figures

Figure 1

Figure 1

The user enters GBrowse by typing a landmark name into the text field at top. Landmarks can be gene names, clone names, accession numbers, or any other identifier configured by the administrator. Once a region is selected, it is displayed in a detailed view that summarizes annotations and other genomic features. An overview panel and a navigation bar together allow the user to move from one place to another.

Figure 2

Figure 2

The detailed view after zooming out to 200 kb, showing semantic zooming.

Figure 3

Figure 3

A search for the term “7 transmembrane receptor.”

Figure 4

Figure 4

The Generic Genome Browser is built from multiple software modules. In this illustration, modules that were not produced as part of this project are shown in a lighter color.

Figure 5

Figure 5

The Bio::DB::GFF database uses a minimal schema to represent features on sequences. The main tables are fdata, which contains the position and type of each feature, fgroup, which tracks the grouping of subfeatures into features, such as high-similarity pairs in a gapped alignment, fdna, which stores the raw DNA sequence, and fattribute_to_feature, which allows attribute information to be attached to features. Attributes are used for storing such textual information as notes, synonyms, and evidence codes. The fattribute and ftype tables, respectively, hold attribute names and the method and source fields. For retrieval efficiency, the fdna table fragments each DNA into small pieces and stores the beginning of each piece in the foffset field.

Figure 6

Figure 6

Creating a new track of targeted deletions using GBrowse.

Similar articles

Cited by

References

    1. Bairoch A, Apweiler R. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res. 2000;28:45–48. - PMC - PubMed
    1. Bennetzen JL, Chandler VL, Schnable P. National Science Foundation-sponsored workshop report. Maize genome sequencing project. Plant Physiol. 2001;127:1572–1578. - PMC - PubMed
    1. Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Rapp BA, Wheeler DL. GenBank. Nucleic Acids Res. 2002;30:17–20. - PMC - PubMed
    1. Blackwell JM. Parasite genome analysis. Progress in the Leishmania genome project. Trans R Soc Trop Med Hyg. 1997;91:107–110. - PubMed
    1. Dowell RD, Jokerst RM, Day A, Eddy SR, Stein L. The Distributed Annotation System. BMC Bioinformatics. 2001;2:7. - PMC - PubMed

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources