BioViews: Java-based tools for genomic data visualization - PubMed (original) (raw)

BioViews: Java-based tools for genomic data visualization

G A Helt et al. Genome Res. 1998 Mar.

Abstract

Visualization tools for bioinformatics ideally should provide universal access to the most current data in an interactive and intuitive graphical user interface. Since the introduction of Java, a language designed for distributed programming over the Web, the technology now exists to build a genomic data visualization tool that meets these requirements. Using Java we have developed a prototype genome browser applet (BioViews) that incorporates a three-level graphical view of genomic data: a physical map, an annotated sequence map, and a DNA sequence display. Annotated biological features are displayed on the physical and sequence-based maps, and the different views are interconnected. The applet is linked to several databases and can retrieve features and display hyperlinked textual data on selected features. In addition to browsing genomic data, different types of analyses can be performed interactively and the results of these analyses visualized alongside prior annotations. Our genome browser is built on top of extensible, reusable graphic components specifically designed for bioinformatics. Other groups can (and do) reuse this work in various ways. Genome centers can reuse large parts of the genome browser with minor modifications, bioinformatics groups working on sequence analysis can reuse components to build front ends for analysis programs, and biology laboratories can reuse components to publish results as dynamic Web documents.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Physical and genetic map viewer. The physical map of a 3-Mb region of Drosophila genomic DNA is displayed within the physical map widget of the Genome Browser. DNA sequence length (kb) is shown at the top of the display. Chromosomal divisions and chromosome bands are shown underneath as labeled boxes. Below the chromosomal bands, the P1 contigs that map to the region are displayed as labeled blue bars, and the P1 clones that make up each contig are shown as smaller green bars underneath their respective contigs. Sequenced P1s are shown in light green. The Physical Map viewer calculates where to place contigs on the map by using the cytogenetic map position of their constituent P1 clones. The length of the contig is estimated from the number of P1s in the contig and the average P1 size, and individual P1s are drawn with a length proportional to the number of STSs contained within the P1 clone. In the current display, only the minimal tiling path of P1s per contig is shown as all the P1 clones for each contig would add much more data to the display but contribute little to the information conveyed. In addition to the contigs, all the genes that have been mapped genetically to the region are displayed as labeled spans, based on their genetic mapping to a range of chromosome bands; these data were obtained from FlyBase (Flybase 1995). Lethal P-element transposon insertions, which have been mapped cytogenetically to this region, are also shown as pink squares at the bottom of the display (Spradling et al. 1995). From this high level view it is immediately apparent that the distribution of mapped P-element insertions is very uneven, which is a recognized biological phenomena where some genomic regions have proven to be “hot spots” for P-element insertion. (A) Low-resolution view. The user has invoked the Physical Map viewer, with the resolution of the display set to the lowest level of detail. At this level of zoom, chromosomal divisions, each comprised of several bands, are shown as labeled black rectangles just below the DNA sequence length scale. (B) Higher-resolution views. Here just the Physical Map viewer chromosomal band display is shown. The user has used the zoom slider at the top left of the display to show successively higher resolution views of the region shown in A. Note that the chromosome divisions (labeled 35C, etc.) shown in A have resolved into their constituent bands (labeled 35C5, etc.) and that finer scale markings have appeared in the kilobase scale.

Figure 2

Figure 2

Annotated sequence map. The sequence map widget is shown above at the lowest level of resolution (i.e., the zoom control is set all the way to the left). The P1 clone displayed here is a gene-dense 83-kb clone (GenBank accession no. L49408) that the BDGP is using as a testbed for computational and experimental sequence analysis methods. The cactus locus has been selected. The sequence map displays the length of the DNA shown in a kilobase scale along the horizontal axis in the center of the display. Many of the features have directionality (i.e., they are present on either the forward or reverse strand). Therefore, features on the forward strand are displayed above the axis, and features on the reverse strand are shown below the axis. Vertical lines represent P-element insertions that inactivate essential genes and that have been mapped to the exact nucleotide of the insertion are also shown (Spradling et al. 1995). GenBank entries for known genes in the region are shown in black labeled with the gene name. The intron/exon structure of known genes are represented as black boxes (exons) connected by black lines (introns). Similarly, gene structures determined by comparison with cDNAs sequenced by the BDGP are shown in blue. Putative exons identified by gene prediction programs are shown as purple boxes (Drosophila GRAIL) or as green boxes connected by lines representing introns (Genefinder). The results from TBLASTX homology searches are shown in maroon. Gold arrows depict annotations added by human curation; these correspond to previously uncharacterized genes predicted by biological data and computational analyses. The vast majority of these predictions were later confirmed by isolation of corresponding cDNAs from a Drosophila cDNA library (L. Hong, D. Harvey, and G.M. Rubin, unpubl.).

Figure 3

Figure 3

DNA view. The user has increased the level of zoom in the sequence map display so that a 30-kb, rather than 80-kb, region of the P1 is shown. The user has then selected a short region over the 40-kb mark on the central horizontal axis of the sequence map display, and has pressed the Show DNA button at the top of the browser window. Both the sequence data and the code for the sequence display viewer are dynamically loaded only as needed. Once the DNA view appears, a “shadow” representing its span along the sequence map appears as a gray box outlined in green on the horizontal axis. The user has selected an exon (predicted by Drosophila GRAIL) within the sequence map, resulting in this feature being highlighted in red in the sequence view and in yellow in the DNA view. The DNA view can be resized, and the DNA viewer deals with resizing intelligently by ensuring that rows are always displayed as multiples of ten. To further provide positional cues, two slightly different background colors are used for adjacent 10-base columns.

Figure 4

Figure 4

Hyperlinks. (A) In situ hybridization pattern of Ipa-6d in an early Drosophila embryo. The user has selected gene Ipa-6d in the Sequence Map viewer and has pressed the Image button at the top of the browser window. This selection invokes a window showing the embryonic expression pattern of the Ipa-6d gene as determined by RNA in situ hybridization. (B) GenBank entry for cactus accessed through the Sequence browser. The user has selected the cactus locus, and has pressed the Get Info button at the top of the Sequence Map viewer display. The cactus GenBank record with HTML hyperlinks is returned in a new separate browser window (the lower box). The lower window has the full capabilities of typical browser windows in that the user is able to follow the hyperlinks shown in blue.

Figure 5

Figure 5

Analysis tools. (A) Primer selection in the Genome browser. The user has asked the browser to design PCR primers suitable for amplification of the region selected within the Sequence Map window. The results are shown above. In the Sequence Map window, sense and antisense primers are displayed as maroon and blue arrows, respectively. In the Sequence Display window (bottom right), the sequence for the sense primer recommended by the primer selection program is highlighted in yellow. (B) Restriction mapping in the Sequence browser. The user has pressed the Rest. Sites button at the top of the browser window. This action has activated a new browser window (shown at the left) that permits selection of restriction enzymes. The user has selected three: AclI, AhaIII, and EcoRI. Sites for these enzymes are shown in light blue, magenta, and green, respectively. Note that these sites in the original map window are marked along the linear map, and that they are also highlighted in yellow within the Show DNA window.

Similar articles

Cited by

References

    1. Altschul S, Gish W, Miller W, Myers E, Lipman D. Basic local alignment search tool. J Mol Biol. 1990;215:403–410. - PubMed
    1. Altschul SF, Boguski MS, Gish W, Wootton JC. Issues in searching molecular sequence databases. Nature Genet. 1994;6:119–129. - PubMed
    1. Bederson B, Hollan J. User Interface Software and Technology. New York, NY: ACM Press; 1994. >Pad++: A zooming graphical interface for exploring alternate interface physics.
    1. Bier EA, Stone MC, Pier K, Buxton W, DeRose TD. Proceedings of SIGGRAPH ’93, Computer Graphics Annual Conference Series. New York, NY: ACM Press; 1993. Toolglass and magic lenses: The see-through interface; pp. 73–80.
    1. Dunham I, Durbin R, Thierry-Mieg J, Bentley DR. Physical mapping projects and ACeDB. In: Bishop MJ, editor. Guide to human genome computing. New York, NY: Harcourt Brace; 1994. pp. 110–158.

Publication types

MeSH terms

LinkOut - more resources