Savant: genome browser for high-throughput sequencing data - PubMed (original) (raw)

Savant: genome browser for high-throughput sequencing data

Marc Fiume et al. Bioinformatics. 2010.

Abstract

Motivation: The advent of high-throughput sequencing (HTS) technologies has made it affordable to sequence many individuals' genomes. Simultaneously the computational analysis of the large volumes of data generated by the new sequencing machines remains a challenge. While a plethora of tools are available to map the resulting reads to a reference genome, and to conduct primary analysis of the mappings, it is often necessary to visually examine the results and underlying data to confirm predictions and understand the functional effects, especially in the context of other datasets.

Results: We introduce Savant, the Sequence Annotation, Visualization and ANalysis Tool, a desktop visualization and analysis browser for genomic data. Savant was developed for visualizing and analyzing HTS data, with special care taken to enable dynamic visualization in the presence of gigabases of genomic reads and references the size of the human genome. Savant supports the visualization of genome-based sequence, point, interval and continuous datasets, and multiple visualization modes that enable easy identification of genomic variants (including single nucleotide polymorphisms, structural and copy number variants), and functional genomic information (e.g. peaks in ChIP-seq data) in the context of genomic annotations.

Availability: Savant is freely available at http://compbio.cs.toronto.edu/savant.

PubMed Disclaimer

Figures

Fig. 1.

Fig. 1.

Screenshot of Savant. (A) Range controls. Selection, zoom and pan controls for coarse navigation; text fields for fine navigation. Zooming and panning are also possible via keyboard and mouse commands. (B) Tracks. These represent the data in current range. Top: read alignments, with colored pixels representing differences between the reads and the reference. Bottom: color representation of the genome sequence. (C) Table View module, detached from the main interface. The table view module is displaying the mapped reads with SAM format fields. (D) Bookmarks module.

Fig. 2.

Fig. 2.

Read alignments, visualized at various resolutions and using two modes. (A) Chromosome-wide view of read mappings, showing the overall coverage (with no coverage in the centromere). (B) Regional view, still visualized as a coverage map, showing higher coverage in certain regions of the genome. (C) Local view, the reads are shown separately and differences between the reads and the reference genome are colored. Reads on the forward and reverse strand are shown with different shades of blue. (D) Matepair (arc) mode, showing the relative distance between the two reads of a pair. Taller arcs indicate larger distances between the pairs.

Fig. 3.

Fig. 3.

Code used to make Bookmark Intersection Plug-in. The details of the UI that allows the user to select two tracks have been omitted. Once the two tracks are selected, the bookMarkTrackIntersections() method is run, which, for each interval of one track, finds overlapping intervals of the other, and saves intervals with overlap to the bookmark panel.

Fig. 4.

Fig. 4.

(A) Visualization of two SNP variants. The displayed tracks are, top to bottom, Conservation, Gene Models, Read Alignments, the Genome sequence and known SNPs from dbSNP. Two potential SNP variants are indicated by consistent mismatching colors within a column, with the downstream SNP previously known, while the upstream one, a heterozygous variant, not in dbSNP. (B) Identification and visualization of MoDIL indel variants that overlap exons via the plug-in framework. The visualized tracks are Gene Models, read mappings (visualized in the matepair mode) and MoDIL predictions. The panel on top right is the Bookmark module, while on the bottom right is the BookMarkIntersection plug-in described in Figure 3. To identify variants of interest, the user selects the two tracks in the BookMarkIntersection window, and those variants that overlap exons are added to the list of bookmarks. The user can then easily go through this list. In this particular case the indel likely occurs in the intron between the exons, but because of MoDIL's inability to accurately identify borders of variants, the variant is shown as overlapping.

Similar articles

Cited by

References

    1. Bentley DR, et al. Accurate whole human genome sequencing using reversible terminator chemistry. Nature. 2008;456:53–59. - PMC - PubMed
    1. Chiang DY, et al. High-resolution mapping of copy-number alterations with massively parallel sequencing. Nat. Methods. 2009;6:99–103. - PMC - PubMed
    1. Dalca AV, Brudno M. Genome variation discovery with high-throughput sequencing data. Brief. Bioinform. 2010;11:3–14. - PubMed
    1. Gordon D, et al. Consed: a graphical tool for sequence finishing. Genome Res. 1998;8:195–202. - PubMed
    1. Hormozdiari F, et al. Combinatorial algorithms for structural variation detection in high-throughput sequenced genomes. Genome Res. 2009;19:1270–1278. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources