The RAST Server: rapid annotations using subsystems technology - PubMed (original) (raw)

doi: 10.1186/1471-2164-9-75.

Daniela Bartels, Aaron A Best, Matthew DeJongh, Terrence Disz, Robert A Edwards, Kevin Formsma, Svetlana Gerdes, Elizabeth M Glass, Michael Kubal, Folker Meyer, Gary J Olsen, Robert Olson, Andrei L Osterman, Ross A Overbeek, Leslie K McNeil, Daniel Paarmann, Tobias Paczian, Bruce Parrello, Gordon D Pusch, Claudia Reich, Rick Stevens, Olga Vassieva, Veronika Vonstein, Andreas Wilke, Olga Zagnitko

Affiliations

The RAST Server: rapid annotations using subsystems technology

Ramy K Aziz et al. BMC Genomics. 2008.

Abstract

Background: The number of prokaryotic genome sequences becoming available is growing steadily and is growing faster than our ability to accurately annotate them.

Description: We describe a fully automated service for annotating bacterial and archaeal genomes. The service identifies protein-encoding, rRNA and tRNA genes, assigns functions to the genes, predicts which subsystems are represented in the genome, uses this information to reconstruct the metabolic network and makes the output easily downloadable for the user. In addition, the annotated genome can be browsed in an environment that supports comparative analysis with the annotated genomes maintained in the SEED environment. The service normally makes the annotated genome available within 12-24 hours of submission, but ultimately the quality of such a service will be judged in terms of accuracy, consistency, and completeness of the produced annotations. We summarize our attempts to address these issues and discuss plans for incrementally enhancing the service.

Conclusion: By providing accurate, rapid annotation freely to the community we have created an important community resource. The service has now been utilized by over 120 external users annotating over 350 distinct genomes.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Example Tricarballylate Utilization Subsystem. A) The subsystem is comprised of 4 functional roles. B) The Subsystem Spreadsheet is populated with genes from 5 organisms (simplified from the original subsystem) where each row represents one organism and each column one functional role. Genes performing the specific functional role in the respective organism populate the respective cell. Gray shading of cells indicates proximity of the respective genes on the chromosomes. There are two distinct variants of the subsystem: variant 1, with all 4 functional roles and variant 2 where the 3rd functional role is missing.

Figure 2

Figure 2

Genes connected to subsystems and their distribution in different categories. The categories are expandable down to the specific gene (see Secondary Metabolism).

Figure 3

Figure 3

Job Overview page. The colours in the progress bar have the following meaning: gray – not started, blue – queued for computation, yellow – in progress, red – requires user input, brown – failed with an error, green – successfully completed.

Figure 4

Figure 4

Job Detail page. The RAST annotation progress can be monitored by each user.

Figure 5

Figure 5

Genome Browser. The annotated genome can be browsed starting from a whole-genome view and zooming-in to a specific feature.

Figure 6

Figure 6

Annotation Overview. For each annotated feature RAST presents an overview page, which includes comparative genomics views and the connections to a subsystem if one was asserted.

Figure 7

Figure 7

Compare Metabolic Reconstruction tool. In the example the RAST metabolic reconstruction of the submitted genome of S. pyogenes Manfredo was compared to the metabolic reconstruction for S. pyogenes MGAS315, which is part of the comparative environment of the SEED. All three columns of subsystem categories are expandable. In cases where RAST was conservative in the assertion of a subsystem a manual attempt to retrieve the missing function/s can be made by clicking the find button.

Figure 8

Figure 8

View Features page. All annotated features can be viewed and downloaded in table format. For each peg the location on the contig, the functional role assignment, its EC number (if present) and GO category, the connection to a subsystem and a KEGG reaction (if appropriate) are given.

Figure 9

Figure 9

View Scenarios page. A genome-specific reaction network can be viewed on a scenario by scenario basis. The scenarios are organized on the left by subsystems, which are themselves organized by categories of metabolic function. If a path through a scenario was found in a given subsystem, the subsystem name is highlighted in blue. In this case, one path was found through the Uroporphyrinogen III generation scenario in the Porphyrin, Heme and Siroheme Biosynthesis subsystem. The table to the right shows the input and output compounds for the scenario, including their stoichiometry, and the reactions that make up the path through the scenario.

Figure 10

Figure 10

Comparison of a set of genomes manually curated in the SEED and automatically annotated in RAST. The number of genes annotated as hypothetical and the number of genes linked to subsystems (our mechanism of manual curation) is shown to provide an initial assessment of the performance of RAST.

Similar articles

Cited by

References

    1. Meyer F, Goesmann A, McHardy AC, Bartels D, Bekel T, Clausen J, Kalinowski J, Linke B, Rupp O, Giegerich R, et al. GenDB – an open source genome annotation system for prokaryote genomes. Nucleic Acids Res. 2003;31:2187–2195. doi: 10.1093/nar/gkg312. - DOI - PMC - PubMed
    1. Van Domselaar GH, Stothard P, Shrivastava S, Cruz JA, Guo A, Dong X, Lu P, Szafron D, Greiner R, Wishart DS. BASys: a web server for automated bacterial genome annotation. Nucleic Acids Res. 2005:W455–459. doi: 10.1093/nar/gki593. - DOI - PMC - PubMed
    1. Bryson K, Loux V, Bossy R, Nicolas P, Chaillou S, van de Guchte M, Penaud S, Maguin E, Hoebeke M, Bessieres P, et al. AGMIAL: implementing an annotation strategy for prokaryote genomes as a distributed system. Nucleic Acids Res. 2006;34:3533–3545. doi: 10.1093/nar/gkl471. - DOI - PMC - PubMed
    1. Vallenet D, Labarre L, Rouy Z, Barbe V, Bocs S, Cruveiller S, Lajus A, Pascal G, Scarpelli C, Medigue C. MaGe: a microbial genome annotation system supported by synteny results. Nucleic Acids Res. 2006;34:53–65. doi: 10.1093/nar/gkj406. - DOI - PMC - PubMed
    1. Moriya Y, Itoh M, Okuda S, Yoshizawa AC, Kanehisa M. KAAS: an automatic genome annotation and pathway reconstruction server. Nucleic Acids Res. 2007:W182–185. doi: 10.1093/nar/gkm321. - DOI - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources