Vienna RNA secondary structure server (original) (raw)

Journal Article

Search for other works by this author on:

Navbar Search Filter Mobile Enter search term Search

Abstract

The Vienna RNA secondary structure server provides a web interface to the most frequently used functions of the Vienna RNA software package for the analysis of RNA secondary structures. It currently offers prediction of secondary structure from a single sequence, prediction of the consensus secondary structure for a set of aligned sequences and the design of sequences that will fold into a predefined structure. All three services can be accessed via the Vienna RNA web server at http://rna.tbi.univie.ac.at/.

Received February 15, 2003; Revised and Accepted April 5, 2003

INTRODUCTION

Biomolecules exhibit a close interplay between structure and function. Therefore the growing number of RNA molecules with complex functions, beyond that of encoding proteins, has brought increased demand for RNA structure prediction methods. While prediction of tertiary structure is usually infeasible, the area of RNA secondary structures is an example where computational methods have been highly successful.

The first practical dynamic programming algorithms to predict the optimal secondary structure of an RNA sequence date back over 20 years (1). Since then they have been extended to allow prediction of suboptimal structures (2,3) and thermodynamic ensembles (4), which allow to assign a confidence level or ‘well definedness’ to the predictions (5).

Recently, several methods have addressed the problem of predicting a consensus structure for a group of related RNA sequences (611). Such conserved structures are of particular interest, since conservation of structure in spite of sequence variation implies that the structure must be functionally important. By enhancing energy rules with sequence covariation these methods also obtain much better prediction accuracies.

The Vienna RNA package (12) is a free software package that implements a variety of algorithms for the prediction and analysis of RNA secondary structures. The package is, however, strongly geared toward Unix command-line users and programmers. For the less computer savvy, or occasional user, it provides neither a point-and-click graphical user interface nor even pre-compiled binaries.

The Vienna RNA web site tries to address these shortcomings by offering access to the most popular features via an easy to use web interface. It consists of three CGI scripts equivalent to the RNAfold, RNAalifold and RNAinverse command line programs, respectively. While the servers have to limit request sizes for performance reasons, they return for each request an equivalent command line invocation. This makes it easier for users to make the transition to locally installed software, should their requirements exceed the limits of the web service.

THE RNAfold SERVER

Of the three services, the RNAfold server provides both the most basic and most widely used function. Input consists of a single sequence that has to be typed or pasted into a text field of the input form.

In the simplest case, the server predicts only the minimum free energy (mfe) structure of a single sequence using the classic algorithm of Zuker and Stiegler (1). In addition to mfe folding the server can calculate equilibrium base pairing probabilities via John McCaskill's partition function algorithm (4).

By default the RNA energy parameters of the Turner group (13) are used, but single stranded DNA sequences can be handled as well, by selecting the DNA parameter set provided by John SantaLucia (14).

The fold server output consists of a static html page presenting the predicted mfe structure as a string in bracket notation and links to the plots generated for visualization. Three types of plots can be produced. Firstly, the predicted mfe structure is plotted as a conventional secondary structure graph using the naview layout method (15). The pair probabilities can be visualized in a so-called ‘dot plot’: on a square grid of n_×_n we draw for each possible pair (i, j) a box with area proportional to its probability. Finally, we produce a mountain plot depicting both the predicted mfe and pair probabilities. A mountain plot is an _xy_-graph that plots the number of base pairs enclosing a sequence position (for pair probabilities the average number of enclosing pairs). See Figure 1 for examples of all three representations.

Secondary structure drawing and dot plots are always produced in Postscript format. Postscript is used not only because it gives the highest print quality, but also because it allows the actual data to be embedded in the file, e.g. all pair probabilities are contained in the dot plot in an easy to parse format. On the other hand, Postscript files cannot be used for inline images on web pages and require additional software for viewing (e.g. gsview, http://www.ghostscript.com/).

A suitable alternative is the new standard for Scalable Vector Graphics, SVG (http://www.w3.org/Graphics/SVG). Users with SVG enabled browsers (typically through the use of Adobe's SVG plugin, http://www.adobe.com/svg/) can request structure drawings in SVG, which allows some interactivity such as toggling annotation. Currently the server accepts sequences up to a maximum length of 4000 nt, sequences up to 300 nt will be processed immediately while longer jobs are submitted to a batch queue, in which case the user is notified by email after completion.

THE Alifold SERVER

The Alifold service predicts the consensus secondary structure for a set of aligned RNA or DNA sequences by using modified dynamic programming algorithms that add a covariance term to the standard energy model (11), again it supports prediction of mfe structures and pair probabilities. Usage is almost identical to that of the RNAfold service. Instead of typing an input sequence, a precomputed sequence alignment is uploaded via the input form. Currently, only alignments in Clustal format are accepted. The server restricts both the size of the upload and the length of the alignment, current limits being 10 Kb and 2000 nt, respectively.

Results are again visualized in Postscript plots that are enhanced by information on sequence variation. In the structure drawings mutations supporting the predicted structure are marked by circles, in the dot plots and mountain plots, color is used to indicate the number of different pair types. Examples and detailed explanation of these representations can be found on the online help page (http://www.tbi.univie.ac.at/~ivo/RNA/alifoldcgi.html).

THE INVERSE FOLD SERVER

Finding sequences that fold into a predefined structure is the inverse of structure prediction problem. Often it is useful to design such sequences, e.g. in order to experimentally test an hypothesis about functional structures. While this is often done manually for very short sequences, it quickly becomes tedious and error prone.

Our inverse folding service treats sequence design as an optimization problem in sequence space that is solved heuristically (12). There are again two variants based on mfe and partition function folding. In the first case we minimize the dissimilarity between the predicted mfe structure and the desired target structure. In the second case we optimize the frequency of the target structure in the thermodynamic ensemble. While the mfe optimization typically yields sequences that are marginally stable, i.e. have many alternative foldings, optimization via the partition function produces sequences with a very strong preference for the target structure.

Input consists simply of the desired structure in bracket notation. The maximum structure length is currently 100 nt. The time needed for the search varies widely depending on the ubiquity of the target structure. Most valid secondary structure strings never occur as mfe structure of some sequence (i.e. many sequence design problems have no solution), while some others are extremely common (for example see 16). Conversely, the number of search steps performed by the algorithm is a good indicator for the frequency of a structure in sequence space.

FUTURE PLANS

The Vienna RNA secondary structure server presented here provides only basic access to a subset of the functions in the Vienna RNA software package. Nevertheless they provide a convenient interface for users that need RNA structure prediction only occasionally and a shallow learning curve for those new to the field.

Work is underway to further improve the visualization of the results, e.g. by producing structure drawings annotated with various measures of well-definedness. As SVG enabled browsers become more widespread, a combination of SVG graphics and client side javascript should allow users to explore the predicted structures interactively.

The output web page produced by the server is designed for the interactive user and thus is not ideal for automatic parsing and further processing of the results. To facilitate such interoperation with other programs and web services we plan to offer input and output in a standardized data exchange format. A promising candidate for this is the recently proposed RNAML format (17), an XML based language for the storage of information on RNA sequence and structure.

While the server currently runs on a somewhat dated dual Pentium II 450 MHz machine, the use of a batch queuing system allows jobs to be distributed to other machines should that become necessary.

ACKNOWLEDGEMENTS

This work is supported by the Austrian Fonds zur Förderung der Wissenschaftlichen Forschung, Projects FWF 15893 and P-13545-MAT.

Figure 1. Three representations for secondary structure and structure ensembles as produced by the RNAfold service: structure graph, mountain plot and dot plot. As can be seen clearly in the dot plot, the sequence has two almost equally good but very dissimilar foldings.

Figure 1. Three representations for secondary structure and structure ensembles as produced by the RNAfold service: structure graph, mountain plot and dot plot. As can be seen clearly in the dot plot, the sequence has two almost equally good but very dissimilar foldings.

References

Zuker,M. and Stiegler,P. (

1981

) Optimal computer folding of larger RNA sequences using thermodynamics and auxiliary information.

Nucleic Acids Res.

,

9

,

133

–148.

Zuker,M. (

1989

) The use of dynamic programming algorithms in RNA secondary structure prediction. In Waterman,M.S. (ed.),

Mathematical Methods for DNA Sequences

, CRC Press, Boca Raton, FL, pp.

159

–184.

Wuchty,S., Fontana,W., Hofacker,I.L. and Schuster,P. (

1999

) Complete suboptimal folding of RNA and the stability of secondary structures.

Biopolymers

,

49

,

145

–165.

McCaskill,J.S. (

1990

) The equilibrium partition function and base pair binding probabilities for RNA secondary structure.

Biopolymers

,

29

,

1105

–1119.

Zuker,M. and Jacobson,A.B. (

1995

) ‘Well-determined’ regions in RNA secondary structure prediction: analysis of small subunit ribosomal RNA.

Nucleic Acids Res.

,

23

,

2791

–2798.

Gorodkin,J., Heyer,L.J. and Stormo,G.D. (

1997

) Finding the most significant common sequence and structure motifs in a set of RNA sequences.

Nucleic Acids Res.

,

25

,

3724

–3732.

Hofacker,I.L., Fekete,M., Flamm,C., Huynen,M.A., Rauscher,S., Stolorz,P.E. and Stadler,P.F. (

1998

) Automatic detection of conserved RNA structure elements in complete RNA virus genomes.

Nucleic Acids Res.

,

26

,

3825

–3836.

Lück,R., Graf,S. and Steger,G. (

1999

) Construct: a tool for thermodynamic controlled prediction of conserved secondary structure.

Nucleic Acids Res.

,

27

,

4208

–4217.

Juan,V. and Wilson,C. (

1999

) RNA secondary structure prediction based on free energy and phylogenetic analysis.

J. Mol. Biol.

,

289

,

935

–947.

Knudsen,B. and Hein,J. (

1999

) RNA secondary structure prediction using stochastic context-free grammars and evolutionary history.

Bioinformatics

,

15

,

446

–454.

Hofacker,I.L., Fekete,M. and Stadler,P.F. (

2002

) Secondary structure prediction for aligned RNA sequences.

J. Mol. Biol.

,

319

,

1059

–1066.

Hofacker,I.L., Fontana,W., Stadler,P.F., Bonhoeffer,S., Tacker,M. and Schuster,P. (

1994

) Fast folding and comparison of RNA secondary structures.

Monatsh. Chem.

,

125

,

167

–188.

Mathews,D., Sabina,J., Zucker,M. and Turner,H. (

1999

) Expanded sequence dependence of thermodynamic parameters provides robust prediction of RNA secondary structure.

J. Mol. Biol.

,

288

,

911

–940.

SantaLucia,J. Jr. (

1998

) A unified view of polymer, dumbbell, and oligonucleotide DNA nearest-neighbor thermodynamics.

Proc. Natl Acad. Sci. USA

,

95

,

1460

–1465.

Bruccoleri,R.E. and Heinrich,G. (

1988

) An improved algorithm for nucleic acid secondary structure display.

CABIOS

,

4

,

167

–173.

Schuster,P., Fontana,W., Stadler,P.F. and Hofacker,I.L. (

1994

) From sequences to shapes and back: a case study in RNA secondary structures.

Proc. Royal Society London B

,

255

,

279

–284.

Waugh,A., Gendron,P., Altman,R., Brown,J., Case,D., Gautheret,D., Harvey,S., Leontis,N., Westbrook,J., Westhof,E. et al. (

2002

) RNAML: a standard syntax for exchanging RNA information.

RNA

,

8

,

707

–717.

I agree to the terms and conditions. You must accept the terms and conditions.

Submit a comment

Name

Affiliations

Comment title

Comment

You have entered an invalid code

Thank you for submitting a comment on this article. Your comment will be reviewed and published at the journal's discretion. Please check for further notifications by email.

Citations

Views

Altmetric

Metrics

Total Views 10,472

8,130 Pageviews

2,342 PDF Downloads

Since 1/1/2017

Month: Total Views:
January 2017 14
February 2017 47
March 2017 68
April 2017 17
May 2017 27
June 2017 23
July 2017 50
August 2017 37
September 2017 47
October 2017 35
November 2017 44
December 2017 105
January 2018 93
February 2018 103
March 2018 102
April 2018 128
May 2018 139
June 2018 157
July 2018 227
August 2018 196
September 2018 99
October 2018 133
November 2018 139
December 2018 117
January 2019 117
February 2019 121
March 2019 106
April 2019 155
May 2019 122
June 2019 106
July 2019 123
August 2019 136
September 2019 124
October 2019 115
November 2019 99
December 2019 102
January 2020 93
February 2020 105
March 2020 68
April 2020 77
May 2020 82
June 2020 118
July 2020 90
August 2020 80
September 2020 105
October 2020 135
November 2020 148
December 2020 95
January 2021 105
February 2021 103
March 2021 168
April 2021 138
May 2021 154
June 2021 105
July 2021 124
August 2021 113
September 2021 77
October 2021 155
November 2021 121
December 2021 127
January 2022 156
February 2022 127
March 2022 160
April 2022 126
May 2022 141
June 2022 128
July 2022 110
August 2022 71
September 2022 101
October 2022 101
November 2022 126
December 2022 101
January 2023 148
February 2023 139
March 2023 114
April 2023 114
May 2023 144
June 2023 144
July 2023 133
August 2023 91
September 2023 104
October 2023 96
November 2023 113
December 2023 120
January 2024 156
February 2024 150
March 2024 140
April 2024 135
May 2024 93
June 2024 87
July 2024 115
August 2024 115
September 2024 159
October 2024 119
November 2024 36

×

Email alerts

Citing articles via

More from Oxford Academic