Blind prediction of noncanonical RNA structure at atomic accuracy (original) (raw)
Related papers
Atomic accuracy in predicting and designing non-canonical RNA structure
Nature methods, 2010
We present a Rosetta full-atom framework for predicting and designing the non-canonical motifs that define RNA tertiary structure, called FARFAR (Fragment Assembly of RNA with Full Atom Refinement). For a test set of thirty-two 6-to-20-nucleotide motifs, the method recapitulated 50% of the experimental structures at near-atomic accuracy. Additionally, design calculations recovered the native sequence at the majority of RNA residues engaged in non-canonical interactions, and mutations predicted to stabilize a signal recognition particle domain were experimentally validated.
All-atom knowledge-based potential for RNA structure prediction and assessment
Bioinformatics/computer Applications in The Biosciences, 2011
Over the recent years, the vision that RNA simply serves as information transfer molecule has dramatically changed. The study of the sequence/structure/function relationships in RNA is becoming more important. As a direct consequence, the total number of experimentally solved RNA structures has dramatically increased and new computer tools for predicting RNA structure from sequence are rapidly emerging. Therefore, new and accurate methods for assessing the accuracy of RNA structure models are clearly needed. Results: Here, we introduce an all-atom knowledge-based potential for the assessment of RNA three-dimensional (3D) structures. We have benchmarked our new potential, called Ribonucleic Acids Statistical Potential (RASP), with two different decoy datasets composed of near-native RNA structures. In one of the benchmark sets, RASP was able to rank the closest model to the X-ray structure as the best and within the top 10 models for ∼93 and ∼95% of decoys, respectively. The average correlation coefficient between model accuracy, calculated as the root mean square deviation and global distance test-total score (GDT-TS) measures of C3 atoms, and the RASP score was 0.85 and 0.89, respectively. Based on a recently released benchmark dataset that contains hundreds of 3D models for 32 RNA motifs with non-canonical base pairs, RASP scoring function compared favorably to ROSETTA FARFAR force field in the selection of accurate models. Finally, using the self-splicing group I intron and the stem-loop IIIc from hepatitis C virus internal ribosome entry site as test cases, we show that RASP is able to discriminate between known structure-destabilizing mutations and compensatory mutations. Availability: RASP can be readily applied to assess all-atom or coarse-grained RNA structures and thus should be of interest to both developers and end-users of RNA structure prediction methods. The computer software and knowledge-based potentials are freely available at
Bridging the gap in RNA structure prediction
Current Opinion in Structural Biology, 2007
The field of RNA structure prediction has experienced significant advances in the past several years, thanks to the availability of new experimental data and improved computational methodologies. These methods determine RNA secondary structures and pseudoknots from sequence alignments, thermodynamics-based dynamic programming algorithms, genetic algorithms and combined approaches. Computational RNA three-dimensional modeling uses this information in conjunction with manual manipulation, constraint satisfaction methods, molecular mechanics and molecular dynamics. The ultimate goal of automatically producing RNA three-dimensional models from given secondary and tertiary structure data, however, is still not fully realized. Recent developments in the computational prediction of RNA structure have helped bridge the gap between RNA secondary structure prediction, including pseudoknots, and three-dimensional modeling of RNA.
Algorithms for the Prediction of Secondary Structures of RNA Macromolecules
In this paper, we tackle the problem of the prediction by energy computation of RNA stable secondary structures. We present, under the Hypothesis of Loops Dependent Energy (HLDE), our dynamic programming algorithm to compute the free energies of the stable secondary structures and our traceback algorithm to predict these structures. We compute the free energies of the stable secondary structures by using a new approach, called m-Multiloop Approach (m-MA), m > 1. This computation is achieved within a time proportional to n 4 and using a memory space proportional to n 2. The prediction of the stable secondary structures is achieved within a time proportional to n 3 * log 3 (n). Compared to other approaches, the m-MA enables us to improve the estimation of the minimum energetic contributions of the multiloops. And hence, it enables us to improve the estimation of the free energies of the stable secondary structures.
RNA structure prediction: Progress and perspective
Chinese Physics B, 2014
Many recent exciting discoveries have revealed the versatility of RNAs and their impo rtance in a variety of cellular functions which are strongly coupled to RNA structures. To understand the functions of RNAs, some structure prediction models have been developed in recent years. In this review, the progress in computational models for RNA structure prediction is introduced and the distinguishing features of many outstanding algorithms are discussed, emphasizing three dimensional (3D) structure prediction. A promising coarse-grained model for predicting RNA 3D structure, stability and salt effect is also introduced briefly. Finally, we discuss the major challenges in the RNA 3D structure modeling.
Lectures L 2 . 1 RNAComposer : automated high-resolution structure prediction for large RNAs
2012
In contrast to the protein field, a much smaller number of RNA tertiary structures has been assessed by X-ray crystallography, NMR spectroscopy and cryo-EM, and deposited in structural data banks. In view of the rapidly growing access to RNA secondary structures their 3D structure prediction is in great demand in the RNA community. Only a few programs and web-accessible tools have been proposed for semi-automated and automated prediction of the RNA tertiary structure. Automated methods make use of the coarse-grained and atomic-level molecular dynamics, internal coordinate space dynamics, fragment assembly and comparative modelling using templates. They vary considerably in terms of the required input data (RNA sequence, secondary structure, conformational data or structural templates), structure prediction quality across different RNA sizes and computation time. Recently we have developed a novel approach for the fully automated RNA 3D structure prediction from the userdefined secon...
RNA, 2011
RNA molecules play integral roles in gene regulation, and understanding their structures gives us important insights into their biological functions. Despite recent developments in template-based and parameterized energy functions, the structure of RNA-in particular the nonhelical regions-is still difficult to predict. Knowledge-based potentials have proven efficient in protein structure prediction. In this work, we describe two differentiable knowledge-based potentials derived from a curated data set of RNA structures, with all-atom or coarse-grained representation, respectively. We focus on one aspect of the prediction problem: the identification of native-like RNA conformations from a set of near-native models. Using a variety of near-native RNA models generated from three independent methods, we show that our potential is able to distinguish the native structure and identify native-like conformations, even at the coarse-grained level. The all-atom version of our knowledge-based potential performs better and appears to be more effective at discriminating near-native RNA conformations than one of the most highly regarded parameterized potential. The fully differentiable form of our potentials will additionally likely be useful for structure refinement and/or molecular dynamics simulations.
Bioinformatics in Rice Research
One of the significant forms of molecules present in living cells is ribonucleic acid (RNA). RNA structural elements moderate various biological process, including epigenetic function, modify mRNA stability, and alternate splicing. The study of the secondary structures of RNA is, therefore, crucial for interpreting the role as well as the regulatory mechanism of RNA transcripts. But experimental methods are tedious, time-consuming, pricey, requires special equipment, and, thus, cannot often be implemented. Methods for statistical simulation are an option and parallel to experimental approaches. Additionally, the findings from the RNA-Puzzles, joint research on the estimation of RNA structures, suggest that computational methods can be employed for effective RNA modeling. However, there is still space for improvement. Considering this, in the chapter, authors attempted to understand the various forms of RNA and how computational approaches can be employed to predict their structure more precisely. The RNA
RNA Secondary Structure Prediction Via Energy Density Minimization
Lecture Notes in Computer Science, 2006
There is a resurgence of interest in RNA secondary structure prediction problem (a.k.a. the RNA folding problem) due to the discovery of many new families of non-coding RNAs with a variety of functions. The vast majority of the computational tools for RNA secondary structure prediction are based on free energy minimization. Here the goal is to compute a non-conflicting collection of structural elements such as hairpins, bulges and loops, whose total free energy is as small as possible. Perhaps the most commonly used tool for structure prediction, mfold/RNAfold, is designed to fold a single RNA sequence. More recent methods, such as RNAscf and alifold are developed to improve the prediction quality of this tool by aiming to minimize the free energy of a number of functionally similar RNA sequences simultaneously. Typically, the (stack) prediction quality of the latter approach improves as the number of sequences to be folded and/or the similarity between the sequences increase. If the number of available RNA sequences to be folded is small then the predictive power of multiple sequence folding methods can deteriorate to that of the single sequence folding methods or worse. In this paper we show that delocalizing the thermodynamic cost of forming an RNA substructure by considering the energy density of the substructure can significantly improve on secondary structure prediction via free energy minimization. We describe a new algorithm and a software tool that we call Densityfold, which aims to predict the secondary structure of an RNA sequence by minimizing the sum of energy densities of individual substructures. We show that when only one or a small number of input sequences are available, Densityfold can outperform all available alternatives. It is our hope that this approach will help to better understand the process of nucleation that leads to the formation of biologically relevant RNA substructures.