ARTEM: a method for RNA tertiary motif identification with backbone permutations, and its example application to kink-turn-like motifs (original) (raw)
Related papers
FR3D: finding local and composite recurrent structural motifs in RNA 3D structures
Journal of Mathematical Biology, 2008
New methods are described for finding recurrent three-dimensional (3D) motifs in RNA atomic-resolution structures. Recurrent RNA 3D motifs are sets of RNA nucleotides with similar spatial arrangements. They can be local or composite. Local motifs comprise nucleotides that occur in the same hairpin or internal loop. Composite motifs comprise nucleotides belonging to three or more different RNA strand segments or molecules. We use a base-centered approach to construct efficient, yet exhaustive search procedures using geometric, symbolic, or mixed representations of RNA structure that we implement in a suite of MATLAB programs, “Find RNA 3D” (FR3D). The first modules of FR3D preprocess structure files to classify base-pair and -stacking interactions. Each base is represented geometrically by the position of its glycosidic nitrogen in 3D space and by the rotation matrix that describes its orientation with respect to a common frame. Base-pairing and base-stacking interactions are calculated from the base geometries and are represented symbolically according to the Leontis/Westhof basepairing classification, extended to include base-stacking. These data are stored and used to organize motif searches. For geometric searches, the user supplies the 3D structure of a query motif which FR3D uses to find and score geometrically similar candidate motifs, without regard to the sequential position of their nucleotides in the RNA chain or the identity of their bases. To score and rank candidate motifs, FR3D calculates a geometric discrepancy by rigidly rotating candidates to align optimally with the query motif and then comparing the relative orientations of the corresponding bases in the query and candidate motifs. Given the growing size of the RNA structure database, it is impossible to explicitly compute the discrepancy for all conceivable candidate motifs, even for motifs with less than ten nucleotides. The screening algorithm that we describe finds all candidate motifs whose geometric discrepancy with respect to the query motif falls below a user-specified cutoff discrepancy. This technique can be applied to RMSD searches. Candidate motifs identified geometrically may be further screened symbolically to identify those that contain particular basepair types or base-stacking arrangements or that conform to sequence continuity or nucleotide identity constraints. Purely symbolic searches for motifs containing user-defined sequence, continuity and interaction constraints have also been implemented. We demonstrate that FR3D finds all occurrences, both local and composite and with nucleotide substitutions, of sarcin/ricin and kink-turn motifs in the 23S and 5S ribosomal RNA 3D structures of the H. marismortui 50S ribosomal subunit and assigns the lowest discrepancy scores to bona fide examples of these motifs. The search algorithms have been optimized for speed to allow users to search the non-redundant RNA 3D structure database on a personal computer in a matter of minutes.
A structural database for k-turn motifs in RNA
RNA, 2010
The kink-turn (k-turn) is a common structural motif in RNA that introduces a tight kink into the helical axis. k-turns play an important architectural role in RNA structures and serve as binding sites for a number of proteins. We have created a database of known and postulated k-turn sequences and three-dimensional (3D) structures, available via the internet. This site provides (1) a database of sequence and structure, as a resource for the RNA community, and (2) a tool to enable the manipulation and comparison of 3D structures where known.
De novo discovery of structural motifs in RNA 3D structures through clustering
As functional components in three-dimensional conformation of an RNA, the RNA structural motifs provide an easy way to associate the molecular architectures with their biological mechanisms. In the past years, many computational tools have been developed to search motif instances by using the existing knowledge of well-studied families. Recently, with the rapidly increasing number of resolved RNA 3D structures, there is an urgent need to discover novel motifs with the newly presented information. In this work, we classify all the loops in non-redundant RNA 3D structures to detect plausible RNA structural motif families by using a clustering pipeline. Compared with other clustering approaches, our method has two benefits: first, the underlying alignment algorithm is tolerant to the variations in 3D structures; second, sophisticated downstream analysis has been performed to ensure the clusters are valid and easily applied to further research. The final clustering results contain many ...
bioRxiv (Cold Spring Harbor Laboratory), 2024
Non-coding RNAs play a major role in diverse processes in living cells with their sequence and spatial structure serving as the principal determinants of their function. Superposition of RNA 3D structures is the most accurate method for comparative analysis of RNA molecules and for inferring sequence alignments. Topology-independent superposition is particularly relevant, as evidenced by structurally similar RNAs with sequence permutations such as tRNA and Y RNA. To date, state-of-the-art methods for RNA 3D structure superposition rely on intricate heuristics, and the potential for topology-independent superposition has not been exhausted. Recently, we introduced the ARTEM method for unrestrained pairwise superposition of RNA 3D modules and now we developed it further to solve the global RNA 3D structure alignment problem. Our new tool ARTEMIS significantly outperforms state-of-the-art tools in both sequentially-ordered and topology-independent RNA 3D structure superposition. Using ARTEMIS we discovered a helical packing motif to be preserved within different backbone topology contexts across various non-coding RNAs, including multiple ribozymes and riboswitches. We anticipate that ARTEMIS will be essential for elucidating the landscape of RNA 3D folds and motifs featuring sequence permutations that thus far remained unexplored due to limitations in previous computational approaches.
RNA Bricks--a database of RNA 3D motifs and their interactions
Nucleic Acids Research, 2014
The RNA Bricks database (http://iimcb.genesilico. pl/rnabricks), stores information about recurrent RNA 3D motifs and their interactions, found in experimentally determined RNA structures and in RNA-protein complexes. In contrast to other similar tools (RNA 3D Motif Atlas, RNA Frabase, Rloom) RNA motifs, i.e. 'RNA bricks' are presented in the molecular environment, in which they were determined, including RNA, protein, metal ions, water molecules and ligands. All nucleotide residues in RNA bricks are annotated with structural quality scores that describe real-space correlation coefficients with the electron density data (if available), backbone geometry and possible steric conflicts, which can be used to identify poorly modeled residues. The database is also equipped with an algorithm for 3D motif search and comparison. The algorithm compares spatial positions of backbone atoms of the user-provided query structure and of stored RNA motifs, without relying on sequence or secondary structure information. This enables the identification of local structural similarities among evolutionarily related and unrelated RNA molecules. Besides, the search utility enables searching 'RNA bricks' according to sequence similarity, and makes it possible to identify motifs with modified ribonucleotide residues at specific positions.
Nucleic Acids Research, 2012
Similarities in the 3D patterns of RNA base interactions or arrangements can provide insights into their functions and roles in stabilization of the RNA 3D structure. Nucleic Acids Search for Substructures and Motifs (NASSAM) is a graph theoretical program that can search for 3D patterns of base arrangements by representing the bases as pseudo-atoms. The geometric relationship of the pseudo-atoms to each other as a pattern can be represented as a labeled graph where the pseudo-atoms are the graph's nodes while the edges are the interpseudo-atomic distances. The input files for NASSAM are PDB formatted 3D coordinates. This web server can be used to identify matches of base arrangement patterns in a query structure to annotated patterns that have been reported in the literature or that have possible functional and structural stabilization implications. The NASSAM program is freely accessible without any login requirement at
Nucleic Acids Research, 2012
RNA secondary structure is important for designing therapeutics, understanding protein-RNA binding and predicting tertiary structure of RNA. Several databases and downloadable programs exist that specialize in the three-dimensional (3D) structure of RNA, but none focus specifically on secondary structural motifs such as internal, bulge and hairpin loops. The RNA Characterization of Secondary Structure Motifs (RNA CoSSMos) database is a freely accessible and searchable online database and website of 3D characteristics of secondary structure motifs. To create the RNA CoSSMos database, 2156 Protein Data Bank (PDB) files were searched for internal, bulge and hairpin loops, and each loop's structural information, including sugar pucker, glycosidic linkage, hydrogen bonding patterns and stacking interactions, was included in the database. False positives were defined, identified and reclassified or omitted from the database to ensure the most accurate results possible. Users can search via general PDB information, experimental parameters, sequence and specific motif and by specific structural parameters in the subquery page after the initial search. Returned results for each search can be viewed individually or a complete set can be downloaded into a spreadsheet to allow for easy comparison. The RNA CoSSMos database is automatically updated weekly and is available at http:// cossmos.slu.edu.
RAG-3D: a search tool for RNA 3D substructures
Nucleic acids research, 2015
To address many challenges in RNA structure/function prediction, the characterization of RNA's modular architectural units is required. Using the RNA-As-Graphs (RAG) database, we have previously explored the existence of secondary structure (2D) submotifs within larger RNA structures. Here we present RAG-3D-a dataset of RNA tertiary (3D) structures and substructures plus a web-based search tool-designed to exploit graph representations of RNAs for the goal of searching for similar 3D structural fragments. The objects in RAG-3D consist of 3D structures translated into 3D graphs, cataloged based on the connectivity between their secondary structure elements. Each graph is additionally described in terms of its subgraph building blocks. The RAG-3D search tool then compares a query RNA 3D structure to those in the database to obtain structurally similar structures and substructures. This comparison reveals conserved 3D RNA features and thus may suggest functional connections. Though...
A method for automated discovering of RNA tertiary motifs
2008
We used a novel graph-based approach to identify recurrent RNA tertiary mo- tifs embedded within secondary structure. We catalogued all the secondary structural elements of the RNA molecule and clustered them using an innovative graph similarity measure. We applied our method to three widely studied structures:H.m50S,E.coli50S andT.th16S. We identified 10 known motifs without any prior knowledge of their shapes or