Frequency and isostericity of RNA base pairs - PubMed (original) (raw)
Frequency and isostericity of RNA base pairs
Jesse Stombaugh et al. Nucleic Acids Res. 2009 Apr.
Abstract
Most of the hairpin, internal and junction loops that appear single-stranded in standard RNA secondary structures form recurrent 3D motifs, where non-Watson-Crick base pairs play a central role. Non-Watson-Crick base pairs also play crucial roles in tertiary contacts in structured RNA molecules. We previously classified RNA base pairs geometrically so as to group together those base pairs that are structurally similar (isosteric) and therefore able to substitute for each other by mutation without disrupting the 3D structure. Here, we introduce a quantitative measure of base pair isostericity, the IsoDiscrepancy Index (IDI), to more accurately determine which base pair substitutions can potentially occur in conserved motifs. We extract and classify base pairs from a reduced-redundancy set of RNA 3D structures from the Protein Data Bank (PDB) and calculate centroids (exemplars) for each base combination and geometric base pair type (family). We use the exemplars and IDI values to update our online Basepair Catalog and the Isostericity Matrices (IM) for each base pair family. From the database of base pairs observed in 3D structures we derive base pair occurrence frequencies for each of the 12 geometric base pair families. In order to improve the statistics from the 3D structures, we also derive base pair occurrence frequencies from rRNA sequence alignments.
Figures
Figure 1.
Representation of the three contributions to the IDI illustrated using non-isosteric base pairs. To calculate the IDI for two base pairs, the bases designated ‘first base’ in each base pair are superposed (bases on the left in each panel) and then the following three quantities are evaluated, normalized and summed: (1) The difference, Δ_c_, in the intra-base pair C1′–C1′ distances, illustrated for two non-isosteric cWW base pairs, AG and AU. (2) The inter-base pair C1′–C1′ distance, _t_1, between the C1′ atoms of the second bases of the base pairs, illustrated for the near isosteric cWW AU and AC base pairs. We also calculate the corresponding distance _t_2 after first superposing the second bases of the base pairs. (3) The angle, θ, about an axis perpendicular to the base pair plane, required to superpose the second bases, illustrated using non-isosteric cWW AU and cWS AU base pairs. For some pairs of base pairs, a 180° rotation (flip) about an axis in the base pair plane is required to superpose the second bases (case not shown).
Figure 2.
Histograms of IDIs between sets of identical (upper left), isosteric (upper right), near isosteric (lower left) and non-isosteric (lower right) base pair instances from the 3D structures in the reduced-redundancy dataset having better than 3.0 Å resolution. Upper left: IDIs calculated between identical base pairs (i.e. GC cWW with GC cWW, UA tWH with UA tWH, etc.). Upper right: IDIs between 200 GC cWW and 200 UA cWW pairs. Lower left: IDIs between 200 GC cWW and 200 GU cWW pairs. Lower right: IDIs between 200 GU cWW and 200 UG cWW pairs.
Figure 3.
Part of 3D structural alignment of E. coli and H. marismortui 23S rRNAs, illustrating structural conservation of a complex motif of Domain I that includes Helix 24. (a) The 3D structural alignment ofcorresponding base pairs from the E. coli (left) and H. marismortui (right) structures. (b) The annotated 2D structures for E. coli and H. marismortui using the base pair symbols. (c) Stereo view of the E. coli 3D structure, highlighting bases that differ between structures. The base pairs in the alignment and in the 2D and 3D structures are color-coded by geometric base pair family. Letters that correspond to bases which differ between organisms are marked in the secondary structure by a magenta circle and in the 3D structure with thicker lines.
Figure 4.
Histograms of IDIs between actual base pairs in the 3D–3D alignment of E. coli and T. thermophilus 5S, 16S and 23S rRNAs. The IDIs used in these histograms were calculated before the revision of the 3D structures to correct syn-anti errors. The upper-left panel shows the IDI between all aligned base pairs, whether in the same geometric family or not. The base pairs with IDI > 6.0 are discussed in section ‘Base pair discrepancies between aligned positions in the rRNA 3D structural alignments’. The upper-right panel shows the IDI between aligned base pairs that belong to the same geometric family, and the lower panels subdivide these into two cases, those in which with identical base combinations (lower left) and those with different base combinations (lower right). All IDI values above 6 are placed in the rightmost bin in each histogram.
Figure 5.
A graphical summary of the base pair occurrence frequencies within each base pair family, obtained from rRNA sequence data (data from
Supplementary Table S8
). For cWW, tHH, tWH, tHS, tWS and tSS, one base combination accounts for >50% of instances. The gray boxes in each matrix indicate base combinations that do not form that type of base pair. For example, there is no GG cWW base pair.
Similar articles
- Recurrent structural RNA motifs, Isostericity Matrices and sequence alignments.
Lescoute A, Leontis NB, Massire C, Westhof E. Lescoute A, et al. Nucleic Acids Res. 2005 Apr 28;33(8):2395-409. doi: 10.1093/nar/gki535. Print 2005. Nucleic Acids Res. 2005. PMID: 15860776 Free PMC article. - Isostericity and tautomerism of base pairs in nucleic acids.
Westhof E. Westhof E. FEBS Lett. 2014 Aug 1;588(15):2464-9. doi: 10.1016/j.febslet.2014.06.031. Epub 2014 Jun 17. FEBS Lett. 2014. PMID: 24950426 Review. - Motif prediction in ribosomal RNAs Lessons and prospects for automated motif prediction in homologous RNA molecules.
Leontis NB, Stombaugh J, Westhof E. Leontis NB, et al. Biochimie. 2002 Sep;84(9):961-73. doi: 10.1016/s0300-9084(02)01463-3. Biochimie. 2002. PMID: 12458088 - ISFOLD: structure prediction of base pairs in non-helical RNA motifs from isostericity signatures in their sequence alignments.
Mokdad A, Frankel AD. Mokdad A, et al. J Biomol Struct Dyn. 2008 Apr;25(5):467-72. doi: 10.1080/07391102.2008.10531239. J Biomol Struct Dyn. 2008. PMID: 18282001 - Analysis of RNA motifs.
Leontis NB, Westhof E. Leontis NB, et al. Curr Opin Struct Biol. 2003 Jun;13(3):300-8. doi: 10.1016/s0959-440x(03)00076-9. Curr Opin Struct Biol. 2003. PMID: 12831880 Review.
Cited by
- RNAMotifProfile: a graph-based approach to build RNA structural motif profiles.
Rahaman MM, Zhang S. Rahaman MM, et al. NAR Genom Bioinform. 2024 Sep 26;6(3):lqae128. doi: 10.1093/nargab/lqae128. eCollection 2024 Sep. NAR Genom Bioinform. 2024. PMID: 39328267 Free PMC article. - Assessing RNA atomistic force fields via energy landscape explorations in implicit solvent.
Röder K, Pasquali S. Röder K, et al. Biophys Rev. 2024 Jun 17;16(3):285-295. doi: 10.1007/s12551-024-01202-9. eCollection 2024 Jun. Biophys Rev. 2024. PMID: 39099837 Free PMC article. Review. - Concurrent prediction of RNA secondary structures with pseudoknots and local 3D motifs in an integer programming framework.
Loyer G, Reinharz V. Loyer G, et al. Bioinformatics. 2024 Feb 1;40(2):btae022. doi: 10.1093/bioinformatics/btae022. Bioinformatics. 2024. PMID: 38230755 Free PMC article. - When will RNA get its AlphaFold moment?
Schneider B, Sweeney BA, Bateman A, Cerny J, Zok T, Szachniuk M. Schneider B, et al. Nucleic Acids Res. 2023 Oct 13;51(18):9522-9532. doi: 10.1093/nar/gkad726. Nucleic Acids Res. 2023. PMID: 37702120 Free PMC article. - Computer-aided comprehensive explorations of RNA structural polymorphism through complementary simulation methods.
Röder K, Stirnemann G, Faccioli P, Pasquali S. Röder K, et al. QRB Discov. 2022 Oct 17;3:e21. doi: 10.1017/qrd.2022.19. eCollection 2022. QRB Discov. 2022. PMID: 37529277 Free PMC article. Review.
References
Publication types
MeSH terms
Substances
Grants and funding
- R01 GM085328/GM/NIGMS NIH HHS/United States
- R15 GM055898/GM/NIGMS NIH HHS/United States
- R15 GM055898-04/GM/NIGMS NIH HHS/United States
- 2 R15GM055898-04/GM/NIGMS NIH HHS/United States
LinkOut - more resources
Full Text Sources
Other Literature Sources