Rfam: updates to the RNA families database - PubMed (original) (raw)

. 2009 Jan;37(Database issue):D136-40.

doi: 10.1093/nar/gkn766. Epub 2008 Oct 25.

Affiliations

Rfam: updates to the RNA families database

Paul P Gardner et al. Nucleic Acids Res. 2009 Jan.

Abstract

Rfam is a collection of RNA sequence families, represented by multiple sequence alignments and covariance models (CMs). The primary aim of Rfam is to annotate new members of known RNA families on nucleotide sequences, particularly complete genomes, using sensitive BLAST filters in combination with CMs. A minority of families with a very broad taxonomic range (e.g. tRNA and rRNA) provide the majority of the sequence annotations, whilst the majority of Rfam families (e.g. snoRNAs and miRNAs) have a limited taxonomic range and provide a limited number of annotations. Recent improvements to the website, methodologies and data used by Rfam are discussed. Rfam is freely available on the Web at http://rfam.sanger.ac.uk/and http://rfam.janelia.org/.

PubMed Disclaimer

Figures

Figure 1.

Figure 1.

An outline of the Rfam 9.0 databases and methods. RFAMSEQ is drawn from EMBL excluding only the EST, synthetic and patented divisions. There are 603 Rfam families in release 9.0, which are used to scan RFAMSEQ for homologues using first WU-BLAST filters followed by the more accurate CM-based methods cmsearch and cmalign. This results in 603 FULL alignments annotating 636 138 regions.

Figure 2.

Figure 2.

An example of the new secondary markups used by Rfam. The coronavirus 3′-UTR pseudoknot is shown (Rfam Accession RF00165). We display coloured markups of sequence conservation (A), covariation (B), base-pair conservation also known as the fraction of canonical base pairs (C) and CM scores (D).

Figure 3.

Figure 3.

An example of how PDB structures are displayed in Rfam. In this case, the structure 1l ng, containing the SRP19-7S.S RNA Complex from M. jannaschii, is rendered as cartoons using Jmol. Protein regions are coloured using the following scheme: beta-sheets (yellow), helices (magenta) and unstructured regions (white). RNA bases that match the Rfam model are coloured according to the key given in the web page (not shown here). In this structure, green represents a match to the eukaryotic SRP model, whereas those unmatched bases are coloured orange.

Similar articles

Cited by

References

    1. Nawrocki EP, Eddy SR. Query-dependent banding (QDB) for faster RNA similarity searches. PLoS Comput. Biol. 2007;3:e56. - PMC - PubMed
    1. Griffiths-Jones S, Bateman A, Marshall M, Khanna A, Eddy SR. Rfam: an RNA family database. Nucleic Acids Res. 2003;31:439–441. - PMC - PubMed
    1. Griffiths-Jones S, Moxon S, Marshall M, Khanna A, Eddy SR, Bateman A. Rfam: annotating non-coding RNAs in complete genomes. Nucleic Acids Res. 2005;33:D121–D124. - PMC - PubMed
    1. Cochrane G, Akhtar R, Aldebert P, Althorpe N, Baldwin A, Bates K, Bhattacharyya S, Bonfield J, Bower L, Browne P, et al. Priorities for nucleotide trace, sequence and annotation data capture at the Ensembl Trace Archive and the EMBL Nucleotide Sequence Database. Nucleic Acids Res. 2008;36:D5–D12. - PMC - PubMed
    1. Freyhult EK, Bollback JP, Gardner PP. Exploring genomic dark matter: a critical assessment of the performance of homology search methods on noncoding RNA. Genome Res. 2007;17:117–125. - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources