Genome analysis: RNA recognition motif (RRM) and K homology (KH) domain RNA-binding proteins from the flowering plant Arabidopsis thaliana - PubMed (original) (raw)

Genome analysis: RNA recognition motif (RRM) and K homology (KH) domain RNA-binding proteins from the flowering plant Arabidopsis thaliana

Zdravko J Lorković et al. Nucleic Acids Res. 2002.

Abstract

Regulation of gene expression at the post-transcriptional level is mainly achieved by proteins containing well-defined sequence motifs involved in RNA binding. The most widely spread motifs are the RNA recognition motif (RRM) and the K homology (KH) domain. In this article, we survey the complete Arabidopsis thaliana genome for proteins containing RRM and KH RNA-binding domains. The Arabidopsis genome encodes 196 RRM-containing proteins, a more complex set than found in Caenorhabditis elegans and Drosophila melanogaster. In addition, the Arabidopsis genome contains 26 KH domain proteins. Most of the Arabidopsis RRM-containing proteins can be classified into structural and/or functional groups, based on similarity with either known metazoan or Arabidopsis proteins. Approximately 50% of Arabidopsis RRM-containing proteins do not have obvious homologues in metazoa, and for most of those that are predicted to be orthologues of metazoan proteins, no experimental data exist to confirm this. Additionally, the function of most Arabidopsis RRM proteins and of all KH proteins is unknown. Based on the data presented here, it is evident that among all eukaryotes, only those RNA-binding proteins that are involved in the most essential processes of post-transcriptional gene regulation are preserved in structure and, most probably, in function. However, the higher complexity of RNA-binding proteins in Arabidopsis, as evident in groups of SR splicing factors and poly(A)-binding proteins, may account for the observed differences in mRNA maturation between plants and metazoa. This survey provides a first systematic analysis of plant RNA-binding proteins, which may serve as a basis for functional characterisation of this important protein group in plants.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Schematic representation of the modular structure of Arabidopsis RRM-containing proteins. Only major types of domain combinations are shown. Individual modules are identified by different shapes and colours. Different types of domains (RNA-binding, auxiliary domains and other distinctive regions of proteins) are listed at the bottom.

Figure 2

Figure 2

Dendrogram of Arabidopsis PABPs, and their orthologues from yeast and human. Dendrogram and sequence alignments were generated with the PileUP program, using default parameters (Genetic Computer Group, Madison, WI).

Figure 3

Figure 3

Sequence analysis of Arabidopsis hnRNP A/B-like proteins. Alignment of the first (top) and the second (bottom) RRM of Arabidopsis hnRNP A/B proteins with metazoan members of the hnRNP A/B group of proteins. Sequences were aligned using the Clustal W program and shaded by the BoxShade server. Amino acids identical or similar in 50% of the sequences are shaded by black or grey background, respectively. Conserved secondary structure elements are indicated between alignments of the two RRMs. Asterisks indicate the position of residues located in the conserved hydrophobic core (52). Residues involved in formation of inter-RRM salt bridges are indicated with blue squares. The two acidic amino acids in the second RRM, possibly involved in salt bridges in Arabidopsis proteins are indicated with purple squares. The RNP1 and RNP2 motifs are indicated with red boxes. Consensus sequences at the bottom of each alignment indicate residues that are conserved in 10/13 sequences for the whole alignments or in 5/6 sequences for Arabidopsis proteins. Six groups of similar amino acids are indicated as follows: B = H, K, R; J = I, L, M, V; O = F, W, Y; U = S, T; X = A, G; Z = D, E. Hs, Homo sapiens; Xl, X.laevis; Dm, D.melanogaster; Sa, Schistocerca americana; Ce, C.elegans. Accession numbers of proteins used in alignment are as follows: HsA1, SWISS-PROT P09651; HsB1, SWISS-PROT: M29064; XlA1a, SWISS-PROT M31041; Dmhrp36, SWISS-PROT P48810; Dmhrp48, SWISS-PROT P48809; SaA1, SWISS-PROT P21522; CeA1, SWISS-PROT D10877.

Figure 4

Figure 4

Alignment of RRMs of eight Arabidopsis GR-RBPs. The three Arabidopsis RZ-1 orthologues and RZ-1_like protein are not included. Details as in Figure 2. Consensus sequence at the bottom of alignment indicates residues that are conserved in 7/8 sequences.

Figure 5

Figure 5

Alignment of RRMs of 15 Arabidopsis S-RBPs. Details as in Figure 2. Consensus sequence at the bottom of alignment indicates residues that are conserved in 11/15 sequences.

Figure 6

Figure 6

Alignment of RRMs of eight Arabidopsis 30K-RRM proteins. Details as in Figure 2. Consensus sequence at the bottom of alignment indicates residues that are conserved in 6/8 sequences.

Similar articles

Cited by

References

    1. Burd C.G. and Dreyfuss,G. (1994) Conserved structures and diversity of functions of RNA-binding proteins. Science, 265, 615–621. - PubMed
    1. Swanson M.S. (1995) Function of nuclear pre-mRNA/mRNA binding proteins. In Lamond,A. (ed.), Pre-mRNA processing. R.G.Landes Publishers, Gergetown, TX, pp. 18–33.
    1. Siomi H., Matunis,M.J., Michael,W.M. and Dreyfuss,G. (1993) The pre-mRNA binding K protein contains a novel evolutionarily conserved motif. Nucleic Acids Res., 21, 1193–1198. - PMC - PubMed
    1. Altschul S.F., Madden,T.L., Schaffer,A.A., Zhang,J., Zhang,Z., Miller,W. and Lipman,D.J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res., 25, 3389–3402. - PMC - PubMed
    1. Henikoff S. and Henikoff,J.G. (1992) Amino acid substitution matrices from protein blocks. Proc. Natl Acad. Sci. USA, 89, 10915–10919. - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources