Eukaryote-specific domains in translation initiation factors: implications for translation regulation and evolution of the translation system - PubMed (original) (raw)

Eukaryote-specific domains in translation initiation factors: implications for translation regulation and evolution of the translation system

L Aravind et al. Genome Res. 2000 Aug.

Abstract

Computational analysis of sequences of proteins involved in translation initiation in eukaryotes reveals a number of specific domains that are not represented in bacteria or archaea. Most of these eukaryote-specific domains are known or predicted to possess an alpha-helical structure, which suggests that such domains are easier to invent in the course of evolution than are domains of other structural classes. A previously undetected, conserved region predicted to form an alpha-helical domain is delineated in the initiation factor eIF4G, in Nonsense-mediated mRNA decay 2 protein (NMD2/UPF2), in the nuclear cap-binding CBP80, and in other, poorly characterized proteins, which is named the NIC (NMD2, eIF4G, CBP80) domain. Biochemical and mutagenesis data on NIC-containing proteins indicate that this predicted domain is one of the central adapters in the regulation of mRNA processing, translation, and degradation. It is demonstrated that, in the course of eukaryotic evolution, initiation factor eIF4G, of which NIC is the core, conserved portion, has accreted several additional, distinct predicted domains such as MI (MA-3 and eIF4G ) and W2, which probably was accompanied by acquisition of new regulatory interactions.

PubMed Disclaimer

Figures

Figure 1

Figure 1

(A) (See pages 1175 and 1176.) Multiple sequence alignment of the predicted NIC domain. (B) Multiple sequence alignment of the predicted MI domain. The proteins are designated by their name, followed by the species abbreviation and GenBank gene identifier. The numbers on either side of the alignment represent the position of the first and last residue of the respective domain in each protein. A consensus secondary structure predicted using the

PSIPRED

and

PHD

programs is shown above the alignment. The coloring is based on the 80% consensus in part A and the 85% consensus in part B: h—hydrophobic residues / l—aliphatic residues shaded yellow (YFWLIVMA), c—charged residues colored magenta, p—polar residues colored purple (STQNEDRKH), s—small residues colored green (SAGTVPNHD), t—tiny residues shaded green (GAS), and b—big residues shaded gray (KREQWFYLMI). The point mutations in the human eIF4G1 and yeast eIF4G2 that affect translation are high-lighted in blue in the NIC domain alignment. In the case of the NIC domain alignment 6 distinct families are delineated by brackets to the right of the alignment panel 1. These families are (1) Nucampholin-like, (2) SGD1p-like, (3) KIAA0427-like, (4) CBP80-like, (5) eIF4G-like, and (6) NMD2-like. The species abbreviations are: At— Arabidopsis thaliana, Ce— Caenorhabditis elegans, Dm— Drosophila melanogaster, Mm— Mus musculus, Hs— Homo sapiens, Sc— Saccharomyces cerevisiae, Sp— Schizosaccharomyces pombe, Ta— Triticum aestivum, Pf—Plasmodium falciparum, and Lm—Leishmania major.

Figure 1

Figure 1

(A) (See pages 1175 and 1176.) Multiple sequence alignment of the predicted NIC domain. (B) Multiple sequence alignment of the predicted MI domain. The proteins are designated by their name, followed by the species abbreviation and GenBank gene identifier. The numbers on either side of the alignment represent the position of the first and last residue of the respective domain in each protein. A consensus secondary structure predicted using the

PSIPRED

and

PHD

programs is shown above the alignment. The coloring is based on the 80% consensus in part A and the 85% consensus in part B: h—hydrophobic residues / l—aliphatic residues shaded yellow (YFWLIVMA), c—charged residues colored magenta, p—polar residues colored purple (STQNEDRKH), s—small residues colored green (SAGTVPNHD), t—tiny residues shaded green (GAS), and b—big residues shaded gray (KREQWFYLMI). The point mutations in the human eIF4G1 and yeast eIF4G2 that affect translation are high-lighted in blue in the NIC domain alignment. In the case of the NIC domain alignment 6 distinct families are delineated by brackets to the right of the alignment panel 1. These families are (1) Nucampholin-like, (2) SGD1p-like, (3) KIAA0427-like, (4) CBP80-like, (5) eIF4G-like, and (6) NMD2-like. The species abbreviations are: At— Arabidopsis thaliana, Ce— Caenorhabditis elegans, Dm— Drosophila melanogaster, Mm— Mus musculus, Hs— Homo sapiens, Sc— Saccharomyces cerevisiae, Sp— Schizosaccharomyces pombe, Ta— Triticum aestivum, Pf—Plasmodium falciparum, and Lm—Leishmania major.

Figure 1

Figure 1

(A) (See pages 1175 and 1176.) Multiple sequence alignment of the predicted NIC domain. (B) Multiple sequence alignment of the predicted MI domain. The proteins are designated by their name, followed by the species abbreviation and GenBank gene identifier. The numbers on either side of the alignment represent the position of the first and last residue of the respective domain in each protein. A consensus secondary structure predicted using the

PSIPRED

and

PHD

programs is shown above the alignment. The coloring is based on the 80% consensus in part A and the 85% consensus in part B: h—hydrophobic residues / l—aliphatic residues shaded yellow (YFWLIVMA), c—charged residues colored magenta, p—polar residues colored purple (STQNEDRKH), s—small residues colored green (SAGTVPNHD), t—tiny residues shaded green (GAS), and b—big residues shaded gray (KREQWFYLMI). The point mutations in the human eIF4G1 and yeast eIF4G2 that affect translation are high-lighted in blue in the NIC domain alignment. In the case of the NIC domain alignment 6 distinct families are delineated by brackets to the right of the alignment panel 1. These families are (1) Nucampholin-like, (2) SGD1p-like, (3) KIAA0427-like, (4) CBP80-like, (5) eIF4G-like, and (6) NMD2-like. The species abbreviations are: At— Arabidopsis thaliana, Ce— Caenorhabditis elegans, Dm— Drosophila melanogaster, Mm— Mus musculus, Hs— Homo sapiens, Sc— Saccharomyces cerevisiae, Sp— Schizosaccharomyces pombe, Ta— Triticum aestivum, Pf—Plasmodium falciparum, and Lm—Leishmania major.

Figure 2

Figure 2

Domain organization of selected eukaryotic translation initiation factors and their homologs. The individual domains are drawn approximately to scale and are labeled by the acronyms that are indicated in the text. Zn in eIF2β and eIF5 indicates a zinc-ribbon. The additional unlabeled regions in CBP80 and YGR278w represent predicted globular domains that are not found in other proteins. The orange bar at the carboxyl terminus of YGR278w shows the RS repeat segment that is also found in several proteins participating in RNA metabolism. I-patch is an isoleucine-rich hexapeptide repeat domain, and NUCT is a nucleotidyl transferase domain of the sugar-nucleotide diphosphate-transferase family (Koonin 1999).

References

    1. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. - PMC - PubMed
    1. Aoki H, Adams SL, Turner MA, Ganoza MC. Molecular characterization of the prokaryotic efp gene product involved in a peptidyltransferase reaction. Biochimie. 1997;79:7–11. - PubMed
    1. Aravind L, Koonin EV. Gleaning non-trivial structural, functional and evolutionary information about proteins by iterative database searches. J Mol Biol. 1999a;287:1023–1040. - PubMed
    1. ————— Novel predicted RNA-binding domains associated with the translation machinery. J Mol Evol. 1999b;48:291–302. - PubMed
    1. Aravind L, Ponting CP. Homologues of 26S proteasome subunits are regulators of transcription and translation. Protein Sci. 1998;7:1250–1254. - PMC - PubMed

MeSH terms

Substances

LinkOut - more resources