Comprehensive classification of nucleotidyltransferase fold proteins: identification of novel families and their representatives in human - PubMed (original) (raw)

Comprehensive classification of nucleotidyltransferase fold proteins: identification of novel families and their representatives in human

Krzysztof Kuchta et al. Nucleic Acids Res. 2009 Dec.

Abstract

This article presents a comprehensive review of large and highly diverse superfamily of nucleotidyltransferase fold proteins by providing a global picture about their evolutionary history, sequence-structure diversity and fulfilled functional roles. Using top-of-the-line homology detection method combined with transitive searches and fold recognition, we revised the realm of these superfamily in numerous databases of catalogued protein families and structures, and identified 10 new families of nucleotidyltransferase fold. These families include hundreds of previously uncharacterized and various poorly annotated proteins such as Fukutin/LICD, NFAT, FAM46, Mab-21 and NRAP. Some of these proteins seem to play novel important roles, not observed before for this superfamily, such as regulation of gene expression or choline incorporation into cell membrane. Importantly, within newly detected families we identified 25 novel superfamily members in human genome. Among these newly assigned members are proteins known to be involved in congenital muscular dystrophy, neurological diseases and retinal pigmentosa what sheds some new light on the molecular background of these genetic disorders. Twelve of new human nucleotidyltransferase fold proteins belong to Mab-21 family known to be involved in organogenesis and development. The determination of specific biological functions of these newly detected proteins remains a challenging task.

PubMed Disclaimer

Figures

Figure 1.

Figure 1.

NTase fold superfamily structures. An example of NTase fold structure is presented together with the observed variation of secondary structure elements and additional domains in representative superfamily proteins of known structure. Positions of conserved active site motifs involved in catalysis ([DE]h[DE]h, h[DE]h) and substrate binding (hG[GS]) are marked above corresponding secondary structure elements. Side-chains of critical motifs residues (aspartates and serine) are shown as balls and sticks. Suffixes ‘_CT’ and ‘_NC’ refer to catalytic and non-catalytic NTase domains.

Figure 2.

Figure 2.

Multiple sequence alignment for NTase fold superfamily. Only conserved regions of the fold core are shown for representative proteins of known structure and single representative sequences for groups lacking experimentally solved structure. Sequences are labeled according to PDB code or NCBI gene identification (gi) number and the number of group they belong to (XVII–XXVI, newly detected NTase fold proteins). Multiple NTase fold domains occurring in some proteins are denoted by consecutive roman numbers. The numbers of excluded residues are specified in square brackets. Conservation of residues is denoted with the following scheme: uncharged, highlighted in yellow; small, letters in red; critical active site aspartates/glutamates, highlighted in black. Locations of secondary structure elements (E, β-strand; H, α-helix) are marked above sequences.

Figure 3.

Figure 3.

Connectivity network for NTase fold superfamily. Groups of closely related proteins that can be linked by both PSI-BLAST and RPS-BLAST are surrounded by black (known members of NTase fold superfamily) and red (newly detected) circles. Detected Meta-BASIC connections between PFAM, KOG, COG families, PDB structures and human proteins are shown as lines colored according to the Meta-BASIC score: pink, confident scores above 40; grey, below threshold scores between 28 and 40. For each group short description and taxonomy distribution is provided.

Figure 4.

Figure 4.

NTase fold proteins in human. Locations of genes encoding NTase fold proteins in human genome are marked with triangles, while rectangles correspond to known genetic diseases associated with mutations found in some of these genes. ENSEMBL identifiers are provided for each protein in blue and red boxes that correspond to known and newly identified groups in our classification, respectively.

Figure 5.

Figure 5.

Genetic mutations in Fukutin and Fukutin related protein connected with congenital muscular dystrophy. Positions of single point mutations found in genes encoding Fukutin (in Fukuyama-type congenital muscular dystrophy, FCMD) and Fukutin related protein (in congenital muscular dystrophy type 1C, MDC1C) are shown as spheres on the respective 3D models of their structure. Mutations denoted with green spheres are clustered in the area of the active site and probably impede catalysis or substrate binding. Red spheres represent mutations grouped on the opposite side of the protein surface where they may impair protein–protein or protein–membrane interactions.

Similar articles

Cited by

References

    1. Aravind L, Koonin EV. DNA polymerase beta-like nucleotidyltransferase superfamily: identification of three new families, classification and evolutionary history. Nucleic Acids Res. 1999;27:1609–1618. - PMC - PubMed
    1. Rogozin IB, Aravind L, Koonin EV. Differential action of natural selection on the N and C-terminal domains of 2′-5′ oligoadenylate synthetases and the potential nuclease function of the C-terminal domain. J. Mol. Biol. 2003;326:1449–1461. - PubMed
    1. Tomita K, Fukai S, Ishitani R, Ueda T, Takeuchi N, Vassylyev DG, Nureki O. Structural basis for template-independent RNA polymerization. Nature. 2004;430:700–704. - PubMed
    1. Deng J, Ernst NL, Turley S, Stuart KD, Hol WG. Structural basis for UTP specificity of RNA editing TUTases from Trypanosoma brucei. EMBO J. 2005;24:4007–4017. - PMC - PubMed
    1. Stagno J, Aphasizheva I, Rosengarth A, Luecke H, Aphasizhev R. UTP-bound and Apo structures of a minimal RNA uridylyltransferase. J. Mol. Biol. 2007;366:882–899. - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources