Polyvalent Proteins, a Pervasive Theme in the Intergenomic Biological Conflicts of Bacteriophages and Conjugative Elements - PubMed (original) (raw)

. 2017 Jul 11;199(15):e00245-17.

doi: 10.1128/JB.00245-17. Print 2017 Aug 1.

Affiliations

Polyvalent Proteins, a Pervasive Theme in the Intergenomic Biological Conflicts of Bacteriophages and Conjugative Elements

Lakshminarayan M Iyer et al. J Bacteriol. 2017.

Abstract

Intense biological conflicts between prokaryotic genomes and their genomic parasites have resulted in an arms race in terms of the molecular "weaponry" deployed on both sides. Using a recursive computational approach, we uncovered a remarkable class of multidomain proteins with 2 to 15 domains in the same polypeptide deployed by viruses and plasmids in such conflicts. Domain architectures and genomic contexts indicate that they are part of a widespread conflict strategy involving proteins injected into the host cell along with parasite DNA during the earliest phase of infection. Their unique feature is the combination of domains with highly disparate biochemical activities in the same polypeptide; accordingly, we term them polyvalent proteins. Of the 131 domains in polyvalent proteins, a large fraction are enzymatic domains predicted to modify proteins, target nucleic acids, alter nucleotide signaling/metabolism, and attack peptidoglycan or cytoskeletal components. They further contain nucleic acid-binding domains, virion structural domains, and 40 novel uncharacterized domains. Analysis of their architectural network reveals both pervasive common themes and specialized strategies for conjugative elements and plasmids or (pro)phages. The themes include likely processing of multidomain polypeptides by zincin-like metallopeptidases and mechanisms to counter restriction or CRISPR/Cas systems and jump-start transcription or replication. DNA-binding domains acquired by eukaryotes from such systems have been reused in XPC/RAD4-dependent DNA repair and mitochondrial genome replication in kinetoplastids. Characterization of the novel domains discovered here, such as RNases and peptidases, are likely to aid in the development of new reagents and elucidation of the spread of antibiotic resistance.IMPORTANCE This is the first report of the widespread presence of large proteins, termed polyvalent proteins, predicted to be transmitted by genomic parasites such as conjugative elements, plasmids, and phages during the initial phase of infection along with their DNA. They are typified by the presence of multiple domains with disparate activities combined in the same protein. While some of these domains are predicted to assist the invasive element in replication, transcription, or protection of their DNA, several are likely to target various host defense systems or modify the host to favor the parasite's life cycle. Notably, DNA-binding domains from these systems have been transferred to eukaryotes, where they have been incorporated into DNA repair and mitochondrial genome replication systems.

Keywords: DNA replication; DNA-binding proteins; RNases; antirestriction; bacteriophages; biological conflicts; effectors; metallopeptidase; plasmids; transcription.

Copyright © 2017 Iyer et al.

PubMed Disclaimer

Figures

FIG 1

FIG 1

(A) Schematic showing the recursive process used to find domains in polyvalent proteins. (B) Frequency distribution of the top 100 domains observed in polyvalent proteins. (C) Examples of recovered polyvalent proteins and their domain architectures. (D) Examples of neighborhoods of polyvalent-protein-encoding genes illustrating the broad types of genome contexts. (E) Plot illustrating the number of polyvalent proteins with a given number of domains. Note that the y axis is on a log scale.

FIG 2

FIG 2

Polyvalent protein domain networks. (A) Chord diagram of domain cooccurrences in polyvalent proteins. The plot includes all cooccurrences of domains. Thus, an edge is drawn between two domains in a protein whether they are adjacent or not. (B) Domain architecture network of polyvalent proteins. Domains linked in the same protein are connected by arrows with each arrowhead pointing to the C-terminal domain. Here and in the subsequent network images, an edge is drawn only between two adjacent domains. (C) Clique subnetwork of the domain architecture network merging large cliques with seven or eight nodes. The network reveals two distinct subgraphs as described in the text. (D) Largest biconnected subnetwork of the domain architecture network. In all of the networks, the node size and color are scaled on the basis of the number of connections per node (degree). Nodes with two or fewer connections are gray. Edge thickness is based on the number of edge occurrences. Edges occurring <15 times are gray, those occurring 16 to 90 times are cadet blue, and those occurring >90 times are maroon.

FIG 3

FIG 3

Multiple-sequence alignments of MPTase (A), Pol-β NTase (B), DdrB-ParB (C), and PBECR (D) domains. Secondary structure and element labeling is provided at the top line, and residue consensus at various percentages is provided at the bottom line. Sequences are identified on the left by gene name, organism abbreviation, and NCBI accession number separated by vertical lines and to the right by family name. Sequences of structures are labeled with PDB codes, shaded in orange. Poorly conserved secondary structure elements are colored white. Family-specific conserved residues described in the text are denoted by blue boxes. Alignments are colored as follows: h (hydrophobic), l (aliphatic), and a (aromatic) are shaded yellow; p (polar), + (positively charged), − (negatively charged), and c (charged) are shaded blue; s (small) and t (tiny) are shaded green; b (big) is shaded gray; absolutely conserved residues are in white lettering and shaded in black. For organism abbreviations, see File S4 at

ftp://ftp.ncbi.nih.gov/pub/aravind/polyvalent/polyvalent.html

.

FIG 4

FIG 4

Domain architectures and gene neighborhoods of domains found in polyvalent proteins grouped by the presence of various principal domains or groups of domains, including peptidases (A), ARTs (B), kinases (C), GNATs (D), Pol-β NTases (E), helicases (F), DdrB-like ParB (G), RadC (H), SMS/RadA (I), and Toprim fold and Primpol primases (J). These include only a small sample of the entire diaspora of associations. For the complete set of domain architectures and operons, see

ftp://ftp.ncbi.nih.gov/pub/aravind/polyvalent/polyvalent.html

. Proteins and gene neighborhoods are shown with their species names and GenBank accession numbers. For gene neighborhoods, the accession number of the gene in dark pink is used. Gene names are shown only for well-studied proteins. Genes in neighborhoods are shown as boxed arrows, with the arrowhead pointing to the 3′ gene. Domains are not drawn to scale.

FIG 5

FIG 5

Domain architectures and gene neighborhoods of domains found in polyvalent proteins grouped by the presence of various principal domains or groups of domains, including PBECR (A); 2H (B); RelA/SpoT (C); inorganic pyrophosphatases (D); phosphoribosyltransferases (E); peptidoglycan and cytoskeleton-targeting domains (F); nucleic acid binding domains (G); and DarA, ArdA, and YodL domains (H). Domain and gene neighborhood designations are as in Fig. 4.

Similar articles

Cited by

References

    1. Iyer LM, Zhang D, Burroughs AM, Aravind L. 2013. Computational identification of novel biochemical systems involved in oxidation, glycosylation and other complex modifications of bases in DNA. Nucleic Acids Res 41:7635–7655. doi:10.1093/nar/gkt573. - DOI - PMC - PubMed
    1. Labrie SJ, Samson JE, Moineau S. 2010. Bacteriophage resistance mechanisms. Nat Rev Microbiol 8:317–327. doi:10.1038/nrmicro2315. - DOI - PubMed
    1. Anantharaman V, Iyer LM, Aravind L. 2012. Ter-dependent stress response systems: novel pathways related to metal sensing, production of a nucleoside-like metabolite, and DNA-processing. Mol Biosyst 8:3142–3165. doi:10.1039/c2mb25239b. - DOI - PMC - PubMed
    1. Makarova KS, Anantharaman V, Grishin NV, Koonin EV, Aravind L. 2014. CARF and WYL domains: ligand-binding regulators of prokaryotic defense systems. Front Genet 5:102. doi:10.3389/fgene.2014.00102. - DOI - PMC - PubMed
    1. Makarova KS, Wolf YI, Snir S, Koonin EV. 2011. Defense islands in bacterial and archaeal genomes and prediction of novel defense systems. J Bacteriol 193:6039–6056. doi:10.1128/JB.05535-11. - DOI - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources