Biological systems discovery in silico: radical S-adenosylmethionine protein families and their target peptides for posttranslational modification - PubMed (original) (raw)
Biological systems discovery in silico: radical S-adenosylmethionine protein families and their target peptides for posttranslational modification
Daniel H Haft et al. J Bacteriol. 2011 Jun.
Abstract
Data mining methods in bioinformatics and comparative genomics commonly rely on working definitions of protein families from prior computation. Partial phylogenetic profiling (PPP), by contrast, optimizes family sizes during its searches for the cooccurring protein families that serve different roles in the same biological system. In a large-scale investigation of the incredibly diverse radical S-adenosylmethionine (SAM) enzyme superfamily, PPP aided in building a collection of 68 TIGRFAMs hidden Markov models (HMMs) that define nonoverlapping and functionally distinct subfamilies. Many identify radical SAM enzymes as molecular markers for multicomponent biological systems; HMMs defining their partner proteins also were constructed. Newly found systems include five groupings of protein families in which at least one marker is a radical SAM enzyme while another, encoded by an adjacent gene, is a short peptide predicted to be its substrate for posttranslational modification. The most prevalent, in over 125 genomes, featuring a peptide that we designate SCIFF (six cysteines in forty-five residues), is conserved throughout the class Clostridia, a distribution inconsistent with putative bacteriocin activity. A second novel system features a tandem pair of putative peptide-modifying radical SAM enzymes associated with a highly divergent family of peptides in which the only clearly conserved feature is a run of His-Xaa-Ser repeats. A third system pairs a radical SAM domain peptide maturase with selenocysteine-containing targets, suggesting a new biological role for selenium. These and several additional novel maturases that cooccur with predicted target peptides share a C-terminal additional 4Fe4S-binding domain with PqqE, the subtilosin A maturase AlbA, and the predicted mycofactocin and Nif11-class peptide maturases as well as with activators of anaerobic sulfatases and quinohemoprotein amine dehydrogenases. Radical SAM enzymes with this additional domain, as detected by TIGR04085, significantly outnumber lantibiotic synthases and cyclodehydratases combined in reference genomes while being highly enriched for members whose apparent targets are small peptides. Interpretation of comparative genomics evidence suggests unexpected (nonbacteriocin) roles for natural products from several of these systems.
Figures
Fig. 1.
The SCIFF system multiple sequence alignment and genomic regions. (A) The TIGRFAMs seed alignment TIGR03973 for the SCIFF (_s_ix _c_ysteines _i_n _f_orty-_f_ive residues) protein is shown shaded according to the degree of sequence identity in each column. Sequences more than 80% identical were removed. The six cysteines that are universal or nearly so are indicated with arrows. A run of 10 residues, SCQSACKTSC, is invariant except for two sequences with one conservative substitution each. The first, third, fourth, and fifth cysteines are flanked on one or both sides by amino acids with small side chains (Gly, Ser, or Ala), as is common for posttranslational modifications that cross-link cysteines to other residues during peptide maturation. The species of origin for the sequences shown, in order from top to bottom, are Clostridium perfringens ATCC 13124, Clostridium novyi NT, Thermosinus carboxydivorans Nor1, Desulfotomaculum reducens MI-1, Caldicellulosiruptor saccharolyticus DSM 8903, Clostridium sp. strain L2-50, Faecalibacterium prausnitzii M21/2, Paenibacillus larvae subsp. larvae BRL-230010, Clostridium scindens ATCC 35704, Epulopiscium sp. ‘N.t. morphotype B’, Anaerofustis stercorihominis DSM 17244, “Candidatus Desulforudis audaxviator” MP104C, Natranaerobius thermophilus JW/NM-WN-LF, Eubacterium biforme DSM 3989, Dethiobacter alkaliphilus AHT 1, Anaerococcus lactolyticus ATCC 51172, Acidaminococcus sp. strain D21, Shuttleworthia satelles DSM 14600, Selenomonas flueggei ATCC 43531, Eubacterium saphenum ATCC 49989, Desulfotomaculum acetoxidans DSM 771, Dialister invisus DSM 15470, Ammonifex degensii KC4, Subdoligranulum variabile DSM 15176, Clostridium hathewayi DSM 13479, Thermoanaerobacter italicus Ab9, Ethanoligenens harbinense YUAN-3, Filifactor alocis ATCC 35896, and Carboxydothermus hydrogenoformans Z-2901. (B) Genome region figure showing the SCIFF precursor and its maturase (red) appearing in a housekeeping gene context with the queuosine tRNA modification genes queA and tgt (green) and the Sec system subunit genes yajC, secD, and secF (black). In some species, an additional conserved hypothetical protein (c.h.p.) is also present (gray).
Fig. 2.
Multiple sequence alignment of His-Xaa-Ser proteins. Sequences were aligned by MUSCLE and minimally hand edited at sites from the first His-Xaa-Ser repeat to the C terminus. The three shortest sequences are shown at their full lengths, although others have additional C-terminal sequence not shown. Three sequences, identified by genus names, were not previously identified as protein-coding features. The sequences shown, in order from top to bottom, are from Vibrio parahaemolyticus RIMD 2210633, Stigmatella aurantiaca DW4/3-1, Rhodobacter sp. strain SW2, Blautia hansenii DSM 20583, Phenylobacterium zucineum HLK1, Pseudomonas fluorescens SBW25, Bacteroides sp. strain D2, Victivallis vadensis ATCC BAA-548, Ralstonia eutropha H16, Desulfovibrio vulgaris strain Miyazaki F, “Candidatus Azobacteroides pseudotrichonymphae” genomovar CFP2, Aeromonas salmonicida subsp. salmonicida A449, and Opitutaceae bacterium TAV2. Member sequences occur in close proximity to paired radical SAM enzymes, one each from families TIGR03977 and TIGR03978.
Fig. 3.
Multiple sequence alignment and genomic region view of selenobacteriocin precursor peptides. (A) Multiple alignment. The letter U represents UGA (normally a stop) codon at the start of a bacterial selenocysteine insertion element (SECIS) translated as selenocysteine (SeCys), the 21st amino acid. The two alignment columns that contain at least one U are indicated with arrows; all non-SeCys residues in those columns are Cys. Model TIGR04081 describes sequences up to the column immediately past the first selenocysteine-containing column. Sequences, in order from top to bottom, include putative (seleno)bacteriocins from Geobacter sulfurreducens PCA (extended), Geobacter sp. strain M18 (extended), Chlorobium phaeobacteroides BS1, Prosthecochloris aestuarii DSM 271, Desulfococcus oleovorans Hxd3 (no gene shown in GenBank), Desulfomicrobium baculatum DSM 4028 (extended), Desulfohalobium retbaense DSM 5692 (extended), Desulfurivibrio alkaliphilus AHT2, Desulfonatronospira thiodismutans ASO3-1, and Geobacter lovleyi SZ (extended). (B) Corrected genomic region for the GSU_1558/GSU_1559 and GSU_1560 genes from Geobacter sulfurreducens PCA. Diagonal arrows indicate the positions of the two predicted SeCys residues. Underneath the arrow diagram are the identified selenocysteine insertion elements, or SECIS. The SECIS elements begin with UGA codons that are translated as SeCys and are 80% identical through their first 30 bases.
Fig. 4.
A ribosomal peptide natural product cassette in Clostridium botulinum A2 Kyoto. This six-gene cluster for a CLI_3235-type system shows two genes for which models were not constructed at the left and right (black). At the left is a transporter, CLM_3254, with both the ATP-binding and permease domains of ABC transporters. At the right is a peptidase, CLM_3249. The central four genes (red) are a locally cysteine-rich putative RTNP precursor of family TIGR04065, a radical SAM enzyme of family TIGR04068, a conserved hypothetical protein described by family TIGR04066, and an acyl carrier protein homolog described by TIGR04069. The related cassette in Clostridium botulinum F Langeland contains the same genes in the same order (plus one additional gene). Cassettes are flanked by unrelated genes in the genomes of these two C. botulinum strains.
Similar articles
- Biochemical and Spectroscopic Characterization of a Radical S-Adenosyl-L-methionine Enzyme Involved in the Formation of a Peptide Thioether Cross-Link.
Bruender NA, Wilcoxen J, Britt RD, Bandarian V. Bruender NA, et al. Biochemistry. 2016 Apr 12;55(14):2122-34. doi: 10.1021/acs.biochem.6b00145. Epub 2016 Apr 1. Biochemistry. 2016. PMID: 27007615 Free PMC article. - Bioinformatic Mapping of Radical S-Adenosylmethionine-Dependent Ribosomally Synthesized and Post-Translationally Modified Peptides Identifies New Cα, Cβ, and Cγ-Linked Thioether-Containing Peptides.
Hudson GA, Burkhart BJ, DiCaprio AJ, Schwalen CJ, Kille B, Pogorelov TV, Mitchell DA. Hudson GA, et al. J Am Chem Soc. 2019 May 22;141(20):8228-8238. doi: 10.1021/jacs.9b01519. Epub 2019 May 13. J Am Chem Soc. 2019. PMID: 31059252 Free PMC article. - Radical-mediated enzymatic methylation: a tale of two SAMS.
Zhang Q, van der Donk WA, Liu W. Zhang Q, et al. Acc Chem Res. 2012 Apr 17;45(4):555-64. doi: 10.1021/ar200202c. Epub 2011 Nov 18. Acc Chem Res. 2012. PMID: 22097883 Free PMC article. Review. - Structural features and substrate engagement in peptide-modifying radical SAM enzymes.
Cheek LE, Zhu W. Cheek LE, et al. Arch Biochem Biophys. 2024 Jun;756:110012. doi: 10.1016/j.abb.2024.110012. Epub 2024 Apr 23. Arch Biochem Biophys. 2024. PMID: 38663796 Review.
Cited by
- Decarboxylation in Natural Products Biosynthesis.
Nguyen NA, Forstater JH, McIntosh JA. Nguyen NA, et al. JACS Au. 2024 Jul 25;4(8):2715-2745. doi: 10.1021/jacsau.4c00425. eCollection 2024 Aug 26. JACS Au. 2024. PMID: 39211618 Free PMC article. Review. - Darobactin Substrate Engineering and Computation Show Radical Stability Governs Ether versus C-C Bond Formation.
Woodard AM, Peccati F, Navo CD, Jiménez-Osés G, Mitchell DA. Woodard AM, et al. J Am Chem Soc. 2024 May 22;146(20):14328-14340. doi: 10.1021/jacs.4c03994. Epub 2024 May 10. J Am Chem Soc. 2024. PMID: 38728535 Free PMC article. - Phylogenomics and genetic analysis of solvent-producing Clostridium species.
Jensen RO, Schulz F, Roux S, Klingeman DM, Mitchell WP, Udwary D, Moraïs S, Reynoso V, Winkler J, Nagaraju S, De Tissera S, Shapiro N, Ivanova N, Reddy TBK, Mizrahi I, Utturkar SM, Bayer EA, Woyke T, Mouncey NJ, Jewett MC, Simpson SD, Köpke M, Jones DT, Brown SD. Jensen RO, et al. Sci Data. 2024 May 1;11(1):432. doi: 10.1038/s41597-024-03210-6. Sci Data. 2024. PMID: 38693191 Free PMC article. - Genome Mining for New Enzyme Chemistry.
Nguyen DT, Mitchell DA, van der Donk WA. Nguyen DT, et al. ACS Catal. 2024 Mar 12;14(7):4536-4553. doi: 10.1021/acscatal.3c06322. eCollection 2024 Apr 5. ACS Catal. 2024. PMID: 38601780 Free PMC article. Review. - Structural, Biochemical, and Bioinformatic Basis for Identifying Radical SAM Cyclopropyl Synthases.
Lien Y, Lachowicz JC, Mendauletova A, Zizola C, Ngendahimana T, Kostenko A, Eaton SS, Latham JA, Grove TL. Lien Y, et al. ACS Chem Biol. 2024 Feb 16;19(2):370-379. doi: 10.1021/acschembio.3c00583. Epub 2024 Jan 31. ACS Chem Biol. 2024. PMID: 38295270
References
- Bierbaum G., Sahl H. G. 2009. Lantibiotics: mode of action, biosynthesis and bioengineering. Curr. Pharm. Biotechnol. 10:2–18 - PubMed
- Brindley A. A., Zajicek R., Warren M. J., Ferguson S. J., Rigby S. E. 2010. NirJ, a radical SAM family member of the d1 heme biogenesis cluster. FEBS Lett. 584:2461–2466 - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous