Comparative genomics of Bacillus thuringiensis phage 0305phi8-36: defining patterns of descent in a novel ancient phage lineage - PubMed (original) (raw)

Comparative Study

Comparative genomics of Bacillus thuringiensis phage 0305phi8-36: defining patterns of descent in a novel ancient phage lineage

Stephen C Hardies et al. Virol J. 2007.

Abstract

Background: The recently sequenced 218 kb genome of morphologically atypical Bacillus thuringiensis phage 0305phi8-36 exhibited only limited detectable homology to known bacteriophages. The only known relative of this phage is a string of phage-like genes called BtI1 in the chromosome of B. thuringiensis israelensis. The high degree of divergence and novelty of phage genomes pose challenges in how to describe the phage from its genomic sequences.

Results: Phage 0305phi8-36 and BtI1 are estimated to have diverged 2.0 - 2.5 billion years ago. Positionally biased Blast searches aligned 30 homologous structure or morphogenesis genes between 0305phi8-36 and BtI1 that have maintained the same gene order. Functional clustering of the genes helped identify additional gene functions. A conserved long tape measure gene indicates that a long tail is an evolutionarily stable property of this phage lineage. An unusual form of the tail chaperonin system split to two genes was characterized, as was a hyperplastic homologue of the T4gp27 hub gene. Within this region some segments were best described as encoding a conservative array of structure domains fused with a variable component of exchangeable domains. Other segments were best described as multigene units engaged in modular horizontal exchange. The non-structure genes of 0305phi8-36 appear to include the remnants of two replicative systems leading to the hypothesis that the genome plan was created by fusion of two ancestral viruses. The case for a member of the RNAi RNA-directed RNA polymerase family residing in 0305phi8-36 was strengthened by extending the hidden Markov model of this family. Finally, it was noted that prospective transcriptional promoters were distributed in a gradient of small to large transcripts starting from a fixed end of the genome.

Conclusion: Genomic organization at a level higher than individual gene sequence comparison can be analyzed to aid in understanding large phage genomes. Methods of analysis include 1) applying a time scale, 2) augmenting blast scores with positional information, 3) categorizing genomic rearrangements into one of several processes with characteristic rates and outcomes, and 4) correlating apparent transcript sizes with genomic position, gene content, and promoter motifs.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Map of the genome of 0305φ8-36 showing distribution of features. The features are from ref. [1]. The scale is in kilobase pairs. Arrows – orfs color coded as: green – encodes virion protein, dark green – encodes high copy virion protein, grey – implied virion protein by sequence analysis only, blue – non-structural, and red – non structural in terminal repeat. The orf number for every 10th orf is given, with the exception of numbers that are not consecutive, for which each orf is labelled. Purple rectangles – tRNA-like sequences of unclear significance. Abbreviations include: TMP – tape measure protein; thy. kinase – thymidine kinase; mreB – mreB-like rod determination protein; hsdM – HsdM, Type I restriction-modification system methyltransferase subunit; nrd – ribonucleoside reductase; rec. exo – DNA repair exonuclease; UDG – uracil-DNA glycosylase. Italic indicates a tentative assignment. Noncoding regions greater than 40 bp are marked above the orfs in cyan if they do, or brown if they do not, contain a promoter candidate of the class described in Figure 6.

Figure 2

Figure 2

Main structure-encoding region of 0305φ8-36 showing similarities to BtI1 and paralogous domains. The figure was modified from Gbrowse output as described in the methods. Phage 0305φ8-36 orfs are color coded as in Figure 1. BtI1 orfs are color coded as follows: Green – N terminus of a BtI1 gene. Shades of red from bright to pale indicate assignment of homology with increasing reliance on positional information as described in the methods. The size of a connector dropping below the chain of matches indicates the amount of DNA missing in BtI1 versus 0305φ8-36. A triangle above the chain of matches indicates the amount of DNA in BtI1 in excess over 0305φ8-36. Boundaries of BtI1 frames marked with an asterisk were revised over those indicated in GenBank. Red angle brackets fuse two BtI1 orfs by correcting a frameshift. The left end of the BtI1 chain of glyphs is at the end of a contig. Colored rectangles below the 0305φ8-36 orfs indicate paralogous domains in 0305φ8-36. Open black boxes immediately under 0305φ8-36 orfs or within BtI1 orfs indicate FN3 domains. Closed black boxes indicate domains as follows: Under orf147 – T4gp27 domain, under orf163 – a C-terminal intimin domain, under orf164 – bacterial von Willebrand's factor domain, within RBTH_07677 – LysM domains. Abbreviations include: Lg. ter. – large terminase; c.f. – putative curly fiber protein gene; pr./scaf. – protease with nested scaffold gene; h.d. – putative head decoration gene; TMP – tape measure protein; hub – homologue of T4gp27; V – homologue of P2 gpV; J – homologue of P2 gpJ.

Figure 3

Figure 3

Comparison of predicted secondary structure between bacteriophage λ gpG and 0305φ8-36 gp143/gp144.

Figure 4

Figure 4

Homology among the T4 gp 27 hub family, the P2 gpD family, and the 0305φ8-36 gp147 family. Domain 1 and 3 refer to folding domains described for the T4 gp27 hub [26]. Sequences within each family were aligned by SAM, and converted to logos as indicated in Methods. The logo segments shown are aligned with each other as found by HHSearch [25] without assistance from secondary structure. Secondary structure was annotated subsequent to the alignment to act as a second opinion on its quality. Red and blue bars below the T4 logos represent α helixes and β strands from the crystal structure. Red and blue bars below the other logos represent secondary structure predictions.

Figure 5

Figure 5

Alignment of 0305φ8-36 orf99 to diagnostic motifs of the RNA-dependent RNA polymerase family Pfam05183. The motifs [34] are represented by segments of the sequence logo obtained from Pfam. The orf99 sequences aligned according to SAM.

Figure 6

Figure 6

DNA sequence logo representing 14 candidates for 0305φ8-36-specific promoters. The corresponding 14 noncoding segments are indicated in cyan in Figure 2.

Similar articles

Cited by

References

    1. Thomas JA, Hardies SC, Rolando M, Hayes S, Lieman K, Carroll CA, Weintraub ST, Serwer P. Complete genomic sequence and mass spectrometric analysis of highly diverse, atypical Bacillus thuringiensis phage 0305phi8-36. Virology. 2007;368:405–421. - PMC - PubMed
    1. Thomas JA, Hardies SC, Serwer P. The complete genomic sequence of Bacillus thuringiensis phage 0305φ8-36 [GenBank:EF583821] 2007.
    1. Serwer P, Hayes SJ, Zaman S, Lieman K, Rolando M, Hardies SC. Improved isolation of undersampled bacteriophages: finding of distant terminase genes. Virology. 2004;329:412–424. - PubMed
    1. Serwer P, Hayes S, Thomas J, Hardies SC. Propagating the missing bacteriophages: a large bacteriophage in a new class. Virology J. 2007;4:21. doi: 10.1186/1743-422X-4-21. - DOI - PMC - PubMed
    1. Serwer P, Hayes SJ, Thomas J, Griess GA, Hardies SC. Rapid determination of genomic DNA length for new bacteriophages. Electrophoresis. 2007;28:1896–1902. doi: 10.1002/elps.200600672. - DOI - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources