Origin of giant viruses from smaller DNA viruses not from a fourth domain of cellular life - PubMed (original) (raw)

Origin of giant viruses from smaller DNA viruses not from a fourth domain of cellular life

Natalya Yutin et al. Virology. 2014 Oct.

Abstract

The numerous and diverse eukaryotic viruses with large double-stranded DNA genomes that at least partially reproduce in the cytoplasm of infected cells apparently evolved from a single virus ancestor. This major group of viruses is known as Nucleocytoplasmic Large DNA Viruses (NCLDV) or the proposed order Megavirales. Among the "Megavirales", there are three groups of giant viruses with genomes exceeding 500kb, namely Mimiviruses, Pithoviruses, and Pandoraviruses that hold the current record of viral genome size, about 2.5Mb. Phylogenetic analysis of conserved, ancestral NLCDV genes clearly shows that these three groups of giant viruses have three distinct origins within the "Megavirales". The Mimiviruses constitute a distinct family that is distantly related to Phycodnaviridae, Pandoraviruses originate from a common ancestor with Coccolithoviruses within the Phycodnaviridae family, and Pithoviruses are related to Iridoviridae and Marseilleviridae. Maximum likelihood reconstruction of gene gain and loss events during the evolution of the "Megavirales" indicates that each group of giant viruses evolved from viruses with substantially smaller and simpler gene repertoires. Initial phylogenetic analysis of universal genes, such as translation system components, encoded by some giant viruses, in particular Mimiviruses, has led to the hypothesis that giant viruses descend from a fourth, probably extinct domain of cellular life. The results of our comprehensive phylogenomic analysis of giant viruses refute the fourth domain hypothesis and instead indicate that the universal genes have been independently acquired by different giant viruses from their eukaryotic hosts.

Keywords: Domains of cellular life; Genome evolution; Giant viruses; Phylogenomics.

Published by Elsevier Inc.

PubMed Disclaimer

Figures

Figure 1

Figure 1. Phylogenies of the large subunits of DNA-dependent RNA polymerase

(a) Subunit a (b) Subunit b Support values represent expected-likelihood weights of 1,000 local rearrangements; branches with support less than 50 were collapsed. “Megavirales” sequences are highlighted in orange, eukaryotic sequences in blue, archaeal sequences in purple. OLPG: Organic Lake phycodnavirus – Phaeocystic globosa virus clade.

Figure 2

Figure 2. Phylogenies of aminoacyl-tRNA synthetases encoded by giant viruses

(a) Tyrosyl-tRNA synthetase (b) Arginyl-tRNA synthetase (c) Aspartyl/asparaginyl-tRNA synthetases (d) Cysteinyl-tRNA synthetase (e) Isoleucyl-tRNA synthetase (f) Methionyl-tRNA synthetase (g) Tryptophanyl-tRNA synthetase Support values represent expected-likelihood weights of 1,000 local rearrangements; branches with support less than 50 were collapsed. “Megavirales” sequences are highlighted in orange, eukaryotic sequences in blue, bacterial sequences in green, archaeal sequences in purple. Taxa abbreviations: Ac, Crenarchaeota; Ae, Euryarchaeota; Az, unclassified Archaea; Ba, Actinobacteria; Bb, Bacteroidetes/Chlorobi group; Bc, Cyanobacteria; Bi, Acidobacteria; Bp, Proteobacteria; Bs, Spirochaetes; Bv, Chlamydiae/Verrucomicrobia group; E2, Fornicata; E7, Rhodophyta; E8, stramenopiles; E9, Viridiplantae; Ea, Amoebozoa; Ec, Alveolata; Eh, Cryptophyta; Ek, Euglenozoa; El, Opisthokonta; Eq, Heterolobosea; Ew, Parabasalidea.

Figure 2

Figure 2. Phylogenies of aminoacyl-tRNA synthetases encoded by giant viruses

(a) Tyrosyl-tRNA synthetase (b) Arginyl-tRNA synthetase (c) Aspartyl/asparaginyl-tRNA synthetases (d) Cysteinyl-tRNA synthetase (e) Isoleucyl-tRNA synthetase (f) Methionyl-tRNA synthetase (g) Tryptophanyl-tRNA synthetase Support values represent expected-likelihood weights of 1,000 local rearrangements; branches with support less than 50 were collapsed. “Megavirales” sequences are highlighted in orange, eukaryotic sequences in blue, bacterial sequences in green, archaeal sequences in purple. Taxa abbreviations: Ac, Crenarchaeota; Ae, Euryarchaeota; Az, unclassified Archaea; Ba, Actinobacteria; Bb, Bacteroidetes/Chlorobi group; Bc, Cyanobacteria; Bi, Acidobacteria; Bp, Proteobacteria; Bs, Spirochaetes; Bv, Chlamydiae/Verrucomicrobia group; E2, Fornicata; E7, Rhodophyta; E8, stramenopiles; E9, Viridiplantae; Ea, Amoebozoa; Ec, Alveolata; Eh, Cryptophyta; Ek, Euglenozoa; El, Opisthokonta; Eq, Heterolobosea; Ew, Parabasalidea.

Figure 2

Figure 2. Phylogenies of aminoacyl-tRNA synthetases encoded by giant viruses

(a) Tyrosyl-tRNA synthetase (b) Arginyl-tRNA synthetase (c) Aspartyl/asparaginyl-tRNA synthetases (d) Cysteinyl-tRNA synthetase (e) Isoleucyl-tRNA synthetase (f) Methionyl-tRNA synthetase (g) Tryptophanyl-tRNA synthetase Support values represent expected-likelihood weights of 1,000 local rearrangements; branches with support less than 50 were collapsed. “Megavirales” sequences are highlighted in orange, eukaryotic sequences in blue, bacterial sequences in green, archaeal sequences in purple. Taxa abbreviations: Ac, Crenarchaeota; Ae, Euryarchaeota; Az, unclassified Archaea; Ba, Actinobacteria; Bb, Bacteroidetes/Chlorobi group; Bc, Cyanobacteria; Bi, Acidobacteria; Bp, Proteobacteria; Bs, Spirochaetes; Bv, Chlamydiae/Verrucomicrobia group; E2, Fornicata; E7, Rhodophyta; E8, stramenopiles; E9, Viridiplantae; Ea, Amoebozoa; Ec, Alveolata; Eh, Cryptophyta; Ek, Euglenozoa; El, Opisthokonta; Eq, Heterolobosea; Ew, Parabasalidea.

Figure 2

Figure 2. Phylogenies of aminoacyl-tRNA synthetases encoded by giant viruses

(a) Tyrosyl-tRNA synthetase (b) Arginyl-tRNA synthetase (c) Aspartyl/asparaginyl-tRNA synthetases (d) Cysteinyl-tRNA synthetase (e) Isoleucyl-tRNA synthetase (f) Methionyl-tRNA synthetase (g) Tryptophanyl-tRNA synthetase Support values represent expected-likelihood weights of 1,000 local rearrangements; branches with support less than 50 were collapsed. “Megavirales” sequences are highlighted in orange, eukaryotic sequences in blue, bacterial sequences in green, archaeal sequences in purple. Taxa abbreviations: Ac, Crenarchaeota; Ae, Euryarchaeota; Az, unclassified Archaea; Ba, Actinobacteria; Bb, Bacteroidetes/Chlorobi group; Bc, Cyanobacteria; Bi, Acidobacteria; Bp, Proteobacteria; Bs, Spirochaetes; Bv, Chlamydiae/Verrucomicrobia group; E2, Fornicata; E7, Rhodophyta; E8, stramenopiles; E9, Viridiplantae; Ea, Amoebozoa; Ec, Alveolata; Eh, Cryptophyta; Ek, Euglenozoa; El, Opisthokonta; Eq, Heterolobosea; Ew, Parabasalidea.

Figure 2

Figure 2. Phylogenies of aminoacyl-tRNA synthetases encoded by giant viruses

(a) Tyrosyl-tRNA synthetase (b) Arginyl-tRNA synthetase (c) Aspartyl/asparaginyl-tRNA synthetases (d) Cysteinyl-tRNA synthetase (e) Isoleucyl-tRNA synthetase (f) Methionyl-tRNA synthetase (g) Tryptophanyl-tRNA synthetase Support values represent expected-likelihood weights of 1,000 local rearrangements; branches with support less than 50 were collapsed. “Megavirales” sequences are highlighted in orange, eukaryotic sequences in blue, bacterial sequences in green, archaeal sequences in purple. Taxa abbreviations: Ac, Crenarchaeota; Ae, Euryarchaeota; Az, unclassified Archaea; Ba, Actinobacteria; Bb, Bacteroidetes/Chlorobi group; Bc, Cyanobacteria; Bi, Acidobacteria; Bp, Proteobacteria; Bs, Spirochaetes; Bv, Chlamydiae/Verrucomicrobia group; E2, Fornicata; E7, Rhodophyta; E8, stramenopiles; E9, Viridiplantae; Ea, Amoebozoa; Ec, Alveolata; Eh, Cryptophyta; Ek, Euglenozoa; El, Opisthokonta; Eq, Heterolobosea; Ew, Parabasalidea.

Figure 2

Figure 2. Phylogenies of aminoacyl-tRNA synthetases encoded by giant viruses

(a) Tyrosyl-tRNA synthetase (b) Arginyl-tRNA synthetase (c) Aspartyl/asparaginyl-tRNA synthetases (d) Cysteinyl-tRNA synthetase (e) Isoleucyl-tRNA synthetase (f) Methionyl-tRNA synthetase (g) Tryptophanyl-tRNA synthetase Support values represent expected-likelihood weights of 1,000 local rearrangements; branches with support less than 50 were collapsed. “Megavirales” sequences are highlighted in orange, eukaryotic sequences in blue, bacterial sequences in green, archaeal sequences in purple. Taxa abbreviations: Ac, Crenarchaeota; Ae, Euryarchaeota; Az, unclassified Archaea; Ba, Actinobacteria; Bb, Bacteroidetes/Chlorobi group; Bc, Cyanobacteria; Bi, Acidobacteria; Bp, Proteobacteria; Bs, Spirochaetes; Bv, Chlamydiae/Verrucomicrobia group; E2, Fornicata; E7, Rhodophyta; E8, stramenopiles; E9, Viridiplantae; Ea, Amoebozoa; Ec, Alveolata; Eh, Cryptophyta; Ek, Euglenozoa; El, Opisthokonta; Eq, Heterolobosea; Ew, Parabasalidea.

Figure 2

Figure 2. Phylogenies of aminoacyl-tRNA synthetases encoded by giant viruses

(a) Tyrosyl-tRNA synthetase (b) Arginyl-tRNA synthetase (c) Aspartyl/asparaginyl-tRNA synthetases (d) Cysteinyl-tRNA synthetase (e) Isoleucyl-tRNA synthetase (f) Methionyl-tRNA synthetase (g) Tryptophanyl-tRNA synthetase Support values represent expected-likelihood weights of 1,000 local rearrangements; branches with support less than 50 were collapsed. “Megavirales” sequences are highlighted in orange, eukaryotic sequences in blue, bacterial sequences in green, archaeal sequences in purple. Taxa abbreviations: Ac, Crenarchaeota; Ae, Euryarchaeota; Az, unclassified Archaea; Ba, Actinobacteria; Bb, Bacteroidetes/Chlorobi group; Bc, Cyanobacteria; Bi, Acidobacteria; Bp, Proteobacteria; Bs, Spirochaetes; Bv, Chlamydiae/Verrucomicrobia group; E2, Fornicata; E7, Rhodophyta; E8, stramenopiles; E9, Viridiplantae; Ea, Amoebozoa; Ec, Alveolata; Eh, Cryptophyta; Ek, Euglenozoa; El, Opisthokonta; Eq, Heterolobosea; Ew, Parabasalidea.

Figure 3

Figure 3. Phylogenies of translation encoded in “Megavirales” genomes

(a) elongation factor EF-1alpha (b) initiation factor eIF-1 (SUI1) (c) initiation factor eIF-2beta (d) initiation factor eIF-4A (e) initiation factor SUA5 (f) peptide chain release factor eRF1 Support values represent expected-likelihood weights of 1,000 local rearrangements; branches with support less than 50 were collapsed. “Megavirales” sequences are highlighted in orange, eukaryotic sequences in blue, archaeal sequences in purple. Taxa abbreviations: Ae, Euryarchaeota; Ba, Actinobacteria; Bf, Firmicutes; Bj, Tenericutes; Bp, Proteobacteria; E2, Fornicata; E7, Rhodophyta; E9, Viridiplantae; Ea, Amoebozoa; Ec, Alveolata; Eh, Cryptophyta; Ek, Euglenozoa; El, Opisthokonta; Ep, Haptophyceae; Eq, Heterolobosea; Ew, Parabasalidea.

Figure 3

Figure 3. Phylogenies of translation encoded in “Megavirales” genomes

(a) elongation factor EF-1alpha (b) initiation factor eIF-1 (SUI1) (c) initiation factor eIF-2beta (d) initiation factor eIF-4A (e) initiation factor SUA5 (f) peptide chain release factor eRF1 Support values represent expected-likelihood weights of 1,000 local rearrangements; branches with support less than 50 were collapsed. “Megavirales” sequences are highlighted in orange, eukaryotic sequences in blue, archaeal sequences in purple. Taxa abbreviations: Ae, Euryarchaeota; Ba, Actinobacteria; Bf, Firmicutes; Bj, Tenericutes; Bp, Proteobacteria; E2, Fornicata; E7, Rhodophyta; E9, Viridiplantae; Ea, Amoebozoa; Ec, Alveolata; Eh, Cryptophyta; Ek, Euglenozoa; El, Opisthokonta; Ep, Haptophyceae; Eq, Heterolobosea; Ew, Parabasalidea.

Figure 3

Figure 3. Phylogenies of translation encoded in “Megavirales” genomes

(a) elongation factor EF-1alpha (b) initiation factor eIF-1 (SUI1) (c) initiation factor eIF-2beta (d) initiation factor eIF-4A (e) initiation factor SUA5 (f) peptide chain release factor eRF1 Support values represent expected-likelihood weights of 1,000 local rearrangements; branches with support less than 50 were collapsed. “Megavirales” sequences are highlighted in orange, eukaryotic sequences in blue, archaeal sequences in purple. Taxa abbreviations: Ae, Euryarchaeota; Ba, Actinobacteria; Bf, Firmicutes; Bj, Tenericutes; Bp, Proteobacteria; E2, Fornicata; E7, Rhodophyta; E9, Viridiplantae; Ea, Amoebozoa; Ec, Alveolata; Eh, Cryptophyta; Ek, Euglenozoa; El, Opisthokonta; Ep, Haptophyceae; Eq, Heterolobosea; Ew, Parabasalidea.

Figure 3

Figure 3. Phylogenies of translation encoded in “Megavirales” genomes

(a) elongation factor EF-1alpha (b) initiation factor eIF-1 (SUI1) (c) initiation factor eIF-2beta (d) initiation factor eIF-4A (e) initiation factor SUA5 (f) peptide chain release factor eRF1 Support values represent expected-likelihood weights of 1,000 local rearrangements; branches with support less than 50 were collapsed. “Megavirales” sequences are highlighted in orange, eukaryotic sequences in blue, archaeal sequences in purple. Taxa abbreviations: Ae, Euryarchaeota; Ba, Actinobacteria; Bf, Firmicutes; Bj, Tenericutes; Bp, Proteobacteria; E2, Fornicata; E7, Rhodophyta; E9, Viridiplantae; Ea, Amoebozoa; Ec, Alveolata; Eh, Cryptophyta; Ek, Euglenozoa; El, Opisthokonta; Ep, Haptophyceae; Eq, Heterolobosea; Ew, Parabasalidea.

Figure 3

Figure 3. Phylogenies of translation encoded in “Megavirales” genomes

(a) elongation factor EF-1alpha (b) initiation factor eIF-1 (SUI1) (c) initiation factor eIF-2beta (d) initiation factor eIF-4A (e) initiation factor SUA5 (f) peptide chain release factor eRF1 Support values represent expected-likelihood weights of 1,000 local rearrangements; branches with support less than 50 were collapsed. “Megavirales” sequences are highlighted in orange, eukaryotic sequences in blue, archaeal sequences in purple. Taxa abbreviations: Ae, Euryarchaeota; Ba, Actinobacteria; Bf, Firmicutes; Bj, Tenericutes; Bp, Proteobacteria; E2, Fornicata; E7, Rhodophyta; E9, Viridiplantae; Ea, Amoebozoa; Ec, Alveolata; Eh, Cryptophyta; Ek, Euglenozoa; El, Opisthokonta; Ep, Haptophyceae; Eq, Heterolobosea; Ew, Parabasalidea.

Figure 3

Figure 3. Phylogenies of translation encoded in “Megavirales” genomes

(a) elongation factor EF-1alpha (b) initiation factor eIF-1 (SUI1) (c) initiation factor eIF-2beta (d) initiation factor eIF-4A (e) initiation factor SUA5 (f) peptide chain release factor eRF1 Support values represent expected-likelihood weights of 1,000 local rearrangements; branches with support less than 50 were collapsed. “Megavirales” sequences are highlighted in orange, eukaryotic sequences in blue, archaeal sequences in purple. Taxa abbreviations: Ae, Euryarchaeota; Ba, Actinobacteria; Bf, Firmicutes; Bj, Tenericutes; Bp, Proteobacteria; E2, Fornicata; E7, Rhodophyta; E9, Viridiplantae; Ea, Amoebozoa; Ec, Alveolata; Eh, Cryptophyta; Ek, Euglenozoa; El, Opisthokonta; Ep, Haptophyceae; Eq, Heterolobosea; Ew, Parabasalidea.

Figure 4

Figure 4

Phylogenomic breakdown of giant virus genes

Figure 5

Figure 5. Relationships between the gene contents of giant viruses and their smaller relatives

(a) matrix of shared genes Lower left: number of shared gene families. Upper right: Jaccard similarity of gene complements. Diagonal: number of annotated genes in the genome. Intra-family comparisons are shaded. (b) tree of gene contents Bold lines indicate branches with high (>70%) bootstrap support; thin lines indicate branches with low bootstrap support. Branches that disagree with the tree, reconstructed with the universal core genes, are highlighted in red (except the poorly resolved branches inside the Poxviridae), dashed lines indicate the relationships expected from the core phylogeny.

Figure 6

Figure 6. Phylogenetic tree of the (nearly) universal core genes of the “Megavirales” and reconstruction of gene gain and loss

Numbers above the branch indicate the estimated number of NCVOG families (plus the number of singletons for extant genomes) at the end of the branch. Numbers below the branch indicate the estimated number of gained and lost NCVOG families (plus the number of acquired singletons for extant genomes). Dashed lines extend the branches The estimated number of “Megavirales” ancestral families is indicated in a circle at the tree root.

Figure 6

Figure 6. Phylogenetic tree of the (nearly) universal core genes of the “Megavirales” and reconstruction of gene gain and loss

Numbers above the branch indicate the estimated number of NCVOG families (plus the number of singletons for extant genomes) at the end of the branch. Numbers below the branch indicate the estimated number of gained and lost NCVOG families (plus the number of acquired singletons for extant genomes). Dashed lines extend the branches The estimated number of “Megavirales” ancestral families is indicated in a circle at the tree root.

Similar articles

Cited by

References

    1. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25(17):3389–402. - PMC - PubMed
    1. Boyer M, Azza S, Barrassi L, Klose T, Campocasso A, Pagnier I, Fournous G, Borg A, Robert C, Zhang X, Desnues C, Henrissat B, Rossmann MG, La Scola B, Raoult D. Mimivirus shows dramatic genome reduction after intraamoebal culture. Proc Natl Acad Sci U S A. 2011;108(25):10296–301. - PMC - PubMed
    1. Brown JR, Doolittle WF. Archaea and the prokaryote-to-eukaryote transition. Microbiol Mol Biol Rev. 1997;61(4):456–502. - PMC - PubMed
    1. Brown JR, Douady CJ, Italia MJ, Marshall WE, Stanhope MJ. Universal trees based on large combined protein sequence data sets. Nat Genet. 2001;28(3):281–5. - PubMed
    1. Ciccarelli FD, Doerks T, von Mering C, Creevey CJ, Snel B, Bork P. Toward automatic reconstruction of a highly resolved tree of life. Science. 2006;311(5765):1283–7. - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources