Phylogenetic and phyletic studies of informational genes in genomes highlight existence of a 4 domain of life including giant viruses - PubMed (original) (raw)

Phylogenetic and phyletic studies of informational genes in genomes highlight existence of a 4 domain of life including giant viruses

Mickaël Boyer et al. PLoS One. 2010.

Abstract

The discovery of Mimivirus, with its very large genome content, made it possible to identify genes common to the three domains of life (Eukarya, Bacteria and Archaea) and to generate controversial phylogenomic trees congruent with that of ribosomal genes, branching Mimivirus at its root. Here we used sequences from metagenomic databases, Marseillevirus and three new viruses extending the Mimiviridae family to generate the phylogenetic trees of eight proteins involved in different steps of DNA processing. Compared to the three ribosomal defined domains, we report a single common origin for Nucleocytoplasmic Large DNA Viruses (NCLDV), DNA processing genes rooted between Archaea and Eukarya, with a topology congruent with that of the ribosomal tree. As for translation, we found in our new viruses, together with Mimivirus, five proteins rooted deeply in the eukaryotic clade. In addition, comparison of informational genes repertoire based on phyletic pattern analysis supports existence of a clade containing NCLDVs clearly distinct from that of Eukarya, Bacteria and Archaea. We hypothesize that the core genome of NCLDV is as ancient as the three currently accepted domains of life.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1

Figure 1. NCLDV proteins involved in DNA, RNA and protein biosynthesis.

The frames on the right represent DNA processing steps with related enzymes found in NCLDVs, and the frames on the left with grey backgrounds show viruses that have the corresponding enzymes. The first frame from the top right corresponds to the dNTP biosynthesis catalyzed by RNR and ThyA. The second frame shows four enzymes involved in DNA replication: DNAP B, TopoIIA, PCNA and FEN. Proteins involved in these two DNA biosynthesis steps were found in diverse large DNA viruses, such as NCLDVs, Herpesviridae and Myoviridae. The third frame represents the DNA transcription step, in which the RNAP II and the transcription factor TFIIB are involved; viral homologs of these proteins are only found in NCLDVs. The last frame shows Mimiviridae amino-acyl tRNA synthetases (aa tRNA synthetase) involved in mRNA translation. Interestingly, viral lineages were consistently less well represented from nucleotide to protein biosynthesis, with the exception of NCLDVs, which were represented in each step. a.a., amino acid.

Figure 2

Figure 2. Phylogenetic tree of the RNA polymerase II beta subunit.

The ML tree of RNAP II was inferred from a cured alignment of 80 sequences from the six supergroups of Eukarya (blue), Bacteria (purple), Archaea (green) and NCLDVs (red). The tree is unrooted, and values near branches are SH-like local supports computed by the FastTree program and are used as confidence values of tree branches. Scale bar represents the number of estimated changes per position for a unit of branch length.

Figure 3

Figure 3. Phylogenetic tree of the Transcription factor II B (TFIIB).

The TFIIB phylogenetic tree is inferred with Bayesian approach from a cured alignment of 32 sequences (155 conserved positions) from the Eukarya (blue), Archaea (green), NCLDVs (red), and metagenomic databases (black). Bayesian posterior probabilities are mentioned near branches as a percentage and are used as confidence values of tree branches. Scale bar represents the number of estimated changes per position for a unit of branch length.

Figure 4

Figure 4. Hierarchical clustering of Eukarya (blue), Bacteria (purple), Archaea (green) and NCLDVs (red) by phyletic pattern.

The phyletic patterns of the putative orthologous sets of informational genes indicating the presence/absence of the respective gene in each cellular organisms and virus were used for the construction of the dendogram tree.

Figure 5

Figure 5. Scenario of NCLDVs emerging from the rhizome of life with roots appearing at the very beginning of life.

This figure represents the living species in the four small pictures according to the current classification of organisms: eukaryotes (represented by yellow cell), bacteria (represented by green cell), Archaea (represented by blue cell) and viruses (represented by magenta colored Mimivirus). An organism classification based on ribosomal proteins allows the discrimination of cellular organisms, but viruses are excluded de facto from historical scenarios for life evolution. Our study shows that eight proteins involved in DNA processing and present both in viruses, particularly in NCLDVs, and in two or three other domains of life could be used for phylogenetic tree reconstruction, displaying schematic topologies for each protein represented by lines colored in light green for RNR, in yellow for ThyA, in blue-green for DNAP B, in blue for TopoIIA, in magenta for PCNA, in orange for FEN, in red for RNAP II and in purple for TFIIB. Topology representations were similar to that based on ribosomes (for RNAP II, FEN, PCNA and TFIIB proteins) and/or with bifurcations (for ThyA, RNR, DNAP B and Topo IIA proteins) representing probable lateral gene transfer. Thus, this figure illustrates that REOs and CEOs (bona fide viruses) share a common set of genes involved in DNA processing that evolved from a common ancestral source of genes. The phenomenon of long-branch attraction makes the identification of the exact deep rooting of each analyzed gene difficult and is therefore hidden by a black mark.

Similar articles

Cited by

References

    1. Moreira D, Lopez-Garcia P. Ten reasons to exclude viruses from the tree of life. Nat Rev Microbiol. 2009;7:306–311. - PubMed
    1. Baker ML, Jiang W, Rixon FJ, Chiu W. Common ancestry of herpesviruses and tailed DNA bacteriophages. J Virol. 2005;79:14967–14970. - PMC - PubMed
    1. Krupovic M, Bamford DH. Virus evolution: how far does the double beta-barrel viral lineage extend? Nat Rev Microbiol. 2008;6:941–948. - PubMed
    1. Raoult D, Audic S, Robert C, Abergel C, Renesto P, et al. The 1.2-megabase genome sequence of Mimivirus. Science. 2004;306:1344–1350. - PubMed
    1. Moreira D, Brochier-Armanet C. Giant viruses, giant chimeras: the multiple evolutionary histories of Mimivirus genes. BMC Evol Biol. 2008;8:12. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources