Zaremba-Niedzwiedzka, K. et al. Asgard archaea illuminate the origin of eukaryotic cellular complexity. Nature541, 353–358 (2017). ArticleCASPubMed Google Scholar
Eme, L., Spang, A., Lombard, J., Stairs, C. W. & Ettema, T. J. G. Archaea and the origin of eukaryotes. Nat. Rev. Microbiol.15, 711–723 (2017). ArticleCASPubMed Google Scholar
Baker, B. J. et al. Diversity, ecology and evolution of Archaea. Nat. Microbiol.5, 887–900 (2020). ArticleCASPubMed Google Scholar
Spang, A. et al. Proposal of the reverse flow model for the origin of the eukaryotic cell based on comparative analyses of Asgard archaeal metabolism. Nat. Microbiol.4, 1138–1148 (2019). ArticleCASPubMed Google Scholar
Bell, P. J. L. Evidence supporting a viral origin of the eukaryotic nucleus. Virus Res.289, 198168 (2020). ArticleCASPubMed Google Scholar
Forterre, P. & Gaïa, M. Giant viruses and the origin of modern eukaryotes. Curr. Opin. Microbiol.31, 44–49 (2016). ArticlePubMed Google Scholar
Malone, L. M. et al. A jumbo phage that forms a nucleus-like structure evades CRISPR-Cas DNA targeting but is vulnerable to type III RNA-based immunity. Nat. Microbiol.5, 48–55 (2020). ArticleCASPubMed Google Scholar
Iyer, L. M., Aravind, L. & Koonin, E. V. Common origin of four diverse families of large eukaryotic DNA viruses. J. Virol.75, 11720–11734 (2001). ArticleCASPubMedPubMed Central Google Scholar
Krupovic, M., Dolja, V. V. & Koonin, E. V. The LUCA and its complex virome. Nat. Rev. Microbiol.18, 661–670 (2020). ArticleCASPubMed Google Scholar
Makarova, K. S. et al. Evolutionary classification of CRISPR-Cas systems: a burst of class 2 and derived variants. Nat. Rev. Microbiol.18, 67–83 (2020). ArticleCASPubMed Google Scholar
Dombrowski, N., Teske, A. P. & Baker, B. J. Expansive microbial metabolic versatility and biodiversity in dynamic Guaymas Basin hydrothermal sediments. Nat. Commun.9, 4999 (2018). ArticlePubMedPubMed CentralCAS Google Scholar
Castelle, C. J. et al. Protein family content uncovers lineage relationships and bacterial pathway maintenance mechanisms in DPANN Archaea. Front. Microbiol.12, 660052 (2021). ArticlePubMedPubMed Central Google Scholar
Langwig, M. V. et al. Large-scale protein level comparison of Deltaproteobacteria reveals cohesive metabolic groups. ISME J. https://doi.org/10.1038/s41396-021-01057-y (2021).
Kieft, K., Zhou, Z. & Anantharaman, K. VIBRANT: automated recovery, annotation and curation of microbial viruses, and evaluation of viral community function from genomic sequences. Microbiome8, 90 (2020). ArticleCASPubMedPubMed Central Google Scholar
Prangishvili, D. et al. The enigmatic archaeal virosphere. Nat. Rev. Microbiol.15, 724–739 (2017). ArticleCASPubMed Google Scholar
Nayfach, S. et al. CheckV assesses the quality and completeness of metagenome-assembled viral genomes. Nat. Biotechnol.39, 578–585 (2021). ArticleCASPubMed Google Scholar
Kazlauskas, D., Krupovic, M. & Venclovas, Č. The logic of DNA replication in double-stranded DNA viruses: insights from global analysis of viral genomes. Nucleic Acids Res.44, 4551–4564 (2016). ArticleCASPubMedPubMed Central Google Scholar
Pons, J. C. et al. VPF-Class: Taxonomic assignment and host prediction of uncultivated viruses based on viral protein families. Bioinformaticshttps://doi.org/10.1093/bioinformatics/btab026 (2021).
Krupovic, M., Cvirkaite-Krupovic, V., Iranzo, J., Prangishvili, D. & Koonin, E. V. Viruses of archaea: structural, functional, environmental and evolutionary genomics. Virus Res.244, 181–193 (2018). ArticleCASPubMed Google Scholar
Yutin, N., Wolf, Y. I., Raoult, D. & Koonin, E. V. Eukaryotic large nucleo-cytoplasmic DNA viruses: clusters of orthologous genes and reconstruction of viral genome evolution. Virol. J.6, 223 (2009). ArticlePubMedPubMed CentralCAS Google Scholar
Koonin, E. V. & Dolja, V. V. Virus world as an evolutionary network of viruses and capsidless selfish elements. Microbiol. Mol. Biol. Rev.78, 278–303 (2014). ArticleCASPubMedPubMed Central Google Scholar
Iranzo, J., Koonin, E. V., Prangishvili, D., Krupovic, M. & Sandri-Goldin, R. M. Bipartite network analysis of the archaeal virosphere: evolutionary connections between viruses and capsidless mobile elements. J. Virol.90, 11043–11055 (2016). ArticleCASPubMedPubMed Central Google Scholar
Kala, S. et al. HNH proteins are a widespread component of phage DNA packaging machines. Proc. Natl Acad. Sci. USA111, 6022–6027 (2014). ArticleCASPubMedPubMed Central Google Scholar
Guilliam, T. A., Keen, B. A., Brissett, N. C. & Doherty, A. J. Primase-polymerases are a functionally diverse superfamily of replication and repair enzymes. Nucleic Acids Res.43, 6651–6664 (2015). ArticleCASPubMedPubMed Central Google Scholar
Gupta, A., Lad, S. B., Ghodke, P. P., Pradeepkumar, P. I. & Kondabagil, K. Mimivirus encodes a multifunctional primase with DNA/RNA polymerase, terminal transferase and translesion synthesis activities. Nucleic Acids Res.47, 6932–6945 (2019). ArticleCASPubMedPubMed Central Google Scholar
Mazzon, C. et al. Cytosolic and mitochondrial deoxyribonucleotidases: activity with substrate analogs, inhibitors and implications for therapy. Biochem. Pharmacol.66, 471–479 (2003). ArticleCASPubMed Google Scholar
Colson, P., La Scola, B., Levasseur, A., Caetano-Anollés, G. & Raoult, D. Mimivirus: leading the way in the discovery of giant viruses of amoebae. Nat. Rev. Microbiol.15, 243–254 (2017). ArticleCASPubMedPubMed Central Google Scholar
Doherty, A. J., Serpell, L. C. & Ponting, C. P. The helix-hairpin-helix DNA-binding motif: a structural basis for non-sequence-specific recognition of DNA. Nucleic Acids Res.24, 2488–2497 (1996). ArticleCASPubMedPubMed Central Google Scholar
Iyer, L. M., Balaji, S., Koonin, E. V. & Aravind, L. Evolutionary genomics of nucleo-cytoplasmic large DNA viruses. Virus Res.117, 156–184 (2006). ArticleCASPubMed Google Scholar
Sim, S., Hughes, K., Chen, X. & Wolin, S. L. The bacterial Ro60 protein and its noncoding Y RNA regulators. Annu. Rev. Microbiol.74, 387–407 (2020). ArticleCASPubMed Google Scholar
Ho, C. K., Wang, L. K., Lima, C. D. & Shuman, S. Structure and mechanism of RNA ligase. Structure12, 327–339 (2004). ArticleCASPubMed Google Scholar
Tang, Q., Wu, P., Chen, H. & Li, G. Pleiotropic roles of the ubiquitin-proteasome system during viral propagation. Life Sci.207, 350–354 (2018). ArticleCASPubMedPubMed Central Google Scholar
Murphy, J., Mahony, J., Ainsworth, S., Nauta, A. & van Sinderen, D. Bacteriophage orphan DNA methyltransferases: insights from their bacterial origin, function, and occurrence. Appl. Environ. Microbiol.79, 7547–7555 (2013). ArticleCASPubMedPubMed Central Google Scholar
Jeudy, S. et al. Exploration of the propagation of transpovirons within Mimiviridae reveals a unique example of commensalism in the viral world. ISME J.14, 727–739 (2020). ArticleCASPubMed Google Scholar
Agarkova, I. V., Dunigan, D. D. & Van Etten, J. L. Virion-associated restriction endonucleases of chloroviruses. J. Virol.80, 8114–8123 (2006). ArticleCASPubMedPubMed Central Google Scholar
Markine-Goriaynoff, N. et al. Glycosyltransferases encoded by viruses. J. Gen. Virol.85, 2741–2754 (2004). ArticleCASPubMed Google Scholar
Piacente, F., Gaglianone, M., Laugieri, M. E. & Tonetti, M. G. The autonomous glycosylation of large DNA viruses. Int. J. Mol. Sci.16, 29315–29328 (2015). ArticleCASPubMedPubMed Central Google Scholar
Hagelueken, G. et al. A coiled-coil domain acts as a molecular ruler to regulate O-antigen chain length in lipopolysaccharide. Nat. Struct. Mol. Biol.22, 50–56 (2014). ArticlePubMedPubMed CentralCAS Google Scholar
Joshi, N.A. & Fass, J.N. Sickle: a sliding-window, adaptive, quality-based trimming tool for FastQ files (Version 1.33) [Software] (2011). https://github.com/najoshi/sickle
Peng, Y., Leung, H. C. M., Yiu, S. M. & Chin, F. Y. L. IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics28, 1420–1428 (2012). ArticleCASPubMed Google Scholar
Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res.25, 1043–1055 (2015). ArticleCASPubMedPubMed Central Google Scholar
Alneberg, J. et al. Binning metagenomic contigs by coverage and composition. Nat. Methods11, 1144–1146 (2014). ArticleCASPubMed Google Scholar
Kang, D. D. et al. MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. PeerJ7, e7359 (2019). ArticlePubMedPubMed Central Google Scholar
Sieber, C. M. K. et al. Recovery of genomes from metagenomes via a dereplication, aggregation and scoring strategy. Nat. Microbiol.3, 836–843 (2018). ArticleCASPubMedPubMed Central Google Scholar
Chen, I.-M. A. et al. IMG/M v.5.0: an integrated data management and comparative analysis system for microbial genomes and microbiomes. Nucleic Acids Res.47, D666–D677 (2019). ArticleCASPubMed Google Scholar
Biswas, A., Staals, R. H. J., Morales, S. E., Fineran, P. C. & Brown, C. M. CRISPRDetect: a flexible algorithm to define CRISPR arrays. BMC Genomics17, 356 (2016). ArticlePubMedPubMed CentralCAS Google Scholar
Fu, L., Niu, B., Zhu, Z., Wu, S. & Li, W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics28, 3150–3152 (2012). ArticleCASPubMedPubMed Central Google Scholar
Bland, C. et al. CRISPR recognition tool (CRT): a tool for automatic detection of clustered regularly interspaced palindromic repeats. BMC Bioinformatics8, 209 (2007). ArticlePubMedPubMed CentralCAS Google Scholar
Padilha, V. A., Alkhnbashi, O. S., Shah, S. A., de Carvalho, A. C. P. L. F. & Backofen, R. CRISPRcasIdentifier: machine learning for accurate identification and classification of CRISPR-Cas systems. Gigascience9, giaa062 (2020). ArticlePubMedPubMed CentralCAS Google Scholar
Koonin, E. V., Makarova, K. S. & Zhang, F. Diversity, classification and evolution of CRISPR-Cas systems. Curr. Opin. Microbiol.37, 67–78 (2017). ArticleCASPubMedPubMed Central Google Scholar
Hyatt, D. et al. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics11, 119 (2010). ArticlePubMedPubMed CentralCAS Google Scholar
Aramaki, T. et al. KofamKOALA: KEGG Ortholog assignment based on profile HMM and adaptive score threshold. Bioinformatics36, 2251–2252 (2019). ArticlePubMed CentralCAS Google Scholar
El-Gebali, S. et al. The Pfam protein families database in 2019. Nucleic Acids Res.47, D427–D432 (2019). ArticleCASPubMed Google Scholar
Grazziotin, A. L., Koonin, E. V. & Kristensen, D. M. Prokaryotic virus orthologous groups (pVOGs): a resource for comparative genomics and protein family annotation. Nucleic Acids Res.45, D491–D498 (2017). ArticleCASPubMed Google Scholar
Guo, J. et al. VirSorter2: a multi-classifier, expert-guided approach to detect diverse DNA and RNA viruses. Microbiome9, 37 (2021). ArticlePubMedPubMed Central Google Scholar
Cantu, V. A. et al. PhANNs, a fast and accurate tool and web server to classify phage structural proteins. PloS Comput. Biol.16, e1007845 (2020). ArticleCASPubMedPubMed Central Google Scholar
Zimmermann, L. et al. A completely reimplemented MPI bioinformatics toolkit with a new HHpred server at its core. J. Mol. Biol.430, 2237–2243 (2018). ArticleCASPubMed Google Scholar
Grant, J. R. & Stothard, P. The CGView server: a comparative genomics tool for circular genomes. Nucleic Acids Res.36, W181–W184 (2008). ArticleCASPubMedPubMed Central Google Scholar
Bin Jang, H. et al. Taxonomic assignment of uncultivated prokaryotic virus genomes is enabled by gene-sharing networks. Nat. Biotechnol.37, 632–639 (2019). ArticleCAS Google Scholar
Nepusz, T., Yu, H. & Paccanaro, A. Detecting overlapping protein complexes in protein-protein interaction networks. Nat. Methods9, 471–472 (2012). ArticleCASPubMedPubMed Central Google Scholar
Enright, A. J., Van Dongen, S. & Ouzounis, C. A. An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res.30, 1575–1584 (2002). ArticleCASPubMedPubMed Central Google Scholar
Sayers, E. W. et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res.37, D5–D15 (2009). ArticleCASPubMed Google Scholar
Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res.13, 2498–2504 (2003). ArticleCASPubMedPubMed Central Google Scholar
RStudio: Integrated Development Environment for R (RStudio Team, 2019).
R: A Language and Environment for Statistical Computing (R Core Team, 2020).
Rudis, B. & Gandy, D. waffle: create waffle chart visualizations in R (2016).
Yutin, N., Wolf, Y. I. & Koonin, E. V. Origin of giant viruses from smaller DNA viruses not from a fourth domain of cellular life. Virology466-467, 38–52 (2014). ArticleCASPubMed Google Scholar
Paez-Espino, D. et al. IMG/VR: a database of cultured and uncultured DNA viruses and retroviruses. Nucleic Acids Res.45, D457–D465 (2017). CASPubMed Google Scholar
Wu, F. et al. Unique mobile elements and scalable gene flow at the prokaryote–eukaryote boundary revealed by circularized Asgard archaea genomes. Nat. Microbiol.7, 200–212 (2022). ArticleCASPubMedPubMed Central Google Scholar
Andersson, A. F. & Banfield, J. F. Virus population dynamics and acquired virus resistance in natural microbial communities. Science320, 1047–1050 (2008). ArticleCASPubMed Google Scholar
De Anda, V. et al. Understanding the mechanisms behind the response to environmental perturbation in microbial mats: a metagenomic-network based approach. Front. Microbiol.9, 2606 (2018). ArticlePubMedPubMed Central Google Scholar
Zhang, R. et al. SpacePHARER: sensitive identification of phages from CRISPR spacers in prokaryotic hosts. Bioinformatics37, 3364–3366 (2021). ArticleCASPubMed Central Google Scholar
Guglielmini, J., Woo, A. C., Krupovic, M., Forterre, P. & Gaia, M. Diversification of giant and large eukaryotic dsDNA viruses predated the origin of modern eukaryotes. Proc. Natl Acad. Sci. USA116, 19585–19592 (2019). ArticleCASPubMedPubMed Central Google Scholar
Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol.30, 772–780 (2013). ArticleCASPubMedPubMed Central Google Scholar
Minh, B. Q. et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol.37, 1530–1534 (2020). ArticleCASPubMedPubMed Central Google Scholar
Kalyaanamoorthy, S., Minh, B. Q., Wong, T. K. F., von Haeseler, A. & Jermiin, L. S. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat. Methods14, 587–589 (2017). ArticleCASPubMedPubMed Central Google Scholar
Letunic, I. & Bork, P. Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res.49, W293–W296 (2021). ArticleCASPubMedPubMed Central Google Scholar