Expanded diversity of Asgard archaea and their relationships with eukaryotes (original) (raw)

Data availability

Asgard archaea genomes generated in this study have been deposited in the eLibrary of Microbial Systematics and Genomics (https://www.biosino.org/elmsg/index) and are also available from the NCBI under BioProject identifier PRJNA680430. Publicly available genomes were retrieved from NCBI GenBank, MG-RAST and the figshare repository. The accession numbers of the newly generated and the public genomes are available in Supplementary Table 1. Supplementary data file 1 comprises the complete Asgard COG data archive (supplementary_data_file_1.tgz), and supplementary data file 2 contains the phylogenetic trees and alignments archive (supplementary_data_file_2.tgz); these files are available without restriction from https://doi.org/10.5281/zenodo.4624280 or https://ftp.ncbi.nih.gov/pub/wolf/\_suppl/asgard20/. Any other relevant data are available from the corresponding authors upon reasonable request.

References

  1. Spang, A. et al. Complex archaea that bridge the gap between prokaryotes and eukaryotes. Nature 521, 173–179 (2015).
    Article CAS PubMed PubMed Central ADS Google Scholar
  2. Zaremba-Niedzwiedzka, K. et al. Asgard archaea illuminate the origin of eukaryotic cellular complexity. Nature 541, 353–358 (2017).
    Article CAS PubMed ADS Google Scholar
  3. MacLeod, F., Kindler, G. S., Wong, H. L., Chen, R. & Burns, B. P. Asgard archaea: diversity, function, and evolutionary implications in a range of microbiomes. AIMS Microbiol. 5, 48–61 (2019).
    Article CAS PubMed PubMed Central Google Scholar
  4. Cai, M. et al. Diverse Asgard archaea including the novel phylum Gerdarchaeota participate in organic matter degradation. Sci. China Life Sci. 63, 886–897 (2020).
    Article CAS PubMed Google Scholar
  5. Williams, T. A., Cox, C. J., Foster, P. G., Szöllősi, G. J. & Embley, T. M. Phylogenomics provides robust support for a two-domains tree of life. Nat. Ecol. Evol. 4, 138–147 (2020).
    Article PubMed Google Scholar
  6. Williams, T. A., Foster, P. G., Cox, C. J. & Embley, T. M. An archaeal origin of eukaryotes supports only two primary domains of life. Nature 504, 231–236 (2013).
    Article CAS PubMed ADS Google Scholar
  7. Cox, C. J., Foster, P. G., Hirt, R. P., Harris, S. R. & Embley, T. M. The archaebacterial origin of eukaryotes. Proc. Natl Acad. Sci. USA 105, 20356–20361 (2008).
    Article CAS PubMed ADS PubMed Central Google Scholar
  8. Yutin, N., Makarova, K. S., Mekhedov, S. L., Wolf, Y. I. & Koonin, E. V. The deep archaeal roots of eukaryotes. Mol. Biol. Evol. 25, 1619–1630 (2008).
    Article CAS PubMed PubMed Central Google Scholar
  9. Da Cunha, V., Gaia, M., Gadelle, D., Nasir, A. & Forterre, P. Lokiarchaea are close relatives of Euryarchaeota, not bridging the gap between prokaryotes and eukaryotes. PLoS Genet. 13, e1006810 (2017).
    Article PubMed PubMed Central Google Scholar
  10. Da Cunha, V., Gaia, M., Nasir, A. & Forterre, P. Asgard archaea do not close the debate about the universal tree of life topology. PLoS Genet. 14, e1007215 (2018).
    Article PubMed PubMed Central Google Scholar
  11. Spang, A. et al. Asgard archaea are the closest prokaryotic relatives of eukaryotes. PLoS Genet. 14, e1007080 (2018).
    Article PubMed PubMed Central Google Scholar
  12. Forterre, P. The universal tree of life: an update. Front. Microbiol. 6, 717 (2015).
    Article PubMed PubMed Central Google Scholar
  13. Lombard, J., López-García, P. & Moreira, D. The early evolution of lipid membranes and the three domains of life. Nat. Rev. Microbiol. 10, 507–515 (2012).
    Article CAS PubMed Google Scholar
  14. Akıl, C. & Robinson, R. C. Genomes of Asgard archaea encode profilins that regulate actin. Nature 562, 439–443 (2018).
    Article PubMed ADS Google Scholar
  15. Akıl, C. et al. Insights into the evolution of regulated actin dynamics via characterization of primitive gelsolin/cofilin proteins from Asgard archaea. Proc. Natl Acad. Sci. USA 117, 19904–19913 (2020).
    Article PubMed PubMed Central Google Scholar
  16. Lu, Z. et al. Coevolution of eukaryote-like Vps4 and ESCRT-III subunits in the Asgard archaea. mBio 11, e00417-20 (2020).
    Article PubMed PubMed Central Google Scholar
  17. Imachi, H. et al. Isolation of an archaeon at the prokaryote–eukaryote interface. Nature 577, 519–525 (2020).
    Article CAS PubMed PubMed Central ADS Google Scholar
  18. Zhang, J.-W. et al. Newly discovered Asgard archaea Hermodarchaeota potentially degrade alkanes and aromatics via alkyl/benzyl-succinate synthase and benzoyl-CoA pathway. ISME J. https://doi.org/10.1038/s41396-020-00890-x (2021).
  19. Ciccarelli, F. D. et al. Toward automatic reconstruction of a highly resolved tree of life. Science 311, 1283–1287 (2006).
    Article CAS PubMed ADS Google Scholar
  20. Rinke, C. et al. Insights into the phylogeny and coding potential of microbial dark matter. Nature 499, 431–437 (2013).
    Article CAS PubMed ADS Google Scholar
  21. Nayfach, S. et al. A genomic catalog of Earth’s microbiomes. Nat. Biotechnol. 39, 499–509 (2021).
  22. Balasingam, N., Brandon, H. E., Ross, J. A., Wieden, H.-J. & Thakor, N. Cellular roles of the human Obg-like ATPase 1 (hOLA1) and its YchF homologs. Biochem. Cell Biol. 98, 1–11 (2020).
    Article CAS PubMed Google Scholar
  23. Rinke, C. et al. Resolving widespread incomplete and uneven archaeal classifications based on a rank-normalized genome-based taxonomy. Preprint at https://doi.org/10.1101/2020.03.01.972265 (2020).
  24. Eme, L., Spang, A., Lombard, J., Stairs, C. W. & Ettema, T. J. G. Archaea and the origin of eukaryotes. Nat. Rev. Microbiol. 15, 711–723 (2017).
    Article CAS PubMed Google Scholar
  25. Klinger, C. M., Spang, A., Dacks, J. B. & Ettema, T. J. G. Tracing the archaeal origins of eukaryotic membrane-trafficking system building blocks. Mol. Biol. Evol. 33, 1528–1541 (2016).
    Article CAS PubMed Google Scholar
  26. Puigbò, P., Lobkovsky, A. E., Kristensen, D. M., Wolf, Y. I. & Koonin, E. V. Genomes in turmoil: quantification of genome dynamics in prokaryote supergenomes. BMC Biol. 12, 66 (2014).
    Article PubMed PubMed Central Google Scholar
  27. Christ, L., Raiborg, C., Wenzel, E. M., Campsteijn, C. & Stenmark, H. Cellular functions and molecular mechanisms of the ESCRT membrane-scission machinery. Trends Biochem. Sci. 42, 42–56 (2017).
    Article CAS PubMed Google Scholar
  28. Su, M.-Y., Fromm, S. A., Zoncu, R. & Hurley, J. H. Structure of the C9orf72 ARF GAP complex that is haploinsufficient in ALS and FTD. Nature 585, 251–255 (2020).
    Article CAS PubMed PubMed Central Google Scholar
  29. de Martín Garrido, N. & Aylett, C. H. S. Nutrient signaling and lysosome positioning crosstalk through a multifunctional protein, folliculin. Front. Cell Dev. Biol. 8, 108 (2020).
    Article PubMed PubMed Central Google Scholar
  30. Shen, K. et al. Architecture of the human GATOR1 and GATOR1–Rag GTPases complexes. Nature 556, 64–69 (2018).
    Article CAS PubMed PubMed Central ADS Google Scholar
  31. López-García, P. & Moreira, D. The syntrophy hypothesis for the origin of eukaryotes revisited. Nat. Microbiol. 5, 655–667 (2020).
    Article PubMed Google Scholar
  32. Martin, W. & Müller, M. The hydrogen hypothesis for the first eukaryote. Nature 392, 37–41 (1998).
    Article CAS PubMed ADS Google Scholar
  33. Moreira, D. & López-García, P. Symbiosis between methanogenic archaea and δ-proteobacteria as the origin of eukaryotes: the syntrophic hypothesis. J. Mol. Evol. 47, 517–530 (1998).
    Article CAS PubMed ADS Google Scholar
  34. López-García, P. & Moreira, D. Cultured Asgard archaea shed light on eukaryogenesis. Cell 181, 232–235 (2020).
    Article PubMed Google Scholar
  35. Spang, A. et al. Proposal of the reverse flow model for the origin of the eukaryotic cell based on comparative analyses of Asgard archaeal metabolism. Nat. Microbiol. 4, 1138–1148 (2019).
    Article CAS PubMed Google Scholar
  36. Koonin, E. V. & Yutin, N. The dispersed archaeal eukaryome and the complex archaeal ancestor of eukaryotes. Cold Spring Harb. Perspect. Biol. 6, a016188 (2014).
    Article PubMed PubMed Central Google Scholar
  37. Liu, Y. et al. Comparative genomic inference suggests mixotrophic lifestyle for Thorarchaeota. ISME J. 12, 1021–1031 (2018).
    Article CAS PubMed PubMed Central Google Scholar
  38. Cheng, L. et al. Isolation and characterization of Methanoculleus receptaculi sp. nov. from Shengli oil field, China. FEMS Microbiol. Lett. 285, 65–71 (2008).
    Article CAS PubMed Google Scholar
  39. Peng, J., Lü, Z., Rui, J. & Lu, Y. Dynamics of the methanogenic archaeal community during plant residue decomposition in an anoxic rice field soil. Appl. Environ. Microbiol. 74, 2894–2901 (2008).
    Article CAS PubMed PubMed Central Google Scholar
  40. Uritskiy, G. V., DiRuggiero, J. & Taylor, J. MetaWRAP–a flexible pipeline for genome-resolved metagenomic data analysis. Microbiome 6, 158 (2018).
    Article PubMed PubMed Central Google Scholar
  41. Li, D., Liu, C. M., Luo, R., Sadakane, K. & Lam, T. W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31, 1674–1676 (2015).
    Article CAS PubMed Google Scholar
  42. Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
    Article CAS PubMed PubMed Central Google Scholar
  43. Kang, D. D. et al. MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. PeerJ 7, e7359 (2019).
    Article PubMed PubMed Central Google Scholar
  44. Sieber, C. M. K. et al. Recovery of genomes from metagenomes via a dereplication, aggregation and scoring strategy. Nat. Microbiol. 3, 836–843 (2018).
    Article CAS PubMed PubMed Central Google Scholar
  45. Peng, Y., Leung, H. C. M., Yiu, S. M. & Chin, F. Y. L. IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics 28, 1420–1428 (2012).
    Article CAS PubMed Google Scholar
  46. Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
    Article CAS PubMed PubMed Central Google Scholar
  47. Nurk, S., Meleshko, D., Korobeynikov, A. & Pevzner, P. A. metaSPAdes: a new versatile metagenomic assembler. Genome Res. 27, 824–834 (2017).
    Article CAS PubMed PubMed Central Google Scholar
  48. Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 25, 1043–1055 (2015).
    Article CAS PubMed PubMed Central Google Scholar
  49. Hyatt, D. et al. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11, 119 (2010).
    Article PubMed PubMed Central Google Scholar
  50. Seemann, T. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30, 2068–2069 (2014).
    Article CAS PubMed Google Scholar
  51. Chan, P. P. & Lowe, T. M. tRNAscan-SE: searching for tRNA genes in genomic sequences. Methods Mol. Biol. 1962, 1–14 (2019).
    Article CAS PubMed PubMed Central Google Scholar
  52. Schäffer, A. A. et al. Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res. 29, 2994–3005 (2001).
    Article PubMed PubMed Central Google Scholar
  53. Steinegger, M. & Söding, J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat. Biotechnol. 35, 1026–1028 (2017).
    Article CAS PubMed Google Scholar
  54. Edgar, R. C. MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5, 113 (2004).
    Article PubMed PubMed Central Google Scholar
  55. Söding, J. Protein homology detection by HMM–HMM comparison. Bioinformatics 21, 951–960 (2005).
    Article PubMed Google Scholar
  56. Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree 2—approximately maximum-likelihood trees for large alignments. PLoS ONE 5, e9490 (2010).
    Article PubMed PubMed Central ADS Google Scholar
  57. Rice, P., Longden, I. & Bleasby, A. EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet. 16, 276–277 (2000).
    Article CAS PubMed Google Scholar
  58. Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).
    Article CAS PubMed PubMed Central Google Scholar
  59. Nguyen, L.-T., Schmidt, H. A., von Haeseler, A. & Minh, B. Q. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268–274 (2015).
    Article CAS PubMed Google Scholar
  60. Larsson, A. AliView: a fast and lightweight alignment viewer and editor for large datasets. Bioinformatics 30, 3276–3278 (2014).
    Article CAS PubMed PubMed Central Google Scholar
  61. Esterman, E. S., Wolf, Y. I., Kogay, R., Koonin, E. V. & Zhaxybayeva, O. Evolution of DNA packaging in gene transfer agents. Virus Evol. 7, veab015 (2021).
    Article PubMed PubMed Central Google Scholar
  62. Puigbò, P., Wolf, Y. I. & Koonin, E. V. Search for a ‘Tree of Life’ in the thicket of the phylogenetic forest. J. Biol. 8, 59 (2009).
    Article PubMed PubMed Central Google Scholar
  63. R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2019).
  64. Zimmermann, L. et al. A completely reimplemented MPI bioinformatics toolkit with a new HHpred server at its core. J. Mol. Biol. 430, 2237–2243 (2018).
    Article CAS PubMed Google Scholar
  65. Yang, M., Derbyshire, M. K., Yamashita, R. A. & Marchler-Bauer, A. NCBI’s conserved domain database and tools for protein domain analysis. Curr. Protoc. Bioinformatics 69, e90 (2020).
    Article CAS PubMed PubMed Central Google Scholar
  66. Søndergaard, D., Pedersen, C. N. S. & Greening, C. HydDB: a web tool for hydrogenase classification and analysis. Sci. Rep. 6, 34212 (2016).
    Article PubMed PubMed Central ADS Google Scholar
  67. Criscuolo, A. & Gribaldo, S. BMGE (block mapping and gathering with entropy): a new software for selection of phylogenetic informative regions from multiple sequence alignments. BMC Evol. Biol. 10, 210 (2010).
    Article PubMed PubMed Central Google Scholar
  68. Swofford, D. L. & Maddison, W. P. Reconstructing ancestral character states under Wagner parsimony. Math. Biosci. 87, 199–229 (1987).
    Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

We thank P. Forterre and T. Williams for critical reading of the manuscript and helpful suggestions; J. Chen, H. Li, P. Du and D. Zou for support with sampling and preliminary analysis of Changjiang estuary sediments; Z. Zhou and J.-D. Gu for support with sampling and preliminary analysis of Mai Po Nature Reserve sediments; S. Zheng and F. Liu for support with sampling and preliminary analysis of Rongcheng Swan Lake Nature Reserve sediments; and the crew and scientific team of RV Xiangyanghong 09, the pilots and the supporting team of Jiaolong manned submersible in the 37th Dayang Cruise for the sampling. M.L., Y.L., X.Z., W.X., Z.L. and L.C. are supported by National Natural Science Foundation of China (grant no. 91851105, 31970105, 92051102, 31700430, 91951102, 41776170 and 92051108), the Innovation Team Project of Universities in Guangdong Province (no. 2020KCXTD023), the Shenzhen Science and Technology Program (grant no. JCYJ20200109105010363 and JCYJ20190808152403587), the Scientific Research Foundation of Third Institute of Oceanography, MNR (2019022), the China Ocean Mineral Resources R&D Association (COMRA) Program (DY135-B2-09) and the National Key Basic Research Program of China (‘973’-Program, 2015CB755903). K.S.M., Y.I.W., A.N. and E.V.K. are supported by the Intramural Research Program of the National Institutes of Health of the USA (National Library of Medicine).

Author information

Author notes

  1. These authors contributed equally: Yang Liu, Kira S. Makarova, Wen-Cong Huang

Authors and Affiliations

  1. Shenzhen Key Laboratory of Marine Microbiome Engineering, Institute for Advanced Study, Shenzhen University, Shenzhen, P. R. China
    Yang Liu, Wen-Cong Huang, Xinxu Zhang, Mingwei Cai, Cui-Jing Zhang & Meng Li
  2. National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
    Kira S. Makarova, Yuri I. Wolf, Anastasia N. Nikolskaya & Eugene V. Koonin
  3. Key Laboratory of Marine Biogenetic Resources, Third Institute of Oceanography, Ministry of Natural Resources, Xiamen, P. R. China
    Wei Xu & Zhuhua Luo
  4. Key Laboratory of Development and Application of Rural Renewable Energy, Biogas Institute of Ministry of Agriculture, Chengdu, P. R. China
    Lei Cheng

Authors

  1. Yang Liu
    You can also search for this author inPubMed Google Scholar
  2. Kira S. Makarova
    You can also search for this author inPubMed Google Scholar
  3. Wen-Cong Huang
    You can also search for this author inPubMed Google Scholar
  4. Yuri I. Wolf
    You can also search for this author inPubMed Google Scholar
  5. Anastasia N. Nikolskaya
    You can also search for this author inPubMed Google Scholar
  6. Xinxu Zhang
    You can also search for this author inPubMed Google Scholar
  7. Mingwei Cai
    You can also search for this author inPubMed Google Scholar
  8. Cui-Jing Zhang
    You can also search for this author inPubMed Google Scholar
  9. Wei Xu
    You can also search for this author inPubMed Google Scholar
  10. Zhuhua Luo
    You can also search for this author inPubMed Google Scholar
  11. Lei Cheng
    You can also search for this author inPubMed Google Scholar
  12. Eugene V. Koonin
    You can also search for this author inPubMed Google Scholar
  13. Meng Li
    You can also search for this author inPubMed Google Scholar

Contributions

M.L., E.V.K., K.S.M. and Y.L. initiated the study; Y.L., W.-C.H., M.C., C.-J.Z., W.X., Z.L. and L.C. participated in sample collections; Y.L., X.Z., M.C., C.-J.Z., W.X., Z.L. and L.C. performed metagenomic assembly and binning analysis. Y.L. performed metabolism analysis; K.S.M., A.N.N. and Y.I.W. performed comparative genomic analysis; Y.L., K.S.M., Y.I.W. and W.-C.H. performed phylogenetic analysis; K.S.M. and Y.I.W. constructed Asgard COGs; K.S.M., Y.I.W., Y.L., M.L. and E.V.K. analysed the data; Y.L., K.S.M., W.-C.H., X.Z., M.C., C.-J.Z., W.X., Z.L., L.C., E.V.K. and M.L. wrote the manuscript that was read, edited and approved by all authors.

Corresponding authors

Correspondence toEugene V. Koonin or Meng Li.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature thanks the anonymous reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Global distribution of the Asgard genomes analysed in this Article.

The world map was generated using R package rnaturalearth v.0.1.0., in R v.3.6.363. The pie chart shows the proportion of Asgard genomes that were found in a given biotope. The numbers of these genomes per biotope are as follows: coastal sediment, 94; freshwater sediment, 15; hot spring, 1; hydrothermal vent, 13; hypersaline lake sediment, 1; marine sediment 26; marine water, 26; petroleum seep (marine), 6; and petroleum field, 1. Boldface in the map indicates the sampling locations.

Extended Data Fig. 2 Completeness and contamination for 75 Asgard MAGs.

These MAGs were assessed using CheckM v.1.0.12. a, Distribution of completeness and contamination for 75 Asgard MAGs assessed by CheckM v.1.0.12. b, c, Distribution of depth coverage (b) and N50 statistics (c) for Asgard MAGs reconstructed in this Article. The numbers in parentheses indicate the number of Asgard genomes recovered from a given sampling location. In cases in which fewer than three samples were recovered, these are presented as individual points. Thick black bar, median; upper and lower bounds of the box plot, first and third quartile, respectively; upper and lower whiskers, largest and smallest values less than 1.5× interquartile range, respectively; black points, values greater than 1.5× interquartile range. Data for this plot are given in Supplementary Table 1.

Extended Data Fig. 3 Gene commonality plot for Asgard archaea and the TACK superphylum.

Gene commonality plot showing the number of Asgard COGs (log scale) (y axis) that include the given fraction of analysed genomes (x axis). The Asgard plot is compared with the TACK superphylum plot on the basis of the assignment of TACK genomes to archaeal COGs.

Extended Data Fig. 4 Comparison of the mean amino acid identity of Asgard and TACK superphyla.

In this figure, -archaeota is omitted from the phylum names. Sample sizes of less than three are presented as individual points. a, Shared amino acid identity across Asgard and TACK lineages. Comparison of representative genomes from all Asgard and TACK lineages analysed in this Article (excluding the six putative phyla proposed in this Article), which characterizes the distribution of amino acid identities that is typical of a phylum. bm, Amino acid identity comparisons between Thorarchaeota (b), Hermodarchaeota (c), Odinarchaeota (d), Baldrarchaeota (e), Lokiarchaeota (f), Helarchaeota (g), Borrarchaeota (h), Heimdallarchaeota (i), Kariarchaeota (j), Gerdarchaeota (k), Hodarchaeota (l) and Wukongarchaeota (m) and other Asgard and TACK lineages. Thick black bar, median; upper and lower bounds of the box plot, first and third quartile respectively; upper and lower whiskers, largest and smallest values less than 1.5× interquartile range, respectively; black points, values greater than 1.5× interquartile range; number in the parentheses, number of genomes in the lineage. Data for this plot are given in Supplementary Table 2.

Extended Data Fig. 5 Comparison of the 16S rRNA gene sequence identity of Asgard and TACK lineages.

In this figure, -archaeota is omitted from the phylum names. Sample sizes of less than three are presented as individual points. a, 16S rRNA gene sequence identity across Asgard and TACK lineages. Comparison of 16S RNA gene sequences from representative genomes of all Asgard and TACK lineages analysed in this Article (excluding the six putative phyla proposed in this Article), which characterizes the distribution of 16S rRNA sequence that is typical of a phylum. bk, Comparison of 16S rRNA gene sequence identity between Thorarchaeota (b), Hermodarchaeota (c), Odinarchaeota (d), Lokiarchaeota (e), Helarchaeota (f), Heimdallarchaeota (g), Kariarchaeota (h), Gerdarchaeota (i), Hodarchaeota (j) and Wukongarchaeota (k) and other Asgard and TACK lineages. Thick black bar, median; upper and lower bounds of the box plot, first and third quartile respectively; upper and lower whiskers, largest and smallest values less than 1.5× interquartile range, respectively; black points, values greater than 1.5× interquartile range; number in the parentheses, number of genomes in the lineage. Data for this plot are given in Supplementary Table 3.

Extended Data Fig. 6 Classification of Asgard archaea by the phyletic patterns and the core gene set of Asgard archaea.

a, Classical multidimensional scaling analysis of binary presence–absence phyletic patterns for 13,939 Asgard COGs that are represented in at least two genomes (Methods). b, Functional breakdown of Asgard core genes (378 Asgard COGs) compared with TACK-superphylum core genes (489 archaeal COGs). Values were normalized as described in the Methods. Functional classes of genes: J, translation, ribosomal structure and biogenesis; K, transcription; L, replication, recombination and repair; D, cell cycle control, cell division and chromosome partitioning; V, defence mechanisms; T, signal transduction mechanisms; M, biogenesis of the cell wall, membrane or envelope; N, cell motility; U, intracellular trafficking, secretion and vesicular transport; O, posttranslational modification, protein turnover and chaperones; X, mobilome (prophages, plasmids and transposons); C, energy production and conversion; G, carbohydrate transport and metabolism; E, amino acid transport and metabolism; F, nucleotide transport and metabolism; H, coenzyme transport and metabolism; I, lipid transport and metabolism; P, inorganic ion transport and metabolism; Q, secondary metabolites biosynthesis, transport and catabolism; R, general function prediction only; S, function unknown. c, Presence–absence of orthologues of Asgard core genes in other archaea, bacteria and eukaryotes.

Extended Data Fig. 7 Phylogenetic trees.

a, Phylogenetic tree of bacteria, archaea and eukaryotes (inferred with IQ-tree using the LG + R10 model) that was constructed from the concatenated alignments of the protein sequences of 30 universally conserved genes (Methods). The tree shows the relationships between the major clades. b, Phylogenetic tree of COG0012 (ribosome-binding ATPase YchF) the tree was reconstructed using IQ-tree with LG + R10 evolutionary model (selected by IQ-tree ModelFinder as the best fit). zc, Phylogenetic tree of COG0201 (preprotein translocase subunit SecY). The tree was reconstructed using IQ-tree with LG + F + R10 evolutionary model (selected by IQ-tree ModelFinder as the best fit). d, Phylogenetic tree of the reduced set of bacteria, archaea and eukaryotes (excluding the genomes of derived parasites), constructed from concatenated alignments of the protein sequences of 29 universal markers (excluding COG0012) using IQ-tree with LG + R10 evolutionary model (selected by IQ-tree ModelFinder as the best fit). The tree shows the relationships between the major clades. e, Phylogenetic analysis of the evolutionary relationship between archaea and eukaryotes, excluding the Asgard superphylum. The tree was reconstructed from a concatenated alignment of the 29 universal markers (excluding COG0012) using IQ-tree with LG + R10 evolutionary model (selected by IQ-tree ModelFinder as the best fit).

Extended Data Fig. 8 Phyletic patterns of ESPs in Asgard genomes.

All 505 Asgard COGs that correspond to ESP are grouped by distance between binary presence–absence phyletic patterns. For a given pair of Asgard COGs A and B that are present in the set of genomes {_G_A} and {_G_B}, respectively, we calculate the similarity between the patterns as _S_A,B = |{_G_A} × {_G_B}|/|{_G_A}+{_G_B}|, and the distance between the patterns as _D_A,B = −ln(_S_A,B). A dendrogram was reconstructed using the unweighted-pair group method with arithmetic mean, from the distance matrix D; the order of leaves in the tree determines the order of Asgard COGs in the figure. Top, patterns are shown schematically by pale blue lines, in which the respective Asgard COG is present and mapped to the 12 major Asgard lineages (as shown by the coloured bar above). The Asgard COGs that correspond to the most highly conserved ESP protein families are shown within the red rectangle. Bottom, plot of the number of Asgard COGs that correspond to ESPs in each of 76 genomes is shown. Complete data are provided in Supplementary Table 7. The colour code for the plot is the same as for the bar graph.

Extended Data Fig. 9 Metabolic features of Asgard archaea.

Schematic of the presence and absence of selected metabolic features in all phyla and putative phyla of Asgard archaea.

Extended Data Fig. 10 Phylogenetic analysis of [NiFe] hydrogenases in Asgard archaea.

a, Phylogenetic analysis of group-4 [NiFe] hydrogenases in Asgard archaea. The unrooted maximum-likelihood phylogenetic tree was built from an alignment of 425 sequences that included 110 sequences of Asgard archaea, with 308 amino acid positions. b, Phylogenetic analysis of group-3 [NiFe] hydrogenases in Asgard archaea. The unrooted maximum-likelihood phylogenetic tree was built from an alignment of 813 sequences that included 335 sequences of Asgard archaea, with 331 amino acid positions. c, Phylogenetic analysis of group-1 [NiFe] hydrogenases in the Asgard archaea. The unrooted maximum-likelihood phylogenetic tree was built from an alignment of 541 sequences that included 2 sequences of Wukongarchaeota, with 376 amino acid positions.

Supplementary information

Supplementary Information

This file contains (1) Description of new taxa; (2) Clusters of orthologous genes of Asgard archaea; (3) The core gene set of Asgard archaea; (4) Phylogenomic analysis of the Asgard superphylum and Asgard-eukaryote evolutionary relationship; (5) Eukaryotic Signature Proteins in Asgard archaea; and (6) Reconstruction of metabolic pathways in Asgard archaea.

Reporting Summary

Supplementary Table 1

Genome information, proposed taxonomy and isolation data.

Supplementary Table 2

Mean amino-acid identity values (%) among 66 TACK genomes and 184 Asgard genomes (162 high quality and 22 low-quality).

Supplementary Table 3

The 16S rRNA gene sequence identity (%) among TACK and Asgard lineages. Identity was calculated using sequences longer than 1,300 bps.

Supplementary Table 4

Species and phylogenetic markers used for the tree of life reconstruction.

Supplementary Table 5

Data for phylogenetic trees: methods, markers, bootstrap data and comments. The trees in the Newick format and the underlying alignments are provided in the additional data file 2.

Supplementary Table 6

The core asCOGs list.

Supplementary Table 7

Eukaryotic signature proteins in Asgard archaea.

Supplementary Table 8

The presence-absence of metabolic enzymes in Asgard archaea.

Rights and permissions

About this article

Cite this article

Liu, Y., Makarova, K.S., Huang, WC. et al. Expanded diversity of Asgard archaea and their relationships with eukaryotes.Nature 593, 553–557 (2021). https://doi.org/10.1038/s41586-021-03494-3

Download citation