Expanded diversity of Asgard archaea and their relationships with eukaryotes (original) (raw)
Data availability
Asgard archaea genomes generated in this study have been deposited in the eLibrary of Microbial Systematics and Genomics (https://www.biosino.org/elmsg/index) and are also available from the NCBI under BioProject identifier PRJNA680430. Publicly available genomes were retrieved from NCBI GenBank, MG-RAST and the figshare repository. The accession numbers of the newly generated and the public genomes are available in Supplementary Table 1. Supplementary data file 1 comprises the complete Asgard COG data archive (supplementary_data_file_1.tgz), and supplementary data file 2 contains the phylogenetic trees and alignments archive (supplementary_data_file_2.tgz); these files are available without restriction from https://doi.org/10.5281/zenodo.4624280 or https://ftp.ncbi.nih.gov/pub/wolf/\_suppl/asgard20/. Any other relevant data are available from the corresponding authors upon reasonable request.
References
- Spang, A. et al. Complex archaea that bridge the gap between prokaryotes and eukaryotes. Nature 521, 173–179 (2015).
Article CAS PubMed PubMed Central ADS Google Scholar - Zaremba-Niedzwiedzka, K. et al. Asgard archaea illuminate the origin of eukaryotic cellular complexity. Nature 541, 353–358 (2017).
Article CAS PubMed ADS Google Scholar - MacLeod, F., Kindler, G. S., Wong, H. L., Chen, R. & Burns, B. P. Asgard archaea: diversity, function, and evolutionary implications in a range of microbiomes. AIMS Microbiol. 5, 48–61 (2019).
Article CAS PubMed PubMed Central Google Scholar - Cai, M. et al. Diverse Asgard archaea including the novel phylum Gerdarchaeota participate in organic matter degradation. Sci. China Life Sci. 63, 886–897 (2020).
Article CAS PubMed Google Scholar - Williams, T. A., Cox, C. J., Foster, P. G., Szöllősi, G. J. & Embley, T. M. Phylogenomics provides robust support for a two-domains tree of life. Nat. Ecol. Evol. 4, 138–147 (2020).
Article PubMed Google Scholar - Williams, T. A., Foster, P. G., Cox, C. J. & Embley, T. M. An archaeal origin of eukaryotes supports only two primary domains of life. Nature 504, 231–236 (2013).
Article CAS PubMed ADS Google Scholar - Cox, C. J., Foster, P. G., Hirt, R. P., Harris, S. R. & Embley, T. M. The archaebacterial origin of eukaryotes. Proc. Natl Acad. Sci. USA 105, 20356–20361 (2008).
Article CAS PubMed ADS PubMed Central Google Scholar - Yutin, N., Makarova, K. S., Mekhedov, S. L., Wolf, Y. I. & Koonin, E. V. The deep archaeal roots of eukaryotes. Mol. Biol. Evol. 25, 1619–1630 (2008).
Article CAS PubMed PubMed Central Google Scholar - Da Cunha, V., Gaia, M., Gadelle, D., Nasir, A. & Forterre, P. Lokiarchaea are close relatives of Euryarchaeota, not bridging the gap between prokaryotes and eukaryotes. PLoS Genet. 13, e1006810 (2017).
Article PubMed PubMed Central Google Scholar - Da Cunha, V., Gaia, M., Nasir, A. & Forterre, P. Asgard archaea do not close the debate about the universal tree of life topology. PLoS Genet. 14, e1007215 (2018).
Article PubMed PubMed Central Google Scholar - Spang, A. et al. Asgard archaea are the closest prokaryotic relatives of eukaryotes. PLoS Genet. 14, e1007080 (2018).
Article PubMed PubMed Central Google Scholar - Forterre, P. The universal tree of life: an update. Front. Microbiol. 6, 717 (2015).
Article PubMed PubMed Central Google Scholar - Lombard, J., López-García, P. & Moreira, D. The early evolution of lipid membranes and the three domains of life. Nat. Rev. Microbiol. 10, 507–515 (2012).
Article CAS PubMed Google Scholar - Akıl, C. & Robinson, R. C. Genomes of Asgard archaea encode profilins that regulate actin. Nature 562, 439–443 (2018).
Article PubMed ADS Google Scholar - Akıl, C. et al. Insights into the evolution of regulated actin dynamics via characterization of primitive gelsolin/cofilin proteins from Asgard archaea. Proc. Natl Acad. Sci. USA 117, 19904–19913 (2020).
Article PubMed PubMed Central Google Scholar - Lu, Z. et al. Coevolution of eukaryote-like Vps4 and ESCRT-III subunits in the Asgard archaea. mBio 11, e00417-20 (2020).
Article PubMed PubMed Central Google Scholar - Imachi, H. et al. Isolation of an archaeon at the prokaryote–eukaryote interface. Nature 577, 519–525 (2020).
Article CAS PubMed PubMed Central ADS Google Scholar - Zhang, J.-W. et al. Newly discovered Asgard archaea Hermodarchaeota potentially degrade alkanes and aromatics via alkyl/benzyl-succinate synthase and benzoyl-CoA pathway. ISME J. https://doi.org/10.1038/s41396-020-00890-x (2021).
- Ciccarelli, F. D. et al. Toward automatic reconstruction of a highly resolved tree of life. Science 311, 1283–1287 (2006).
Article CAS PubMed ADS Google Scholar - Rinke, C. et al. Insights into the phylogeny and coding potential of microbial dark matter. Nature 499, 431–437 (2013).
Article CAS PubMed ADS Google Scholar - Nayfach, S. et al. A genomic catalog of Earth’s microbiomes. Nat. Biotechnol. 39, 499–509 (2021).
- Balasingam, N., Brandon, H. E., Ross, J. A., Wieden, H.-J. & Thakor, N. Cellular roles of the human Obg-like ATPase 1 (hOLA1) and its YchF homologs. Biochem. Cell Biol. 98, 1–11 (2020).
Article CAS PubMed Google Scholar - Rinke, C. et al. Resolving widespread incomplete and uneven archaeal classifications based on a rank-normalized genome-based taxonomy. Preprint at https://doi.org/10.1101/2020.03.01.972265 (2020).
- Eme, L., Spang, A., Lombard, J., Stairs, C. W. & Ettema, T. J. G. Archaea and the origin of eukaryotes. Nat. Rev. Microbiol. 15, 711–723 (2017).
Article CAS PubMed Google Scholar - Klinger, C. M., Spang, A., Dacks, J. B. & Ettema, T. J. G. Tracing the archaeal origins of eukaryotic membrane-trafficking system building blocks. Mol. Biol. Evol. 33, 1528–1541 (2016).
Article CAS PubMed Google Scholar - Puigbò, P., Lobkovsky, A. E., Kristensen, D. M., Wolf, Y. I. & Koonin, E. V. Genomes in turmoil: quantification of genome dynamics in prokaryote supergenomes. BMC Biol. 12, 66 (2014).
Article PubMed PubMed Central Google Scholar - Christ, L., Raiborg, C., Wenzel, E. M., Campsteijn, C. & Stenmark, H. Cellular functions and molecular mechanisms of the ESCRT membrane-scission machinery. Trends Biochem. Sci. 42, 42–56 (2017).
Article CAS PubMed Google Scholar - Su, M.-Y., Fromm, S. A., Zoncu, R. & Hurley, J. H. Structure of the C9orf72 ARF GAP complex that is haploinsufficient in ALS and FTD. Nature 585, 251–255 (2020).
Article CAS PubMed PubMed Central Google Scholar - de Martín Garrido, N. & Aylett, C. H. S. Nutrient signaling and lysosome positioning crosstalk through a multifunctional protein, folliculin. Front. Cell Dev. Biol. 8, 108 (2020).
Article PubMed PubMed Central Google Scholar - Shen, K. et al. Architecture of the human GATOR1 and GATOR1–Rag GTPases complexes. Nature 556, 64–69 (2018).
Article CAS PubMed PubMed Central ADS Google Scholar - López-García, P. & Moreira, D. The syntrophy hypothesis for the origin of eukaryotes revisited. Nat. Microbiol. 5, 655–667 (2020).
Article PubMed Google Scholar - Martin, W. & Müller, M. The hydrogen hypothesis for the first eukaryote. Nature 392, 37–41 (1998).
Article CAS PubMed ADS Google Scholar - Moreira, D. & López-García, P. Symbiosis between methanogenic archaea and δ-proteobacteria as the origin of eukaryotes: the syntrophic hypothesis. J. Mol. Evol. 47, 517–530 (1998).
Article CAS PubMed ADS Google Scholar - López-García, P. & Moreira, D. Cultured Asgard archaea shed light on eukaryogenesis. Cell 181, 232–235 (2020).
Article PubMed Google Scholar - Spang, A. et al. Proposal of the reverse flow model for the origin of the eukaryotic cell based on comparative analyses of Asgard archaeal metabolism. Nat. Microbiol. 4, 1138–1148 (2019).
Article CAS PubMed Google Scholar - Koonin, E. V. & Yutin, N. The dispersed archaeal eukaryome and the complex archaeal ancestor of eukaryotes. Cold Spring Harb. Perspect. Biol. 6, a016188 (2014).
Article PubMed PubMed Central Google Scholar - Liu, Y. et al. Comparative genomic inference suggests mixotrophic lifestyle for Thorarchaeota. ISME J. 12, 1021–1031 (2018).
Article CAS PubMed PubMed Central Google Scholar - Cheng, L. et al. Isolation and characterization of Methanoculleus receptaculi sp. nov. from Shengli oil field, China. FEMS Microbiol. Lett. 285, 65–71 (2008).
Article CAS PubMed Google Scholar - Peng, J., Lü, Z., Rui, J. & Lu, Y. Dynamics of the methanogenic archaeal community during plant residue decomposition in an anoxic rice field soil. Appl. Environ. Microbiol. 74, 2894–2901 (2008).
Article CAS PubMed PubMed Central Google Scholar - Uritskiy, G. V., DiRuggiero, J. & Taylor, J. MetaWRAP–a flexible pipeline for genome-resolved metagenomic data analysis. Microbiome 6, 158 (2018).
Article PubMed PubMed Central Google Scholar - Li, D., Liu, C. M., Luo, R., Sadakane, K. & Lam, T. W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31, 1674–1676 (2015).
Article CAS PubMed Google Scholar - Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Article CAS PubMed PubMed Central Google Scholar - Kang, D. D. et al. MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. PeerJ 7, e7359 (2019).
Article PubMed PubMed Central Google Scholar - Sieber, C. M. K. et al. Recovery of genomes from metagenomes via a dereplication, aggregation and scoring strategy. Nat. Microbiol. 3, 836–843 (2018).
Article CAS PubMed PubMed Central Google Scholar - Peng, Y., Leung, H. C. M., Yiu, S. M. & Chin, F. Y. L. IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics 28, 1420–1428 (2012).
Article CAS PubMed Google Scholar - Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
Article CAS PubMed PubMed Central Google Scholar - Nurk, S., Meleshko, D., Korobeynikov, A. & Pevzner, P. A. metaSPAdes: a new versatile metagenomic assembler. Genome Res. 27, 824–834 (2017).
Article CAS PubMed PubMed Central Google Scholar - Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 25, 1043–1055 (2015).
Article CAS PubMed PubMed Central Google Scholar - Hyatt, D. et al. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11, 119 (2010).
Article PubMed PubMed Central Google Scholar - Seemann, T. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30, 2068–2069 (2014).
Article CAS PubMed Google Scholar - Chan, P. P. & Lowe, T. M. tRNAscan-SE: searching for tRNA genes in genomic sequences. Methods Mol. Biol. 1962, 1–14 (2019).
Article CAS PubMed PubMed Central Google Scholar - Schäffer, A. A. et al. Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res. 29, 2994–3005 (2001).
Article PubMed PubMed Central Google Scholar - Steinegger, M. & Söding, J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat. Biotechnol. 35, 1026–1028 (2017).
Article CAS PubMed Google Scholar - Edgar, R. C. MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5, 113 (2004).
Article PubMed PubMed Central Google Scholar - Söding, J. Protein homology detection by HMM–HMM comparison. Bioinformatics 21, 951–960 (2005).
Article PubMed Google Scholar - Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree 2—approximately maximum-likelihood trees for large alignments. PLoS ONE 5, e9490 (2010).
Article PubMed PubMed Central ADS Google Scholar - Rice, P., Longden, I. & Bleasby, A. EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet. 16, 276–277 (2000).
Article CAS PubMed Google Scholar - Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).
Article CAS PubMed PubMed Central Google Scholar - Nguyen, L.-T., Schmidt, H. A., von Haeseler, A. & Minh, B. Q. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268–274 (2015).
Article CAS PubMed Google Scholar - Larsson, A. AliView: a fast and lightweight alignment viewer and editor for large datasets. Bioinformatics 30, 3276–3278 (2014).
Article CAS PubMed PubMed Central Google Scholar - Esterman, E. S., Wolf, Y. I., Kogay, R., Koonin, E. V. & Zhaxybayeva, O. Evolution of DNA packaging in gene transfer agents. Virus Evol. 7, veab015 (2021).
Article PubMed PubMed Central Google Scholar - Puigbò, P., Wolf, Y. I. & Koonin, E. V. Search for a ‘Tree of Life’ in the thicket of the phylogenetic forest. J. Biol. 8, 59 (2009).
Article PubMed PubMed Central Google Scholar - R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2019).
- Zimmermann, L. et al. A completely reimplemented MPI bioinformatics toolkit with a new HHpred server at its core. J. Mol. Biol. 430, 2237–2243 (2018).
Article CAS PubMed Google Scholar - Yang, M., Derbyshire, M. K., Yamashita, R. A. & Marchler-Bauer, A. NCBI’s conserved domain database and tools for protein domain analysis. Curr. Protoc. Bioinformatics 69, e90 (2020).
Article CAS PubMed PubMed Central Google Scholar - Søndergaard, D., Pedersen, C. N. S. & Greening, C. HydDB: a web tool for hydrogenase classification and analysis. Sci. Rep. 6, 34212 (2016).
Article PubMed PubMed Central ADS Google Scholar - Criscuolo, A. & Gribaldo, S. BMGE (block mapping and gathering with entropy): a new software for selection of phylogenetic informative regions from multiple sequence alignments. BMC Evol. Biol. 10, 210 (2010).
Article PubMed PubMed Central Google Scholar - Swofford, D. L. & Maddison, W. P. Reconstructing ancestral character states under Wagner parsimony. Math. Biosci. 87, 199–229 (1987).
Article MathSciNet MATH Google Scholar
Acknowledgements
We thank P. Forterre and T. Williams for critical reading of the manuscript and helpful suggestions; J. Chen, H. Li, P. Du and D. Zou for support with sampling and preliminary analysis of Changjiang estuary sediments; Z. Zhou and J.-D. Gu for support with sampling and preliminary analysis of Mai Po Nature Reserve sediments; S. Zheng and F. Liu for support with sampling and preliminary analysis of Rongcheng Swan Lake Nature Reserve sediments; and the crew and scientific team of RV Xiangyanghong 09, the pilots and the supporting team of Jiaolong manned submersible in the 37th Dayang Cruise for the sampling. M.L., Y.L., X.Z., W.X., Z.L. and L.C. are supported by National Natural Science Foundation of China (grant no. 91851105, 31970105, 92051102, 31700430, 91951102, 41776170 and 92051108), the Innovation Team Project of Universities in Guangdong Province (no. 2020KCXTD023), the Shenzhen Science and Technology Program (grant no. JCYJ20200109105010363 and JCYJ20190808152403587), the Scientific Research Foundation of Third Institute of Oceanography, MNR (2019022), the China Ocean Mineral Resources R&D Association (COMRA) Program (DY135-B2-09) and the National Key Basic Research Program of China (‘973’-Program, 2015CB755903). K.S.M., Y.I.W., A.N. and E.V.K. are supported by the Intramural Research Program of the National Institutes of Health of the USA (National Library of Medicine).
Author information
Author notes
- These authors contributed equally: Yang Liu, Kira S. Makarova, Wen-Cong Huang
Authors and Affiliations
- Shenzhen Key Laboratory of Marine Microbiome Engineering, Institute for Advanced Study, Shenzhen University, Shenzhen, P. R. China
Yang Liu, Wen-Cong Huang, Xinxu Zhang, Mingwei Cai, Cui-Jing Zhang & Meng Li - National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
Kira S. Makarova, Yuri I. Wolf, Anastasia N. Nikolskaya & Eugene V. Koonin - Key Laboratory of Marine Biogenetic Resources, Third Institute of Oceanography, Ministry of Natural Resources, Xiamen, P. R. China
Wei Xu & Zhuhua Luo - Key Laboratory of Development and Application of Rural Renewable Energy, Biogas Institute of Ministry of Agriculture, Chengdu, P. R. China
Lei Cheng
Authors
- Yang Liu
You can also search for this author inPubMed Google Scholar - Kira S. Makarova
You can also search for this author inPubMed Google Scholar - Wen-Cong Huang
You can also search for this author inPubMed Google Scholar - Yuri I. Wolf
You can also search for this author inPubMed Google Scholar - Anastasia N. Nikolskaya
You can also search for this author inPubMed Google Scholar - Xinxu Zhang
You can also search for this author inPubMed Google Scholar - Mingwei Cai
You can also search for this author inPubMed Google Scholar - Cui-Jing Zhang
You can also search for this author inPubMed Google Scholar - Wei Xu
You can also search for this author inPubMed Google Scholar - Zhuhua Luo
You can also search for this author inPubMed Google Scholar - Lei Cheng
You can also search for this author inPubMed Google Scholar - Eugene V. Koonin
You can also search for this author inPubMed Google Scholar - Meng Li
You can also search for this author inPubMed Google Scholar
Contributions
M.L., E.V.K., K.S.M. and Y.L. initiated the study; Y.L., W.-C.H., M.C., C.-J.Z., W.X., Z.L. and L.C. participated in sample collections; Y.L., X.Z., M.C., C.-J.Z., W.X., Z.L. and L.C. performed metagenomic assembly and binning analysis. Y.L. performed metabolism analysis; K.S.M., A.N.N. and Y.I.W. performed comparative genomic analysis; Y.L., K.S.M., Y.I.W. and W.-C.H. performed phylogenetic analysis; K.S.M. and Y.I.W. constructed Asgard COGs; K.S.M., Y.I.W., Y.L., M.L. and E.V.K. analysed the data; Y.L., K.S.M., W.-C.H., X.Z., M.C., C.-J.Z., W.X., Z.L., L.C., E.V.K. and M.L. wrote the manuscript that was read, edited and approved by all authors.
Corresponding authors
Correspondence toEugene V. Koonin or Meng Li.
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Peer review information Nature thanks the anonymous reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Extended Data Fig. 1 Global distribution of the Asgard genomes analysed in this Article.
The world map was generated using R package rnaturalearth v.0.1.0., in R v.3.6.363. The pie chart shows the proportion of Asgard genomes that were found in a given biotope. The numbers of these genomes per biotope are as follows: coastal sediment, 94; freshwater sediment, 15; hot spring, 1; hydrothermal vent, 13; hypersaline lake sediment, 1; marine sediment 26; marine water, 26; petroleum seep (marine), 6; and petroleum field, 1. Boldface in the map indicates the sampling locations.
Extended Data Fig. 2 Completeness and contamination for 75 Asgard MAGs.
These MAGs were assessed using CheckM v.1.0.12. a, Distribution of completeness and contamination for 75 Asgard MAGs assessed by CheckM v.1.0.12. b, c, Distribution of depth coverage (b) and N50 statistics (c) for Asgard MAGs reconstructed in this Article. The numbers in parentheses indicate the number of Asgard genomes recovered from a given sampling location. In cases in which fewer than three samples were recovered, these are presented as individual points. Thick black bar, median; upper and lower bounds of the box plot, first and third quartile, respectively; upper and lower whiskers, largest and smallest values less than 1.5× interquartile range, respectively; black points, values greater than 1.5× interquartile range. Data for this plot are given in Supplementary Table 1.
Extended Data Fig. 3 Gene commonality plot for Asgard archaea and the TACK superphylum.
Gene commonality plot showing the number of Asgard COGs (log scale) (y axis) that include the given fraction of analysed genomes (x axis). The Asgard plot is compared with the TACK superphylum plot on the basis of the assignment of TACK genomes to archaeal COGs.
Extended Data Fig. 4 Comparison of the mean amino acid identity of Asgard and TACK superphyla.
In this figure, -archaeota is omitted from the phylum names. Sample sizes of less than three are presented as individual points. a, Shared amino acid identity across Asgard and TACK lineages. Comparison of representative genomes from all Asgard and TACK lineages analysed in this Article (excluding the six putative phyla proposed in this Article), which characterizes the distribution of amino acid identities that is typical of a phylum. b–m, Amino acid identity comparisons between Thorarchaeota (b), Hermodarchaeota (c), Odinarchaeota (d), Baldrarchaeota (e), Lokiarchaeota (f), Helarchaeota (g), Borrarchaeota (h), Heimdallarchaeota (i), Kariarchaeota (j), Gerdarchaeota (k), Hodarchaeota (l) and Wukongarchaeota (m) and other Asgard and TACK lineages. Thick black bar, median; upper and lower bounds of the box plot, first and third quartile respectively; upper and lower whiskers, largest and smallest values less than 1.5× interquartile range, respectively; black points, values greater than 1.5× interquartile range; number in the parentheses, number of genomes in the lineage. Data for this plot are given in Supplementary Table 2.
Extended Data Fig. 5 Comparison of the 16S rRNA gene sequence identity of Asgard and TACK lineages.
In this figure, -archaeota is omitted from the phylum names. Sample sizes of less than three are presented as individual points. a, 16S rRNA gene sequence identity across Asgard and TACK lineages. Comparison of 16S RNA gene sequences from representative genomes of all Asgard and TACK lineages analysed in this Article (excluding the six putative phyla proposed in this Article), which characterizes the distribution of 16S rRNA sequence that is typical of a phylum. b–k, Comparison of 16S rRNA gene sequence identity between Thorarchaeota (b), Hermodarchaeota (c), Odinarchaeota (d), Lokiarchaeota (e), Helarchaeota (f), Heimdallarchaeota (g), Kariarchaeota (h), Gerdarchaeota (i), Hodarchaeota (j) and Wukongarchaeota (k) and other Asgard and TACK lineages. Thick black bar, median; upper and lower bounds of the box plot, first and third quartile respectively; upper and lower whiskers, largest and smallest values less than 1.5× interquartile range, respectively; black points, values greater than 1.5× interquartile range; number in the parentheses, number of genomes in the lineage. Data for this plot are given in Supplementary Table 3.
Extended Data Fig. 6 Classification of Asgard archaea by the phyletic patterns and the core gene set of Asgard archaea.
a, Classical multidimensional scaling analysis of binary presence–absence phyletic patterns for 13,939 Asgard COGs that are represented in at least two genomes (Methods). b, Functional breakdown of Asgard core genes (378 Asgard COGs) compared with TACK-superphylum core genes (489 archaeal COGs). Values were normalized as described in the Methods. Functional classes of genes: J, translation, ribosomal structure and biogenesis; K, transcription; L, replication, recombination and repair; D, cell cycle control, cell division and chromosome partitioning; V, defence mechanisms; T, signal transduction mechanisms; M, biogenesis of the cell wall, membrane or envelope; N, cell motility; U, intracellular trafficking, secretion and vesicular transport; O, posttranslational modification, protein turnover and chaperones; X, mobilome (prophages, plasmids and transposons); C, energy production and conversion; G, carbohydrate transport and metabolism; E, amino acid transport and metabolism; F, nucleotide transport and metabolism; H, coenzyme transport and metabolism; I, lipid transport and metabolism; P, inorganic ion transport and metabolism; Q, secondary metabolites biosynthesis, transport and catabolism; R, general function prediction only; S, function unknown. c, Presence–absence of orthologues of Asgard core genes in other archaea, bacteria and eukaryotes.
Extended Data Fig. 7 Phylogenetic trees.
a, Phylogenetic tree of bacteria, archaea and eukaryotes (inferred with IQ-tree using the LG + R10 model) that was constructed from the concatenated alignments of the protein sequences of 30 universally conserved genes (Methods). The tree shows the relationships between the major clades. b, Phylogenetic tree of COG0012 (ribosome-binding ATPase YchF) the tree was reconstructed using IQ-tree with LG + R10 evolutionary model (selected by IQ-tree ModelFinder as the best fit). zc, Phylogenetic tree of COG0201 (preprotein translocase subunit SecY). The tree was reconstructed using IQ-tree with LG + F + R10 evolutionary model (selected by IQ-tree ModelFinder as the best fit). d, Phylogenetic tree of the reduced set of bacteria, archaea and eukaryotes (excluding the genomes of derived parasites), constructed from concatenated alignments of the protein sequences of 29 universal markers (excluding COG0012) using IQ-tree with LG + R10 evolutionary model (selected by IQ-tree ModelFinder as the best fit). The tree shows the relationships between the major clades. e, Phylogenetic analysis of the evolutionary relationship between archaea and eukaryotes, excluding the Asgard superphylum. The tree was reconstructed from a concatenated alignment of the 29 universal markers (excluding COG0012) using IQ-tree with LG + R10 evolutionary model (selected by IQ-tree ModelFinder as the best fit).
Extended Data Fig. 8 Phyletic patterns of ESPs in Asgard genomes.
All 505 Asgard COGs that correspond to ESP are grouped by distance between binary presence–absence phyletic patterns. For a given pair of Asgard COGs A and B that are present in the set of genomes {_G_A} and {_G_B}, respectively, we calculate the similarity between the patterns as _S_A,B = |{_G_A} × {_G_B}|/|{_G_A}+{_G_B}|, and the distance between the patterns as _D_A,B = −ln(_S_A,B). A dendrogram was reconstructed using the unweighted-pair group method with arithmetic mean, from the distance matrix D; the order of leaves in the tree determines the order of Asgard COGs in the figure. Top, patterns are shown schematically by pale blue lines, in which the respective Asgard COG is present and mapped to the 12 major Asgard lineages (as shown by the coloured bar above). The Asgard COGs that correspond to the most highly conserved ESP protein families are shown within the red rectangle. Bottom, plot of the number of Asgard COGs that correspond to ESPs in each of 76 genomes is shown. Complete data are provided in Supplementary Table 7. The colour code for the plot is the same as for the bar graph.
Extended Data Fig. 9 Metabolic features of Asgard archaea.
Schematic of the presence and absence of selected metabolic features in all phyla and putative phyla of Asgard archaea.
Extended Data Fig. 10 Phylogenetic analysis of [NiFe] hydrogenases in Asgard archaea.
a, Phylogenetic analysis of group-4 [NiFe] hydrogenases in Asgard archaea. The unrooted maximum-likelihood phylogenetic tree was built from an alignment of 425 sequences that included 110 sequences of Asgard archaea, with 308 amino acid positions. b, Phylogenetic analysis of group-3 [NiFe] hydrogenases in Asgard archaea. The unrooted maximum-likelihood phylogenetic tree was built from an alignment of 813 sequences that included 335 sequences of Asgard archaea, with 331 amino acid positions. c, Phylogenetic analysis of group-1 [NiFe] hydrogenases in the Asgard archaea. The unrooted maximum-likelihood phylogenetic tree was built from an alignment of 541 sequences that included 2 sequences of Wukongarchaeota, with 376 amino acid positions.
Supplementary information
Supplementary Information
This file contains (1) Description of new taxa; (2) Clusters of orthologous genes of Asgard archaea; (3) The core gene set of Asgard archaea; (4) Phylogenomic analysis of the Asgard superphylum and Asgard-eukaryote evolutionary relationship; (5) Eukaryotic Signature Proteins in Asgard archaea; and (6) Reconstruction of metabolic pathways in Asgard archaea.
Reporting Summary
Supplementary Table 1
Genome information, proposed taxonomy and isolation data.
Supplementary Table 2
Mean amino-acid identity values (%) among 66 TACK genomes and 184 Asgard genomes (162 high quality and 22 low-quality).
Supplementary Table 3
The 16S rRNA gene sequence identity (%) among TACK and Asgard lineages. Identity was calculated using sequences longer than 1,300 bps.
Supplementary Table 4
Species and phylogenetic markers used for the tree of life reconstruction.
Supplementary Table 5
Data for phylogenetic trees: methods, markers, bootstrap data and comments. The trees in the Newick format and the underlying alignments are provided in the additional data file 2.
Supplementary Table 6
The core asCOGs list.
Supplementary Table 7
Eukaryotic signature proteins in Asgard archaea.
Supplementary Table 8
The presence-absence of metabolic enzymes in Asgard archaea.
Rights and permissions
About this article
Cite this article
Liu, Y., Makarova, K.S., Huang, WC. et al. Expanded diversity of Asgard archaea and their relationships with eukaryotes.Nature 593, 553–557 (2021). https://doi.org/10.1038/s41586-021-03494-3
- Received: 15 November 2020
- Accepted: 26 March 2021
- Published: 28 April 2021
- Issue Date: 27 May 2021
- DOI: https://doi.org/10.1038/s41586-021-03494-3