Resolving conflict in eutherian mammal phylogeny using phylogenomics and the multispecies coalescent model - PubMed (original) (raw)

Resolving conflict in eutherian mammal phylogeny using phylogenomics and the multispecies coalescent model

Sen Song et al. Proc Natl Acad Sci U S A. 2012.

Erratum in

Abstract

The reconstruction of the Tree of Life has relied almost entirely on concatenation methods, which do not accommodate gene tree heterogeneity, a property that simulations and theory have identified as a likely cause of incongruent phylogenies. However, this incongruence has not yet been demonstrated in empirical studies. Several key relationships among eutherian mammals remain controversial and conflicting among previous studies, including the root of eutherian tree and the relationships within Euarchontoglires and Laurasiatheria. Both bayesian and maximum-likelihood analysis of genome-wide data of 447 nuclear genes from 37 species show that concatenation methods indeed yield strong incongruence in the phylogeny of eutherian mammals, as revealed by subsampling analyses of loci and taxa, which produced strongly conflicting topologies. In contrast, the coalescent methods, which accommodate gene tree heterogeneity, yield a phylogeny that is robust to variable gene and taxon sampling and is congruent with geographic data. The data also demonstrate that incomplete lineage sorting, a major source of gene tree heterogeneity, is relevant to deep-level phylogenies, such as those among eutherian mammals. Our results firmly place the eutherian root between Atlantogenata and Boreoeutheria and support ungulate polyphyly and a sister-group relationship between Scandentia and Primates. This study demonstrates that the incongruence introduced by concatenation methods is a major cause of long-standing uncertainty in the phylogeny of eutherian mammals, and the same may apply to other clades. Our analyses suggest that such incongruence can be resolved using phylogenomic data and coalescent methods that deal explicitly with gene tree heterogeneity.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.

Fig. 1.

Evolutionary relationships of eutherian mammals. The phylogeny was estimated using the maximum-pseudolikelihood coalescent method MP-EST with multilocus bootstrapping (8, 40). The numbers on the tree indicate bootstrap support values, and nodes with bootstrap support >90% are not shown. Branch lengths were estimated by fitting the concatenated sequence data for all 447 loci to the MP-EST topology using standard ML and an appropriate substitution model in PAUP* v.4.0 (45). (Inset) The eutherian phylogeny estimated using the Bayesian concatenation method implemented in MrBayes (22). The ML concatenation tree built by RAxML (23) is identical to the Bayesian concatenation tree in topology. Branches of the concatenation tree are coded by the same colors as in the MP-EST tree. The blue asterisks indicate the position of Scandentia (tree shrews), Chiroptera (bats), Perissodactyla (odd-toed ungulates), and Carnivora (carnivores), whose placement differs from the coalescent tree. The Bayesian concatenation tree received a posterior probability support of 1.0 for all nodes. In

SI Appendix, Fig. S2

, the concatenation tree with taxon names is shown.

Fig. 2.

Fig. 2.

Trends in bootstrap support for coalescent analyses and incongruence of concatenation estimates for eutherian phylogeny. (A) Gradual increase in bootstrap support values with increasing gene numbers using coalescent methods for three clades: Scandentia–Primates within Euarchontoglires, Perissodactyla–Carnivora and Cetartiodactyla–(Perissodactyla, Carnivora) within Laurasiatheria. The gray dashed line indicates bootstrap support of 90%. (B) Concatenation analyses yield conflicting phylogenies within Euarchontoglires and Laurasiatheria for subsampled gene and taxon sets. We constructed coalescent and concatenation trees for different sets of 25, 50, 100, 200, and 300 genes randomly selected from the 447-gene set, with 10 replicates for each gene set except 447. We also constructed trees for two reduced taxon sets by excluding 6 and 12 eutherian taxa. White cells in the heatmap indicate that the support for all replicates is <0.9 or 90%. Colored cells indicate relationships that received node support values >0.9 or 90% for at least one replicate. Cells with two colors indicate two highly supported but conflicting relationships among different replicates. Note that the concatenation analyses frequently support conflicting relationships for different gene and taxon sets, whereas the coalescent methods consistently support the same topology.

Fig. 3.

Fig. 3.

The mammal data set is consistent with the multispecies coalescent model. (A) Distribution of expected and observed gene tree distances. Expected gene trees were simulated from the MP-EST species tree under the multispecies coalescent model. Observed gene trees were estimated from the 447 genes in the full data set. Gene tree distances were calculated using standard measures (27). Note that the expected gene tree distance can account for about 77% of the observed gene tree distance. (B_–_E) Distribution of majority and minority gene tree triplets for specific eutherian clades. In cases where one of the three taxa in the triplet consists of multiple species, we counted the frequency of all relevant gene tree triplets for a given gene and then assigned the majority triplet to that gene. Ties were ignored, and hence the totals sometimes do not sum to 447 genes.

Comment in

Similar articles

Cited by

References

    1. de Queiroz A, Gatesy J. The supermatrix approach to systematics. Trends Ecol Evol. 2007;22(1):34–41. - PubMed
    1. William J, Ballard O. Combining data in phylogenetic analysis. Trends Ecol Evol. 1996;11:334. - PubMed
    1. Belfiore NM, Liu L, Moritz C. Multilocus phylogenetics of a rapid radiation in the genus Thomomys (Rodentia: Geomyidae) Syst Biol. 2008;57(2):294–310. - PubMed
    1. Edwards SV. Is a new and general theory of molecular systematics emerging? Evolution. 2009;63(1):1–19. - PubMed
    1. Kubatko LS, Degnan JH. Inconsistency of phylogenetic estimates from concatenated data under coalescence. Syst Biol. 2007;56(1):17–24. - PubMed

Publication types

MeSH terms

LinkOut - more resources