Retroposon analysis and recent geological data suggest near-simultaneous divergence of the three superorders of mammals (original) (raw)

Abstract

As a consequence of recent developments in molecular phylogenomics, all extant orders of placental mammals have been grouped into 3 lineages: Afrotheria, Xenarthra, and Boreotheria, which originated in Africa, South America, and Laurasia, respectively. Despite this advancement, the order of divergence of these 3 lineages remains unresolved. Here, we performed extensive retroposon analysis with mammalian genomic data. Surprisingly, we identified a similar number of informative retroposon loci that support each of 3 possible phylogenetic hypotheses: the basal position for Afrotheria (22 loci), Xenarthra (25 loci), and Boreotheria (21 loci). This result indicates that the divergence of the placental common ancestor into the 3 lineages occurred nearly simultaneously. Thus, we examined whether these molecular data could be integrated into the geological context by incorporating recent geological data. We obtained firm evidence that complete separation of Gondwana into Africa and South America occurred 120 ± 10 Ma. Accordingly, the previous reported time frame (division of Pangea into Gondwana and Laurasia at 148–138 Ma and division of Gondwana at 105 Ma) cannot be used to validate mammalian divergence order. Instead, we use our retroposon results and the recent geological data to propose that near-simultaneous divisions of continents leading to isolated Africa, South America, and Laurasia caused nearly concomitant divergence of the ancient placental ancestor into 3 lineages, Afrotheria, Xenarthra, and Boreotheria, ≈120 Ma.

Keywords: continental divisions, incomplete lineage sorting, long interspersed element 1, mammalian phylogeny, paleobiogeography


Recent phylogenetic analysis using dozens of gene sequences and other genomic markers such as insertions/deletions (indels) and retroposon insertions have largely revealed the interordinal relationships among placental mammals (eutherians) (17). These studies agree that 18 orders of placental mammals can be grouped into 3 major lineages, Afrotheria, Xenarthra, and Boreotheria (or Boreoeutheria; Fig. 1A). Afrotheria comprises 6 orders that include elephants, manatees, and aardvarks, and it is considered to have originated in Africa. Xenarthra includes armadillos, sloths, and anteaters and originated in South America. Boreotheria comprises 2 major mammalian groups, Laurasiatheria (e.g., carnivorans and bats) and Euarchontoglires (or Supraprimates; e.g., primates and rodents) and is considered to have originated in Laurasia of the Northern Hemisphere. Despite large-scale molecular phylogenetic analyses, the relationship among these 3 groups, which will finally establish the root of the placental mammalian tree, remains unresolved (4, 8). Three phylogenetic trees have been proposed as follows: tree 1, basal-Afrotheria (Boreotheria + Xenarthra; Exafroplacentalia); tree 2, basal-Xenarthra (Boreotheria + Afrotheria; Epitheria); and tree 3, basal-Boreotheria (Afrotheria + Xenarthra; Atlantogenata) (Fig. 1B).

Fig. 1.

Fig. 1.

Placental mammalian tree and 3 phylogenetic hypotheses of major lineages. (A) Mammalian interordinal phylogeny previously analyzed by retroposon insertion analysis (6) with the time scale (53). The bold vertical line denotes the root of placental mammals that was addressed in this study. (B) Three phylogenetic hypotheses to root the eutherian tree: tree 1, Exafroplacentalia (Afrotheria basal); tree 2, Epitheria (Xenarthra basal); and tree 3, Atlantogenata (Boreotheria basal).

Possible association between continental fission/fusion events and divergence of mammals was first proposed by Hedges et al. (9). They estimated molecular divergence times among many mammalian orders to be ≈100 Ma, which is inconsistent with fossil records suggesting that most mammals had diverged after 65 Ma. One of their interpretations was that the early divergence of mammals occurred because of several continental fissions. This hypothesis prompted many biologists to consider available geological data when estimating lineage divergence. One of the outcomes of this idea is Afrotheria, which is a good representative case for the association between continental drift and mammalian evolution (10, 11). Springer et al. (10) provided the first evidence that African-endemic mammals are monophyletic, and this hypothesis has been supported by subsequent molecular phylogenetic studies (13, 12) and analysis of retroposon insertions among orders (13, 14). Afrotherian mammals are considered to have evolved in Africa after it became isolated from other landmasses for several tens of million years (10, 11).

Based on the fact that placental mammal groups of Afrotheria, Xenarthra, and Boreotheria originated in Africa, South America, and Laurasia, respectively, the divergence of these mammals has been proposed to be linked to the separation of continents and subsequent land mass drift (3, 1517). Paleogeological estimates reported by Smith et al. (18) indicate that Laurasia (where Boreotheria originated) separated first from Pangaea at 148–138 Ma and that the supercontinent Gondwana in the Southern Hemisphere divided into Africa and South America at 105 Ma. Indeed, molecular analyses have estimated the time of divergence among the 3 mammalian lineages to be ≈120–95 Ma (17, 19), approximately coincident with the geological estimation. If the phylogeny of these 3 mammalian ancestral lineages accurately reflects the order of continental divisions as proposed by Smith et al. (18), then tree 3 (basal-Boreotheria) should be validated by phylogenetic studies. To address this issue, many researchers have evaluated the 3 phylogenetic hypotheses (i.e., trees 1–3) (Table 1). Previous molecular studies using a few dozen gene sequences favored tree 1 (basal-Afrotheria), but the bootstrap probability was low (13, 8, 2022). Molecular analyses using genomic data (phylogenomics) have recently been developed, and one such analysis using a several-hundred-kbp phylogenomic dataset favored tree 1 (23). However, more extensive phylogenomic analyses using mega-base sequence datasets strongly supported tree 3 (basal-Boreotheria) (16, 24), although it is possible that these results are misleading because of model misspecification (25).

Table 1.

Previous studies supporting trees 1–3

Hypotheses Dataset, sequence length (ref.)
Tree 1 Nuclear + mitochondrial genes, 16.4 kb (3)
Nuclear + mitochondrial genes, 17.7 kb (21)
Three nuclear genes, 5 kb (20)
Mitochondrial genes, 3,516 residues (55)
Nuclear + mitochondrial genes (22)
Nuclear genes, 205 kb (23)
Tree 2 Nuclear + mitochondrial genes, 7,999 residues (55)
Two retroposon loci (5)
Tree 3 Three nuclear genes (55, 56)
Two retroposon loci (17)
1,700 nuclear genes, 1.4 Mb (16)
2,840 nuclear genes, 723,000 residues (24)
Nuclear genes + conserved sequences, 1.9 Mb (57)

Rare genomic changes such as retroposon insertions and indels are quite useful phylogenetic markers. The nearly homoplasy-free character of retroposons has been established (26, 27), and application of retroposons to reveal true phylogeny already has a credible scientific history (5, 6, 14, 17, 2835). In addition, duplication of several nucleotides [target-site duplications (TSDs)], which often occurs upon retrotransposition (36), can also be used as a hallmark of homoplasy-free insertion of retroposons. Interestingly, retroposon analysis can also be applied to cases in which multiple species divergences occurred in a short evolutionary time. In these cases, ancestral polymorphisms based on the presence or absence of a retroposon should be retained after species divergence, and each retroposon would be randomly fixed or lost in a diverged lineage, resulting in distinct retroposon insertion patterns (6, 33, 3739). Identification of such inconsistent insertions in a certain species group leads to the conclusion that the lineages diverged in a very short evolutionary time (38).

Kriegs et al. (5) reported 2 retroposon insertion loci that support tree 2 (basal-Xenarthra). Their data, however, were somewhat ambiguous because of ragged end points of the aligned sequences as described by Murphy et al. (17). Murphy et al. reported 2 unambiguous retroposon loci and 4 indels in support of tree 3 (basal-Boreotheria).

Overall, recent phylogenomic analyses of both mega-base gene sequence data and rare genomic changes have provided evidence that Boreotheria first diverged from the common ancestor of placental mammals and that the other placentals (Atlantogenata) diverged into Xenarthra in South America and Afrotheria in Africa along with the division of Gondwana (tree 3) (Table 1). This scenario is apparently consistent with the geological data by Smith et al. (18). In the present study, however, we present a different scenario from those proposed previously. We performed an extensive retroposon analysis using genomic data and provide strong evidence that the 3 lineages diverged nearly simultaneously. Our molecular data prompted us to re-examine recent geological data, incorporation of which suggests that the Africa–South America separation occurred at ≈120 Ma. This estimated time frame is much earlier than that (i.e., 105 Ma) proposed by Smith et al. (18). We also discuss caveats for applying Smith et al.'s geological data to biogeographic questions.

Results and Discussion

Isolation of Multiple Long Interspersed Element 1 (L1) Loci Supporting Each of the 3 Phylogenetic Hypotheses.

In this study, genomic data of human, armadillo, and elephant were used as representatives of Boreotheria, Xenarthra, and Afrotheria, respectively. There are 83,331, 178,357, and 135,739 L1MB elements in the genomes of human, armadillo, and elephant, respectively. Among them, 19,783 loci were compared among the 3 species by using the method shown in Fig. S1, revealing 1,382 possible informative loci (i.e., loci where L1MB was present in any 2 of the 3 species). After removing false-positive loci by referring to alignments available in the University of California Santa Cruz Genome Browser, 68 phylogenetically-informative loci were aligned, including the additional sequences of other placental mammals such as chimpanzee, rhesus macaque, mouse, rat, rabbit, dog, cow, and tenrec (Fig. 2 and Fig. S2). Among them, trees 1, 2, and 3 were supported by 22, 25, and 21 loci, respectively. Using our strategy, we successfully identified the 4 loci reported by Kriegs et al. (5) and Murphy et al. (17) (Table 2), an indication of the comprehensiveness of the strategy. Our data include some cases of ambiguous L1 insertions in which no TSD was observed. However, even considering only bona fide L1 insertions having a TSD in the flanking regions, we still found an obvious inconsistency involving 16, 21, and 16 unambiguous L1 insertions that support trees 1, 2, and 3, respectively (Table 2).

Fig. 2.

Fig. 2.

Three examples of L1 insertion loci supporting tree 1 (A), tree 2 (B), and tree 3 (C). Black and gray boxes denote the L1 and the TSD sequences, respectively. The latter were generated during L1 integration and indicate bona fide retrotransposition. The central region of the inserted L1 sequence has been omitted.

Table 2.

Informative L1MB loci supporting trees 1–3

Tree 1 Tree 2 Tree 3
Locus L1 subfamily TSD, nt Locus L1 subfamily TSD, nt Locus L1 subfamily TSD, nt
HDL1007 L1MB5 - HDL2003 L1MB4 4 HDL3016 L1MB5 19
HDL1040 L1MB5 - HDL2090 L1MB5 10 HDL3051 L1MB7 7
HDL1061 L1MB5 - HDL2102 L1MB4 13 HDL3074 L1MB7 -
HDL1081 L1MB2 8 HDL2121 L1MB5 6 HDL3078 L1MB5 15
HDL1119 L1MB8 7 HDL2203 L1MB5 4 HDL3089 L1MB8 13
HDL1122 L1MB5 7 HDL2223 L1MB8 9 HDL3101 L1MB5 13
HDL1125 L1MB7 12 HDL2237 L1MB5 14 HDL3133 L1MB7 -
HDL1136 L1MB2 15 HDL2242 L1MB5 11 HDL3138 L1MB5 5
HDL1141 L1MB7 - HDL2279 L1MB8 15 HDL3146 L1MB5 10
HDL1144 L1MB8 18 HDL2307 L1MB5 - HDL3161 L1MB5 6
HDL1171 L1MB5 5 HDL2309 L1MB5 10 HDL3214 L1MB5 -
HDL1200 L1MB5 9 HDL2333 L1MB5 16 HDL3225 L1MB8 14
HDL1208 L1MB5 14 HDL2340 L1MB5 10 HDL3266 L1MB5 15
HDL1233 L1MB7 - HDL2345 L1MB7 - HDL3283 L1MB5 8
HDL1256 L1MB4 14 HDL2368 L1MB5 15 HDL3295 L1MB5 6
HDL1262 L1MB4 14 HDL2370 L1MB4 8 HDL3314 L1MB5 -
HDL1276 L1MB5 16 HDL2380 L1MB5 9 HDL3324 L1MB4 -
HDL1287 L1MB8 11 HDL2387 L1MB5 13 HDL3347 L1MB4 7
HDL1337 L1MB5 7 HDL2433* L1MB5 6 HDL3355 L1MB8 6
HDL1360 L1MB5 - HDL2443 L1MB5 15 HDL3366 L1MB5 7
HDL1372 L1MB5 8 HDL2446 L1MB7 10 HDL3369 L1MB5 10
HDL1373 L1MB5 14 HDL2457 L1MB4 8
HDL2483 L1MB8 6
HDL2499* L1MB5 -
HDL2548 L1MB8 -

Near-Simultaneous Divergence of a Common Ancestor of Placental Mammals into 3 Lineages.

As briefly described in the Introduction, inconsistency of retroposon insertions will be observed if species had diverged over a short evolutionary time span in which ancestral polymorphisms were followed by incomplete lineage sorting in a diverged lineage (38). Our group has already reported 3 cases, namely charr (40), cichlids (37, 41), and baleen whales (39), in which the order of species divergence could not be deduced because of extensive retroposon insertion inconsistency. In the present case, it is remarkable that such a large number of inconsistent loci could be characterized (Table 2). The fact that our L1 insertion data evenly support 3 different trees provides convincing evidence that 3 ancient lineages of placental mammals, namely Boreotheria, Xenarthra, and Afrotheria, diverged nearly simultaneously from a common ancestor.

Analysis of gene sequences alone generally is insufficient to provide convincing evidence that multiple divergences occurred nearly simultaneously. Even if the same analysis of 2 different genes in several lineages yields different trees, this is not necessarily an indication that they diverged nearly simultaneously because the result might be misleading owing to unusual gene characteristics or systematic errors such as long branch attraction, compositional bias, or heterotachy. Therefore, a comprehensive survey of retroposon loci from genomewide sequence data yields very useful markers for determining whether several ancient lineages diverged nearly simultaneously, as is demonstrated by our current results.

In general, acquisition of additional sequence data may allow resolution of a confounding phylogenetic relationship. As such, it could be argued that additional sequence data from the armadillo and elephant genomes could resolve the relationship among the 3 mammalian groups. However, a nearly equivalent number of L1 loci for the 3 trees was obtained from our extensive screen of all L1 loci from the complete human genome and the 2× genomic sequence coverage of armadillo and elephant data, corresponding to ≈70% of their whole genomes. Thus, given the data presented in Table 2 we consider that, even if additional sequence data were obtained once the armadillo and elephant genome sequences are completed, the number of L1 loci for the 3 trees still would not differ significantly. In any cases, the conclusion of nearly simultaneous divergence of 3 early mammalian lineages would not change.

Geological Evidence Supporting Multiple Continental Divisions, Occurring Over a Short Period ≈120 Ma, that Hindered Dispersion of Terrestrial Mammals.

As described in the Introduction, divergences and dispersal of mammals has been linked to continental separation and subsequent drift, as has often been suggested by evolutionary biologists (911). Also, in the case of the 3 ancient mammalian lineages, based on the origins of Boreotheria in Laurasia, Xenarthra in South America, and Afrotheria in Africa, the association between mammalian divergence and continental divisions has been discussed by many researchers (3, 1517, 42), with many referring to the geological data collected by Smith et al. (18) who proposed division of Pangea into Laurasia and Gondwana at 148–138 Ma and division of Gondwana into Africa and South America at 105 Ma. Although the conclusions of Smith et al. are meaningful from a geological viewpoint, we propose that they may not be directly applicable to biogeographic questions, as discussed below.

First, we examined the physical connection between Africa and South America. The data by Smith et al. (18) were based on the spatiotemporal distribution of magnetic stripes on the Atlantic Ocean floor, which were identified by measuring magnetic patterns of crust at the deep sea bottom from a ship. Magnetic stripes are generated by the normal polarity reversals of the Earth's magnetism over time. Accordingly, these data indicate the appearance at 105 Ma of a new oceanic crust with a particular magnetic stripe, the depth of which is always ≈2,700 m above the active midoceanic ridge. Taking into account the time required to create an ocean of 2,700-m depth from a continental division, the geographical separation between Africa and South America that hindered terrestrial dispersion of mammals would have occurred considerably earlier than 105 Ma. Curiously, this time lag has never been discussed by any molecular evolutionist who cited Smith et al.'s work. In addition, the geomagnetic time scale that Smith et al. used was revised recently. Further, magnetic stripes must be at least tens or hundreds of kilometers wide to be measured accurately, and thus the resolution of time by this method is limited.

However, paleogeographic maps of Africa have recently been published by Guiraud and Maulin (43) and Guiraud et al. (44) based on extensive compilation of drilling results of Mesozoic shallow marine sediments in Africa. These studies show the spatiotemporal distribution of the shallow marine ocean, which progressively extended landward because of the global sea-level rise (transgression) of ≈300 m from the Jurassic to the Cretaceous. Using on-land drilling data by Milani and Thomaz Filho (45) and paleogeography by Guiraud et al. (44) and de Wit (46), we reconstructed the paleogeography of the southern half of Pangea (Fig. 3). Because the North and South American continents were not connected above sea level from the Jurassic until the Miocene (20 Ma) (47), we focused on the relationships between Africa and South America and between Africa and Laurasia. At 146 Ma (Fig. 3A), the Central Atlantic Ocean was born by the eastward drift of Africa. Ridge propagation to open the South Atlantic Ocean proceeded not only southward from the Central Atlantic Ocean but also from south to north. The final connection point between South America and Africa, designated here as the Brazilian Bridge, remained at 120 Ma (Fig. 3 B and D). At 90 Ma (Fig. 3C), the opening of the Central and South Atlantic Oceans was completed. We used the revised geomagnetic time scale (47, 48) to update the paleogeographic map. In generating these maps, we considered changes in sea level (Fig. 3E). As shown in Fig. 3, from 146 Ma to 90 Ma, the area of the landmass (brown regions in Fig. 3 A–C) gradually decreased because of an increase in the sea level (compare the light-blue regions, representing shallow oceans, in Fig. 3 A–C), resulting in fragmentation of the continents. Thus, according to the updated geological estimation, we propose that the Brazilian Bridge was the last connection point between Africa and South America and that 120 Ma (or slightly later) represents the most likely period when west and east Gondwana became separated. Indeed, unconformity in the basins of eastern South America ≈115–120 Ma (figures 10 and 15 in ref. 45) suggests that the connection was finally broken around that time.

Fig. 3.

Fig. 3.

A revised paleogeographic scenario based on recent geological data. (A–C) Paleogeographic reconstruction of Pangea at 146 Ma (A), 120 Ma (B), and 90 Ma (C) (modified after refs. 4348 and 52). (D) A magnified map at 120 Ma (boxed area in B) is shown emphasizing 2 bridges that connect 3 continents. Blue and light blue represent deep and shallow ocean depths, respectively. Brown areas and the yellow jagged line represent landmasses and the mid-Atlantic ridge, respectively. Only deep ocean is represented for Laurasia because drilling data are currently unavailable. (E) Sea-level rise (54) to yield subsequent fragmentation of Africa and South America was taken into consideration for the reconstruction.

Interestingly, based on the presence of fossils of ammonites and other invertebrates from the north extending their range to the south through an open corridor, Jacobs et al. (49) proposed that the Atlantic Ocean was completed by the opening of the Equatorial Atlantic Gateway, which occurred in the Early Cretaceous (≈115 Ma). Therefore, terrestrial dispersion of an ancestral mammal between Africa and South America might have been blocked just before that time. Accordingly, our estimation of 120 Ma for that time is consistent with the estimate of Jacobs et al.

Regarding the relationship between Laurasia (Europe) and Africa, although conclusive geological evidence such as drilling data of the western Mediterranean Sea is not yet available, the geographic position of the west African promontory (Morocco) was close to the Iberian peninsula at 120 Ma, suggesting the possibility of a connection (i.e., Gibraltar Bridge) between Africa and Laurasia (see SI Text). As shown in Fig. 3D, landmasses (brown regions) of 3 continents (Laurasia, Africa, and South America) all can be connected through the Gibraltar and Brazilian Bridges. Traditionally, as shown in Table 3, the timing of the final separation of Africa from Europe has been variously defined by researchers as ranging from 148 Ma (18) to 110 Ma (50). Therefore, based on the updated drilling data supporting the final separation of Africa and South America ≈120 Ma (45), the possibility remains that the near-simultaneous separations of Laurasia, South America, and Africa might have occurred at ≈120 Ma. This geological estimation is not inconsistent with our present conclusion based on retroposon evidence.

Table 3.

Estimated ages (in Ma) of the divisions of Africa–Laurasia and Africa–South America as determined in several studies

Age, Ma Ref./source
Africa–S. America Africa–Laurasia
110 50
105 148–138 18
115 49
ca. 120 ca. 120? This study

Conclusion

Contrary to recent phylogenomic results supporting tree 3 (16, 17, 24), an extensive genome-scale retroposon analysis in the present study revealed that the common ancestor of placental mammals diverged into Boreotheria, Xenarthra, and Afrotheria nearly simultaneously. From a geological viewpoint, Smith et al. (18) proposed that Laurasia was first separated at 148–138 Ma, followed by division of Gondwana into South America and Africa at 105 Ma, the time difference of which has been used by evolutionary biologists as a rationale to support a biogeographic scenario that Boreotheria diverged first. Here, we show that the data of Smith et al. (18) cannot be directly applied to biogeographic questions, especially with regard to the terrestrial dispersion of mammals. By re-examining the geological data, including newly-accumulated drilling data, and considering the sea-level change during the Cretaceous, we provide evidence that the last connection between Africa and South America was through the Brazilian Bridge at 120 Ma, which is considerably earlier than the 105 Ma (18) that has been referred to by evolutionary biologists (3, 15, 17, 42). We note the possibility that Laurasia and Africa were connected by the Gibraltar Bridge until 120 Ma, although definitive evidence to confirm this hypothesis is currently lacking. Finally, by combining the updated geological data with our retroposon evidence, we provide a scenario in which the divergence of the eutherian ancestor into Boreotheria, Xenarthra, and Afrotheria occurred nearly simultaneously at 120 Ma, possibly with almost concomitant cut-off of 2 bridges that had connected the 3 ancient continents, Laurasia, South America, and Africa (Fig. S3).

Materials and Methods

Genomic sequences of human (hg18), 9-banded armadillo (Dasypus novemcinctus) (dasNov1; 2× genome coverage), and African elephant (Loxodonta africana) (loxAfr1; 2× genome coverage) were obtained from the University of California Santa Cruz Genome Bioinformatics database (http://genome.ucsc.edu). We compared patterns based on the presence or absence of L1, a mammalian retroposon. Among >50 L1 subfamilies in mammalian genomes (51), L1 elements belonging to the L1MB subgroup are considered to be the most appropriate subfamily for analyses of mammalian interordinal relationships immediately before and after the divergence of the 3 major lineages of mammalian ancestors, the availability of which was shown in our previous study (6). In addition, the 4 L1 elements reported by Kriegs et al. (5) and Murphy et al. (17) also belong to the L1MB subgroup. Therefore, we used only elements of the L1MB subgroup as phylogenetic markers. The genomic data for human, armadillo, and African elephant were screened for L1MB elements by using the RepeatMasker program (www.repeatmasker.org). From the 17-way genomic sequence alignment in the University of California Santa Cruz database, human–armadillo and human–elephant alignments were constructed and used to compare the orthologs of each L1MB locus. As shown in Fig. S1, 4 sites located 50, 100, 200, and 500 nt upstream and downstream of the L1MB sequence were set as queries for ortholog searches with the D. novemcinctus and L. Africana genomes. Orthologs of the 8 sites were sought in the 2 species by using the genomic alignment data. If at least 1 orthologous site was found on both sides of the L1 in the other 2 species (armadillo and elephant; Fig. S1), the presence or absence of a L1MB element between the 2 nearest orthologous sites was verified by referring to the RepeatMasker results. Among the loci compared, we used only those that were potentially phylogenetically informative (i.e., L1MB was present in 2 of the species and absent in the other species). Next, we checked alignments of these loci in the University of California Santa Cruz Genome Browser to remove clear false positive and paralogous loci. Most of the false positive results were caused by independent insertions of L1 within the genomic region compared. For the remaining 68 loci, orthologous sequences of various mammals were collected and aligned by using the GENETYX version 5.2 program.

Supplementary Material

Supporting Information

Acknowledgments.

This work was supported by research grants from the Ministry of Education, Culture, Sports, Science, and Technology of Japan (to N.O.).

Footnotes

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information