Housekeeping Genes for Phylogenetic Analysis of Eutherian Relationships (original) (raw)

Cite

Morgan Kullberg, Maria A. Nilsson, Ulfur Arnason, Eric H. Harley, Axel Janke, Housekeeping Genes for Phylogenetic Analysis of Eutherian Relationships, Molecular Biology and Evolution, Volume 23, Issue 8, August 2006, Pages 1493–1503, https://doi.org/10.1093/molbev/msl027
Close

Navbar Search Filter Mobile Enter search term Search

Abstract

The molecular relationship of placental mammals has attracted great interest in recent years. However, 2 crucial and conflicting hypotheses remain, one with respect to the position of the root of the eutherian tree and the other the relationship between the orders Rodentia, Lagomorpha (rabbits, hares), and Primates. Although most mitochondrial (mt) analyses have suggested that rodents have a basal position in the eutherian tree, some nuclear data in combination with mt-rRNA genes have placed the root on the so-called African clade or on a branch that includes this clade and the Xenarthra (e.g., anteater and armadillo). In order to generate a new and independent set of molecular data for phylogenetic analysis, we have established cDNA sequences from different tissues of various mammalian species. With this in mind, we have identified and sequenced 8 housekeeping genes with moderately fast rate of evolution from 22 placental mammals, representing 11 orders. In order to determine the root of the eutherian tree, the same genes were also sequenced for 3 marsupial species, which were used as outgroup. Inconsistent with the analyses of nuclear + mt-rRNA gene data, the current data set did not favor a basal position of the African clade or Xenarthra in the eutherian tree. Similarly, by joining rodents and lagomorphs on the same basal branch (Glires hypothesis), the data set is also inconsistent with the tree commonly favored in mtDNA analyses. The analyses of the currently established sequences have helped examination of problematic parts in the eutherian tree at the same time as they caution against suggestions that have claimed that basal eutherian relationships have been conclusively settled.

Introduction

The problems in resolving the placental mammal tree were outlined by Novacek (1992) in a synthesis of classical and molecular data. At this time, the amount of molecular data suitable for phylogenetic analyses was limited, and many relationships at the ordinal level were still unresolved due to lacking or contradictory information. These molecular studies were commonly based on limited sequence data such as those from single genes or fragments thereof (de Jong et al. 1981; Miyamoto and Goodman 1986; Irwin et al. 1991; Springer et al. 1997). During the 1990s, the 10–13 kb of protein-coding regions from mitochondrial (mt) genomes became a popular tool for phylogenetic analysis (mitogenomics). The continuously increasing amount of molecular data during the 1990s provided improved resolution of eutherian relationships and led to the acknowledgment of some unexpected relationships such as the sister group relationship between carnivores (e.g., cat and dog) and perissodactyls (e.g., horse and rhinoceros) (Xu et al. 1996). Only a decade after Novacek's (1992) synthesis, some 60 complete mt genomes representing all orders of placental mammals had been sequenced and used for phylogenetic reconstruction (Arnason et al. 2002). These mitogenomic analyses also provided insight into the times of mammalian inter- and intraordinal divergences (e.g., Janke et al. 1994; Arnason et al. 1996; Nilsson et al. 2004), which otherwise in many instances had to depend on an incomplete fossil record.

Murphy et al. (2001) presented the results of phylogenetic analysis based on a combination of nuclear and mt-rRNA sequences. The study encompassed 44 species, and the amount of data representing each species was in the range of 8–12 kb in most instances. The analysis was performed on all 3 codon positions (cdp's) and included gaps and also chimeric and missing data as the result of the limited amount of sequences representing some taxa. Malia et al. (2003) and Misawa and Nei (2003) discussed the problems associated with analysis of data sets of this kind. The unrooted nuclear + mt-rRNA results were nevertheless largely consistent with earlier mitogenomic trees, and in the few instances where the trees differed, there appeared to be a greater consistency between morphology and the nuclear + mt-rRNA tree than between morphology and the mitogenomic tree. At the present time, the tree presented by Murphy et al. (2001) is probably the most popular hypothesis with respect to placental mammal evolution, and in the following we refer to this tree as the “Murphy tree.” Similarly, we refer to the indel free and nonchimeric tree of Arnason et al. (2002) and its variants as the “mitogenomic tree.”

Many questions in mammalian phylogeny involve the position and relationships of rodents. Figure 1 shows a simplified tree of the 3 major hypotheses of eutherian relationships. Mitogenomic analysis in general do not support monophyletic grouping of rodents, and some mitogenomic analyses have strongly advocated rodent paraphyly (D'Erchia et al. 1996; Reyes et al. 1998). As discussed by Arnason et al. (2002), this question is at least partly related to the position of Erinaceidae (hedgehogs) in the eutherian tree. Rodent monophyly is unequivocally supported by morphological data (Simpson 1945; Novacek 1992) and also by the analysis by Murphy et al. (2001) and Huchon et al. (2002) of nuclear + mt-rRNA data. Removal of Erinaceidae from their basal eutherian position (Lin et al. 2002) and extended rodent sampling has provided stronger support to monophyletic Rodentia in the mitogenomic eutherian tree (Reyes et al. 2004).

Simplified relationships among placental mammals showing (a) the morphological hypothesis (Novacek 1992), (b) the mitogenomic tree (Arnason et al. 2002), and (c) the Murphy tree (Murphy et al. 2001). ART: Artiodactyla; PRI: Primates; ROD: Rodentia; LAG: Lagomorpha; PRO: Proboscidea (elephants); XEN: Xenarthra (anteater and armadillo); MAR: Marsupialia.

FIG. 1.—

Simplified relationships among placental mammals showing (a) the morphological hypothesis (Novacek 1992), (b) the mitogenomic tree (Arnason et al. 2002), and (c) the Murphy tree (Murphy et al. 2001). ART: Artiodactyla; PRI: Primates; ROD: Rodentia; LAG: Lagomorpha; PRO: Proboscidea (elephants); XEN: Xenarthra (anteater and armadillo); MAR: Marsupialia.

Another difference between the Murphy tree and the mitogenomic tree is the relationship between rodents and lagomorphs (rabbits, hares). Consistent with the Glires hypothesis (Gregory 1910; Liu and Miyamoto 1999), rodents and lagomorphs join on a common branch in the Murphy tree (fig. 1). Other analyses of nuclear genes have not identified monophyletic Glires (Li et al. 1990; Misawa and Janke 2003), but reanalysis of the data set of Misawa and Janke (2003) has supported Glires monophyly when using rate heterogeneity models (Douzery and Huchon 2004). Thus, this issue, which is crucial to the interpretation of basal eutherian divergences, remains controversial.

The Murphy tree joins Glires with Primates, Scandentia (tree shrews), and Dermoptera (flying lemurs) in the superordinal clade “Euarchontaglires” (Murphy et al. 2001). This relationship is not favored by mitogenomic analysis, whereas morphological data have suggested a grouping of Primates, Scandentia, Dermoptera, and Chiroptera (bats), in the superorder Archonta (Gregory 1910; Novacek and Wyss 1986).

As discussed by Arnason et al. (2002), the essential difference between the mitogenomic tree and the Murphy tree is in the location of the root. Analysis of nuclear + mt-rRNA data identify a basal position of the so-called African clade, a grouping of elephants, dugongs, hyraxes, the aardvark, golden moles, elephant shrews, and tenrecs. Some of these lineages were morphologically joined by Le Gros Clark and Sonntag (1926). In contrast, mitogenomic analyses preferably place the root on the Erinaceidae. When the Erinaceidae are disregarded, the rodents constitute a basal group in the mitogenomic tree (Arnason et al. 2002), a position that is consistent with some nuclear data (e.g., Li et al. 1990; Rosenberg and Kumar 2001; Jorgensen et al. 2005) but is inconsistent with the Murphy tree. This difference with respect to the phylogenetic position of basal lineages is crucial for the interpretation of eutherian relationships as it inter alia affects the positions of primates and rodents.

The study of Murphy et al. (2001) provided an important alternative to the results coming from mitogenomic analyses of eutherian relationships as for the first time a significant amount of nuclear protein-coding gene data was compiled for all eutherian orders. About 30% of the eutherian data did not have a counterpart in the 2 marsupial data sets that were used to root the eutherian tree. Whether this discrepancy may have affected the rooting of the eutherian tree is not known. The nature of some of the nuclear data may have influenced the outcome of the analyses, however. For example, the molecular distances between some of these genes are strikingly high. Thus, the amino acid distances between humans and mice for the von Willebrand factor (vWF), the interphotoreceptor retinoid-binding protein (IRPB), and the breast and ovarian cancer susceptibility protein 1 (BRCA1) sequence is 20%–45%. Distances of this degree are problematic as the background noise of the data may interfere with the resolving power of the sequences (Page and Holmes 1998; Nei and Kumar 2000). The odd evolutionary pattern of the BRCA1 gene with all cdp's evolving at the same rate (e.g., Adkins et al. 2001; Delsuc et al. 2002) is an additional complication that challenges the use of this gene in phylogenetic analysis.

It is evident that more comprehensive and better-suited nuclear data sets than those currently available are needed for establishing the tree of placental mammals. However, the intron–exon structure of protein-coding nuclear genes makes it difficult to acquire these sequences from genomic DNA. Here we have circumvented this problem by establishing via cDNA procedures the sequences of housekeeping genes from a number of different mammalian orders. The decision to use housekeeping genes (for definition see Watson et al. 1965; Warrington et al. 2000; Hsiao et al. 2001) was based on their presence in most tissues as this facilitates the recovery of cDNA sequences from tissues at different developmental stages and from cell cultures. In order to meet the need for versatility, we have directed our efforts toward genes that are also present and expressed outside the placental mammals (preferably in all vertebrates) and which have evolutionary rates that allow analysis of both basal vertebrate divergences and ordinal and subordinal divergences within different classes. For this reason, we have primarily selected genes that show amino acid distances of 2%–10% between humans and mice. The use of conserved sequences simplifies identification of these genes in mammals and other vertebrates and their alignment, but most importantly, it reduces problems in phylogenetic reconstruction due to multiple substitutions.

The cDNA is produced by standard methods, in order to make our approach universally applicable. The new data are then used to reconstruct and investigate the major branches and the basal divergences of the placental mammal tree.

Materials and Methods

Search for Housekeeping Genes

Housekeeping genes were identified by searching the human, mouse, rat, cow, and dog genomes in the National Center for Biotechnology Information (NCBI) database and by investigating housekeeping genes compiled by Warrington et al. (2000). Sequences were aligned using Se-Al v2.0a11 (Rambaut 1996). After removing indel differences, absolute distances were calculated using PAUP* (Swofford 1998) for selecting slow evolving genes. From the alignments of these genes, oligonucleotide primer pairs (17–21 mers) were designed manually at conserved sites.

RNA Isolation and cDNA Synthesis

Whenever possible, tissue samples were collected within minutes postmortem. Most samples were frozen in liquid nitrogen and then stored at −80 °C. Primary fibroblast cell cultures were either established by the authors or purchased from American Type Culture Collection (http://www.atcc.org) or European Collection of Cell Cultures (http://www.ecacc.org.uk). The cells were cultured under standard conditions at 37 °C in 5% CO2 in Dulbecco's modified Eagle's medium supplemented with 10% fetal calf serum, 2 mM L-glutamine, 0.1 mg/ml sodium pyruvate, 100 U/ml penicillin, and 0.1 mg/ml streptomycin (Freshney 2000).

The homogenization procedure was found to be crucial for an efficient RNA isolation. When tissue samples were homogenized, best results were obtained using a small motor-driven ultra-turrax (IKA Werke model T8 S5N-5G). RNA purification was performed using the guanidium thiocyanate method of Chomczynski and Sacchi (1987). The approach produced high-quality RNA from all types of tissues with virtually no contaminations of RNAses, DNA, or protein. The preparations were inspected for their quality by judging the degradation of the 18S and 28S rRNA bands on a denaturizing, ethidium bromide–stained agarose gel (Sambrook and Russell 2001). Table 1 summarizes the species and respective tissue samples that were used in this study.

Table 1

Common and Scientific Species Names and Their Tissue Types Used for RNA Isolation

Order Name Common Name Tissue
Galliformes Gallus gallus Chicken NCBI
Didelphimorphia Monodelphis domestica Short-tailed gray opossum Liver, heart, kidney
Didelphis virginiana North American opossum Cell culture
Dasyuromorphia Sminthopsis douglasi Julia Creek dunnart Cell culture
Rodentia Mus musculus House mouse Liver, heart, NCBI
Rattus norvegicus Brown rat Liver, NCBI
Mesocricetus auratus Golden hamster Liver, muscle, skin
Cavia porcellus Guinea pig Heart, cell culture
Lagomorpha Lepus europaeus European hare Liver
Oryctolagus cuniculus Rabbit Liver
Primates Homo sapiens Human Cell culture, NCBI
Aotus trivirgatus Owl monkey Cell culture
Scandentia Tupaia glis Tree shrew Liver, kidney
Proboscidea Loxodonta africana African elephant Cell culture
Xenarthra Tamandua tetradactyla Southern tamandua (anteater) Cell culture
Dasypus novemcinctus Nine-banded armadillo Cell culture
Chiroptera Tadarida brasiliensis Brazilian free-tailed bat Cell culture
Carnivora Canis familiaris Dog Liver
Felis catus Cat Heart, liver
Perissodactyla Equus caballus Horse Testicle
Ceratotherium simum White rhinoceros Cell culture
Diceros bicornis Black rhinoceros Cell culture
Cetacea Balaenoptera physalus Finback whale Cell culture
Artiodactyla Hippopotamus amphibius Hippopotamus Cell culture
Bos taurus Cow Liver, NCBI
Sus scrofa Pig Liver
Order Name Common Name Tissue
Galliformes Gallus gallus Chicken NCBI
Didelphimorphia Monodelphis domestica Short-tailed gray opossum Liver, heart, kidney
Didelphis virginiana North American opossum Cell culture
Dasyuromorphia Sminthopsis douglasi Julia Creek dunnart Cell culture
Rodentia Mus musculus House mouse Liver, heart, NCBI
Rattus norvegicus Brown rat Liver, NCBI
Mesocricetus auratus Golden hamster Liver, muscle, skin
Cavia porcellus Guinea pig Heart, cell culture
Lagomorpha Lepus europaeus European hare Liver
Oryctolagus cuniculus Rabbit Liver
Primates Homo sapiens Human Cell culture, NCBI
Aotus trivirgatus Owl monkey Cell culture
Scandentia Tupaia glis Tree shrew Liver, kidney
Proboscidea Loxodonta africana African elephant Cell culture
Xenarthra Tamandua tetradactyla Southern tamandua (anteater) Cell culture
Dasypus novemcinctus Nine-banded armadillo Cell culture
Chiroptera Tadarida brasiliensis Brazilian free-tailed bat Cell culture
Carnivora Canis familiaris Dog Liver
Felis catus Cat Heart, liver
Perissodactyla Equus caballus Horse Testicle
Ceratotherium simum White rhinoceros Cell culture
Diceros bicornis Black rhinoceros Cell culture
Cetacea Balaenoptera physalus Finback whale Cell culture
Artiodactyla Hippopotamus amphibius Hippopotamus Cell culture
Bos taurus Cow Liver, NCBI
Sus scrofa Pig Liver

NOTE.—In the case of Gallus gallus, NCBI indicates that the sequences were collected from databases. In the remaining instances, NCBI indicates that the currently established sequences were compared with the database sequences as evaluation of the mRNA/cDNA approach.

Table 1

Common and Scientific Species Names and Their Tissue Types Used for RNA Isolation

Order Name Common Name Tissue
Galliformes Gallus gallus Chicken NCBI
Didelphimorphia Monodelphis domestica Short-tailed gray opossum Liver, heart, kidney
Didelphis virginiana North American opossum Cell culture
Dasyuromorphia Sminthopsis douglasi Julia Creek dunnart Cell culture
Rodentia Mus musculus House mouse Liver, heart, NCBI
Rattus norvegicus Brown rat Liver, NCBI
Mesocricetus auratus Golden hamster Liver, muscle, skin
Cavia porcellus Guinea pig Heart, cell culture
Lagomorpha Lepus europaeus European hare Liver
Oryctolagus cuniculus Rabbit Liver
Primates Homo sapiens Human Cell culture, NCBI
Aotus trivirgatus Owl monkey Cell culture
Scandentia Tupaia glis Tree shrew Liver, kidney
Proboscidea Loxodonta africana African elephant Cell culture
Xenarthra Tamandua tetradactyla Southern tamandua (anteater) Cell culture
Dasypus novemcinctus Nine-banded armadillo Cell culture
Chiroptera Tadarida brasiliensis Brazilian free-tailed bat Cell culture
Carnivora Canis familiaris Dog Liver
Felis catus Cat Heart, liver
Perissodactyla Equus caballus Horse Testicle
Ceratotherium simum White rhinoceros Cell culture
Diceros bicornis Black rhinoceros Cell culture
Cetacea Balaenoptera physalus Finback whale Cell culture
Artiodactyla Hippopotamus amphibius Hippopotamus Cell culture
Bos taurus Cow Liver, NCBI
Sus scrofa Pig Liver
Order Name Common Name Tissue
Galliformes Gallus gallus Chicken NCBI
Didelphimorphia Monodelphis domestica Short-tailed gray opossum Liver, heart, kidney
Didelphis virginiana North American opossum Cell culture
Dasyuromorphia Sminthopsis douglasi Julia Creek dunnart Cell culture
Rodentia Mus musculus House mouse Liver, heart, NCBI
Rattus norvegicus Brown rat Liver, NCBI
Mesocricetus auratus Golden hamster Liver, muscle, skin
Cavia porcellus Guinea pig Heart, cell culture
Lagomorpha Lepus europaeus European hare Liver
Oryctolagus cuniculus Rabbit Liver
Primates Homo sapiens Human Cell culture, NCBI
Aotus trivirgatus Owl monkey Cell culture
Scandentia Tupaia glis Tree shrew Liver, kidney
Proboscidea Loxodonta africana African elephant Cell culture
Xenarthra Tamandua tetradactyla Southern tamandua (anteater) Cell culture
Dasypus novemcinctus Nine-banded armadillo Cell culture
Chiroptera Tadarida brasiliensis Brazilian free-tailed bat Cell culture
Carnivora Canis familiaris Dog Liver
Felis catus Cat Heart, liver
Perissodactyla Equus caballus Horse Testicle
Ceratotherium simum White rhinoceros Cell culture
Diceros bicornis Black rhinoceros Cell culture
Cetacea Balaenoptera physalus Finback whale Cell culture
Artiodactyla Hippopotamus amphibius Hippopotamus Cell culture
Bos taurus Cow Liver, NCBI
Sus scrofa Pig Liver

NOTE.—In the case of Gallus gallus, NCBI indicates that the sequences were collected from databases. In the remaining instances, NCBI indicates that the currently established sequences were compared with the database sequences as evaluation of the mRNA/cDNA approach.

The RNA was reverse transcribed to cDNA using 2.5 μM oligo-dT primer (20-mers) and 22 U BcaBEST RT-polymerase (TaKaRa) in a total volume of 20 μl according to the manufacturer's recommendations. The reaction mix was incubated on a programmable thermocycler at 65 °C for 1 min and at 30 °C for 5 min. The temperature was then gradually raised to 65 °C over a period of 20 min. The sample was then incubated at 65 °C for 20 min, at 98 °C for 5 min, and finally kept at 5 °C until processed further. Specific cDNA sequences (table 2) were polymerase chain reaction (PCR) amplified in 25 μl volumes using 1 μl of the cDNA reaction with conserved primer pairs (table 3) and the Ex-Taq polymerase (TaKaRa) according to the manufacturer's protocol. More details on the RNA isolation from different tissue, the RNA stability, cDNA synthesis, and a comparison of different methods will be published elsewhere.

Table 2

Names, Accession Numbers, Amino Acid Distances, and Lengths of Housekeeping Genes Used for Phylogenetic Analysis

Abbreviation Name Accession Number Distance Length DistancecDNA LengthcDNA
ATP5b ATP synthase, beta polypeptide NM_001686 3.8 1590 0.5 663
CS Citrate synthase NM_004077 5.8 1401 4.4 609
GAPDH Glyceraldehyde-3-P-dehydrogenase M33197 5.7 1008 4.3 621
IDH1 Isocitrate dehydrogenase 1 NM_005896 4.6 1245 4.7 897
MDH2 Malate dehydrogenase 2, NAD (mt) NM_005918 5.0 1017 3.7 651
RPL18 Ribosomal protein L18 NM_000979 9.5 567 6.0 399
SDHA Succinate dehydrogenase subunit A NM_004168 5.4 1995 2.4 1482
SDHB Succinate dehydrogenase subunit B NM_003000 8.9 843 1.8 504
Total 5826
Abbreviation Name Accession Number Distance Length DistancecDNA LengthcDNA
ATP5b ATP synthase, beta polypeptide NM_001686 3.8 1590 0.5 663
CS Citrate synthase NM_004077 5.8 1401 4.4 609
GAPDH Glyceraldehyde-3-P-dehydrogenase M33197 5.7 1008 4.3 621
IDH1 Isocitrate dehydrogenase 1 NM_005896 4.6 1245 4.7 897
MDH2 Malate dehydrogenase 2, NAD (mt) NM_005918 5.0 1017 3.7 651
RPL18 Ribosomal protein L18 NM_000979 9.5 567 6.0 399
SDHA Succinate dehydrogenase subunit A NM_004168 5.4 1995 2.4 1482
SDHB Succinate dehydrogenase subunit B NM_003000 8.9 843 1.8 504
Total 5826

NOTE.—The accession number refers to the human sequence in NCBI database, distance and length denote the observed amino acid distance and nucleotide length between humans and mice of the complete protein coding region, whereas distancecDNA and lengthcDNA refer to the amino acid distance and length of the cDNA sequence used for phylogenetic analysis.

Table 2

Names, Accession Numbers, Amino Acid Distances, and Lengths of Housekeeping Genes Used for Phylogenetic Analysis

Abbreviation Name Accession Number Distance Length DistancecDNA LengthcDNA
ATP5b ATP synthase, beta polypeptide NM_001686 3.8 1590 0.5 663
CS Citrate synthase NM_004077 5.8 1401 4.4 609
GAPDH Glyceraldehyde-3-P-dehydrogenase M33197 5.7 1008 4.3 621
IDH1 Isocitrate dehydrogenase 1 NM_005896 4.6 1245 4.7 897
MDH2 Malate dehydrogenase 2, NAD (mt) NM_005918 5.0 1017 3.7 651
RPL18 Ribosomal protein L18 NM_000979 9.5 567 6.0 399
SDHA Succinate dehydrogenase subunit A NM_004168 5.4 1995 2.4 1482
SDHB Succinate dehydrogenase subunit B NM_003000 8.9 843 1.8 504
Total 5826
Abbreviation Name Accession Number Distance Length DistancecDNA LengthcDNA
ATP5b ATP synthase, beta polypeptide NM_001686 3.8 1590 0.5 663
CS Citrate synthase NM_004077 5.8 1401 4.4 609
GAPDH Glyceraldehyde-3-P-dehydrogenase M33197 5.7 1008 4.3 621
IDH1 Isocitrate dehydrogenase 1 NM_005896 4.6 1245 4.7 897
MDH2 Malate dehydrogenase 2, NAD (mt) NM_005918 5.0 1017 3.7 651
RPL18 Ribosomal protein L18 NM_000979 9.5 567 6.0 399
SDHA Succinate dehydrogenase subunit A NM_004168 5.4 1995 2.4 1482
SDHB Succinate dehydrogenase subunit B NM_003000 8.9 843 1.8 504
Total 5826

NOTE.—The accession number refers to the human sequence in NCBI database, distance and length denote the observed amino acid distance and nucleotide length between humans and mice of the complete protein coding region, whereas distancecDNA and lengthcDNA refer to the amino acid distance and length of the cDNA sequence used for phylogenetic analysis.

Table 3

Primers Used for Amplifying and Sequencing Housekeeping Genes

Primer/Gene Tm Sequence 5′–3′
ATP5b-F 54.8 CTSCYATTCATGCTGAGG
ATP5b-F2 68.5 CKGCYYCGGCCTCCGGGG
ATP5b-F3 67.3 GCCTCCGGGGCCTTGCGG
ATP5b-R 57.1 TGCCACRGCTTCAATGGG
ATP5b-R2 56.7 CTTCAATGGGTCCCACCAT
CS-F 55.6 KGCAGCHAAGATCTACCG
CS-R 57.1 YGTGCTCATGGACTTGGG
GAPDH-F 65.4 CGCCTGGTCACCAGGGCT
GAPDH-F2 50.3 CATYAAYGAYCCCTTCAT
GAPDH-R 58.8 GCCTGCTTCACCACCTTCT
GAPDH-R2 59.4 RCGGCANGTCAGRTCCAC
IDH1-F 52.4 AAGGAGATGAAATGACACG
IDH1-F2 51.0 TDGARATGCAAGGAGATG
IDH1-R 53.4 GAMCTTAAAGTTTGGCCTG
IDH1-R2 48.0 GTTTRTCCAWRAACTCAAA
MDH2-F 66.2 GCTSTCCGCYCTCGCCCG
MDH2-F2 58.8 CYTCCRGCCCAGAACAATG
MDH2-R 57.1 CSCCYTTCTTGATGGAGG
MDH2-R2 55.6 AGAAGAACYTRGGCATYG
RPL18-F 56.0 CAAGGACCGAAAGGTTCG
RPL18-F2 59.4 RGGAGCCMAARAGCCAGG
RPL18-R 52.4 CCAGGGTTAGTTTTTGTA
RPL18-R2 58.2 TYTTGTAGCCYCGGCTGG
SDHA-F 55.9 GKCRTCHGCTAAAGTTTCAG
SDHA-F2 55.6 CHYTGGAYCTYCTGATGG
SDHA-F3 54.5 TYTAYCAGCGWGCATTTGG
SDHA-F4 50.2 TRGAYCATGAATWTGATGC
SDHA-R 61.0 GSGTGTGCTTCCTCCAGTG
SDHA-R2 58.8 CTTTRCGAGCYTCAGCACC
SDHA-R3 57.6 CTGGCATGAGCTCCACG
SDHB-F 61.6 GGAGCCAARATGGCGGCG
SDHB-R 54.5 GGTYGCCATCATYTTCTTG
SDHB-F2 51.4 TTKCCATYTAYMGATGGG
SDHB-R2 51.4 RCAGTTCATGATRGTGTG
Primer/Gene Tm Sequence 5′–3′
ATP5b-F 54.8 CTSCYATTCATGCTGAGG
ATP5b-F2 68.5 CKGCYYCGGCCTCCGGGG
ATP5b-F3 67.3 GCCTCCGGGGCCTTGCGG
ATP5b-R 57.1 TGCCACRGCTTCAATGGG
ATP5b-R2 56.7 CTTCAATGGGTCCCACCAT
CS-F 55.6 KGCAGCHAAGATCTACCG
CS-R 57.1 YGTGCTCATGGACTTGGG
GAPDH-F 65.4 CGCCTGGTCACCAGGGCT
GAPDH-F2 50.3 CATYAAYGAYCCCTTCAT
GAPDH-R 58.8 GCCTGCTTCACCACCTTCT
GAPDH-R2 59.4 RCGGCANGTCAGRTCCAC
IDH1-F 52.4 AAGGAGATGAAATGACACG
IDH1-F2 51.0 TDGARATGCAAGGAGATG
IDH1-R 53.4 GAMCTTAAAGTTTGGCCTG
IDH1-R2 48.0 GTTTRTCCAWRAACTCAAA
MDH2-F 66.2 GCTSTCCGCYCTCGCCCG
MDH2-F2 58.8 CYTCCRGCCCAGAACAATG
MDH2-R 57.1 CSCCYTTCTTGATGGAGG
MDH2-R2 55.6 AGAAGAACYTRGGCATYG
RPL18-F 56.0 CAAGGACCGAAAGGTTCG
RPL18-F2 59.4 RGGAGCCMAARAGCCAGG
RPL18-R 52.4 CCAGGGTTAGTTTTTGTA
RPL18-R2 58.2 TYTTGTAGCCYCGGCTGG
SDHA-F 55.9 GKCRTCHGCTAAAGTTTCAG
SDHA-F2 55.6 CHYTGGAYCTYCTGATGG
SDHA-F3 54.5 TYTAYCAGCGWGCATTTGG
SDHA-F4 50.2 TRGAYCATGAATWTGATGC
SDHA-R 61.0 GSGTGTGCTTCCTCCAGTG
SDHA-R2 58.8 CTTTRCGAGCYTCAGCACC
SDHA-R3 57.6 CTGGCATGAGCTCCACG
SDHB-F 61.6 GGAGCCAARATGGCGGCG
SDHB-R 54.5 GGTYGCCATCATYTTCTTG
SDHB-F2 51.4 TTKCCATYTAYMGATGGG
SDHB-R2 51.4 RCAGTTCATGATRGTGTG

NOTE.—F and R refer to forward and reverse orientation of primers. The figures following F and R refer to different primers used for the same gene. Tm = primer melting temperature.

Table 3

Primers Used for Amplifying and Sequencing Housekeeping Genes

Primer/Gene Tm Sequence 5′–3′
ATP5b-F 54.8 CTSCYATTCATGCTGAGG
ATP5b-F2 68.5 CKGCYYCGGCCTCCGGGG
ATP5b-F3 67.3 GCCTCCGGGGCCTTGCGG
ATP5b-R 57.1 TGCCACRGCTTCAATGGG
ATP5b-R2 56.7 CTTCAATGGGTCCCACCAT
CS-F 55.6 KGCAGCHAAGATCTACCG
CS-R 57.1 YGTGCTCATGGACTTGGG
GAPDH-F 65.4 CGCCTGGTCACCAGGGCT
GAPDH-F2 50.3 CATYAAYGAYCCCTTCAT
GAPDH-R 58.8 GCCTGCTTCACCACCTTCT
GAPDH-R2 59.4 RCGGCANGTCAGRTCCAC
IDH1-F 52.4 AAGGAGATGAAATGACACG
IDH1-F2 51.0 TDGARATGCAAGGAGATG
IDH1-R 53.4 GAMCTTAAAGTTTGGCCTG
IDH1-R2 48.0 GTTTRTCCAWRAACTCAAA
MDH2-F 66.2 GCTSTCCGCYCTCGCCCG
MDH2-F2 58.8 CYTCCRGCCCAGAACAATG
MDH2-R 57.1 CSCCYTTCTTGATGGAGG
MDH2-R2 55.6 AGAAGAACYTRGGCATYG
RPL18-F 56.0 CAAGGACCGAAAGGTTCG
RPL18-F2 59.4 RGGAGCCMAARAGCCAGG
RPL18-R 52.4 CCAGGGTTAGTTTTTGTA
RPL18-R2 58.2 TYTTGTAGCCYCGGCTGG
SDHA-F 55.9 GKCRTCHGCTAAAGTTTCAG
SDHA-F2 55.6 CHYTGGAYCTYCTGATGG
SDHA-F3 54.5 TYTAYCAGCGWGCATTTGG
SDHA-F4 50.2 TRGAYCATGAATWTGATGC
SDHA-R 61.0 GSGTGTGCTTCCTCCAGTG
SDHA-R2 58.8 CTTTRCGAGCYTCAGCACC
SDHA-R3 57.6 CTGGCATGAGCTCCACG
SDHB-F 61.6 GGAGCCAARATGGCGGCG
SDHB-R 54.5 GGTYGCCATCATYTTCTTG
SDHB-F2 51.4 TTKCCATYTAYMGATGGG
SDHB-R2 51.4 RCAGTTCATGATRGTGTG
Primer/Gene Tm Sequence 5′–3′
ATP5b-F 54.8 CTSCYATTCATGCTGAGG
ATP5b-F2 68.5 CKGCYYCGGCCTCCGGGG
ATP5b-F3 67.3 GCCTCCGGGGCCTTGCGG
ATP5b-R 57.1 TGCCACRGCTTCAATGGG
ATP5b-R2 56.7 CTTCAATGGGTCCCACCAT
CS-F 55.6 KGCAGCHAAGATCTACCG
CS-R 57.1 YGTGCTCATGGACTTGGG
GAPDH-F 65.4 CGCCTGGTCACCAGGGCT
GAPDH-F2 50.3 CATYAAYGAYCCCTTCAT
GAPDH-R 58.8 GCCTGCTTCACCACCTTCT
GAPDH-R2 59.4 RCGGCANGTCAGRTCCAC
IDH1-F 52.4 AAGGAGATGAAATGACACG
IDH1-F2 51.0 TDGARATGCAAGGAGATG
IDH1-R 53.4 GAMCTTAAAGTTTGGCCTG
IDH1-R2 48.0 GTTTRTCCAWRAACTCAAA
MDH2-F 66.2 GCTSTCCGCYCTCGCCCG
MDH2-F2 58.8 CYTCCRGCCCAGAACAATG
MDH2-R 57.1 CSCCYTTCTTGATGGAGG
MDH2-R2 55.6 AGAAGAACYTRGGCATYG
RPL18-F 56.0 CAAGGACCGAAAGGTTCG
RPL18-F2 59.4 RGGAGCCMAARAGCCAGG
RPL18-R 52.4 CCAGGGTTAGTTTTTGTA
RPL18-R2 58.2 TYTTGTAGCCYCGGCTGG
SDHA-F 55.9 GKCRTCHGCTAAAGTTTCAG
SDHA-F2 55.6 CHYTGGAYCTYCTGATGG
SDHA-F3 54.5 TYTAYCAGCGWGCATTTGG
SDHA-F4 50.2 TRGAYCATGAATWTGATGC
SDHA-R 61.0 GSGTGTGCTTCCTCCAGTG
SDHA-R2 58.8 CTTTRCGAGCYTCAGCACC
SDHA-R3 57.6 CTGGCATGAGCTCCACG
SDHB-F 61.6 GGAGCCAARATGGCGGCG
SDHB-R 54.5 GGTYGCCATCATYTTCTTG
SDHB-F2 51.4 TTKCCATYTAYMGATGGG
SDHB-R2 51.4 RCAGTTCATGATRGTGTG

NOTE.—F and R refer to forward and reverse orientation of primers. The figures following F and R refer to different primers used for the same gene. Tm = primer melting temperature.

The PCR product was purified by 2 subsequent ethanol precipitations, and both strands were sequenced using the BIG-DYE v. 3 cycle sequencing kit on an ABI Prism 3100 Genetic Analyzer. If the PCR product was longer than 1000 nt, an internal primer was designed for that gene in order to sequence the remaining region. The accession numbers of the new cDNA genes are: DQ402944DQ402968 (malate dehydrogenase 2, NAD [mt]), DQ402969DQ402993 (succinate dehydrogenase subunit A), DQ402994DQ403018 (succinate dehydrogenase subunit B [SDHB]), DQ403019DQ403043 (ribosomal protein L18), DQ403044DQ403068 (glyceraldehyde-3-P-dehydrogenase), DQ403069DQ403093 (isocitrate dehydrogenase 1), DQ403094DQ403118 (ATP synthase, beta polypeptide), and DQ403119DQ403143 (citrate synthase).

Phylogenetic Reconstruction

Phylogenetic relationships were analyzed using the Tree-Puzzle (Schmidt et al. 2002), PHYLIP (Felsenstein 1993), MOLPHY (Adachi and Hasegawa 1996), MrBayes (Huelsenbeck and Ronquist 2001), TREEFINDER (Jobb et al. 2004), PHYML (Guindon and Gascuel 2003), or PAUP* (Swofford 1998) program packages. The best-fitted model for nucleotide sequence evolution and parameters were determined with Modeltest (Posada and Crandall 1998). The WAG2000 model of amino acid sequence evolution (Whelan and Goldman 2001) and the general time reversible (GTR) and 3-state GTR model of nucleotide evolution (Lanave et al. 1984; Gibson et al. 2005) were used for distance and likelihood analyses. All analyses were performed in parallel with or without assuming a gamma model of rate heterogeneity (Gu et al. 1995) with 4 classes of variable sites and 1 class of invariable site (4Γ + I). Bayesian analysis was carried out under standard conditions with the above-mentioned models, running 3 cold and 1 heated chain and using 1,000,000 generations, a sample and print frequency of 100 and a burnin of 100,000 generations. Maximum likelihood (ML) bootstrap support was evaluated from 1000 bootstrap replicates of the amino acid data set generated by SEQBOOT and with PHYML under the WAG2000 + 4Γ + I model using the tree in figure 2 as the starting tree. The resulting topologies were summarized by CONSENS.

ML tree of mammals based on amino acid analysis applying WAG2000 + 4Γ + I.

FIG. 2.—

ML tree of mammals based on amino acid analysis applying WAG2000 + 4Γ + I.

Dating of Divergence Times

Estimates of divergence times were based on the tree shown in figure 2 from amino acid sequences and using the R8S version 1.70 program package (Sanderson 2002) and a nonparametric rate smoothing method and optimization by the Powell model as implemented in R8S. For calculating standard deviations (SDs), 100 bootstrap replicates of the sequence data were created using SEQBOOT. The branch lengths for each replicate were estimated using Tree-Puzzle under WAG2000 + 4Γ + I model, and the tree is given in figure 2. For each of those tree replicates, the divergence times were estimated by the R8S program, and standard errors (SEs) were calculated from the different results.

The program MULTIDIVTIME as implemented in the T3 program package (http://abacus.gene.ucl.ac.uk/) and the WAG2000 + 8Γ model was used in addition to the R8S program to estimate divergence times. Eight categories of gamma-distributed rate heterogeneity among sites (8Γ) were used in order to compensate for the lack of the program to account for invariable sites. Other important parameters that were used to build the model file (modelinf) are an absolute maximum age of the placental mammal origin 200 million years before present (MYBP) and 100,000 generations for the Markov chain Monte Carlo chain.

For calibration of the evolutionary rates and estimating the divergence times, we used the split between cetaceans and ruminants ≈60 MYBP, AC-60 (Arnason et al. 1996), and the split between equids and rhinoceroses set at ≈50 MYBP, ER-50 (Arnason et al. 1998). The reference AC-60—the divergence between artiodactyl (A) ruminants and cetaceans (C)—is (1) upheld by the age, ≈53 Myr, of the oldest archaeocete fossils, (2) the divergence between Hippopotamidae and Cetacea (molecularly dated to ≈55 MYBP), and (3) by its consistency with the age, 34–35 Myr, of the oldest fossils diagnostic for the divergence between odontocetes and mysticetes (Arnason, Gullberg, Gretarsdottir, et al. 2000; Arnason et al. 2004 and references in these papers). The lower bound of this reference is more difficult to define, but based on paleontological evidence, its lower limits can be placed at ≈65 MYA (Gatesy and O'Leary 2001), a date that we maintain in the current study. On these grounds, the split between cetaceans and ruminants has been set to 60–65 MYA. The age of ER-50 (Arnason et al. 1998) is substantiated by the age, ≈48 Myr, of the oldest rhinocerotid fossil, Hyrachyus. It may be argued, however, that the lower bound of ER-50 should be placed earlier than 50 MYA in line with the early Eocene age of Hyracotherium. The status of Hyracotherium as a phylogenetic wastebasket (Hooker 1989; Prothero and Schoch 1989) makes the argumentation problematic, however. In the current estimates, we have tentatively placed the lower bound of ER at 58 MYBP, the same as used by Garland et al. (1993). The metatherian/eutherian split has 125 MYA as upper limit, in agreement with the appearance of oldest metatherians (Sinodelphys) and eutherians (Eomaia) in the fossil record (Ji et al. 2002; Luo et al. 2003). As the lower limit, we have placed this split at 170 MYBP. This age is without paleontological substantiation but is among the oldest molecular-based estimates (Kumar and Hedges 1998).

Results

cDNA Sequences from Selected Housekeeping Genes

From the 535 housekeeping genes listed by Warrington et al. (2000), about 50 genes were selected for further investigation. The evolutionary distance of these genes between humans and mice was 3%–10% at the amino acid level. For about 20 of these genes, conserved primers could be designed that spanned at least 500 nt of the protein-coding region and that would PCR amplify the corresponding gene from most placental mammals. In order to optimize the specificity of the primers, the number of ambiguities in the primer sequence was kept to a minimum.

For 8 of the selected genes, cDNA sequences could be produced from every placental species that was included in the study. The respective genes, their names, and properties are listed in table 2. The table also shows the accession number of the human homologue, the amino acid distance between humans and mice, and the length and amino acid distances of the protein-coding regions that were used in the alignment for the phylogenetic analysis. The actual distance values for the sequenced cDNA region were lower than initially selected for because the conserved primer pairs usually covered the central and more conserved regions of the genes. The corresponding reverse transcription (RT)–PCR and sequencing primer pairs and their properties are listed in table 3.

In order to examine the consistency between the currently established sequences and sequences from whole genome projects, a number of cDNA sequences were produced for such an examination. The database sequence and the cDNA sequences were identical or differed at less than 3 nucleotide sites per sequence in all cases. This indicates that the primer pairs and RT–PCR conditions applied have produced homologous sequences in these model organisms and conceivably also in the other mammalian species studied.

Tree Reconstruction

All mammalian sequences aligned readily. Only a single indel difference was identified in the SDHB amino acid sequence of the Xenarthra as compared with the remaining mammals. The position of this indel, which was removed from the alignment, was unambiguous. The chicken genome may not be completely sequenced. Therefore, some chicken sequences were obtained from expressed sequence tags (EST) data (U.D. Chick EST Database). The EST sequences rarely covered the complete protein-coding region, and therefore, 5% of the sequences are missing from the chicken in the alignment. Exclusion of the chicken increased the amount of sites that were indel free, but the phylogenetic results remained unaffected. The alignment is available from treebase (http://www.treebase.org/treebase/index.html), accession number SN2729.

Concatenation of the amino acid sequences resulted in a data set with a total length of 1942 amino acid (aa) sites for mammals. The directly calculated distance between humans and mice was 3.6%. Tree-Puzzle estimated the distance between humans and mice to 3.2% under the WAG2000 model, whereas gamma rate heterogeneity (4Γ + I) yielded an estimate of 3.3%. The current focus on genes that have low amino acid distance values limits the number of multiple substitutions that may have occurred in the sequences. Based on a Poisson model (d = −ln [1 − p]), the expected amino acid distance (d) can be calculated from p, the observed amino acid distance (Nei and Kumar 2000). With a distance of 3.6% between humans and mice and a sequence length of 1942 aa, the calculation suggests that only about one extra substitution may have remained undetected as the result of multiple substitutions. If so, the effect of unaccounted substitutions is negligible, and below the SE for p. Correspondingly, low values were observed for nonsynonymous nucleotide sites. This simplified estimate illustrates the low frequency of multiple substitutions in conserved sequences.

The tree in figure 2 shows the ML tree using 1942 aa sites under the WAG2000 + 4Γ + I model. Rodentia and Lagomorpha constitute sistergroups forming the cohort Glires that is placed in a basal position among placental mammals. The next crownward group is Scandentia, followed by the primates. The analysis identified a sister group relationship between the elephant, representing the African clade and Xenarthra (anteater and armadillo). This group was sister to Cetferungulata (Arnason et al. 1999) plus Chiroptera (bats). Within the Cetferungulata, the Carnivora (carnivores) group with the Perissodactyla (horse and rhinos) and the Artiodactyla with the Cetacea (whales). Artiodactyla itself is paraphyletic because the hippopotamus cluster with the Cetacea, whereas cow and pig are sister groups.

Glires was not placed in a basal eutherian position in all analyses. Instead, the rodents branched off first when amino acid or all cdp's were analyses under rate homogeneity, and the muroids were basal when first and second cdp GTR were used, both of which would make the Glires paraphyletic. The likelihood difference of these trees relative to the one shown in figure 2 was less than 0.2 times the SE. Thus, there was no significant difference between these groupings. Most data sets and analytical methods reconstructed the Glires as shown in figure 2, however. Also, when using only the chicken or only the marsupials as outgroup, most phylogenetic analyses placed the root of the eutherian tree on the Glires or members thereof.

The only single analysis that did not identify the root at or within the Glires was the ML analysis using all 3 cdp's and a GTR + 4Γ + I model of nucleotide substitution. Under this scenario, the elephant and Xenarthra were basal in the eutherian tree, and Glires and primates became sistergroups (fig. 3_b_). However, this analysis included third cdp's, which show large distance values relative to the outgroup. At these sites, the distance value between marsupials and eutherians is about 40%. This indicates a high level of randomization due to multiple substitutions. Third cdp's also markedly deviate in nucleotide composition among all orders. Inclusion of these sites is therefore of questionable value, although this is common practice in some analyses of eutherian evolution (Murphy et al. 2001). When differences at third cpd's are limited to transversions (R = purines, Y = pyrimidines), ML analysis assuming a GTR + 4Γ + I model of evolution placed the eutherian root again in Glires. Use of a 3-state GTR + 4Γ + I model as implemented in TREEFINDER for the analysis of all 3 cdp's also placed Glires basal among eutherian mammals. Table 4 summarizes an ML analysis of different tree topologies and different data sets.

Alternative simplified topologies that were evaluated in an ML analysis (table 4). AVE: Aves; MAR: Marsupialia; CET: Cetferungulata; CHI: Chiroptera; PRI: Primates; SCA: Scandentia; ROD: Rodentia; LAG: Lagomorpha; PRO: Proboscidea (elephants); XEN: Xenarthra (anteater and armadillo).

FIG. 3.—

Alternative simplified topologies that were evaluated in an ML analysis (table 4). AVE: Aves; MAR: Marsupialia; CET: Cetferungulata; CHI: Chiroptera; PRI: Primates; SCA: Scandentia; ROD: Rodentia; LAG: Lagomorpha; PRO: Proboscidea (elephants); XEN: Xenarthra (anteater and armadillo).

Table 4

ML Analysis of Alternative Trees

| | Amino Acid | 1st + 2nd cdp's | 1st + 2nd + 3rd cdp's | | | | | | | | | | | | | | | | | | --------------------------------------------------------------- | ---------------- | --------------------- | ------------ | ---- | ------------ | ------ | --------- | ---- | --------- | ---- | --- | ------ | ---- | --- | ------ | --------- | ---- | ---- | | WAG2000 | WAG2000 + 4Γ + I | GTR | GTR + 4Γ + I | GTR | GTR + 4Γ + I | | | | | | | | | | | | | | | | Δlog L | SE | psh | Δlog L | SE | psh | Δlog L | SE | psh | Δlog L | SE | psh | Δlog L | SE | psh | Δlog L | SE | sh | | | Topology as in figure 2 | <−11 491> | 1.00 | <−11 049> | 1.00 | <−12 805> | 1.00 | <−12 165> | 1.00 | <−45 865> | 1.00 | −17 | ±12 | 0.51 | | | | | | | Root on Proboscidea + Xenarthra | −15 | ±15 | 0.54 | −9 | ±11 | 0.57 | −11 | ±17 | 0.74 | −5 | +10 | 0.77 | −12 | ±27 | 0.80 | <−41 850> | 1.00 | | | Proboscidea + Xenarthra placed basal | −24 | ±16 | 0.36 | −15 | ±12 | 0.42 | −15 | ±17 | 0.60 | −7 | ±11 | 0.68 | −109 | ±29 | 0.02 | −50 | ±16 | 0.05 | | Murphy tree | −28 | ±20 | 0.29 | −22 | ±14 | 0.26 | −40 | ±22 | 0.17 | −25 | ±13 | 0.17 | −106 | ±38 | 0.02 | −56 | ±16 | 0.03 | | Mitogenomic tree | −29 | ±27 | 0.29 | −25 | ±19 | 0.21 | −46 | ±29 | 0.15 | −36 | ±19 | 0.09 | −97 | ±45 | 0.06 | −94 | ±26 | 0.00 |

| | Amino Acid | 1st + 2nd cdp's | 1st + 2nd + 3rd cdp's | | | | | | | | | | | | | | | | | | --------------------------------------------------------------- | ---------------- | --------------------- | ------------ | ---- | ------------ | ------ | --------- | ---- | --------- | ---- | --- | ------ | ---- | --- | ------ | --------- | ---- | ---- | | WAG2000 | WAG2000 + 4Γ + I | GTR | GTR + 4Γ + I | GTR | GTR + 4Γ + I | | | | | | | | | | | | | | | | Δlog L | SE | psh | Δlog L | SE | psh | Δlog L | SE | psh | Δlog L | SE | psh | Δlog L | SE | psh | Δlog L | SE | sh | | | Topology as in figure 2 | <−11 491> | 1.00 | <−11 049> | 1.00 | <−12 805> | 1.00 | <−12 165> | 1.00 | <−45 865> | 1.00 | −17 | ±12 | 0.51 | | | | | | | Root on Proboscidea + Xenarthra | −15 | ±15 | 0.54 | −9 | ±11 | 0.57 | −11 | ±17 | 0.74 | −5 | +10 | 0.77 | −12 | ±27 | 0.80 | <−41 850> | 1.00 | | | Proboscidea + Xenarthra placed basal | −24 | ±16 | 0.36 | −15 | ±12 | 0.42 | −15 | ±17 | 0.60 | −7 | ±11 | 0.68 | −109 | ±29 | 0.02 | −50 | ±16 | 0.05 | | Murphy tree | −28 | ±20 | 0.29 | −22 | ±14 | 0.26 | −40 | ±22 | 0.17 | −25 | ±13 | 0.17 | −106 | ±38 | 0.02 | −56 | ±16 | 0.03 | | Mitogenomic tree | −29 | ±27 | 0.29 | −25 | ±19 | 0.21 | −46 | ±29 | 0.15 | −36 | ±19 | 0.09 | −97 | ±45 | 0.06 | −94 | ±26 | 0.00 |

Table 4

ML Analysis of Alternative Trees

| | Amino Acid | 1st + 2nd cdp's | 1st + 2nd + 3rd cdp's | | | | | | | | | | | | | | | | | | --------------------------------------------------------------- | ---------------- | --------------------- | ------------ | ---- | ------------ | ------ | --------- | ---- | --------- | ---- | --- | ------ | ---- | --- | ------ | --------- | ---- | ---- | | WAG2000 | WAG2000 + 4Γ + I | GTR | GTR + 4Γ + I | GTR | GTR + 4Γ + I | | | | | | | | | | | | | | | | Δlog L | SE | psh | Δlog L | SE | psh | Δlog L | SE | psh | Δlog L | SE | psh | Δlog L | SE | psh | Δlog L | SE | sh | | | Topology as in figure 2 | <−11 491> | 1.00 | <−11 049> | 1.00 | <−12 805> | 1.00 | <−12 165> | 1.00 | <−45 865> | 1.00 | −17 | ±12 | 0.51 | | | | | | | Root on Proboscidea + Xenarthra | −15 | ±15 | 0.54 | −9 | ±11 | 0.57 | −11 | ±17 | 0.74 | −5 | +10 | 0.77 | −12 | ±27 | 0.80 | <−41 850> | 1.00 | | | Proboscidea + Xenarthra placed basal | −24 | ±16 | 0.36 | −15 | ±12 | 0.42 | −15 | ±17 | 0.60 | −7 | ±11 | 0.68 | −109 | ±29 | 0.02 | −50 | ±16 | 0.05 | | Murphy tree | −28 | ±20 | 0.29 | −22 | ±14 | 0.26 | −40 | ±22 | 0.17 | −25 | ±13 | 0.17 | −106 | ±38 | 0.02 | −56 | ±16 | 0.03 | | Mitogenomic tree | −29 | ±27 | 0.29 | −25 | ±19 | 0.21 | −46 | ±29 | 0.15 | −36 | ±19 | 0.09 | −97 | ±45 | 0.06 | −94 | ±26 | 0.00 |

| | Amino Acid | 1st + 2nd cdp's | 1st + 2nd + 3rd cdp's | | | | | | | | | | | | | | | | | | --------------------------------------------------------------- | ---------------- | --------------------- | ------------ | ---- | ------------ | ------ | --------- | ---- | --------- | ---- | --- | ------ | ---- | --- | ------ | --------- | ---- | ---- | | WAG2000 | WAG2000 + 4Γ + I | GTR | GTR + 4Γ + I | GTR | GTR + 4Γ + I | | | | | | | | | | | | | | | | Δlog L | SE | psh | Δlog L | SE | psh | Δlog L | SE | psh | Δlog L | SE | psh | Δlog L | SE | psh | Δlog L | SE | sh | | | Topology as in figure 2 | <−11 491> | 1.00 | <−11 049> | 1.00 | <−12 805> | 1.00 | <−12 165> | 1.00 | <−45 865> | 1.00 | −17 | ±12 | 0.51 | | | | | | | Root on Proboscidea + Xenarthra | −15 | ±15 | 0.54 | −9 | ±11 | 0.57 | −11 | ±17 | 0.74 | −5 | +10 | 0.77 | −12 | ±27 | 0.80 | <−41 850> | 1.00 | | | Proboscidea + Xenarthra placed basal | −24 | ±16 | 0.36 | −15 | ±12 | 0.42 | −15 | ±17 | 0.60 | −7 | ±11 | 0.68 | −109 | ±29 | 0.02 | −50 | ±16 | 0.05 | | Murphy tree | −28 | ±20 | 0.29 | −22 | ±14 | 0.26 | −40 | ±22 | 0.17 | −25 | ±13 | 0.17 | −106 | ±38 | 0.02 | −56 | ±16 | 0.03 | | Mitogenomic tree | −29 | ±27 | 0.29 | −25 | ±19 | 0.21 | −46 | ±29 | 0.15 | −36 | ±19 | 0.09 | −97 | ±45 | 0.06 | −94 | ±26 | 0.00 |

Based on the analysis of amino acid sequences, the difference in the log likelihood (Δlog L) between the topology in figure 3_b_ and those in figures 2 and 3_a_ is about 1 SE. In analysis of first and second cdp's, assuming a GTR + 4Γ + I model, the trees in figure 3_b_ and c are a half SE worse than the best one. This difference is not significant but indicates that placing the root at the elephant and Xenarthra may not be optimal and may not hold when more data are included. Alternative trees with the root either on the elephant or the xenarthrans show the same tendency. The recently proposed Murphy tree and the mitogenomic tree (fig. 3_d_ and e) were not favored in any ML analyses. Although they cannot be statistically rejected applying Shimodaira–Hasegawa test, they are more than 1 SE worse and in many analyses more than 2 SE worse than the best tree.

The position of Scandentia (represented by the tree shrew) remained essentially unsettled. Some analyses using first and second cdp's placed Scandentia and Primates as sister groups, but most analyses did not identify this relationship. However, the difference in the log likelihood for the position of the tree shrew as in figure 2 or as sistergroup to the primates is about −1, that is, only 1/10 of the SE of the log L estimate. The position of the Scandentia relative to the primates is therefore best regarded as unresolved by the current data set.

The Bayesian probability values for internal branches (a–w in fig. 2) are given in table 5. MrBayes provided some support for the root of the eutherian tree being on Glires using amino acid data or on the rodents based on nucleotide data. However, the probability for lagomorphs grouping with other eutherians as in the nucleotide analysis is limited (P = 0.44), and the Glires hypothesis cannot be rejected. The low MrBayes support for Glires (P = 0.61) based on amino acid sequence analysis may reflect an influence from the placement of the root on rodents as observed in some other analyses. In contrast to the topology shown in figure 2, MrBayes identified some affinities between Scandentia and Primates and also between Chiroptera and Artiodactyla using nucleotide data. There is a surprisingly strong support for the pig grouping with the cow. This association has not been favored in previous phylogenetic studies (Murphy et al. 2001; Arnason et al. 2002). The Δlog L for placing the pig basal to the (cow, (hippopotamus, whale)) group is, depending on analysis of amino acid or nucleotide sequences, 0.7 to 3 times SE lower than the position of the pig as in the best ML tree.

Table 5

Divergence Times and Bayesian and ML Bootstrap Support for Different Relationships

Node R8S (MYA ± SD) M.DIVTME (MYA ± SD) MrBaa MrB12 MLboot
a 103 ± 8.3 95 ± 11.4 1.00 1.00 97
b 46 ± 7.1 42 ± 11.5 1.00 1.00 100
c 97 ± 4.4 102 ± 9.6 1.00 1.00 100
d 93 ± 5.5 93 ± 10.4 0.61 <0.5 <0.5
e 79 ± 8.4 79 ± 11.5 0.92 <0.5 56
f 58 ± 8.3 59 ± 11.8 0.99 1.00 88
g 43 ± 6.7 44 ± 10.9 1.00 1.00 76
h 32 ± 11.5 23 ± 9.6 1.00 1.00 100
i 93 ± 3.9 98 ± 9.2 0.99 0.77SP 64
j 88 ± 4.1 93 ± 8.8 0.64 0.92 51
k 48 ± 11.3 46 ± 10.7 1.00 1.00 100
l 85 ± 4.3 89 ± 8.4 0.85 0.71 <0.5
m 78 ± 4.0 82 ± 7.8 0.51CA 0.74CA 88
n 74 ± 4.4 76 ± 7.1 1.00 1.00 52
o 59 ± 3.3 61 ± 5.1 1.00 1.00 100
p 52 ± 4.1 52 ± 6.6 0.86 0.83 87
q 48 ± 5.8 42 ± 8.1 1.00 1.00 93
r 70 ± 3.7 69 ± 7.1 0.85 0.61 58
s 52 ± 1.4 56 ± 1.1 1.00 1.00 98
t 10 ± 8.7 7 ± 4.6 1.00 1.00 100
u 42 ± 9.1 32 ± 9.3 1.00 1.00 99
v 80 ± 4.4 82 ± 8.9 0.99 0.96 77
w 38 ± 5.8 37 ± 8.8 1.00 1.00 100
Node R8S (MYA ± SD) M.DIVTME (MYA ± SD) MrBaa MrB12 MLboot
a 103 ± 8.3 95 ± 11.4 1.00 1.00 97
b 46 ± 7.1 42 ± 11.5 1.00 1.00 100
c 97 ± 4.4 102 ± 9.6 1.00 1.00 100
d 93 ± 5.5 93 ± 10.4 0.61 <0.5 <0.5
e 79 ± 8.4 79 ± 11.5 0.92 <0.5 56
f 58 ± 8.3 59 ± 11.8 0.99 1.00 88
g 43 ± 6.7 44 ± 10.9 1.00 1.00 76
h 32 ± 11.5 23 ± 9.6 1.00 1.00 100
i 93 ± 3.9 98 ± 9.2 0.99 0.77SP 64
j 88 ± 4.1 93 ± 8.8 0.64 0.92 51
k 48 ± 11.3 46 ± 10.7 1.00 1.00 100
l 85 ± 4.3 89 ± 8.4 0.85 0.71 <0.5
m 78 ± 4.0 82 ± 7.8 0.51CA 0.74CA 88
n 74 ± 4.4 76 ± 7.1 1.00 1.00 52
o 59 ± 3.3 61 ± 5.1 1.00 1.00 100
p 52 ± 4.1 52 ± 6.6 0.86 0.83 87
q 48 ± 5.8 42 ± 8.1 1.00 1.00 93
r 70 ± 3.7 69 ± 7.1 0.85 0.61 58
s 52 ± 1.4 56 ± 1.1 1.00 1.00 98
t 10 ± 8.7 7 ± 4.6 1.00 1.00 100
u 42 ± 9.1 32 ± 9.3 1.00 1.00 99
v 80 ± 4.4 82 ± 8.9 0.99 0.96 77
w 38 ± 5.8 37 ± 8.8 1.00 1.00 100

NOTE.—SP: Scandentia/Primates as sister group to node l. CA: the support for Chiroptera and Artiodactyla as sistergroups. MrBaa, MrB12: Bayesian probability for branches based on amino acid sequences and 1st plus 2nd cdp's, respectively. SD: standard deviation as estimated by the MULTIDIVTIME (M.DIVTIME) program. MLboot: Bootstrap support for amino acid using ML under the WAG2000 + 4Γ + I model.

Table 5

Divergence Times and Bayesian and ML Bootstrap Support for Different Relationships

Node R8S (MYA ± SD) M.DIVTME (MYA ± SD) MrBaa MrB12 MLboot
a 103 ± 8.3 95 ± 11.4 1.00 1.00 97
b 46 ± 7.1 42 ± 11.5 1.00 1.00 100
c 97 ± 4.4 102 ± 9.6 1.00 1.00 100
d 93 ± 5.5 93 ± 10.4 0.61 <0.5 <0.5
e 79 ± 8.4 79 ± 11.5 0.92 <0.5 56
f 58 ± 8.3 59 ± 11.8 0.99 1.00 88
g 43 ± 6.7 44 ± 10.9 1.00 1.00 76
h 32 ± 11.5 23 ± 9.6 1.00 1.00 100
i 93 ± 3.9 98 ± 9.2 0.99 0.77SP 64
j 88 ± 4.1 93 ± 8.8 0.64 0.92 51
k 48 ± 11.3 46 ± 10.7 1.00 1.00 100
l 85 ± 4.3 89 ± 8.4 0.85 0.71 <0.5
m 78 ± 4.0 82 ± 7.8 0.51CA 0.74CA 88
n 74 ± 4.4 76 ± 7.1 1.00 1.00 52
o 59 ± 3.3 61 ± 5.1 1.00 1.00 100
p 52 ± 4.1 52 ± 6.6 0.86 0.83 87
q 48 ± 5.8 42 ± 8.1 1.00 1.00 93
r 70 ± 3.7 69 ± 7.1 0.85 0.61 58
s 52 ± 1.4 56 ± 1.1 1.00 1.00 98
t 10 ± 8.7 7 ± 4.6 1.00 1.00 100
u 42 ± 9.1 32 ± 9.3 1.00 1.00 99
v 80 ± 4.4 82 ± 8.9 0.99 0.96 77
w 38 ± 5.8 37 ± 8.8 1.00 1.00 100
Node R8S (MYA ± SD) M.DIVTME (MYA ± SD) MrBaa MrB12 MLboot
a 103 ± 8.3 95 ± 11.4 1.00 1.00 97
b 46 ± 7.1 42 ± 11.5 1.00 1.00 100
c 97 ± 4.4 102 ± 9.6 1.00 1.00 100
d 93 ± 5.5 93 ± 10.4 0.61 <0.5 <0.5
e 79 ± 8.4 79 ± 11.5 0.92 <0.5 56
f 58 ± 8.3 59 ± 11.8 0.99 1.00 88
g 43 ± 6.7 44 ± 10.9 1.00 1.00 76
h 32 ± 11.5 23 ± 9.6 1.00 1.00 100
i 93 ± 3.9 98 ± 9.2 0.99 0.77SP 64
j 88 ± 4.1 93 ± 8.8 0.64 0.92 51
k 48 ± 11.3 46 ± 10.7 1.00 1.00 100
l 85 ± 4.3 89 ± 8.4 0.85 0.71 <0.5
m 78 ± 4.0 82 ± 7.8 0.51CA 0.74CA 88
n 74 ± 4.4 76 ± 7.1 1.00 1.00 52
o 59 ± 3.3 61 ± 5.1 1.00 1.00 100
p 52 ± 4.1 52 ± 6.6 0.86 0.83 87
q 48 ± 5.8 42 ± 8.1 1.00 1.00 93
r 70 ± 3.7 69 ± 7.1 0.85 0.61 58
s 52 ± 1.4 56 ± 1.1 1.00 1.00 98
t 10 ± 8.7 7 ± 4.6 1.00 1.00 100
u 42 ± 9.1 32 ± 9.3 1.00 1.00 99
v 80 ± 4.4 82 ± 8.9 0.99 0.96 77
w 38 ± 5.8 37 ± 8.8 1.00 1.00 100

NOTE.—SP: Scandentia/Primates as sister group to node l. CA: the support for Chiroptera and Artiodactyla as sistergroups. MrBaa, MrB12: Bayesian probability for branches based on amino acid sequences and 1st plus 2nd cdp's, respectively. SD: standard deviation as estimated by the MULTIDIVTIME (M.DIVTIME) program. MLboot: Bootstrap support for amino acid using ML under the WAG2000 + 4Γ + I model.

Divergence Times

The estimated divergence times of major lineages are summarized in table 5. These times are generally slightly younger than those estimated previously using mitogenomic data, but the dates do not differ significantly in most instances. The R8S and MULTIDIVTME package and their different underlying methods provide similar divergence times for the eutherian and marsupial divergences. Most results overlap well within less than the 1 SE/SD differences.

Discussion

RT–PCR of Housekeeping Genes

In the current study, cDNA sequences from 8 genes were produced from 25 species representing 11 eutherian and 2 marsupial orders and a large variety of tissues. Although the isolation and handling of RNA is not unproblematic (Farrell 1998; Sambrook and Russell 2001), the study has demonstrated the utility of RNA/cDNA-based approaches in phylogenetic studies.

The phylogenetic analyses were based on sequence data of the same 8 genes of all species, thereby eliminating the potential influence of missing data. This contrasts to several phylogenetic studies that have been based on concatenated nuclear genes. Such studies often lacked sequences, especially for the outgroup, and included genes that are difficult to align (Murphy et al. 2001). The possibility to choose from hundreds of different housekeeping genes provides many opportunities for selecting genes that are suitable for phylogenetic analysis. Such a selection can be made as here on the basis of distance values and the facility with which the sequences can be unambiguously aligned. Various other parameters for selecting suitable genes such as nucleotide composition or evolutionary rate differences can be evaluated using whole genomic data from model organisms (e.g., human, cow, dog, mouse, rat) prior to establishing the corresponding genes from other species.

The individual genes that were selected for this study have distance values of 3%–10% between rodents and primates. As mentioned above, these distances appear to be within a range that minimizes the number of multiple substitutions, whereas at the same time provide enough sequence variation to allow phylogenetic analysis at the ordinal and subordinal levels. The approach differs from studies that have made use of genes with up to 45% distance between the purportedly orthologous amino acid sequences among eutherians and up to a distance of 60% between the marsupials and the eutherian ingroup. Superficially, genes as vWF, IRPB, and BRCA that have been used in previous studies of mammalian relationships may appear to provide extended phylogenetic information. This may be illusory, however, as distances at this level will by necessity carry a heavy load of noise that may interfere with the correct phylogenetic signal.

The sequence consistency within each individual gene chosen for the current study greatly facilitated the alignment of ambiguous sites and the exclusion of indel differences. Sites of this kind have a yet not well-examined influence on the phylogenetic reconstruction. PAUP* uses the largest pairwise distance in the data set as a default value in distance analysis, if a sequence pair is missing (Swofford 1998), and this may skew the analysis. Most other programs do not specify how partial and missing sequences or gaps are treated. Some ignore the whole indel column for all species in the alignment, whereas other ignore a gapped site in pairwise comparisons. Either way, ambiguous alignment sites can introduce an erroneous signal into the data.

Several nuclear genes that have been used for mammalian phylogenetic analysis are specific for mammals (Murphy et al. 2001). This restricts the use of these genes to studies of this particular group. In contrast, housekeeping genes, like mitogenomic data, can be used for phylogenetic analysis of a wider range of vertebrates. Another advantage of using cDNA sequences rather than the PCR products of putative exons is that inclusion of pseudogenic sequences in the phylogenetic analyses is essentially avoided.

Phylogeny

The current analysis of 11 out of the 18 traditional placental orders allows examination of the major basal branching pattern of the eutherian tree because the sampling includes rodents, primates, and artiodactyls. The crucial question is whether rodents have a basal position in this collection of taxa or whether they join the primates on a common branch. The phylogenetic findings based on the new nuclear data set show general consistency with previously reported relationships among placental mammals. The relationships within Cetferungulata and the Chiroptera/Cetferungulata grouping corroborate earlier mitogenomic analyses (Pumo et al. 1998; Arnason et al. 2002) and results based on a combination of nuclear and mt sequences (Murphy et al. 2001). The grouping of rodents and lagomorphs in Glires is also supported by this new data set. This confirms earlier findings using nuclear genes (Murphy et al. 2001; Huchon et al. 2002; Douzery and Huchon 2004) but is in conflict with most mitogenomic analyses, which do not suggest a sistergroup relationship between these taxa.

The placement of the root of the eutherian tree on Glires as favored by the new set of nuclear genes is somewhat unexpected. It is inconsistent with the Murphy tree and other similar studies of combined data (Madsen et al. 2001; Murphy et al. 2001), which placed the root on the African clade and/or Xenarthra. More recent reports on selected genes from the same data set also placed the root on the African clade plus Xenarthra (Delsuc et al. 2002, 2004). In contrast to these results, a basal position of rodents has been described in other studies using nuclear genes (Li et al. 1990; Jorgensen et al. 2005). In the current study, a basal position of the African clade (as represented by the elephant) plus Xenarthra was only favored after including all 3 cdp's in an ML analysis that applied the GTR + 4G + I model of sequence evolution. All other analyses, whether performed on amino acid sequences or first and second cdp's (see table 4 for details), placed the root on Glires. It is possible that the inclusion of third cdp's interferes with the analysis as a result of high distance values and nonhomogenous composition. The long branches of the elephant and Xenarthra in figure 2 reflect the fast evolutionary rates of these taxa. Long-branch attraction may constitute a problem in phylogenetic analysis (Felsenstein 1978) by dragging the affected taxa to a basal position. If long-branch attraction has affected the current analyses, it has not done so to the extent of placing the African clade/Xenarthra in a basal position in the eutherian tree.

Analyses of protein-coding sequences from whole genomes (>200,000 sites) have identified a basal position of rodents relative to human and the pig, using the fugu as an outgroup (Jorgensen et al. 2005). Although the taxon sampling of the study is limited, the relationship between rodents, primates, and artiodactyls is consistent with that of the current study.

The relationship among the cetferungulate orders (Carnivora, Perissodactyla, Artiodactyla, and Cetacea) is consistent with the mitogenomic and the Murphy trees, corroborating the phylogenetic utility of the new set of data. In comparison, analyses of single nuclear genes have failed to resolve these relationships (Miyamoto and Goodman 1986; Springer et al. 1997; DeBry and Sagel 2001; Madsen et al. 2001). The strongly supported grouping of the pig and the cow by the new data set is surprising, however, as this relationship has not been identified in other molecular analyses, which rather place the pig basal to a (cow, (hippopotamus, whale)) grouping. It remains to be seen if this relationship will be upheld with the inclusion of additional taxa or sequences.

Dating

The estimated times of most divergences in this study corroborate earlier estimates based on mitogenomic (Janke et al. 1994; Arnason et al. 1996; Arnason et al. 1998) or nuclear data (Li et al. 1990; Kumar and Hedges 1998). The algorithms behind the R8S and MULTIDIVTIME programs yielded concurring results, indicating that both programs compensate similarly for differences in evolutionary rates when calculating divergence times. Among the eutherians there are, nevertheless, a few notable differences relative to estimates based on nuclear data. The divergence time between rat and mouse is consistent with that of most mitogenomic studies that have used nonrodent calibration points (Arnason et al. 1996; Arnason, Gullberg, Gretarsdottir, et al. 2000). The estimate is, however, in conflict with recent nuclear studies of rodent relationships (Douzery and Huchon 2004). Similarly, the divergence between Old and New World monkeys at ≈50 MYBP conforms to mitogenomic estimates (Arnason et al. 1998; Arnason, Gullberg, Gretarsdottir, et al. 2000; Arnason, Gullberg, Schweizer Burguete, and Janke 2000) and primate paleontology (Godinot and Mahboubi 1992) but challenges estimates that have used this split, set at 30–35 MYBP, as a calibration point to place the divergence between human and chimpanzee at 5 MYBP (e.g., Sarich 1970; Goodman 1996). As should be evident to the reader, a placement of the Old/New World primate calibration point at 50 MYBP rather than at 30–35 MYBP will automatically place the human/chimpanzee split proportionally earlier, consistent with the paleontological rebuttal (Senut et al. 2001) of a human/chimpanzee divergence 5 MYBP. Whether the differences in estimates between this study and earlier studies are related to limited taxon sampling within the respective groups, or on a limited amount of sequence data, remains to be shown. Nevertheless, the general congruency between divergence date estimates between this and other studies provides additional confidence in the data used here for phylogenetic analysis.

Summary

By applying mRNA/cDNA procedures, nuclear-encoded housekeeping genes can be recovered from a large variety of mammalian species and used for phylogenetic analysis. The approach enables the preselection of genes useful for phylogenetic reconstruction. Preselection of this kind can be carried out independently of a phylogeny, for example, on the basis of distance values and base composition. The current data set was established by concentrating on relatively slowly evolving genes, where only a limited number of multiple substitutions can be expected. The facility with which homologous sites could be unambiguously identified and aligned constituted another advantage connected to the relatively slow evolutionary rates of these genes. This new mammalian data set suggests that neither the Murphy tree nor the mitogenomic tree is entirely correct. Most elements of each tree are supported but a few are not. A major result from the study is that rodents and lagomorphs and not the African clade or Xenarthra are basal in the eutherian tree. This disrupts some groups that have been defined on the basis of the Murphy tree. The definition and naming of these unconventional clades may need reconsideration after analysis of larger and more comprehensive data sets, preferably of housekeeping genes, which due to their generally slow evolutionary rates also allow the examination of phylogenetic relationships of greater depths than those of mammalian orders.

William Martin, Associate Editor

The Swedish Science Council, the Erik Philip-Sörensen Foundation, the Jörgen Lindström Foundation, and the Nilsson-Ehle Foundation supported the study. The authors would like to express their gratitude to Lars Hegestad, Jan Nilsson, and Eberhart Fuchs for providing tissue samples.

References

Adachi J, Hasegawa M.

1996

. MOLPHY version 2.3: programs for molecular phylogenetics based on maximum likelihood.

Comput Sci Monogr

28

:

1

–150.

Adkins RM, Gelke EL, Rowe D, Honeycutt RL.

2001

. Molecular phylogeny and divergence time estimates for major rodent groups: evidence from multiple genes.

Mol Biol Evol

18

:

777

–91.

Arnason U, Adegoke JA, Bodin K, Born EW, Esa YB, Gullberg A, Nilsson M, Short RV, Xu X, Janke A.

2002

. Mammalian mitogenomic relationships and the root of the eutherian tree.

Proc Natl Acad Sci USA

99

:

8151

–6.

Arnason U, Gullberg A, Gretarsdottir S, Ursing B, Janke A.

2000

. The mitochondrial genome of the sperm whale and a new molecular reference for estimating eutherian divergence dates.

J Mol Evol

50

:

569

–78.

Arnason U, Gullberg A, Janke A.

1998

. Molecular timing of primate divergences as estimated by two nonprimate calibration points.

J Mol Evol

47

:

718

–27.

Arnason U, Gullberg A, Janke A.

1999

. The mitochondrial DNA molecule of the aardvark, Orycteropus afer, and the position of the Tubulidentata in the eutherian tree.

Proc Biol Sci

266

:

339

–45.

Arnason U, Gullberg A, Janke A.

2004

. Mitogenomic analyses provide new insights into cetacean origin and evolution.

Gene

333

:

27

–34.

Arnason U, Gullberg A, Janke A, Xu X.

1996

. Pattern and timing of evolutionary divergences among hominoids based on analyses of complete mtDNAs.

J Mol Evol

43

:

650

–61.

Arnason U, Gullberg A, Schweizer Burguete A, Janke A.

2000

. Molecular estimates of primate divergences and new hypotheses for primate dispersal and the origin of modern humans.

Hereditas

133

:

217

–28.

Chomczynski P, Sacchi N.

1987

. Single-step method of RNA isolation by acid guanidinium thiocyanate-phenol-chloroform extraction.

Anal Biochem

162

:

156

–9.

DeBry RW, Sagel RM.

2001

. Phylogeny of Rodentia (Mammalia) inferred from nuclear-encoded gene IRBP.

Mol Phylogenet Evol

19

:

290

–301.

de Jong WW, Zweers A, Goodman M.

1981

. Relationship of aardvark to elephants, hyraxes and sea cows from α-crystallin sequences.

Nature

292

:

538

–40.

Delsuc F, Scally M, Madsen O, Stanhope MJ, de Jong WW, Catzeflis FM, Springer MS, Douzery EJ.

2002

. Molecular phylogeny of living xenarthrans and the impact of character and taxon sampling on the placental tree rooting.

Mol Biol Evol

19

:

1656

–71.

Delsuc F, Vizcaíno SF, Douzery EJ.

2004

. Influence of Tertiary paleoenvironmental changes on the diversification of South American mammals: a relaxed molecular clock study within xenarthrans.

BMC Evol Biol

4

:

11

.

D'Erchia AM, Gissi C, Pesole G, Saccone C, Arnason U.

1996

. The guinea pig is not a rodent.

Nature

381

:

597

–600.

Douzery EJ, Huchon D.

2004

. Rabbits, if anything, are likely Glires.

Mol Phylogenet Evol

33

:

922

–35.

Farrell RE.

1998

. RNA methodologies: a laboratory guide for isolation and characterization. San Diego, CA: Academic Press.

Felsenstein J.

1978

. Cases in which parsimony or compatibility methods will be positively misleading.

Syst Zool

27

:

401

–10.

Felsenstein J.

1993

. Phylogenetic inference programs (PHYLIP). Seattle, WA: University of Washington.

Freshney RI.

2000

. Culture of animal cells: a manual of basic technique. 4th ed. New York: Wiley-Liss.

Garland TJ, Dickerman AW, Janis CM, Jones JA.

1993

. Phylogenetic analysis of covariance by computer simulation.

Syst Biol

42

:

265

–92.

Gatesy J, O'Leary MA.

2001

. Deciphering whale origins with molecules and fossils.

Trends Ecol Evol

16

:

562

–70.

Gibson A, Gowri-Shankar V, Higgs PG, Rattray M.

2005

. A comprehensive analysis of mammalian mitochondrial genome base composition and improved phylogenetic methods.

Mol Biol Evol

22

:

251

–64.

Godinot M, Mahboubi M.

1992

. Earliest known simian primate found in Algeria.

Nature

357

:

324

–6.

Goodman M.

1996

. Epilogue: a personal account of the origins of a new paradigm.

Mol Phylogenet Evol

5

:

269

–85.

Gregory WK.

1910

. The orders of mammals.

Bull Am Mus Nat Hist

27

:

1

–524.

Gu X, Fu YX, Li WH.

1995

. Maximum likelihood estimation of the heterogeneity of substitution rate among nucleotide sites.

Mol Biol Evol

12

:

546

–57.

Guindon S, Gascuel O.

2003

. A simple, fast and accurate algorithm to estimate large phylogenies by maximum likelihood.

Syst Biol

52

:

696

–704.

Hooker JJ.

1989

. Character polarities in early perissodactyls and their significance for Hyracotherium and infraordinal relationships. In: Prothero DR, Schoch RM, editors. The evolution of perissodactyls. New York: Clarendon Press. p 79–101.

Hsiao LL, Dangond F, Yoshida T, et al. (23 co-authors).

2001

. A compendium of gene expression in normal human tissues.

Physiol Genomics

7

:

97

–104.

Huchon D, Madsen O, Sibbald MJ, Ament K, Stanhope MJ, Catzeflis F, de Jong WW, Douzery EJ.

2002

. Rodent phylogeny and a timescale for the evolution of Glires: evidence from an extensive taxon sampling using three nuclear genes.

Mol Biol Evol

19

:

1053

–65.

Huelsenbeck JP, Ronquist F.

2001

. MRBAYES: Bayesian inference of phylogenetic trees.

Bioinformatics

17

:

754

–55.

Irwin DM, Kocher TD, Wilson AC.

1991

. Evolution of the cytochrome b gene of mammals.

J Mol Evol

32

:

128

–44.

Janke A, Feldmaier-Fuchs G, Thomas WK, von Haeseler A, Pääbo S.

1994

. The marsupial mitochondrial genome and the evolution of placental mammals.

Genetics

137

:

243

–56.

Ji Q, Luo ZX, Yuan CX, Wible JR, Zhang JP, Georgi JA.

2002

. The earliest known eutherian mammal.

Nature

416

:

816

–22.

Jobb G, von Haeseler A, Strimmer K.

2004

. TREEFINDER: a powerful graphical analysis environment for molecular phylogenetics.

BMC Evol Biol

4

:

18

.

Jorgensen FG, Hobolth A, Hornshoj H, Bendixen C, Fredholm M, Schierup MH.

2005

. Comparative analysis of protein coding sequences from human, mouse and the domesticated pig.

BMC Biol

3

:

2

.

Kumar S, Hedges SB.

1998

. A molecular timescale for vertebrate evolution.

Nature

392

:

917

–20.

Lanave C, Preparata G, Saccone C, Serio G.

1984

. A new method for calculating evolutionary substitution rates.

J Mol Evol

20

:

86

–93.

Le Gros Clark WE, Sonntag CF.

1926

. A monograph of Orycteropus afer.

Proc Zool Soc

30

:

445

–85.

Li WH, Gouy M, Sharp PM, O'hUigin C, Yang YW.

1990

. Molecular phylogeny of Rodentia, Lagomorpha, Primates, Artiodactyla, and Carnivora and molecular clocks.

Proc Natl Acad Sci USA

87

:

6703

–7.

Lin Y.-H, McLenachan PA, Gore AR, Phillips MJ, Ota R, Hendy MD, Penny D.

2002

. Four new mitochondrial genomes and increased stability of evolutionary trees of mammals from improved taxon sampling.

Mol Biol Evol

19

:

2060

–70.

Liu FG, Miyamoto MM.

1999

. Phylogenetic assessment of molecular and morphological data for eutherian mammals.

Syst Biol

48

:

54

–64.

Luo ZX, Ji Q, Wible JR, Yuan CX.

2003

. An Early Cretaceous tribosphenic mammal and metatherian evolution.

Science

302

:

1934

–40.

Madsen O, Scally M, Douady CJ, Kao DJ, DeBry RW, Adkins R, Amrine HM, Stanhope MJ, de Jong WW, Springer MS.

2001

. Parallel adaptive radiations in two major clades of placental mammals.

Nature

409

:

610

–4.

Malia MJ, Lipscomb DL, Allard MW.

2003

. The misleading effects of composite taxa in supermatrices.

Mol Phylogenet Evol

27

:

522

–7.

Misawa K, Janke A.

2003

. Revisiting the Glires concept—phylogenetic analysis of nuclear sequences.

Mol Phylogenet Evol

28

:

320

–7.

Misawa K, Nei M.

2003

. Reanalysis of Murphy et al.'s data gives various mammalian phylogenies and suggests overcredibility of Bayesian trees.

J Mol Evol

57

:

S290

–6.

Miyamoto MM, Goodman M.

1986

. Biomolecular systematics of eutherian mammals: phylogenetic patterns and classification.

Syst Zool

35

:

230

–40.

Murphy WJ, Eizirik E, O'Brien SJ, et al. (11 co-authors).

2001

. Resolution of the early placental mammal radiation using Bayesian phylogenetics.

Science

294

:

2348

–51.

Nei M, Kumar S.

2000

. Molecular evolution and phylogenetics. New York: Oxford University Press.

Nilsson MA, Arnason U, Spencer PB, Janke A.

2004

. Marsupial relationships and a timeline for marsupial radiation in South Gondwana.

Gene

340

:

189

–96.

Novacek MJ.

1992

. Mammalian phylogeny: shaking the tree.

Nature

356

:

121

–5.

Novacek MJ, Wyss AR.

1986

. Higher-level relationships of the recent eutherian orders: morphological evidence.

Cladistics

2

:

257

–87.

Page RDM, Holmes EC.

1998

. Molecular evolution—a phylogenetic approach. Oxford: Blackwell Science Ltd.

Posada D, Crandall KA.

1998

. Modeltest: testing the model of DNA substitution.

Bioinformatics

14

:

817

–8.

Prothero DR, Schoch RM.

1989

. Origin and evolution of the Perissodactyla: summary and synthesis. In: Prothero DR, Schoch RM, editors. The evolution of perissodactyls. New York: Clarendon Press. p 504–29.

Pumo DE, Finamore PS, Franek WR, Phillips CJ, Tarzami S, Balzarano D.

1998

. Complete mitochondrial genome of a neotropical fruit bat, Artibeus jamaicensis, and a new hypothesis of the relationship of bats to other eutherian mammals.

J Mol Evol

47

:

709

–17.

Reyes A, Gissi C, Catzeflis F, Nevo E, Pesole G, Saccone C.

2004

. Congruent mammalian trees from mitochondrial and nuclear genes using Bayesian methods.

Mol Biol Evol

21

:

397

–403.

Reyes A, Pesole G, Saccone C.

1998

. Complete mitochondrial DNA sequence of the fat dormouse, Glis glis: further evidence of rodent paraphyly.

Mol Biol Evol

15

:

499

–505.

Rosenberg MS, Kumar S.

2001

. Incomplete taxon sampling is not a problem for phylogenetic inference.

Proc Natl Acad Sci USA

98

:

10751

–6.

Sambrook J, Russell DW.

2001

. Molecular cloning, a laboratory manual. New York: Cold Spring Harbor Press.

Sanderson MJ.

2002

. Estimating absolute rates of molecular evolution and divergence times: a penalized likelihood approach.

Mol Biol Evol

19

:

101

–9.

Sarich VM.

1970

. Primate systematic with special reference to Old World monkeys. In: Napier JR, Napier PH, editors. Old World monkeys: evolution, systematic and behavior. New York: Academic Press. p 175–266.

Schmidt HA, Strimmer K, Vingron M, von Haeseler A.

2002

. TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing.

Bioinformatics

18

:

502

–4.

Senut B, Pickford M, Gommery D, Mein P, Cheboi K, Coppens Y.

2001

. First hominid from the Miocene (Lukeino Formation, Kenya).

C R Acad Sci Paris Sci Terre Planètes

332

:

137

–44.

Simpson GG.

1945

. The principles of classification and a classification of mammals.

Bull Am Mus Nat Hist

85

:

1

–272.

Springer MS, Cleven GC, Madsen O, de Jong WW, Waddell VG, Amrine HM, Stanhope MJ.

1997

. Endemic African mammals shake the phylogenetic tree.

Nature

388

:

61

–4.

Swofford DL.

1998

. Phylogenetic analysis using parsimony (*and other methods). Version 4. Sunderland, MA: Sinauer Associates.

Warrington JA, Nair A, Mahadevappa M, Tsyganskaya M.

2000

. Comparison of human adult and fetal expression and identification of 535 housekeeping/maintenance genes.

Physiol Genomics

2

:

143

–7.

Watson JD, Hopkins NH, Roberts JW, Steitz JA, Weiner AM.

1965

. Molecular biology of the gene. Menlo Park, CA: Benjamin/Cummings.

Whelan S, Goldman N.

2001

. A general empirical model of protein evolution derived from multiple protein families using a maximum likelihood approach.

Mol Biol Evol

18

:

691

–9.

Xu X, Janke A, Arnason U.

1996

. The complete mitochondrial DNA sequence of the Greater Indian Rhinoceros, Rhinoceros unicornis, and the phylogenetic relationship among Carnivora, Perissodactyla and Artiodactyla (+Cetacea).

Mol Biol Evol

13

:

1167

–73.

Author notes

*Division of Evolutionary Molecular Systematics, Department of Cell and Organism Biology, University of Lund, Lund, Sweden; and †Division of Chemical Pathology, Department of Clinical Laboratory Sciences, University of Cape Town, South Africa

© The Author 2006. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: [email protected]

Citations

Views

Altmetric

Email alerts

Email alerts

Citing articles via

More from Oxford Academic