Revealing the prehistoric settlement of Australia by Y chromosome and mtDNA analysis (original) (raw)

Abstract

Published and new samples of Aboriginal Australians and Melanesians were analyzed for mtDNA (n = 172) and Y variation (n = 522), and the resulting profiles were compared with the branches known so far within the global mtDNA and the Y chromosome tree. (i) All Australian lineages are confirmed to fall within the mitochondrial founder branches M and N and the Y chromosomal founders C and F, which are associated with the exodus of modern humans from Africa ≈50–70,000 years ago. The analysis reveals no evidence for any archaic maternal or paternal lineages in Australians, despite some suggestively robust features in the Australian fossil record, thus weakening the argument for continuity with any earlier Homo erectus populations in Southeast Asia. (ii) The tree of complete mtDNA sequences shows that Aboriginal Australians are most closely related to the autochthonous populations of New Guinea/Melanesia, indicating that prehistoric Australia and New Guinea were occupied initially by one and the same Palaeolithic colonization event ≈50,000 years ago, in agreement with current archaeological evidence. (iii) The deep mtDNA and Y chromosomal branching patterns between Australia and most other populations around the Indian Ocean point to a considerable isolation after the initial arrival. (iv) We detect only minor secondary gene flow into Australia, and this could have taken place before the land bridge between Australia and New Guinea was submerged ≈8,000 years ago, thus calling into question that certain significant developments in later Australian prehistory (the emergence of a backed-blade lithic industry, and the linguistic dichotomy) were externally motivated.

Keywords: human evolution, population genetics


Australia was probably occupied by humans at least 50,000 years ago (1), at a time when lowered sea levels created a land bridge between Australia and neighboring New Guinea (NG) and when the region was separated from the Eurasian land mass by only narrow straits such as Wallace's Line (Fig. 1). Australia's archaeological record remains mysterious. To begin with, Australia harbors among the oldest modern human fossils outside Africa dating to ≈46,000 years (2, 3), despite the large geographic distance from the African homeland of mankind (4). Moreover, the earliest known Australian skeletons, at Lake Mungo, are gracile, whereas some younger skeletal finds (e.g., at Kow Swamp) have robust morphology (5). Some modern Australian aboriginals retain elements of this robustness, for example, in the form of pronounced brow ridges (supraorbital tori) (6, 7). Various explanations can be put forward for the inconsistent morphological record, for example that local Homo erectus of Southeast Asia admixed into the modern human gene pool to a lesser or greater extent (5, 8), or that there have been multiple migrations to Australia that gave rise to the differing morphologies at different times, such as hypothetical new migrants from India (9), or that Australia has been genetically isolated for a sufficiently long time to produce marked continent-/Australian-specific features (1014).

Fig. 1.

Fig. 1.

Coastlines of Australia and NG ≈50,000 years ago. After the initial spread of H. sapiens out of Africa to Sahul (the formerly connected land mass of Australia and NG), the principal processes are differentiation of the mitochondrial DNA clades Q and S. Subsequent to that process, there is little migration within Sahul other than Q from NG to Australia. The genetic isolation of Australia is in the main very clearly evident already before the Sahul land bridge disappears ≈8,000 years ago. See Results and Discussion for further details.

Archaeological data indicate the intensification of density and complexity of different stone tools in Australia during the Holocene period and the emergence of backed-blade stone-tool technology (15). The first Dingoes (Canis lupus dingo) also appear at about the same time (3,500–4,000 years ago) and were proposed to have been introduced by new human arrivals, from India (16), along with new stone tool types (17). This debate is ongoing (15, 18, 19).

Recent molecular studies on humans have likewise yielded a diversity of interpretations, ranging from a deep but undated split distinguishing Australians even from their immediate neighbors to the north in NG (20) to a very recent immigration event within the Holocene in the past 10,000 years (21, 22).

Using new Australian and NG samples screened for mtDNA and Y chromosome variation and benefiting from the increasing genetic sample coverage available for Australia [Fig. 1 (4, 2032)], we can now attempt to clarify some of the salient features of the record of Australian population history and confirm its considerable isolation.

Results and Discussion

African Ancestry of Australian and NG Y and mtDNA Types.

We carried out a phylogenetic analysis of our Australian and NG complete mtDNA sequences and compared the resulting branches with the Asian mtDNA tree, as known so far (Fig. 2). The result confirms that both Australian and NG maternal lineages consist exclusively of the known out-of-Africa founder types M and N, dated to ≈50–70,000 years ago, and their derivatives (2427, 29, 30, 3337). This mitochondrial finding is mirrored in our Y chromosome data (Fig. 3), where we observe the paternal lineages in Australians and New Guineans to fall into either branches C or F, proposed to be the earliest out-of-Africa founder types (31). These results indicate that Australians and New Guineans are ultimately descended from the same African emigrant group 50–70,000 years ago, as all other Eurasians. In other words, these data provide further evidence that local H. erectus or archaic Homo sapiens populations did not contribute to the modern aboriginal Australian gene pool, nor did Australians and New Guineans derive from a hypothetical second migration out of Africa (38), nor is there any suggestion of a specific relationship with India (9, 21, 22).

Fig. 2.

Fig. 2.

Simplified tree of autochthonous Near Oceanian mtDNA branches. East and Southeast Asian, and Indian specific clusters are added for comparison. Mutations relevant to Australia, Melanesia, and NG are shown along the branches. Only branches identified by at least two complete mtDNA sequences are included. For data and a detailed tree, see SI Fig. 4.

Fig. 3.

Fig. 3.

Simplified Y chromosomal phylogeny including the recently discovered Australia-specific marker M347. For data and a detailed tree, see SI Fig. 5.

Comparing the Australian complete mtDNA sequences within the context of the Asian phylogeny (25, 26, 3845), we find that the Australians do not share any derived branches with Asians more recent than the founding types M, N, and R (Fig. 2). Similarly, our increased resolution of regionally differentiated Y chromosomal types, C5 in India, C4 in Australia, and C2 in NG provide evidence of significant long-term isolation (Fig. 3). Although the confirmed existence of F* chromosomes in India (13, 46) suggests they may also exist in Australia and NG, incomplete molecular analysis for types G–J in some previous studies (28, 47) leaves the issue of the presence of basal F* chromosomes in Australia and NG unresolved. The implication is that the migration rate of the founders from Africa along the Indian Ocean has been rapid relative to the mutation rate of the complete mtDNA genome [one mutation in ≈5,000 years; see Mishmar _et al._ (37)]. These findings support the relatively rapid migration of the Eurasian founder types to Southeast Asia (45) and, as we can now confirm, all the way to Australia. It should be noted that migration in this context refers not simply to travel but also to successful colonization. Applying the given mutation rate to the M, N, and R founders, the migration from southwestern Asia to Australia would have taken <5,200 years at 95% confidence, assuming a Poisson mutation process. This migration speed is in the same order of magnitude as estimated for other prehistoric continental settlements (48).

Australian and NG Founder Lineages.

An important result in our high-resolution mtDNA data is the discovery that Australians and New Guineans not only share the same M and N founders dating from the African exodus but furthermore within M share a characteristic variant nucleotide position 13500, which is widespread in Australia, NG, and neighboring Melanesia but not found elsewhere in the world. Taken together with the fact that the ancestral node, but not the derived lineages, is shared between Australia and NG/Melanesia (Fig. 2), we argue for a single founder group settling the whole region of Australia and NG ≈50,000 years ago. Strongly supporting evidence for this view comes from the N portion of the mtDNA phylogeny (Fig. 2), where a major deep subclade P is found in both Australia and NG/Melanesia but not elsewhere, with the time-depth estimates for P again ranging around the 50,000-year mark (Table 1).

Table 1.

Age estimates for mtDNA branches found in Australians, New Guineans, and Melanesians

Region Hg N ρ SE Age, yr
Aus/Mel M 50 7,9 1,1 53,400 ± 7,500
Aus/Mel Q'M29 27 6,6 1,4 44,300 ± 9,800
Aus/Mel Q 22 4,7 1,0 32,000 ± 6,500
Mel Q1 11 3,2 0,9 21,500 ± 6,100
Aus/Mel Q2 4 4,5 1,4 30,400 ± 9,300
Mel Q3 7 3,1 0,8 21,300 ± 5,500
Mel M29 5 2,8 1,2 18,900 ± 8,300
Mel M27 7 5,9 1,4 39,600 ± 9,800
Mel M28 8 3,0 1,0 20,300 ± 6,500
Mel M28a 6 1,7 0,7 11,300 ± 4,500
Aus M42 6 6,0 1,3 40,600 ± 9,000
Aus/Mel N 51 7,9 1,1 53,200 ± 7,300
Aus N12 4 2,5 1,1 16,900 ± 7,200
Aus S 12 3,8 0,8 25,400 ± 5,200
Aus S1 4 3,3 1,1 22,000 ± 7,700
Aus S2 4 2,3 0,8 15,200 ± 5,100
Aus/Mel R 33 8,6 1,2 58,400 ± 8,400
Aus/Mel P 31 7,6 0,9 51,700 ± 5,800
Mel P1 6 4,5 1,0 30,400 ± 6,500
Mel P2 7 1,9 0,6 12,600 ± 4,000
Aus/Mel P3 5 5,8 1,2 39,200 ± 8,200
Aus/Mel P4 8 9,8 1,9 65,900 ± 13,200
Aus P4b 3 7,0 1,7 47,300 ± 11,700
Mel P4a 5 3,8 1,1 25,700 ± 7,500

Within Australia, the ancient mtDNA branch S (27) stands out, because it is found in 34% of our Australians [supporting information (SI) Table 2], and it is well represented in other regions of Australia, as detailed in Fig. 1 (4, 21, 2327, 29) and has so far not been detected elsewhere in the world, based on the available global mtDNA database of >40,000 sequences (49). This branch is distinguished from the root of macrohaplogroup N by a transition at nucleotide position 8404. Nearly all Australians who do not have mtDNA type S nevertheless harbor deep mtDNA branches specific to Australia, several of which are described in this study (SI Fig. 4). These deep and continent-specific branches indicate substantial isolation since the first colonization of Australia. Although NG and Australia were not separated until 8,000 years ago, we can estimate the time depth for the arrival in Australia both qualitatively and quantitatively. Qualitatively, as argued above, the relatively nested phylogenetic structure, with no mutation events separating the M, N, and R founders around the Indian Ocean even at the highly resolved level of the complete mtDNA sequence, indicates an arrival in Australia soon after the African exodus, the latter dated to 50–70,000 years ago (14, 36, 48). Quantitatively, the absolute date estimates for the founder clusters in Near Oceania yield dates of up to 58,000 ± 8,000 years ago (Table 1).

Occurrence of a “New Guinean” Lineage in Northern Australia.

There is an important exception to the general pattern of Australian-specific lineages in Australian aboriginals, and this concerns mtDNA branch Q. Thus far, Q has been considered as having a geographic distribution restricted to NG and Melanesia (25, 26, 30, 3335, 50). Surprisingly, in our northern Australian Kalumburu sample, we now find an Aboriginal Australian mtDNA lineage bearing all of the basic mutations characteristic of haplogroup Q. This Australian Q lineage does not appear to be a recent arrival from NG (nor indeed a case of sample confusion), because the lineage does not belong to any of the common and widespread Q subclusters known so far from NG and Melanesia. The Australian Q instead branches deeply within Q to a depth of five mtDNA mutations. The mutational time of separation of this Q lineage from existing NG Q branches is estimated at 30,400 ± 9,300 years (Table 1). The geographically restricted appearance of Q in northern Australia may suggest a secondary arrival of settlers from NG well before the land bridge between Australia and NG was submerged ≈8,000 years ago.

Apart from this potential signal of secondary migration into Australia, there seem to be no further lineages either on the Australian Y or mtDNA tree that would provide clear evidence for extensive genetic contact since the first settlement, except possibly for a P3 sublineage shared between Australia and NG (Fig. 2). Thus, Australia appears to have been largely isolated since initial settlement, in agreement with one interpretation of the fossil record (10, 11). In particular, there are no lineages exclusively shared between Australia and India that might have indicated common ancestry as originally proposed by Huxley (9). Indeed, we have identified a new Y marker M347 (Fig. 3), which distinguishes all Australian C types from Indian or other Asian C types and adds weight to the rejection of the Huxley hypothesis. NG, in contrast, does carry a clear imprint of new arrivals at least along its coasts, where the “Austronesian” B mtDNA type has been established (51, 52).

This conclusion may have a negative bearing on the much-discussed emergence of a new stone tool industry in Australia, the “small tool” tradition, characterized by backed blades (15, 53). There is currently no evidence in Australia to associate this change in the material culture record with the arrival of new maternal and paternal lineages.

A major question that has not been addressed here and awaits resolution is the intriguing linguistic landscape of Australia, where seven-eighths of the continent is dominated by a single language family (Pama-Nyungan), whereas all other language families are concentrated in the northwestern region of Australia [Fig. 1 (54)]. Our samples from Kalumburu are from the linguistically diverse northern zone, where we have identified potential secondary gene flow into Australia as evidenced by a mitochondrial Q lineage distantly related to current NG Q lineages. The secondary migration ≈30,000 years ago associated with the arrival of the Q lineage would be considered too early, in the view of most linguists, to account for this dichotomy. Future more exhaustive genetic surveys of the Australian continent may one day resolve whether the Australian linguistic landscape can be better understood with the identification of such potential contact events. At present, it may seem preferable to seek an explanation for the dichotomy in terms of events and processes internal to Australia.

Conclusions

The mitochondrial and Y chromosomal results presented here point toward one early founder group settling both Australia and NG soon after the exodus from Africa ≈50–70,000 years ago, at a time when the lowered sea levels joined the two islands into one land mass, necessitating sea travel only across narrow straits such as Wallace's Line. The deep and specific phylogenetic lineages today within this former landmass indicate a small founding population size and subsequent isolation of Australia and, to a lesser extent, of NG, from the rest of the world. These founder events and the lack of contact could underlie the divergent morphological development seen in the Australian human fossil record and could also help explain the remarkably restricted range of Pleistocene Australian lithic industries and bone artifacts compared with contemporaneous cultures elsewhere in the world (55).

Materials and Methods

Samples.

In total, 172 Australian and Melanesian mtDNAs and 522 Y chromosome profiles were used in this study. Samples were obtained with informed consent. The following mtDNA sequences were generated: 32 sampled Aboriginal Australians from Kalumburu in northwestern Australia and 48 NG highlanders from the Bundi area (Fig. 1). Four of the Australian individuals had been characterized by Y chromosome short tandem repeat analysis (20). In addition, mtDNA sequences were generated from the following DNA samples described in Kivisild et al. (27): two Aboriginal Australian samples (Oc06 and Oc10), two NG samples (Oc01 and Oc16), and two Melanesian samples (Oc03 and Oc04). Extended Y chromosomal profiles were generated for the males within these samples (6/32 Aboriginal Australians and 19/48 New Guineans).

Previously published mtDNA profiles were included as follows: 33 complete or nearly complete mtDNA sequences from Australia (2527, 29, 30); 32 complete or nearly complete mtDNA sequences from NG (2527, 30, 33, 34); and 27 complete or nearly complete mtDNA sequences from Melanesia (25, 27, 30, 34, 35).

Published Eurasian and Near Oceanian Y chromosomal haplotypes in the present study include: 102 Aboriginal Australians (28, 31, 32); 395 New Guineans (28, 31, 32, 47); 1,021 individuals from Southeast Asian populations (28, 31, 32, 47, 56); 1,141 individuals from the Indian subcontinent and Pakistan (13, 31, 32, 46); 358 individuals from East Asian populations (13, 28, 31, 32); and 1,065 individuals from Northeast and Central Asian populations (31, 32, 56) (see SI Fig. 5 for further details).

mtDNA Typing.

The first hypervariable segment (HVS-1) of mtDNA (nps 16024–16383) and the stretch 57–302 of HVS-2 were sequenced directly from both strands in all samples. Additionally, two macrohaplogroup M and N defining mutations, namely 10398 A>G and 10400 C>T, were genotyped in all samples by RFLP (DdeI 10394 and AluI 10397 respectively). M types were further analyzed, by direct sequencing, for the Q and M29 marker 13500C (33, 50) and N types were checked for the S marker 8404C (27) and the P marker 15607G (33, 50). Additional coding region markers were analyzed in S and P mtDNA types (SI Table 3). The haplotypes defined by control region sequences and coding region SNPs were further grouped by their mutational motifs under following subhaplogroups: B4a1a1, M7b1, P1, P2, P3, P4b, Q1, and Q2 (see SI Table 3 and SI Fig. 4 for further details) (30, 33, 43, 51).

Most NG (40/48) and approximately one-half of Aboriginal Australian (14/32) mtDNA haplotypes could be sufficiently well characterized using existing mtDNA haplogroup nomenclature (SI Tables 2 and 3). Of the 26 mtDNA control region sequences that did not show clear affiliation to previously described haplogroups, nine Australian and NG individuals were selected for complete mtDNA sequencing. All recently characterized mutations that were found during the complete mitochondrial genome sequencing were typed in individuals with similar or identical mtDNA control region sequences (SI Table 3).

Multiplex SNP Assay.

A mtDNA multiplex PCR was designed and performed in a reaction volume of 25 μl containing 1× PCR buffer, 6.5 mM MgCl2, 600 mM each dNTP, 0.01–0.2 mM of each primer (SI Table 4), and 2 units of AmpliTaq Gold DNA polymerase (Applied Biosystems, Tartu, Estonia). The thermal cycling program was: denaturation at 95°C for 10 min followed by 35 cycles of 95°C for 30 s, 60°C for 30 s, and 65°C for 30 s, followed by 6 min at 65°C.

Excess primers and dNTPs were removed by addition of 1 μl (1 unit/μl) of shrimp alkaline phosphatase and 0.02 μl (10 units/μl) of Exonuclease I (Amersham Pharmacia Biotech, Piscataway, NJ) to 2.5 μl of PCR product and incubating the mixture at 37°C for 30 min followed by 80°C for 15 min.

Single-base extension (SBE) reactions were performed in 5 μl with 1 μl of purified PCR product, 3 μl of SNaPshot (Applied Biosystems), or SNuPe (Amersham Biosciences, Piscataway, NJ) reaction mix, 0.5 μl of SBE primer mix (0.01–0.3 mM each primer; see SI Table 5), and 0.5 μl of water. The SBE primer mix was diluted in 160 mM ammonium sulfate (Sigma–Aldrich, Helsinki, Finland) to minimize primer-dimer artifacts. Excess nucleotides were removed by addition of 1 μl (1 unit/μl) shrimp alkaline phosphatase to the SBE mix and incubation at 37°C for 20 min followed by incubation at 80°C for 15 min. Two microliters of SBE product were mixed with 18 μl of Hi-Di formamide (Applied Biosystems) and 0.1 μl of GeneScan-120 Liz internal size standard (Applied Biosystems), and analyzed by capillary electrophoresis using ABI Prism 3730XL Genetic Analysers with 50 cm capillary arrays and POP-6 polymer (Applied Biosystems) or a MegaBACE Analysis System (Amersham Biosciences). Full methodological and theoretical details are available elsewhere (57, 58).

Y Chromosome Typing.

Eighteen Y chromosomal markers (M4, M9, M11, M38, M45, M70, M89, M130, M147, M175, M177, M208, M210, M214, M230, M231, M347, and M356) (SI Fig. 5) were typed in 25 of the Kalumburu and Bundi samples. One previously unpublished biallelic M347 marker is reported here. M347 was amplified by using primers (F, 5′-AAGTGGAGGGTATGTTTCAGCC-3′; R, 5′-GGCAACAATAGGCAGATGGCTC-3′) specific for a single 558-bp amplicon. The thermal cycling program was: denaturation at 95°C for 3 min followed by 36 cycles of 95°C for 30 s, 53°C for 30 s, and 72°C for 40 s, followed by 5 min at 72°C. Nucleotide position 374 A>G (ancestral>derived) variant was sequenced directly by using one of the same external primers. The following were additionally genotyped: haplogroup C* lineages (altogether 13 individuals) from the study by Kivisild et al. (46) were typed for the presence of the M356 marker reported by Sengupta et al. (13); NG haplogroup K* lineages from the study by Underhill et al. (31, 32) were typed for the presence of M230 marker; and the M347 marker was typed in Australian C lineages from the study by Underhill et al. (31, 32).

Coalescence Age Estimation.

Phylogenetic trees were constructed manually and confirmed by using the Network software (www.fluxus-engineering.com) (59, 60). Coalescence ages of mtDNA haplogroups were calculated by the rho (ρ) statistic as described by refs. 61 and 62 by using the coding region mutation rate of one synonymous transition per 6,764 years (27).

Supplementary Material

Supporting Information

Abbreviations

NG

New Guinea

SBE

single-base extension.

Footnotes

The authors declare no conflict of interest.

Data deposition: The sequences reported in this paper have been deposited in the GenBank database [accession nos. EF495214EF495222 (complete mtDNA sequences), EF524341EF524420 (mtDNA HVS-1 sequences), and EF524421EF524500 (partial HVS-2 sequences)].

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information