Hierarchical Patterns of Global Human Y-Chromosome Diversity (original) (raw)

Journal Article

,

Search for other works by this author on:

,

Search for other works by this author on:

,

Search for other works by this author on:

,

Search for other works by this author on:

,

Silvana Santachiara-Benerecetti

Search for other works by this author on:

,

Search for other works by this author on:

Search for other works by this author on:

Cite

Michael F. Hammer, Tatiana M. Karafet, Alan J. Redd, Hamdi Jarjanazi, Silvana Santachiara-Benerecetti, Himla Soodyall, Stephen L. Zegura, Hierarchical Patterns of Global Human Y-Chromosome Diversity, Molecular Biology and Evolution, Volume 18, Issue 7, July 2001, Pages 1189–1203, https://doi.org/10.1093/oxfordjournals.molbev.a003906
Close

Navbar Search Filter Mobile Enter search term Search

Abstract

We examined 43 biallelic polymorphisms on the nonrecombining portion of the Y chromosome (NRY) in 50 human populations encompassing a total of 2,858 males to study the geographic structure of Y-chromosome variation. Patterns of NRY diversity varied according to geographic region and method/level of comparison. For example, populations from Central Asia had the highest levels of heterozygosity, while African populations exhibited a higher level of mean pairwise differences among haplotypes. At the global level, 36% of the total variance of NRY haplotypes was attributable to differences among populations (i.e., ΦST = 0.36). When a series of AMOVA analyses was performed on different groupings of the 50 populations, high levels of among-groups variance (ΦCT) were found between Africans, Native Americans, and a single group containing all 36 remaining populations. The same three population groupings formed distinct clusters in multidimensional scaling plots. A nested cladistic analysis (NCA) demonstrated that both population structure processes (recurrent gene flow restricted by isolation by distance and long-distance dispersals) and population history events (contiguous range expansions and long-distance colonizations) were instrumental in explaining this tripartite division of global NRY diversity. As in our previous analyses of smaller NRY data sets, the NCA detected a global contiguous range expansion out of Africa at the level of the total cladogram. Our new results support a general scenario in which, after an early out-of-Africa range expansion, global-scale patterns of NRY variation were mainly influenced by migrations out of Asia. Two other notable findings of the NCA were (1) Europe as a “receiver” of intercontinental signals primarily from Asia, and (2) the large number of intracontinental signals within Africa. Our AMOVA analyses also supported the hypothesis that patrilocality effects are evident at local and regional scales, rather than at intercontinental and global levels. Finally, our results underscore the importance of subdivision of the human paternal gene pool and imply that caution should be exercised when using models and experimental strategies based on the assumption of panmixia.

Introduction

Knowledge of just how genetic variation is partitioned among human populations has important implications for studies of human origins, DNA forensics, and the etiology of human disease. Compared with other species, humans have relatively low levels of genetic diversity. Also, the proportion of that diversity that exists between all levels of human population is correspondingly low (Nei 1987 ). For example, studies of blood groups and protein polymorphisms have shown that approximately 85% of all human genetic diversity can be found within a single population, while only 5%–10% of the total diversity is partitioned among major geographic regions (Lewontin 1972 ; Nei and Roychoudhury 1974 ; Latter 1980 ). A similar apportionment of genetic diversity has been revealed at the DNA level by genomewide analyses of a variety of different markers (Bowcock and Cavalli-Sforza 1991 ; Batzer et al. 1994 ; Deka et al. 1995 ; Barbujani et al. 1997 ; Jorde et al. 2000 ). Concordant results have also been obtained from other human traits, such as cranial morphology (Relethford and Harpending 1994 ). Overall, these studies have generally underscored the lack of discontinuity among human groups (Lewontin 1972 ; Barbujani et al. 1997 ) and the relative homogeneity of the human species. This observation, in turn, may reflect a relatively recent origin for human between-groups differentiation, relatively high rates of migration among different human groups, or both (Relethford 1995 ).

Assuming a 1:1 sex-ratio, autosomal and X-linked regions of the genome have four- and threefold higher effective sizes, respectively, than the nonrecombining portion of the Y chromosome (NRY) and the mitochondrial DNA molecule. Consequently, increased levels of population subdivision due to genetic drift are expected for these uniparentally inherited haploid regions of the genome. Until recently, the small number of known NRY polymorphisms has hindered a comprehensive assessment of the global structure of Y-chromosome diversity. Earlier studies indicated that Y-chromosome polymorphisms were geographically restricted and that _F_ST values for the NRY were higher than those for mtDNA (Jobling and Tyler-Smith 1995 ; Cavalli-Sforza and Minch 1997 ; Underhill et al. 1997 ; Hammer et al. 1998 ; Perez-Lezaun et al. 1999 ). Indeed, the higher observed _F_ST for the NRY compared with that for mtDNA led Seielstad, Minch, and Cavalli-Sforza (1998) to propose that females have had an eightfold higher migration rate than males. It is unclear, however, whether the suggested underlying cause of this higher mobility (i.e., local-scale patrilocality, defined anthropologically as the tendency for a wife to move into her husband's natal domicile) would lead to a higher global _F_ST for the Y chromosome (Stoneking 1998 ). On the other hand, contrasting signals in nested cladistic analyses of NRY and mtDNA data sets led Hammer et al. (1998) to hypothesize that male migration rates may have been higher than those for females at the intercontinental level. Despite these observations, most human population genetics models assume panmixia. This study was designed to measure the degree of Y-chromosome structure on a global scale (i.e., to test the assumption of panmixia) and to test the global applicability of the patrilocality hypothesis.

Materials and Methods

Populations Sampled

We analyzed a total of 2,858 males from 50 populations (table 1 ). The sample was divided into the following 10 major geographic regions: sub-Saharan Africa (SAF), North Africa (NAF), the Middle East (MEA), Europe (EUR), South Asia (SAS), Central Asia (CAS), North Asia (NAS), East Asia (EAS), Oceania (OCE), and the Americas (AME). Many of the samples analyzed here were also included in our previous studies (Hammer et al. 1997, 1998, 2000 ; Karafet et al. 1999 ), although the exact number of subjects reported for each population sometimes differs. The 50 populations were distributed according to geographic region in the following way (see accompanying citation below for populations not described in the aforementioned references): SAF (n = 229) = 73 Khoisan, 55 East Bantus, 26 Pygmies, 26 Bagandans, and 49 Gambians; NAF (n = 131) = 51 Ethiopians, 50 Egyptians, and 30 Tunisians; MEA (n = 180) = 20 Saudi Arabians, 88 Syrians, and 72 Turks; EUR (n = 327) = 86 Greeks, 62 Italians, 58 Romanians (this study), 34 Germans, 44 Russians, and 43 British; SAS (n = 196) = 75 Sri Lankans, 59 Indians, and 62 Pakistanis (Qamar et al. 1999 ); CAS (n = 263) = 45 Turkmen (this study), 14 Tadjiks (this study), 77 Uzbeks (this study), 30 Kazakhs, 29 Altai, and 68 Uygurs (this study); NAS (n = 495) = 148 Mongolians, 81 Buryats, 122 Selkups, 27 Forest Nentsi, 95 Evenks, and 22 Siberian Eskimos; EAS (n = 461) = 70 Vietnamese, 58 Miao (this study), 84 Chinese Han, 52 Manchu (this study), 76 Koreans, and 121 Japanese; OCE (n = 303) = 60 East Indonesians, 47 Papua New Guineans, 50 Melanesians, 78 Australian Aboriginal people, 18 Micronesians, and 50 Polynesians (this study); and AME (n = 273) = 80 Navajos, 45 Cheyenne, 24 Pima, 72 Mayans (this study), 28 Mixtecs, and 24 Wayus. All sampling protocols were approved by the Human Subjects Committee at the University of Arizona.

Mutation Detection

Mutation detection analysis was performed using single-stranded conformation polymorphism (SSCP) (Sheffield et al. 1993 ) and denaturing high-performance liquid chromatography (DHPLC) (Underhill et al. 1997 ). Two panels of DNA samples were employed to ascertain polymorphisms using the above methods. For SSCP, n = 20 (9 sub-Saharan Africans, 3 Asians, 3 Native Americans, 3 Europeans, and 2 Oceanians), and for DHPLC, n = 57 (17 sub-Saharan Africans, 15 Asians, 11 Europeans, 7 Native Americans, and 7 Oceanians).

The SSCP method was used to screen a set of 20 sequence-tagged sites (STSs). The DHPLC method was used to screen for mutations in the following set of three clones that were previously used as probes to detect restriction fragment length polymorphism (RFLP) variation on the NRYs of humans and great apes (Allen and Ostrer 1994 ): clone 4-1 (DYS188), clone 3-11 (DYS190), and clone 3-8 (DYS194). Mutational variation within four STSs (DYS221, DYS257, DYS199, and DYS211) and two clones (3-8 and 4-1) was previously reported (Karafet et al. 1997 ; Hammer et al. 1998, 2000 ). Additional variation was found at sites within three STSs (DYS7, DYS265, and DYS257b) and two clones (3-8 and 3-11). Finally, Ya5 Alu elements within the 16E4 and 486,O,2 clones (GenBank accession numbers AC003094 and AC002531, respectively; http://www.ncbi.nlm.nih.gov/Genbank/index.html), as well as a 683-bp region of an arylsulfatase pseudogene (ARSEP, GenBank accession number AC002992) were screened for polymorphisms using DHPLC.

The DYS7 (GenBank accession number G12023), DYS265 (GenBank accession number G12016), and DYS257 (GenBank accession number G38358) STSs were amplified using the conditions and primers reported by Vollrath et al. (1992) . The Y-specific clones 3-8 (DYS194) and 3-11 (DYS190) (Allen and Ostrer 1994 ) were sequenced by primer walking (GenBank accession numbers AF257064 and AF337053, respectively), and the sequence information was used to design primers to amplify shorter fragments for DHPLC analysis. DNA sequencing was performed by standard procedures to identify mutations which altered mobility on SSCP gels or DHPLC chromatograms.

As in previous mutation detection surveys (Underhill et al. 1997, 2000 ; Hammer et al. 1998 ; Karafet et al. 1999 ), we sequenced homologous DNA regions encompassing all sites found to be polymorphic on the human NRY in great ape species (e.g., one common chimpanzee, one bonobo, and one gorilla) to determine ancestral states, as well as the position of the root of the human NRY haplotype tree.

Allele-Specific Genotyping Assays

A total of 23 segregating sites were discovered using the two mutation detection methods. Of these 23, 15 were chosen for genotyping in the entire sample (10 new and 5 previously published polymorphisms). The other 8 polymorphisms were found to be so rare in a subset of the 2,858 chromosomes that they were excluded from subsequent analyses. After determining the location of a sample with respect to its position on the haplotype tree, no further genotyping was undertaken for that sample. This hierarchical genotyping protocol means that not every individual was typed for every marker, and hence it is possible that some recurrent mutations remained undetected using this strategy (Underhill et al. 2000) . Nevertheless, because the homoplasy rate for single-nucleotide polymorphisms (SNPs) on the NRY is so low (Underhill et al. 2000) , it is unlikely that undetected multiple “hits” would seriously affect either our phylogenetic or our diversity analyses. The remaining 20 previously published polymorphisms (from the entire battery of 43 polymorphisms) were also genotyped for all 2,858 chromosomes, with the aforementioned caveats.

Variation at all previously unpublished polymorphic sites (table 1 : mutations 2, 6, 9, 10, 12, 13, 20, 23, 32, and 38) was genotyped using allele-specific PCR (Sommer, Groszbach, and Bottema 1992 ). The PCR conditions and primer sequences employed in these allele-specific genotyping assays were deposited in the National Center for Biotechnology Information (NCBI) dbSNP database (http://www.ncbi.nlm.nih.gov/SNP). Mutations numbered 3, 7, 14, 16–20, 22, 24, 25, and 40–43 in table 1 were genotyped according to methods reported by Hammer and Horai (1995) , Hammer et al. (1998,2000) , and Karafet et al. (1999) . Other previously published mutations included mutation 8 (Jobling et al. 1996 ); mutations 1, 4, 5, 15, 21, 26, 27, 36, 37, and 39 (Underhill et al. 1997 ); mutation 31 (Zerjal et al. 1997 ); mutations 34 and 35 (Shinka et al. 1999 ); mutations 28 and 33 (Su et al. 1999 ); mutations 11 and 30 (Bao et al. 2000) ; and mutation 29 (Santos et al. 2000) .

Statistical Analyses

Parsimony analysis of NRY haplotypes was aided by the use of PAUP, version 4.0b4 (Swofford 2000) , with outgroup rooting. Measures of haplotype diversity, including the number of haplotypes (k), Nei's (1987) heterozygosity (h), and the mean number of pairwise differences among haplotypes (p), were calculated using the software package ARLEQUIN (Schneider et al. 1998 ). We also used ARLEQUIN to perform analysis of molecular variance (AMOVA). AMOVA produces estimates of variance components and Φ statistics (F statistic analogs) reflecting the correlation of haplotypic diversity at different levels of hierarchical subdivision (Excoffier, Smouse, and Quattro 1992 ). Because the assumptions of random sampling, “pure” genetic drift, and no migration are likely to be violated in all human populations, caution is needed when interpreting Φ statistics. Nevertheless, according to Excoffier, Smouse, and Quattro (1992) , the resulting variance components can be viewed as convenient summaries of the partitioning of genetic variation within and among populations. We performed multidimensional scaling (MDS) (Kruskal 1964 ) on the ΦST distances generated in ARLEQUIN using the software package NTSYS (Rohlf 1998 ). Nested cladistic analyses (NCAs) were carried out using GeoDis, version 2.0 (Posada, Crandall, and Templeton 2000) . This novel method attempts to explain statistically significant associations between haplotypes and geography in terms of population history and/or population structure considerations. Population structure processes operate over short time intervals and tend to establish migration-drift equilibria, whereas population history events are considered to be nonrecurrent phenomena that disrupt equilibria. Three conditions underlie the general applicability of NCA and its ability to discriminate among the various population structure processes (i.e., recurrent gene flow restricted by isolation by distance vs. long-distance dispersal) and/or population history events (i.e., contiguous range expansion, long-distance colonization, or fragmentation). These considerations include (1) adequate sampling across the geographic range of the species, (2) temporal polarity of the haplotype network, and (3) mutational resolution in the haplotype tree. Because our cladogram was rooted by outgroup comparisons, we were able to infer the geographical polarity of several of the signals detected by the NCA analysis by considering the distribution of interior and tip clades (e.g., directionality was assumed to go from interior to tip), especially in cases where there was a clear geographic pattern of separation between ancestral and derived haplotypes. For a more extensive explanation of the NCA method, consult Templeton, Routman, and Phillips (1995) , Hammer et al. (1998) , and Posada, Crandall, and Templeton (2000) .

Results

During the course of this research, the following 10 previously unpublished Y-specific polymorphisms were discovered: a C→T transition at position 20642 of 16E4; a G→A transition at position 905 of DYS190; a T→G transversion at position 922 of DYS190; a T→C transition at position 450 of DYS194; a T→C transition at position 1391 of DYS194; a C→A transversion at position 118453 of 486,O,2; a T→C transition and a T deletion at positions 71227 and 71228, respectively, of ARSEP; a C→T transition at position 108 of DYS7; a G→C transversion at position 83 of DYS265; and a G→A transition at position 162 of DYS257. Comparisons with the homologous sequences from one common chimpanzee, one bonobo, and one gorilla allowed us to infer the ancestral states at all of these sites except 16E420642 and 486,O,2118453, which only occurred in human-specific Alu elements.

In addition to these 10 new markers, we surveyed 33 previously published polymorphisms (table 1 ). Mutational events at two sites were recurrent (SRY10831 and MSY2). The character states at all 41 mutational sites give rise to 44 possible NRY haplotypes, of which 39 were present in this survey. The frequencies of these 39 haplotypes (h1–h39) in each regional group are reported in table 1 . Figure 1 displays a maximum-parsimony tree showing the evolutionary relationships of all 39 haplotypes. Haplotypes in figure 1 are color-coded by geography. The pie charts represent the frequencies of occurrence of the haplotypes within each of the 10 geographic regions listed in table 1 , and the overall size of each circle represents a global haplotype frequency.

The position of the root in figure 1 was determined by outgroup comparisons. In order to confirm the position of the root, a subset of chromosomes was genotyped at four sites (M91, M42, M94, and M139) that mark the two most basal lineages on the maximum-parsimony tree presented by Underhill et al. (2000) . Overall, the tree in figure 1 exhibits a high degree of underlying similarity with Underhill et al.'s (2000) maximum-parsimony tree (even though our analysis was based on ∼fourfold fewer markers, with only 15 common to both studies). In fact, our figure 1 and the Y-chromosome tree in Underhill et al. (2000) offer remarkably strong mutual confirmation. For instance, Underhill et al.'s (2000) haplogroup I is represented by our haplotypes 1–4. The rest of the correspondences are as follows, again with the Underhill et al. (2000) haplogroup designation first, followed by our matching haplotype number(s): haplogroup II—haplotypes 5–10; haplogroup III—haplotypes 13–15; haplogroup IV—haplotypes 11 and 12; haplogroup V—haplotypes 16–18; haplogroup VI—haplotypes 19–23; haplogroup VII—haplotypes 25–27 and 29–32; haplogroup VIII—haplotypes 24, 28, and 33–35; haplogroup IX—haplotypes 36–38; and haplogroup X—haplotype 39.

Haplotype Diversity

Diversity statistics for the 10 regional and 5 continental groups are presented in table 2 . The number of regional haplotypes (k) ranged from 10 in South Asians to 18 in East Asians. Regional haplotype diversity values (h) ranged from 0.605 in Native Americans to 0.878 in Central Asians, while the mean number of pairwise differences (p) ranged from 1.31 in Native Americans to 3.93 in sub-Saharan Africans. Four distinct patterns appear when two diversity statistic values for the 10 regional groups in table 2 are compared: (1) low p/low h, (2) high p/high h, (3) high p/low h, and (4) moderate p/high h. The Americas represent the only region that exhibits the first pattern of concordantly low p and low h values. The low p value occurs here because 90% of Native American Y-chromosome lineages are one-step neighbors restricted to haplotypes 36–39 (magenta in fig. 1 ), while the low h value is due to the fact that 57% of the Native Americans in our study have haplotype 39 (table 1 ). The second pattern, where a high p value is accompanied by a concordantly high h value, is seen only in Europeans. This pattern reflects intermediate frequencies of relatively divergent haplotypes found in different parts of the tree (i.e., blue in fig. 1 ). The third pattern, where a high p value is discordantly combined with a low h value, characterizes both African regional groups. In sub-Saharan Africa, the extremely high p value is influenced by the marked divergence among the dark green haplotypes in figure 1 . The contrasting relatively low h value may occur because 45% of sub-Saharan African Y chromosomes exhibit a single haplotype (h15). Likewise, 50% of the North Africans have a single (but different) haplotype (h14), resulting in a low h value, while the rather high North African p value is associated with the occurrence of a diverse set of lineages (light green in fig. 1 ). Finally, Asian and Oceanian populations exhibit the fourth pattern, moderate p and high h values. Although their p values are moderate, the Central and East Asians have the highest h values among the 10 regions. These high h values are probably due to the lack of any predominant Central or East Asian haplotype (red and orange, respectively in fig. 1 ).

Analysis of Molecular Variance

According to Excoffier, Smouse, and Quattro (1992 , p. 482), ΦST can be viewed as the correlation of random haplotypes within populations relative to that of random pairs of haplotypes drawn from the whole species, while ΦCT is the correlation of random haplotypes within a group of populations relative to that of random pairs of haplotypes drawn from the whole species, and ΦSC is the correlation of the molecular diversity of random haplotypes within populations relative to that of random haplotypes drawn from the region. Table 3 presents variance components and Φ statistics at five different grouping levels, which summarize the geographic partitioning of NRY diversity. When all 50 populations were combined into a global analysis, 64% of the variance was within populations, and thus ΦST = 0.36. The Φ statistics at the 10 regional and 5 continental levels were very similar. However, when the world was partitioned into three groups (Africa/Americas/“rest of the world”), all three Φ statistics increased noticeably in value. When each continental grouping was contrasted with the remainder of the world (table 3 , analyses 5–9), the African/non-African comparison exhibited the highest ΦST and ΦCT values, but the lowest ΦSC value. In the American/non-American comparison, the ΦST and ΦCT values were second highest, while the ΦSC value was second lowest. In contrast, the three Φ statistics for the European/non-European, Asian/non-Asian, and Oceanian/non-Oceanian comparisons were nearly identical. Their ΦST values corresponded closely to the global and 10 regional ΦST values. However, their ΦCT values dropped ∼10-fold compared with the 10-, 5-, and 3-group partitionings, while their ΦSC values were the highest of any the comparisons. The combined results suggest that while African and Native American populations were the most differentiated from the other groups, European, Asian, and Oceanian populations were practically indistinguishable from each other on the basis of all three Φ-statistic patterns.

Multiple-Dimensional Scaling

Figure 2 shows the results of multidimensional scaling based on ΦST genetic distances, as well as ΦCT values for the three intercluster comparisons. The correlation between the original ΦST molecular distance matrix and a Euclidean distance matrix derived from the two-dimensional plot was extremely high (r = 0.983). Africa, the Americas, and the 36 remaining populations formed three distinct clusters, paralleling the results of AMOVA.

Nested Cladistic Analysis

Figure 3 displays the nested cladogram for the 39 haplotypes (plus the five not found in our survey) generated by applying the nesting rules given in Templeton, Boerwinkle, and Sing (1987) and Templeton and Sing (1993) . This nesting methodology produced 19 one-step clades, 9 two-step clades, 3 three-step clades, and a four-step clade that nested the entire cladogram. The NCA indicated highly statistically significant associations between clades and geographic locations for the entire cladogram (P < 0.0001). Out of a total of 32 nested clades, 21 exhibited statistically significant associations with geography (data not shown). When the null hypothesis of no association between haplotype and geography is rejected, the analysis continues by generating specific explanatory inferences involving population history and/or structure considerations. With the aid of the key published on the GeoDis 2.0 website (http://bioag.byu.edu/zoology/crandall_lab/geodis.htm) (Posada, Crandall, and Templeton 2000) , we were able to infer the probable causes of these 21 patterns (table 4 ). Population structure inferences included both recurrent gene flow restricted by isolation by distance (RFG/IBD; n = 10 inferences), as well as long-distance dispersal processes (LDD; n = 1). Population history events included contiguous range expansions (CRE; n = 6) and long-distance colonizations (LDC; n = 4). Of these 21 inferences, 12 involved an intercontinental geographic pattern, while 9 were limited to a single continent. Some of the intercontinental signals affected more than two geographic regions, and some were redundant because they were detected at different nesting levels in the cladogram. As a result, the 12 intercontinental signals were deemed to represent only 11 separate entities (6 range expansions/long-distance colonization events and 5 recurrent gene flow/long-distance dispersal processes), all but 1 of which are illustrated by arrows in figure 4 . Similarly, the nine intracontinental signals were deemed to represent only eight separate signals (shown as arrows within circles in fig. 4 ). The remaining intercontinental signal involving global gene flow (via isolation by distance) is not depicted in figure 4 because it had no inherent polarity. As in our earlier global nested cladistic analyses (Hammer et al. 1998 ; Karafet et al. 1999 ), a contiguous range expansion out of Africa was detected at the level of the total cladogram (denoted by the widest black arrow in fig. 4 ).

Discussion

Evolutionary Tree for NRY Haplotypes

The root of the gene tree in figure 1 fell between two sets of haplotypes (h1–h4 and h5–h10) that are entirely restricted to the continent of Africa (table 1 ), thereby supporting the hypothesis of an African origin of contemporary NRY lineages (Hammer et al. 1998 ; Underhill et al. 2000 ). Haplotype 1 represents the “ancestral” human NRY haplotype (previously designated haplotype 1A in Hammer et al. 1998 ), which is found at relatively high frequencies in Khoisan populations. Haplotype 2, a one-step neighbor of h1, was found in ∼14% of Ethiopian chromosomes in this study, as well as in 42% of Underhill et al.'s (1997) Sudanese samples. The predominantly East African distribution of h1 and h2 (data not shown) lends support to the hypothesis of a remnant archaic gene pool along the African rift and a possible wider East African range for the ancestors of the Khoisan (Scozzari et al. 1999 ). Other than the two aforementioned sets of haplotypes confined to Africa and a set of YAP+ haplotypes (h13–h15), all haplotypes in the remainder of the tree were absent or very rare in sub-Saharan African populations.

In contrast to other kinds of genetic data (Przeworski, Hudson, and Di Rienzo 2000) , both our present NRY tree and that of Underhill et al. (2000) clearly indicate that haplotypes found outside of Africa are not a subset of those found within Africa. However, the NRY tree does show a branching pattern similar to that seen in the gene trees of several other loci: African-specific branches are found on both sides of the root of the tree and are separated from the remaining sets of African and non-African branches (Labuda, Zietkiewicz, and Yotova 2000) .

Apportionment of NRY Biallelic Diversity

This study represents the most extensive Φ-statistic analysis utilizing Y-chromosome biallelic markers to date combining sample size, geographic coverage, and number of markers. Caution should be exercised when comparing variance partitions among studies because results will depend on which populations are sampled, how the populations are grouped (nested), and what underlying models of population structure are assumed (Urbanek, Goldman, and Long 1996 ). With this caveat in mind, a sample of previously reported total among-groups variation values (whether measured by ΦST, _F_ST, or _G_ST) for Y-specific RFLPs and/or SNPs was found to range from 0.230 to 0.645 (Hammer et al. 1997 ; Poloni et al. 1997 ; Seielstad, Minch, and Cavalli-Sforza 1998 ; Kittles et al. 1999 ; Jorde et al. 2000 ). Our global ΦST value of 0.360 is only slightly lower than the mean value (0.413) calculated from the five studies cited above. Poloni et al. (1997) cautioned that their _F_ST value of 0.230 may be an underestimate in part because of recurrent mutation acting on the p49a,f/_Taq_I polymorphic system. It is not clear why the _F_ST value of 0.645 reported by Seielstad, Minch, and Cavalli-Sforza (1998) based on the data presented in Underhill et al. (1997) is so much higher than the value reported here. When we analyzed the Underhill et al. (1997) data set using the 10 population groupings provided in their figure 2 , we obtained a ΦST value of 0.540 and an _F_ST value of 0.414.

In general, the within-populations variance component for Y-chromosome data is much smaller than the values reported for mtDNA (Excoffier, Smouse, and Quattro 1992 ; Seielstad, Minch, and Cavalli-Sforza 1998 ; Kittles et al. 1999 ; Jorde et al. 2000 ). On the other hand, the among-groups and the among-populations-within-groups component values for Y chromosomes usually exceed those for mtDNA. Seielstad, Minch, and Cavalli-Sforza (1998) appealed to a lower transgenerational migration rate for males as the major explanatory factor for why their Y-chromosome _F_ST value (0.645) was so much higher than the mtDNA-based value they recalculated from Excoffier, Smouse, and Quattro's (1992) data (0.186). Their estimated eightfold higher female migration rate was attributed to patrilocality operating primarily at the local and perhaps regional levels (Seielstad, Minch, and Cavalli-Sforza 1998 ). Although the majority of human societies practice patrilocality (Murdock 1967 ), it is unclear whether this effect extends to intercontinental and global levels (Stoneking 1998 ). In order to investigate this proposition, we analyzed our Y-chromosome data according to the grouping design presented in Excoffier, Smouse, and Quattro (1992) (albeit with different samples), thereby permitting a comparison of the Φ statistics generated from our Y-chromosome data with those derived from a global mtDNA RFLP data set. Although the “patrilocality effect” was extremely clear at the interregional level within continents (ΦSC ratio = 4.4), it was much less apparent at the intercontinental level (ΦCT ratio = 1.4) or at the global level (ΦST ratio = 1.7). It is even possible that the latter two ratios would be closer to one (or less than one) without the high mutation rate and accompanying homoplasy known to affect mtDNA restriction sites (Excoffier, Smouse, and Quattro 1992 ) and thought to depress _F_ST values (Jorde et al. 2000) . For instance, the at-least-10-fold-higher mtDNA mutation rate would be expected to increase the within-groups variance component and thus decrease the ΦST values for mtDNA relative to the NRY, thereby artificially inflating the corresponding NRY/mtDNA ratio (Jin and Chakraborty 1995 ). Numerous possible explanations exist for the discrepancy between the ΦSC ratio and the other two values. For instance, increased intercontinental male migration, decreased intercontinental female gene flow, and sex-specific demographic factors may all contribute to the Φ-statistic patterns (Stoneking 1998 ; Fix 1999 ; Karafet et al. 1999 ).

Our AMOVA results support two tentative conclusions: (1) patrilocality effects are evident at local and regional scales rather than at the intercontinental and global levels of analysis, and (2) sole reliance on _F_ST values based on Wright's (1969) island model of population structure may result in distorted pictures of the geographic extent of the patrilocality effect and of the possible markedly different sex-specific migration rates noted above.

Tripartite Division of Global NRY Variation

The multidimensional scaling plot in figure 2 underscores the distinctiveness of Native American and African populations with respect to Eurasian and Oceanian populations seen in the AMOVA results (table 3 ). The pattern of low Native American NRY diversity is also concordant with the position of the Americas as an outlier in figure 2 . These results fit a scenario whereby a combination of relatively recent colonization, repeated founder effects, small population sizes, and extensive intergenerational genetic drift is responsible for the distinctiveness of Native Americans with respect to their Asian forebears, as well as the remainder of the world (Karafet et al. 1999 ). As in many other genetic studies (Vigilant et al. 1991 ; Nei and Roychoudhury 1993 ; Cavalli-Sforza, Menozzi, and Piazza 1994 ), African populations occupy a distinct region of the multidimensional space in figure 2 . However, unlike those of the Americas, sub-Saharan African populations are characterized by a diverse set of ancient haplotypes that are not shared globally (e.g., basal haplotypes h1–h10 in fig. 1 ) in combination with a set of more derived haplotypes that are widely shared within Africa and, again, are not shared globally (e.g., h13 and h15). Therefore, the distinctiveness of African populations better fits a scenario of African-specific lineage admixture (Labuda, Zietkiewicz, and Yotova 2000) .

Although North Africa occupies a position relatively close to sub-Saharan Africa in figure 2 , when traditional _F_ST and CHORD distance statistics were employed (data not shown), North Africa moved closer to the Middle Eastern and European portion of the central cluster, as might be expected from ethnohistoric connections between North Africa and the Middle East (Cavalli-Sforza, Menozzi, and Piazza 1994 ), thereby producing a pattern similar to that depicted in the maximum-likelihood network of Underhill et al. (2000) .

What distinguishes our results in figure 2 from autosomal global genetic analyses is the particular subdivision pattern that emerges, wherein Africans and Native Americans occupy opposite ends of the plot, while populations from Europe, Asia, and Oceania form a large, central cluster. For instance, Cavalli-Sforza, Menozzi, and Piazza's (1994 , p. 82) principal-components ordination shows Africa clearly differentiated from the rest of the world; however, the Americas fall within the northern Eurasian portion of their map, which is separated from a southern Asian/Oceanian cluster. In other surveys, Africa and Oceania are frequently positioned as the outliers (Nei and Roychoudhury 1993 ; Stoneking et al. 1997 ). One possible reason for the distinctiveness of the haploid NRY pattern compared with diploid autosomal patterns is the stronger effect of genetic drift because of the smaller effective population size of the NRY (i.e., NRY _N_e = 1/4 autosomal _N_e).

The very similar and extremely low ΦCT values for the European/non-European, Asian/non-Asian, and Oceanian/non-Oceanian comparisons in table 3 , as well as the high ΦST and ΦCT values for the African/non-African, American/non-American, and Africa/Americas/“rest of the world” comparisons, coincide with the observed pattern of global Y-chromosome diversity portrayed in figure 2 . When ΦCT values were calculated for the three intercluster comparisons in figure 2 , the African/Native American comparison showed the largest between-groups differentiation (ΦCT = 0.549), while the central cluster was less differentiated from the Americas (ΦCT = 0.234) than from Africa (ΦCT = 0.350), in accord with ethnohistoric evidence (Cavalli-Sforza, Menozzi, and Piazza 1994 ; Hammer and Zegura 1996 ; Crawford 1998 ; Karafet et al. 1999 ; Cavalli-Sforza 2000 ).

Nested Cladistic Analysis as a Synthetic Explanatory Tool

In order to understand the causal mechanisms underlying the pattern of NRY variation reflected in the MDS plot and AMOVA results, the spatial distribution of our global NRY database was investigated by NCA (Templeton, Routman, and Phillips 1995 ; Hammer et al. 1998 ). In figure 4 , inter- and intracontinental population history events (contiguous range expansions and long-distance colonizations) are depicted by solid arrows, while population structure processes (recurrent gene flow restricted by isolation by distance and long-distance dispersals) are indicated by dashed arrows (and one dashed line between Asia and Oceania, for which polarity could not be inferred). It is clear from figure 4 that both population structure and history have played important roles in shaping patterns of global NRY variation.

One of the most notable findings from the NCA analysis was the predominance of intercontinental signals detected emanating from Asia (fig. 4 ). These multiple out-of-Asia signals included gene flow episodes to Europe and the Americas, along with range expansions to Oceania, Africa, and Europe. In contrast, the NCA only detected two out-of-Africa signals. These NCA inferences help to explain the MDS plot (i.e., Asia's membership in the central cluster), the AMOVA results (i.e., lack of significant differentiation of Eurasian and Oceanian populations), and the diversity statistics (i.e., similar patterns of diversity in Asia and Oceania). Contrary to previously published studies of mtDNA (Redd et al. 1999 ) and autosomal markers (Harding et al. 1997 ; Stoneking et al. 1997 ), the NRY results suggest a strong affinity between mainland Asian and Oceanian populations. This different pattern may be due to either ascertainment bias in our NRY database or higher rates of male migration between Asia and Oceania. Support for the latter conjecture comes from the two long-distance colonization events, as well as a gene flow signal detected between Asia and Oceania in the NCA.

The fact that Europe is primarily a receiver rather than a sender of signals in figure 4 underscores the importance of gene flow/population movements into this continent. It also helps to explain Europe's central position in the MDS plot, its high h and p diversity statistic values, the concordant pattern of the three Φ statistics for the European/non-European and Asian/non-Asian comparisons, and the observation that Europe has the lowest continental ΦST value (see below). Interestingly, all incoming signals to Europe came from Asia. Two of these signals appear to have originated in Asia, one being a long-distance dispersal (from within nested clade 1-15) and the other being a contiguous range expansion (from within nested clade 1-11). The third was a gene flow signal that may have actually originated in Africa before moving to the Levant and eventually to Europe. This latter signal was postulated to result from the Neolithic demic diffusion of Levantine farmers into Europe (Hammer et al. 1998 ) and corresponds to Semino et al.'s (2000) Eu4 lineage. The former two signals may well correspond to the two proposed Paleolithic migratory episodes that contributed a major portion of the modern European paternal gene pool (Semino et al. 2000) .

After an early out-of-Africa range expansion (widest arrow in fig. 4 ), the majority of signals involving Africa were intracontinental events and processes. To explain this in the context of the two different sets of NRY haplotypes in sub-Saharan Africa (i.e., an ancient set of haplotypes overlaid by derived shared haplotypes), one probably needs a layered temporal framework whereby, for instance, early subdivision with inferred gene flow between the Khoisan and Pygmies (e.g., table 4 ) is combined with later, more extensive gene flow and historical events such as the Bantu expansion (Cavalli-Sforza, Menozzi, and Piazza 1994 ). These inferences are compatible with the model put forward by Labuda, Zietkiewicz, and Yotova (2000) in which the gene pool of sub-Saharan Africans is seen to be composed of two clades that evolved separately and then eventually underwent hybridization.

Comparative Framework: NRY and mtDNA Patterns in Sub-Saharan Africa

Our results may help to inform the debate concerning the conflicting patterns observed in sub-Saharan African mitochondrial and nuclear DNA. A basic inconsistency has been noted concerning the relative branch lengths in population trees for sub-Saharan African and non-African populations (Jorde et al. 1995 ). In contrast to non-African populations, sub-Saharan African populations appear to be well differentiated in mtDNA-based trees, suggesting that they have been subdivided for an extended period (Mountain 1998 ). Population trees based on nuclear polymorphisms do not show this pattern: for example, sub-Saharan African populations appear to be more closely related, and non-African and sub-Saharan African branches are more comparable in length. This discrepancy has not been satisfactorily explained by models incorporating an ascertainment bias in nuclear polymorphisms, a higher substitution (and homoplasy) rate for mtDNA, limited sample sizes, or various population-level factors (e.g., size changes) (Jorde et al. 1995 ; Mountain 1998 ). Additional explanatory factors have been suggested, including lack of selective neutrality in mtDNA and differences in male versus female migration rates and/or effective sizes within sub-Saharan Africa (Jorde et al. 1995 ).

We examined this problem from the perspective of the NRY by undertaking two new analyses. We wanted to know (1) whether NRY-based genetic distances within sub-Saharan Africa were smaller than those for non-African locales, and (2) if NRY data showed less differentiation among sub-Saharan African populations than did mtDNA data. First, ΦST genetic distances were calculated for each continent separately. Asia had the highest ΦST value (0.271), followed by Africa (0.222), the Americas (0.188), Oceania (0.133), and Europe (0.128). Thus, at least one non-African locale (Asia) had larger genetic distances than Africa. When sub-Saharan Africa was analyzed separately, its ΦST of 0.251 was still smaller than that of Asia. In contrast, mtDNA data typically show much greater among-groups variation for sub-Saharan African populations than for non-African groups. For instance, Melton et al. (1997) reported a ΦST value of 0.339 for sub-Saharan Africa, compared with values of 0.045 and 0.007 for Asian and European populations, respectively. The second new analysis consisted of an MDS plot for all 50 populations (fig. 5 ). Here, sub-Saharan African populations were more tightly clustered than they were in Excoffier et al.'s (1996) mtDNA-based plot, and the sub-Saharan African populations were also more tightly clustered than non-African populations, the exact opposite of the mtDNA pattern. Moreover, the overall pattern of sub-Saharan African NRY phylogeography is closer to other nuclear system results (Cavalli-Sforza, Menozzi, and Piazza 1994 ; Stoneking et al. 1997 ) than to those for its haploid mtDNA counterpart.

We favor an explanation for these contradictory mtDNA and NRY results that involves, at least in part, a higher male-versus-female migration rate in sub-Saharan Africa. Obviously, this suggestion conflicts with our previous statements about possible local and regional patterns of greater female migration for the rest of the world; however, it is concordant with the recent results of Carvajal-Carmona et al. (2000) and Mesa et al. (2000) on sex-specific gene flow patterns over the last few hundred years in Colombia, South America. The discordant NRY/mtDNA patterns also fit a model involving male-biased gene flow during the Bantu expansion, which in turn could have produced the widespread sub-Saharan African distribution of a single haplotype (h15) in clade 1-10 (previously called YAP+ haplotype 5 in Hammer et al. [1998] ). The importance of African male migration is also underscored by the observation that four of the eight intracontinental signals detected in the NCA involve Africa. These gene flow episodes and two range expansions certainly add to the impression that male dispersal rates are quite high within Africa. It is also possible that other factors (such as selection; see next section) are responsible for producing the discordant picture of mtDNA and NRY results (Hey 1997 ).

Conclusions, Caveats, and Future Directions

Previous global nested cladistic analyses of human NRY variation (Hammer et al. 1998 ; Karafet et al. 1999 ) have demonstrated patterns of diversity unlike those provided by mtDNA (Templeton 1993, 1997, 1999 ) or autosomal systems (Harding et al. 1997 ). For instance, Templeton's (1993, 1997,1999) nested cladistic analyses of human mtDNA data are all highlighted at the deepest level by pervasive gene flow restricted by isolation by distance throughout Africa and southern Eurasia for the entire time to the most recent common ancestor (TMRCA) of mtDNA. A similar extensive worldwide Late Pleistocene gene flow signal was detected at the β-globin locus (Harding et al. 1997 ; Templeton 1999 ), the only global autosomal data set analyzed by Templeton, Routman, and Phillips' (1995) nested cladistic procedures. In contrast, all three of our nested cladistic analyses detected a global contiguous range expansion out of Africa at the level of the entire cladogram. In the present NCA, the two deepest gene flow signals were only at the three-step level: one occurred globally, while the other was restricted to the continent of Africa. Our new results support a general scenario in which, after an early out-of-Africa range expansion, global-scale patterns of NRY variation were mainly influenced by migrations out of Asia. Moreover, the greater degree of contact detected by the NCA among Asia, Europe, and Oceania (via both population structure processes and population history events) helps to explain the observed pattern of global NRY diversity.

A major conclusion of the present work is that global human NRY variation is structured, with a significant amount of intergroup variation partitioned among African, Native American, and Eurasian/Oceanian populations. There was also a significant degree of among-populations variation at the intracontinental level; the degree of structure at lower levels of population subdivision remains to be determined.

It should be noted that the pattern of subdivision detected here could also be explained by models that involve natural selection or a combination of microevolutionary forces including selection, migration, genetic drift, and mutation. Additionally, various human social processes, such as polygyny and kin-structured migration, may affect variation on the NRY (Fix 1999 ). Support for a model involving selection comes from recent findings demonstrating an excess of rare alleles at sites on the NRY (Underhill et al. 1997 ; Pritchard et al. 1999 ; Shen et al. 2000 ; Thomson et al. 2000 ). Indeed, our mutation screening at the DYS188, DYS190, and DYS194 sites on a panel of 58 Y chromosomes from worldwide samples also yielded a significant excess of singleton polymorphisms. This excess of singletons (>twofold more than expected under the hypothesis of constant population size) resulted in a significantly negative Fu and Li's (1997)F* statistic of −2.67 (P < 0.05). These findings are consistent with models based on positive directional selection, expansion from a small population size, and/or ascertainment bias resulting from poor sampling of a subdivided population system. Obviously, more research needs to be focused on distinguishing the possible causes and implications of population subdivision in the human paternal gene pool.

It is important to note the limitations of the different methods employed in this study, as well as their complementary nature for inferring the underlying forces shaping NRY variation in human populations. The NCA is stronger at making inferences toward the interior of a cladogram and weaker at inferring processes/events at the tips. Therefore, as more polymorphisms are discovered (see, e.g., Underhill et al.2000) and the NRY tree becomes more resolved, more inferences concerning regional variation will emerge. It is also possible that some of the inferences made here will change as more data are collected. Consequently, we have mainly focused on general patterns and have not tried to explain all of the specific signals detected by the NCA. Finally, these methods do not distinguish selection from demographic forces in shaping patterns of diversity.

Standard approaches for the description of population structure based on Wright's (1969) island model and/or F statistics (e.g., AMOVA) do not attempt to disentangle past events from contemporary processes and thus can be considered nonhistorical (Turner et al. 2000) . The combination of nested cladistic and coalescence analyses can theoretically provide the temporal framework for making these crucial distinctions (Schaal and Olsen 2000) . For instance, coalescence analysis could provide the missing dates needed to clarify the relative chronology of the many signals in figure 4 . Our two previous coalescence analyses of Y-chromosome data (Hammer et al. 1998 ; Karafet et al. 1999 ) were performed without population growth or subdivision parameters in the model. Growth has been shown to decrease TMRCA estimates (Pritchard et al. 1999 ; Thomson et al. 2000 ). We are presently collaborating with R. C. Griffiths who is developing coalescence analyses incorporating both population growth and subdivision. Population growth should decrease our previously published mutational ages and TMRCA estimates (Hammer et al. 1998 ; Karafet et al. 1999 ), while population subdivision should have the opposite effect.

Keith Crandall, Reviewing Editor

1

Keywords: subdivision patrilocality gene flow male migrations

2

Address for correspondence and reprints: Michael F. Hammer, Laboratory of Molecular Systematics and Evolution, Biosciences West room 239, University of Arizona, Tucson, Arizona 85721. E-mail: mhammer@u.arizona.edu .

Table 1 NRY Haplotype Frequencies in 10 Regional Groups

Table 1 NRY Haplotype Frequencies in 10 Regional Groups

Table 1 NRY Haplotype Frequencies in 10 Regional Groups

Table 1 NRY Haplotype Frequencies in 10 Regional Groups

Table 2 NRY Haplotype Diversity in 10 Regional and 5 Continental Groups

Table 2 NRY Haplotype Diversity in 10 Regional and 5 Continental Groups

Table 2 NRY Haplotype Diversity in 10 Regional and 5 Continental Groups

Table 2 NRY Haplotype Diversity in 10 Regional and 5 Continental Groups

Table 3 Structure of Y-Chromosome Haplotype Variation for 50 Worldwide Populations

Table 3 Structure of Y-Chromosome Haplotype Variation for 50 Worldwide Populations

Table 3 Structure of Y-Chromosome Haplotype Variation for 50 Worldwide Populations

Table 3 Structure of Y-Chromosome Haplotype Variation for 50 Worldwide Populations

Table 4 Main Inferences from Results of Nested Cladistic Analysis

Table 4 Main Inferences from Results of Nested Cladistic Analysis

Table 4 Main Inferences from Results of Nested Cladistic Analysis

Table 4 Main Inferences from Results of Nested Cladistic Analysis

Fig. 1.—Evolutionary tree for 39 NRY haplotypes. The 43 mutational events listed in table 1 are shown by cross-hatches. Haplotypes numbered h1–h39 in circles correspond to designations in table 1 . The root of the genetree is denoted by an arrow. Haplotypes are color-coded by geography (see figure for color-coding key). The pie charts represent the frequencies of occurrence of the haplotypes within each of the 10 geographic regions listed in table 1 . The overall size of each pie chart corresponds to one of five frequency classes (see figure for frequency class key) and represents the frequency of that haplotype in the global sample of 2,858 chromosomes

Fig. 1.—Evolutionary tree for 39 NRY haplotypes. The 43 mutational events listed in table 1 are shown by cross-hatches. Haplotypes numbered h1–h39 in circles correspond to designations in table 1 . The root of the genetree is denoted by an arrow. Haplotypes are color-coded by geography (see figure for color-coding key). The pie charts represent the frequencies of occurrence of the haplotypes within each of the 10 geographic regions listed in table 1 . The overall size of each pie chart corresponds to one of five frequency classes (see figure for frequency class key) and represents the frequency of that haplotype in the global sample of 2,858 chromosomes

Fig. 2.—MDS plot of 10 regional populations based on ΦST genetic distances, and ΦCT values for three intercluster comparisons. For three-letter population codes, see Materials and Methods.

Fig. 2.—MDS plot of 10 regional populations based on ΦST genetic distances, and ΦCT values for three intercluster comparisons. For three-letter population codes, see Materials and Methods.

Fig. 3.—Nested cladistic design for 39 NRY haplotypes. The 43 mutational events listed in table 1 are shown by cross-hatches. Haplotypes h1–h39 are named as shown in table 1 . The root of the cladogram is denoted by an arrow. Filled circles represent haplotypes that were missing in this sample of Y chromosomes. Ovals contain one-step clades which are designated 1-1 through 1-19. Rectangles contain two-step clades which are designated 2-1 through 2-9. Rounded rectangles contain three-step clades which are designated 3-1 through 3-3. A single four-step clade (4-1) encompasses the entire cladogram

Fig. 3.—Nested cladistic design for 39 NRY haplotypes. The 43 mutational events listed in table 1 are shown by cross-hatches. Haplotypes h1–h39 are named as shown in table 1 . The root of the cladogram is denoted by an arrow. Filled circles represent haplotypes that were missing in this sample of Y chromosomes. Ovals contain one-step clades which are designated 1-1 through 1-19. Rectangles contain two-step clades which are designated 2-1 through 2-9. Rounded rectangles contain three-step clades which are designated 3-1 through 3-3. A single four-step clade (4-1) encompasses the entire cladogram

Fig. 4.—Inferences from nested cladistic analysis of Y-chromosome data. Intercontinental signals are indicated by arrows between continent ideograms (note: arrows are not meant to indicate routes of migration), and intracontinental signals are shown by arrows within circles (empty circles for Europe and the Americas denote the absence of intracontinental signals). Solid arrows represent population history events (contiguous range expansions and long-distance colonizations), while population structure processes (recurrent gene flow restricted by isolation by distance and long-distance dispersals) are indicated with dashed arrows (and, in one instance, a dashed line between Asia and Oceania where no polarity could be inferred). The widest solid arrow denotes early range expansion out of Africa at the level of the total cladogram

Fig. 4.—Inferences from nested cladistic analysis of Y-chromosome data. Intercontinental signals are indicated by arrows between continent ideograms (note: arrows are not meant to indicate routes of migration), and intracontinental signals are shown by arrows within circles (empty circles for Europe and the Americas denote the absence of intracontinental signals). Solid arrows represent population history events (contiguous range expansions and long-distance colonizations), while population structure processes (recurrent gene flow restricted by isolation by distance and long-distance dispersals) are indicated with dashed arrows (and, in one instance, a dashed line between Asia and Oceania where no polarity could be inferred). The widest solid arrow denotes early range expansion out of Africa at the level of the total cladogram

Fig. 5.—MDS plot of 50 populations based on ΦST genetic distances

Fig. 5.—MDS plot of 50 populations based on ΦST genetic distances

We gratefully acknowledge the excellent laboratory assistance of Roxane Bonner, Matthew Kaplan, Agnish Chakravarti, Christine Ponder, Ji Park, Abdel-Halim Salem, Hwayong Park, and Ammon Corl. We thank Elizabeth Wood, Tasha Altheide, Christopher Tillquist, Robert Griffiths, Alan Templeton, and Michael Nachman for helpful discussions and comments on earlier versions of the manuscript. We also thank our generous collaborators who provided DNA samples. This publication was made possible by grant GM-53566 from the National Institute of General Medical Sciences and grant OPP-9806759 from the National Science Foundation (to M.F.H.). Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the NIH or the NSF.

References

Allen B. S., H. Ostrer,

1994

Conservation of human Y chromosome sequences among male great apes: implications for the evolution of Y chromosomes

J. Mol. Evol

39

:

13

-21

Bao W., S. Zhu, A. Pandya, T. Zerjal, J. Xu, Q. Shu, R. Du, H. Yang, C. Tyler-Smith,

2000

MSY2: a slowly evolving minisatellite on the human Y chromosome which provides a useful polymorphic marker in Chinese populations

Gene

244

:

29

-33

Barbujani G., A. Magagni, E. Minch, L. L. Cavalli-Sforza,

1997

An apportionment of human DNA diversity

Proc. Natl. Acad. Sci. USA

94

:

4516

-4519

Batzer M. A., M. Stoneking, M. Alegria-Hartman, et al. (11 co-authors)

1994

African origin of human-specific polymorphic Alu insertions

Proc. Natl. Acad. Sci. USA

91

:

12288

-12292

Bowcock A., L. L. Cavalli-Sforza,

1991

The study of variation in the human genome

Genomics

11

:

491

-498

Carvajal-Carmona L. G., I. D. Soto, N. Pineda, et al. (11 co-authors)

2000

Strong Amerind/White sex bias and a possible Sephardic contribution among the founders of a population in northwest Colombia

Am. J. Hum. Genet

67

:

1287

-1295

Cavalli-Sforza L. L.,

2000

Genes, peoples, and languages North Point Press, New York

Cavalli-Sforza L. L., P. Menozzi, A. Piazza,

1994

The history and geography of human genes Princeton University Press, Princeton, N.J

Cavalli-Sforza L. L., E. Minch,

1997

Paleolithic and Neolithic lineages in the European mitochondrial gene pool

Am. J. Hum. Genet

61

:

247

-254

Crawford M. H.,

1998

The origins of Native Americans: evidence from anthropological genetics Cambridge University Press, Cambridge, England

Deka R., M. D. Shriver, L. M. Yu, R. E. Ferrell, R. Chakraborty,

1995

Intra- and inter-population diversity at short tandem repeat loci in diverse populations of the world

Electrophoresis

16

:

1659

-1664

Excoffier L., E. S. Poloni, S. Santachiara-Benerecetti, O. Semino, A. Langaney,

1996

The molecular diversity of the Niokholo Mandenkalu from Eastern Senegal: an insight into West Africa genetic history Pp. 141–155 in A. J. Boyce and C. G. N. Mascie-Taylors, eds. Molecular biology and human diversity. Cambridge University Press, Cambridge, England

Excoffier L., P. E. Smouse, J. M. Quattro,

1992

Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data

Genetics

131

:

479

-491

Fix A.,

1999

Migration and colonization in human microevolution Cambridge University Press, New York

Fu Y., W.-H. Li,

1997

Statistical tests of neutrality of mutations against population growth, hitchhiking and background selection

Genetics

147

:

915

-925

Hammer M. F.,

1995

A recent common ancestry for human Y-chromosomes

Nature

378

:

376

-378

Hammer M. F., S. Horai,

1995

Y chromosomal DNA variation and the peopling of Japan

Am. J. Hum. Genet

56

:

951

-962

Hammer M. F., T. Karafet, A. Rasanayagam, E. T. Wood, T. K. Altheide, T. Jenkins, R. C. Griffiths, A. R. Templeton, S. L. Zegura,

1998

Out of Africa and back again: nested cladistic analysis of human Y chromosome variation

Mol. Biol. Evol

15

:

427

-441

Hammer M. F., A. J. Redd, E. T. Wood, et al. (12 co-authors)

2000

Jewish and middle eastern non-Jewish populations share a common pool of Y-chromosome biallelic haplotypes

Proc. Natl. Acad. Sci. USA

97

:

6769

-6774

Hammer M. F., A. B. Spurdle, T. Karafet, et al. (11 co-authors)

1997

The geographic distribution of human Y chromosome variation

Genetics

145

:

787

-805

Hammer M. F., S. L. Zegura,

1996

The role of the Y chromosome in human evolutionary studies

Evol. Anthropol

5

:

116

-134

Harding R. M., S. M. Fullerton, R. C. Griffiths, J. Bond, M. J. Cox, J. A. Schneider, D. S. Moulin, J. B. Clegg,

1997

Archaic African and Asian lineages in the genetic ancestry of modern humans

Am. J. Hum. Genet

60

:

772

-789

Hey J.,

1997

Mitochondrial and nuclear genes present conflicting portraits of human origins

Mol. Biol. Evol

14

:

166

-172

Jin L., R. Chakraborty,

1995

Population structure, stepwise mutations, heterozygote deficiency and their implications in DNA forensics

Heredity

74

:

274

-285

Jobling M. A., V. Samara, A. Pandya, et al. (16 co-authors)

1996

Recurrent duplication and deletion polymorphisms on the long arm of the Y chromosome in normal males

Hum. Mol. Genet

5

:

1767

-1775

Jobling M. A., C. Tyler-Smith,

1995

Fathers and sons: the Y chromosome and human evolution

Trends Genet

11

:

449

-456

Jorde L. B., M. J. Bamshad, W. S. Watkins, R. Zenger, A. E. Fraley, P. A. Krakowiak, K. D. Carpenter, H. Soodyall, T. Jenkins, A. R. Rogers,

1995

Origins and affinities of modern humans: a comparison of mitochondrial and nuclear genetic data

Am. J. Hum. Genet

57

:

523

-538

Jorde L. B., W. S. Watkins, M. J. Bamshad, M. E. Dixon, C. E. Ricker, M. T. Seielstad, M. A. Batzer,

2000

The distribution of human genetic diversity: a comparison of mitochondrial, autosomal, and Y-chromosome data

Am. J. Hum. Genet

66

:

979

-988

Karafet T. M., S. L. Zegura, O. Posukh, et al. (14 co-authors)

1999

Ancestral Asian source(s) of New World Y-chromosome founder haplotypes

Am. J. Hum. Genet

64

:

817

-831

Karafet T., S. L. Zegura, J. Vuturo-Brady, et al. (14 co-authors)

1997

Y chromosome markers and trans-Bering Strait dispersals

Am. J. Phys. Anthropol

102

:

301

-314

Kittles R. A., A. W. Bergen, M. Urbanek, M. Virkkunen, M. Linnoila, D. Goldman, J. C. Long,

1999

Autosomal, mitochondrial, and Y chromosome DNA variation in Finland: evidence for a male-specific bottleneck

Am. J. Phys. Anthropol

108

:

381

-399

Kruskal J. B.,

1964

Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis

Pyschometrika

29

:

1

-27

Labuda D., E. Zietkiewicz, V. Yotova,

2000

Archaic lineages in the history of modern humans

Genetics

156

:

799

-808

Latter B. D. H.,

1980

Genetic differences within and between populations of the major human subgroups

Am. Nat

116

:

220

-237

Lewontin R. C.,

1972

The apportionment of human diversity

Evol. Biol

6

:

381

-398

Melton T., C. Ginther, G. Sensabaugh, H. Soodyall, M. Stoneking,

1997

Extent of heterogeneity in mitochondrial DNA of sub-Saharan African populations

J. Forensic Sci

42

:

582

-592

Mesa N. R., M. C. Mondragon, I. D. Soto, et al. (13 co-authors)

2000

Autosomal, mtDNA, and Y-chromosome diversity in Amerinds: pre- and post-Columbian patterns of gene flow in South America

Am. J. Hum. Genet

67

:

1277

-1286

Mountain J. L.,

1998

Molecular evolution and modern human origins

Evol. Anthropol

7

:

21

-37

Murdock G. P.,

1967

Ethnographic atlas University of Pittsburgh Press, Pittsburgh, Pa

Nei M.,

1987

Molecular evolutionary genetics Columbia University Press, New York

Nei M., A. K. Roychoudhury,

1974

Genic variation within and between the three major races of man, Caucasoids, Negroids, and Mongoloids

Am. J. Hum. Genet

26

:

421

-443

———.

1993

Evolutionary relationships of human populations on a global scale

Mol. Biol. Evol

10

:

927

-943

Perez-Lezaun A., F. Calafell, D. Comas, et al. (12 co-authors)

1999

Sex-specific migration patterns in Central Asian populations, revealed by analysis of Y-chromosome short tandem repeats and mtDNA

Am. J. Hum. Genet

65

:

208

-219

Poloni E. S., O. Semino, G. Passarino, A. S. Santachiara-Benerecetti, I. Dupanloup, A. Langaney, L. Excoffier,

1997

Human genetic affinities for Y-chromosome P49a,f/_Taq_I haplotypes show strong correspondence with linguistics

Am. J. Hum. Genet

61

:

1015

-1035

Posada D., K. A. Crandall, A. R. Templeton,

2000

GeoDis: a program for the cladistic nested analysis of the geographical distribution of genetic haplotypes

Mol. Ecol

9

:

487

-488

Pritchard J. K., M. T. Seielstad, A. Perez-Lezaun, M. W. Feldman,

1999

Population growth of human Y chromosomes: a study of Y chromosome microsatellites

Mol. Biol. Evol

16

:

1791

-1798

Przeworski M., R. R. Hudson, A. Di Rienzo,

2000

Adjusting the focus on human variation

Trends Genet

16

:

296

-302

Qamar R., Q. Ayub, S. Khaliq, A. Mansoor, T. Karafet, S. Q. Mehdi, M. F. Hammer,

1999

African and Levantine origins of Pakistani YAP+ Y chromosomes

Hum. Biol

71

:

745

-755

Redd A. J., M. Stoneking,

1999

Peopling of Sahul: mtDNA variation in aboriginal Australian and Papua New Guinean populations

Am. J. Hum. Genet

65

:

808

-828

Relethford J. H.,

1995

Genetics and modern human origins

Evol. Anthropol

4

:

53

-63

Relethford J. H., H. C. Harpending,

1994

Craniometric variation, genetic theory, and modern human origins

Am. J. Phys. Anthropol

95

:

249

-270

Rohlf F. J.,

1998

NTSYS-pc: numerical taxonomy and multivariate analysis system Release 2.02H. Exeter Software, Setauket, N.Y

Santos F. R., A. Pandya, M. Kayser, et al. (13 co-authors)

2000

A polymorphic L1 retroposon insertion in the centromere of the human Y chromosome

Hum. Mol. Genet

9

:

421

-430

Schaal B. A., K. M. Olsen,

2000

Gene genealogies and population variation in plants

Proc. Natl. Acad. Sci. USA

97

:

7024

-7029

Schneider S., J.-M. Kueffer, D. Roessli, L. Excoffier,

1998

Arlequin: a software for population genetic analysis Release 1.1. Genetics and Biometry Laboratory, University of Geneva, Geneva, Switzerland

Scozzari R., F. Cruciani, P. Santolamazza, et al. (17 co-authors)

1999

Combined use of biallelic and microsatellite Y-chromosome polymorphisms to infer affinities among African populations

Am. J. Hum. Genet

65

:

829

-846

Seielstad M. T., E. Minch, L. L. Cavalli-Sforza,

1998

Genetic evidence for a higher female migration rate in humans

Nat. Genet

20

:

278

-280

Semino O., G. Passarino, P. J. Oefner, et al. (17 co-authors)

2000

The genetic legacy of Paleolithic Homo sapiens sapiens in extant Europeans: a Y chromosome perspective

Science

290

:

1155

-1159

Sheffield V. C., J. S. Beck, A. E. Kwitek, D. W. Sandstrom, E. M. Stone,

1993

The sensitivity of single-strand conformation polymorphism analysis for the detection of single base changes

Genomics

16

:

325

-332

Shen P., F. Wang, P. A. Underhill, et al. (13 co-authors)

2000

Population genetic implications from sequence variation in four Y chromosome genes

Proc. Natl. Acad. Sci. USA

97

:

7354

-7359

Shinka T., K. Tomita, T. Toda, S. E. Kotliarova, J. Lee, Y. Kuroki, D. K. Jin, K. Tokunaga, H. Nakamura, Y. Nakahori,

1999

Genetic variations on the Y chromosome in the Japanese population and implications for modern human Y chromosome lineage

J. Hum. Genet

44

:

240

-245

Sommer S. S., A. R. Groszbach, C. D. Bottema,

1992

PCR amplification of specific alleles (PASA) is a general method for rapidly detecting known single-base changes

Biotechniques

12

:

82

-87

Stoneking M.,

1998

Women on the move

Nat. Genet

20

:

219

-220

Stoneking M., J. J. Fontius, S. L. Clifford, H. Soodyall, S. S. Arcot, N. Saha, T. Jenkins, M. A. Tahir, P. L. Deininger, M. A. Batzer,

1997

Alu insertion polymorphisms and human evolution: evidence for a larger population size in Africa

Genome Res

7

:

1061

-1071

Su B., J. Xiao, P. Underhill, et al. (21 co-authors)

1999

Y-Chromosome evidence for a northward migration of modern humans into Eastern Asia during the last Ice Age

Am. J. Hum. Genet

65

:

1718

-1724

Swofford D.,

2000

PAUP: phylogenetic analysis using parsimony Release 4.0b4. Sinauer, Sunderland, Mass

Templeton A. R.,

1993

The “Eve” hypothesis: a genetic critique and reanalysis

Am. Anthropol

95

:

51

-72

———.

1997

Testing the out-of-Africa replacement hypothesis with mitochondrial DNA data Pp. 329–360 in G. A. Clark and C. Willermet, eds. Conceptual issues in modern human origins research. Aldine de Gruyter, Amsterdam

———.

1999

Human races: a genetic and evolutionary perspective

Am. Anthropol

100

:

632

-650

Templeton A. R., E. Boerwinkle, C. F. Sing,

1987

A cladistic analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping. I. Basic theory and an analysis of alcohol dehydrogenase activity in Drosophila.

Genetics

117

:

343

-351

Templeton A. R., E. Routman, C. A. Phillips,

1995

Separating population structure from population history: a cladistic analysis of the geographical distribution of mitochondrial DNA haplotypes in the tiger salamander, Ambystoma tigrinum.

Genetics

140

:

767

-782

Templeton A. R., C. F. Sing,

1993

A cladistic analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping. IV. Nested analyses with cladogram uncertainty and recombination

Genetics

134

:

659

-669

Thomson R., J. K. Pritchard, P. Shen, P. J. Oefner, M. W. Feldman,

2000

Recent common ancestry of human Y chromosomes: evidence from DNA sequence data

Proc. Natl. Acad. Sci. USA

97

:

7360

-7365

Turner T. F., J. C. Trexler, J. L. Harris, J. L. Haynes,

2000

Nested cladistic analysis indicates population fragmentation shapes genetic diversity in a freshwater mussel

Genetics

154

:

777

-785

Underhill P. A., L. Jin, A. A. Lin, S. Q. Mehdi, T. Jenkins, D. Vollrath, R. W. Davis, L. L. Cavalli-Sforza, P. J. Oefner,

1997

Detection of numerous Y chromosome biallelic polymorphisms by denaturing high-performance liquid chromatography

Genome Res

7

:

996

-1005

Underhill P. A., P. Shen, A. A. Lin, et al. (21 co-authors)

2000

Y chromosome sequence variation and the history of human populations

Nat. Genet

26

:

358

-361

Urbanek M., D. Goldman, J. C. Long,

1996

The apportionment of dinucleotide repeat diversity in Native Americans and Europeans: a new approach to measuring gene identity reveals asymmetric patterns of divergence

Mol. Biol. Evol

13

:

943

-953

Vigilant L., M. Stoneking, H. Harpending, K. Hawkes, A. C. Wilson,

1991

African populations and the evolution of human mitochondrial DNA

Science

253

:

1503

-1507

Vollrath D., S. Foote, A. Hilton, L. G. Brown, P. Beer-Romero, J. S. Bogan, D. C. Page,

1992

The human Y chromosome: a 43-interval map based on naturally occurring deletions

Science

258

:

52

-59

Wright S.,

1969

Evolution and the genetics of populations 2. the theory of gene frequencies University of Chicago Press, Chicago

Zerjal T., B. Dashnyam, A. Pandya, et al. (18 co-authors)

1997

Genetic relationships of Asians and northern Europeans, revealed by Y-chromosome DNA analysis

Am. J. Hum. Genet

60

:

1174

-1183

Citations

Views

Altmetric

Metrics

Total Views 3,999

2,726 Pageviews

1,273 PDF Downloads

Since 1/1/2017

Month: Total Views:
January 2017 2
February 2017 5
March 2017 7
April 2017 5
May 2017 3
June 2017 2
July 2017 1
August 2017 5
September 2017 5
October 2017 4
November 2017 4
December 2017 60
January 2018 66
February 2018 57
March 2018 68
April 2018 35
May 2018 83
June 2018 81
July 2018 61
August 2018 68
September 2018 73
October 2018 60
November 2018 66
December 2018 65
January 2019 72
February 2019 97
March 2019 54
April 2019 55
May 2019 59
June 2019 42
July 2019 49
August 2019 35
September 2019 57
October 2019 46
November 2019 25
December 2019 25
January 2020 34
February 2020 46
March 2020 30
April 2020 42
May 2020 46
June 2020 62
July 2020 37
August 2020 32
September 2020 61
October 2020 36
November 2020 38
December 2020 45
January 2021 51
February 2021 41
March 2021 67
April 2021 45
May 2021 51
June 2021 23
July 2021 36
August 2021 58
September 2021 29
October 2021 51
November 2021 23
December 2021 22
January 2022 51
February 2022 55
March 2022 50
April 2022 29
May 2022 42
June 2022 44
July 2022 51
August 2022 29
September 2022 55
October 2022 45
November 2022 37
December 2022 42
January 2023 32
February 2023 22
March 2023 78
April 2023 55
May 2023 46
June 2023 41
July 2023 32
August 2023 40
September 2023 32
October 2023 37
November 2023 39
December 2023 41
January 2024 47
February 2024 46
March 2024 66
April 2024 42
May 2024 34
June 2024 47
July 2024 46
August 2024 67
September 2024 33
October 2024 38

Citations

206 Web of Science

×

Email alerts

Email alerts

Citing articles via

More from Oxford Academic