Origin, diffusion, and differentiation of Y-chromosome haplogroups E and J: inferences on the neolithization of Europe and later migratory events in the Mediterranean area - PubMed (original) (raw)

Comparative Study

. 2004 May;74(5):1023-34.

doi: 10.1086/386295. Epub 2004 Apr 6.

Chiara Magri, Giorgia Benuzzi, Alice A Lin, Nadia Al-Zahery, Vincenza Battaglia, Liliana Maccioni, Costas Triantaphyllidis, Peidong Shen, Peter J Oefner, Lev A Zhivotovsky, Roy King, Antonio Torroni, L Luca Cavalli-Sforza, Peter A Underhill, A Silvana Santachiara-Benerecetti

Affiliations

Comparative Study

Origin, diffusion, and differentiation of Y-chromosome haplogroups E and J: inferences on the neolithization of Europe and later migratory events in the Mediterranean area

Ornella Semino et al. Am J Hum Genet. 2004 May.

Abstract

The phylogeography of Y-chromosome haplogroups E (Hg E) and J (Hg J) was investigated in >2400 subjects from 29 populations, mainly from Europe and the Mediterranean area but also from Africa and Asia. The observed 501 Hg E and 445 Hg J samples were subtyped using 36 binary markers and eight microsatellite loci. Spatial patterns reveal that (1). the two sister clades, J-M267 and J-M172, are distributed differentially within the Near East, North Africa, and Europe; (2). J-M267 was spread by two temporally distinct migratory episodes, the most recent one probably associated with the diffusion of Arab people; (3). E-M81 is typical of Berbers, and its presence in Iberia and Sicily is due to recent gene flow from North Africa; (4). J-M172(xM12) distribution is consistent with a Levantine/Anatolian dispersal route to southeastern Europe and may reflect the spread of Anatolian farmers; and (5). E-M78 (for which microsatellite data suggest an eastern African origin) and, to a lesser extent, J-M12(M102) lineages would trace the subsequent diffusion of people from the southern Balkans to the west. A 7%-22% contribution of Y chromosomes from Greece to southern Italy was estimated by admixture analysis.

PubMed Disclaimer

Figures

Figure  1

Figure 1

Phylogeny and frequency distributions of Hg E and its main subclades (panels A–G). The numbering of mutations is according to the Y Chromosome Consortium (YCC) (YCC ; Jobling and Tyler-Smith 2003). To the left of the phylogeny, the ages (in 1,000 years) of the boxed mutations are reported, with their SEs (Zhivotovsky et al. 2004). Because the procedure used is based on STR data, it actually estimates the ages of STR variation observed within the corresponding haplogroup in the studied populations. With the exception of the value relative to SRY4064 mutation, which as been calculated as _T_D (with

_V_0=0

) between the sister clades Hg E-P2 and Hg E-M33, the other values were estimated as the average squared difference (ASD) in the number of repeats between all current chromosomes of a sample and the founder haplotype, which has an expected value μ_t_ for single-step mutations (Thomas et al. 1998) and wt for a general mutation scheme, where w is an average effective mutation rate at the loci, taken as

6.9×10-4

per 25 years (Zhivotovsky et al. 2004) (microsatellite data available on request). In some cases, because of small sample sizes or long time passed since the occurrence of the mutation, the founder haplotype could not be reliably estimated as a modal haplotype. Therefore, we constructed it from modal alleles at single loci, although this can underestimate the age if the candidate founder haplotype differs from the real one. To make the computation of the P2 and M35 ages independent from those of their most-represented subclades, the STR variation observed at only the “asterisk” lineages (e.g., E-P2*) has been used. The M35 estimate is in agreement with those of Bosch et al. (2001) and Cruciani et al. ( [in this issue]), obtained with different methods. The YAP insertion was studied as an amplified fragment-length polymorphism (Hammer and Horai 1995). The other mutations were investigated in a hierarchical order by use of the denaturing high-performance liquid chromatography (DHPLC) methodology (Underhill et al. 2001). Subhaplogroups observed in this study are illustrated by continuous lines, whereas subhaplogroups discussed elsewhere are indicated by dotted lines. For simplicity, the prefix “M” was omitted from the name of the marker mutations. Haplogroup-frequency surfaces were graphically computer reconstructed following the Kringing procedure (Delfiner 1976) by use of the Surfer System (Golden Software) and the data reported in table 1.

Figure  2

Figure 2

Phylogeny and frequency distributions of Hg J and its main subclades (panels A–F). The numbering of mutations is according to the YCC (YCC ; Jobling and Tyler-Smith 2003). To the left of the phylogeny, the ages (in 1,000 years) of the boxed mutations are reported, with their SEs (Zhivotovsky et al. 2004). With the exception of the age relative to the 12f2 mutation, which has been estimated as _T_D (with

_V_0=0

) between the combined data of the two sister clades Hg J-M267 and Hg J-M172, the other values have been determined as ASD, as described in figure 1. The 12f2a marker was examined as an RFLP by Southern blotting (Passarino et al. 1998); the other mutations were investigated in hierarchical order by use of DHPLC methodology (Underhill et al. 2001). Three new mutations, M327, M280, and M390, were found in this study. M327 is a T→C transition at np 404 within the STS containing mutation M92, M280 is a G→A transition at np 330 within the STS containing the mutation M67, and M390 is an A insertion after nt 175 in the STS containing the M365 mutation. Conventions used are the same as for figure 1. The frequency surfaces were drawn using the data reported in table 2 and, for Hg J (panel A), also the data from Rosser et al. (2000), Quintana-Murci et al. (2001), and Scozzari et al. (2001).

Figure  3

Figure 3

Networks of the STR haplotypes of the main subhaplogroups of Hg E. These networks were obtained by the analysis of a subset of the samples for the following microsatellites: YCAIIa, YCAIIb (Mathias et al. 1994), DYS19, DYS389, DYS390, DYS391, and DYS392 (Roewer et al. 1996). The phylogenetic relationships between the microsatellite haplotypes were determined using the program NETWORK 2.0b (Fluxus Engineering). Networks were calculated by the median-joining method (

ɛ=0

) (Bandelt et al. 1995), weighting the STR loci according to their relative variability in Hg E and, with the exception of E-M81, after having processed the data with the reduced-median method. Circles represent the microsatellite haplotypes. Unless otherwise indicated by a number on the pie chart, the area of the circles and the area of the sectors are proportional to the haplotype frequency in the haplogroup and in the geographic area indicated by the color. The smallest circle of each network corresponds to one Y chromosome. The shaded area in E-M78 indicates the branch characterized by the DYS392-12 allele.

Figure  4

Figure 4

Network of the STR haplotypes of the main subhaplogroups of Hg J. These networks were obtained by the analysis of a subset of the samples for the following microsatellites: YCAIIa, YCAIIb (Mathias et al. 1994), DYS388 (Thomas et al. 1999), DYS19, DYS389, DYS390, DYS391, and DYS392 (Roewer et al. 1996), by the same procedures used for Hg E (fig. 3). Apart from the YCAII system in Hg J-M267, which was considered as a stable marker in this haplogroup (see text), the STR loci were weighted according to their relative variability in Hg J. The most complex networks, J-M267* and J-M172*, were calculated by the median-joining method (

ɛ=0

) on the preprocessed data with the reduced-median method; the other networks were calculated by using only the reduced-median algorithm. The shaded area in J-M267* indicates the branch characterized by the YCAIIa-22/YCAIIb-22 motif. For the areas of the circles and the sectors, see figure 3. The expansion time of this branch was calculated using _T_D (Zhivotovsky 2001), which gives 8.7 and 4.3 ky, respectively, for the earliest and the latest bounds of the expansion time. The former estimate was calculated by using the variance in the number of repeats of the remaining six loci, assuming a variance at the beginning of population separation (_V_0) equal to zero, and thus gives an upper bound for the _T_D (Zhivotovsky 2001). The latter assumes a linear approximation of the within-population variance in repeat scores as a function of time and takes a predicted value of _V_0 prior to population split; because the linearity can be achieved in a case of infinite population size only and because each survived haplogroup started from one individual and could maintain small size for a long time, the linear approximation overestimates _V_0 and thus might be considered as a lower bound for divergence times (L.A.Z., unpublished method).

Similar articles

Cited by

References

Electronic-Database Information

    1. Fluxus Engineering, http://www.fluxus-engineering.com (for NETWORK 2.0b)
    1. Y-Chromosome STR Database, http://www.cstl.nist.gov/biotech/strbase/y_strs.htm

References

    1. Al-Zahery N, Semino O, Benuzzi G, Magri C, Passarino G, Torroni A, Santachiara-Benerecetti AS (2003) Y-chromosome and mtDNA polymorphisms in Iraq, a crossroad of the early human dispersal and of post-Neolithic migrations. Mol Phylogenet Evol 28:458–47210.1016/S1055-7903(03)00039-3 - DOI - PubMed
    1. Ammerman AJ, Cavalli-Sforza LL (1984) Neolithic transition and the genetics of populations in Europe. Princeton University Press, Princeton, NJ
    1. Bandelt HJ, Forster P, Sykes BC, Richards MB (1995) Mitochondrial portraits of human populations using median networks. Genetics 141:743–753 - PMC - PubMed
    1. Bertorelle G, Excoffier L (1998) Inferring admixture proportions from molecular data. Mol Biol Evol 15:1298–1311 - PubMed
    1. Bosch E, Calafell F, Comas D, Oefner PJ, Underhill PA, Bertranpetit J (2001) High-resolution analysis of human Y-chromosome variation shows a sharp discontinuity and limited gene flow between northwestern Africa and the Iberian Peninsula. Am J Hum Genet 68:1019–1029 - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources