Population genetic implications from sequence variation in four Y chromosome genes - PubMed (original) (raw)

. 2000 Jun 20;97(13):7354-9.

doi: 10.1073/pnas.97.13.7354.

F Wang, P A Underhill, C Franco, W H Yang, A Roxas, R Sung, A A Lin, R W Hyman, D Vollrath, R W Davis, L L Cavalli-Sforza, P J Oefner

Affiliations

Population genetic implications from sequence variation in four Y chromosome genes

P Shen et al. Proc Natl Acad Sci U S A. 2000.

Abstract

Some insight into human evolution has been gained from the sequencing of four Y chromosome genes. Primary genomic sequencing determined gene SMCY to be composed of 27 exons that comprise 4,620 bp of coding sequence. The unfinished sequencing of the 5' portion of gene UTY1 was completed by primer walking, and a total of 20 exons were found. By using denaturing HPLC, these two genes, as well as DBY and DFFRY, were screened for polymorphic sites in 53-72 representatives of the five continents. A total of 98 variants were found, yielding nucleotide diversity estimates of 2.45 x 10(-5), 5. 07 x 10(-5), and 8.54 x 10(-5) for the coding regions of SMCY, DFFRY, and UTY1, respectively, with no variant having been observed in DBY. In agreement with most autosomal genes, diversity estimates for the noncoding regions were about 2- to 3-fold higher and ranged from 9. 16 x 10(-5) to 14.2 x 10(-5) for the four genes. Analysis of the frequencies of derived alleles for all four genes showed that they more closely fit the expectation of a Luria-Delbrück distribution than a distribution expected under a constant population size model, providing evidence for exponential population growth. Pairwise nucleotide mismatch distributions date the occurrence of population expansion to approximately 28,000 years ago. This estimate is in accord with the spread of Aurignacian technology and the disappearance of the Neanderthals.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Genomic structure of SMCY and the nucleotide positions of 47 sequence variants detected in 53 individuals. SMCY consists of 27 exons spaced over 39.5 kb, 11.6% of which are coding. The AUG start codon is located in exon 2. Shadowed areas (35.9 kb) represent LINE elements that were excluded from the variation search. The numbers indicate the nucleotide positions of the polymorphic sites with respect to the published cDNA sequence (GenBank accession no. U52195).

Figure 2

Figure 2

Maximum likelihood networks of the SMCY haplotypes (A) and the major haplotypes constructed from the combined variant sites in DFFRY, DBY, and UTY1 (B). Haplotype numbers are from Tables 1 and 3. The representations of color are black, Africa; yellow, Asia; green, Europe; red, America; and blue, Oceania. The areas of the circles represent the number of individuals carrying each haplotype. The arrows indicate the location of the most likely root in the phylogenies. The occurrence of a single recurrent mutation (r) did not generate any ambiguity in the parsimonious mutational pathway when considered in the context of other polymorphisms.

Figure 3

Figure 3

Allele frequency spectra of the four Y chromosome genes. Shaded columns show the observed distribution of allele frequencies. Black columns depict the expected numbers of derived alleles under the expectation of the Luria–Delbrück/Lea–Coulson theory (16, 17), assuming that each nucleotide in the screened regions is analogous to a parallel, independent bacterial culture. The concordance of the observed and expected distributions is consistent with a significant population expansion. In contrast, there is strong incongruity with the expectation of constant population size (blank columns) with the distribution of alleles estimated by Watterson (18).

Figure 4

Figure 4

Histograms display the numbers of sequence differences among all possible pairs of sequences in the set of 70 individuals for 51 segregating sites in DFFRY, DBY, and UTY1, over 41,016 bp (A) and 53 individuals for 47 segregating sites in SMCY over 39,931 bp (B). The mean values and standard errors (SE) for human expansion times were estimated as described in the text.

Comment in

References

    1. Cooke H J, Brown W R, Rappold G A. Nature (London) 1985;317:687–692. - PubMed
    1. Sinclair A H, Berta P, Palmer M S, Hawkins J R, Griffiths B L, Smith M J, Foster J W, Frischauf A M, Lovell-Badge R, Goodfellow P N. Nature (London) 1990;346:240–244. - PubMed
    1. Lahn B T, Page D C. Science. 1997;278:675–680. - PubMed
    1. Vogt P H, Edelmann A, Kirsch S, Henegariu O, Hirschmann P, Kiesewetter F, Kohn F M, Schill W B, Farah S, Ramos C, et al. Hum Mol Genet. 1996;5:933–943. - PubMed
    1. Kent-First M G, Maffitt M, Muallem A, Brisco P, Shultz J, Ekenberg S, Agulnik A I, Agulnik I, Shramm D, Bavister B, et al. Nat Genet. 1996;14:128–129. - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources