Genomic regions exhibiting positive selection identified from dense genotype data - PubMed (original) (raw)
Comparative Study
Genomic regions exhibiting positive selection identified from dense genotype data
Christopher S Carlson et al. Genome Res. 2005 Nov.
Abstract
The allele frequency spectrum of polymorphisms in DNA sequences can be used to test for signatures of natural selection that depart from the expected frequency spectrum under the neutral theory. We observed a significant (P = 0.001) correlation between the Tajima's D test statistic in full resequencing data and Tajima's D in a dense, genome-wide data set of genotyped polymorphisms for a set of 179 genes. Based on this, we used a sliding window analysis of Tajima's D across the human genome to identify regions putatively subject to strong, recent, selective sweeps. This survey identified seven Contiguous Regions of Tajima's D Reduction (CRTRs) in an African-descent population (AD), 23 in a European-descent population (ED), and 29 in a Chinese-descent population (XD). Only four CRTRs overlapped between populations: three between ED and XD and one between AD and ED. Full resequencing of eight genes within six CRTRs demonstrated frequency spectra inconsistent with neutral expectations for at least one gene within each CRTR. Identification of the functional polymorphism (and/or haplotype) responsible for the selective sweeps within each CRTR may provide interesting insights into the strongest selective pressures experienced by the human genome over recent evolutionary history.
Figures
Figure 1.
Comparison of Tajima's D between Perlegen and SeattleSNPs data sets. For each gene, Tajima's D was calculated from complete resequencing data in the SeattleSNPs data set, or from the region spanning 10 kb upstream of the transcript, the full transcript, and 10 kb downstream of the transcript in the Perlegen data. (A) Tajima's D from Perlegen vs. Tajima's D from SeattleSNPs for AD population. (B) Tajima's D from Perlegen vs. Tajima's D from SeattleSNPs for ED population. Genes previously resequenced by SeattleSNPs are shown in red, with a trend line representing a linear regression on the data. Genes resequenced as part of the present study are shown as purple dots, with filled circles indicating that the gene lay within a CRTR in the population being plotted. The seven SeattleSNPs genes with robust signatures of selection in SeattleSNPs data are shown in green (Akey et al. 2004).
Figure 2.
A probability density plot of the distribution of Tajima's D in the sliding windows is shown for each population. All three distributions depart significantly from a normal distribution, most noticeably in the heavy tail at low values in each population.
Figure 3.
Tajima's D in 100-kbp sliding windows with 10-kbp steps is shown across the first 50 megabases of chromosome 1. Several CRTRs are visible, including a region near 35M in the ED population containing CLSPN (large blue arrowhead) and a region near 41M in the AD population spanning CTPS, FLJ23878, and SCMH1 (large green arrowhead). CRTRs at the less stringent 5% level are also indicated in the ED population as small blue arrowheads and in the XD population as small red arrowheads.
Figure 4.
(A) A visual genotype for 1.5 Mbp spanning the CLSPN CRTR in the Perlegen data. Each row corresponds to an individual, and each column corresponds to a polymorphic site, with genotypes color coded as follows: Common allele homozygotes are shown in blue, heterozygotes are shown in red, rare allele homozygotes are shown in yellow, and missing data are shown as gray. The top 24 samples are ED, the middle 23 samples are AD, and the bottom 24 samples are XD. Although nucleotide diversity is depressed across a large region, there is no clear minimum within the CRTR. Nucleotide diversity was relatively constant across the region, so CLSPN (shown as a black box) was selected as a target for resequencing because of interesting patterns of Fst between ED and XD, in addition to low nucleotide diversity. (B) A visual genotype of the resequencing results for the CLSPN gene. The top 24 samples are ED; the middle 24 samples are AD, and the bottom 24 samples are XD. As expected, a number of polymorphisms nearly fixated between ED and XD were observed. One of these SNPs (10710, red arrowhead) changes an amino acid (Ser525Asn), whereas the other three are intronic (green arrowheads).
Figure 5.
A close-up of the CLSPN CRTR from the UCSC genome browser is shown, with the Tajima's D tracks as well as a set of tracks showing the inferred relative recombination rate from LDhat for each population in grayscale (track label, LDhat log RR AD/ED/XD): Darker segments correspond to high inferred recombination rates. CLSPN is located at 35.9 Mbp. The left edge of the CLSPN CRTR (at ∼35 Mbp in the ED population) corresponds to a strong recombination hotspot observed in all three populations, but of greater interest are the hotspots spanned by the CRTR at ∼35.4 Mbp and ∼35.8 Mbp. Thus, although this CRTR does span a region with reduced recombination overall, there are several inferred hotspots within the CRTR that are shared between populations.
Similar articles
- Disentangling the effects of demography and selection in human history.
Stajich JE, Hahn MW. Stajich JE, et al. Mol Biol Evol. 2005 Jan;22(1):63-73. doi: 10.1093/molbev/msh252. Epub 2004 Sep 8. Mol Biol Evol. 2005. PMID: 15356276 - A whole genome long-range haplotype (WGLRH) test for detecting imprints of positive selection in human populations.
Zhang C, Bailey DK, Awad T, Liu G, Xing G, Cao M, Valmeekam V, Retief J, Matsuzaki H, Taub M, Seielstad M, Kennedy GC. Zhang C, et al. Bioinformatics. 2006 Sep 1;22(17):2122-8. doi: 10.1093/bioinformatics/btl365. Epub 2006 Jul 15. Bioinformatics. 2006. PMID: 16845142 - Detecting directional selection in the presence of recent admixture in African-Americans.
Lohmueller KE, Bustamante CD, Clark AG. Lohmueller KE, et al. Genetics. 2011 Mar;187(3):823-35. doi: 10.1534/genetics.110.122739. Epub 2010 Dec 31. Genetics. 2011. PMID: 21196524 Free PMC article. - Scanning for genomic regions subject to selective sweeps using SNP-MaP strategy.
Deng L, Tang X, Chen W, Lin J, Lai Z, Liu Z, Zhang D. Deng L, et al. Genomics Proteomics Bioinformatics. 2010 Dec;8(4):256-61. doi: 10.1016/S1672-0229(10)60027-7. Genomics Proteomics Bioinformatics. 2010. PMID: 21382594 Free PMC article. - Molecular evolution of 5' flanking regions of 87 candidate genes for atherosclerotic cardiovascular disease.
Ding K, Kullo IJ. Ding K, et al. Genet Epidemiol. 2006 Nov;30(7):557-69. doi: 10.1002/gepi.20169. Genet Epidemiol. 2006. PMID: 16799961
Cited by
- Allelic variation in the autotetraploid potato: genes involved in starch and steroidal glycoalkaloid metabolism as a case study.
Li H, Brouwer M, Pup ED, van Lieshout N, Finkers R, Bachem CWB, Visser RGF. Li H, et al. BMC Genomics. 2024 Mar 12;25(1):274. doi: 10.1186/s12864-024-10186-5. BMC Genomics. 2024. PMID: 38475714 Free PMC article. - Harnessing γ-TMT Genetic Variations and Haplotypes for Vitamin E Diversity in the Korean Rice Collection.
Somsri A, Chu SH, Nawade B, Lee CY, Park YJ. Somsri A, et al. Antioxidants (Basel). 2024 Feb 14;13(2):234. doi: 10.3390/antiox13020234. Antioxidants (Basel). 2024. PMID: 38397832 Free PMC article. - Understanding genetic diversity in drought-adaptive hybrid parental lines in pearl millet.
Kandarkar K, Palaniappan V, Satpathy S, Vemula A, Rajasekaran R, Jeyakumar P, Sevugaperumal N, Gupta SK. Kandarkar K, et al. PLoS One. 2024 Feb 23;19(2):e0298636. doi: 10.1371/journal.pone.0298636. eCollection 2024. PLoS One. 2024. PMID: 38394324 Free PMC article. - Conservation genetics and potential geographic distribution modeling of Corybas taliensis, a small 'sky Island' orchid species in China.
Liu Y, Wang H, Yang J, Dao Z, Sun W. Liu Y, et al. BMC Plant Biol. 2024 Jan 2;24(1):11. doi: 10.1186/s12870-023-04693-y. BMC Plant Biol. 2024. PMID: 38163918 Free PMC article. - Diaporthe Species on Palms: Molecular Re-Assessment and Species Boundaries Delimitation in the D. arecae Species Complex.
Pereira DS, Hilário S, Gonçalves MFM, Phillips AJL. Pereira DS, et al. Microorganisms. 2023 Nov 6;11(11):2717. doi: 10.3390/microorganisms11112717. Microorganisms. 2023. PMID: 38004729 Free PMC article.
References
- Carlson, C.S., Eberle, M.A., Rieder, M.J., Smith, J.D., Kruglyak, L., and Nickerson, D.A. 2003. Additional SNPs and linkage-disequilibrium analyses are necessary for whole-genome association studies in humans. Nat. Genet. 33: 518–521. - PubMed
- Clark, A.G., Glanowski, S., Nielsen, R., Thomas, P., Kejariwal, A., Todd, M.J., Tanenbaum, D.M., Civello, D., Lu, F., Murphy, B., et al. 2003a. Positive selection in the human genome inferred from human–chimp–mouse orthologous gene alignments. Cold Spring Harb. Symp. Quant. Biol. 68: 471–477. - PubMed
Web site references
- http://pga.gs.washington.edu; Seattle SNPs Web site.
- http://genome.perlegen.com/browser/download.html; Perlegen Web site.
- http://genome.ucsc.edu/cgi-bin/hgGateway; UCSC Genome Browser.
Publication types
MeSH terms
Substances
Grants and funding
- U01 HL066642/HL/NHLBI NIH HHS/United States
- HG02238/HG/NHGRI NIH HHS/United States
- P41 HG002371/HG/NHGRI NIH HHS/United States
- IP41HG02371/HG/NHGRI NIH HHS/United States
- HL66682/HL/NHLBI NIH HHS/United States
- HL66642/HL/NHLBI NIH HHS/United States
- U01 HL066682/HL/NHLBI NIH HHS/United States
- R01 HG002238/HG/NHGRI NIH HHS/United States
LinkOut - more resources
Full Text Sources
Medical
Research Materials