Statistical tests for detecting positive selection by utilizing high-frequency variants - PubMed (original) (raw)
Statistical tests for detecting positive selection by utilizing high-frequency variants
Kai Zeng et al. Genetics. 2006 Nov.
Abstract
By comparing the low-, intermediate-, and high-frequency parts of the frequency spectrum, we gain information on the evolutionary forces that influence the pattern of polymorphism in population samples. We emphasize the high-frequency variants on which positive selection and negative (background) selection exhibit different effects. We propose a new estimator of theta (the product of effective population size and neutral mutation rate), thetaL, which is sensitive to the changes in high-frequency variants. The new thetaL allows us to revise Fay and Wu's H-test by normalization. To complement the existing statistics (the H-test and Tajima's D-test), we propose a new test, E, which relies on the difference between thetaL and Watterson's thetaW. We show that this test is most powerful in detecting the recovery phase after the loss of genetic diversity, which includes the postselective sweep phase. The sensitivities of these tests to (or robustness against) background selection and demographic changes are also considered. Overall, D and H in combination can be most effective in detecting positive selection while being insensitive to other perturbations. We thus propose a joint test, referred to as the DH test. Simulations indicate that DH is indeed sensitive primarily to directional selection and no other driving forces.
Figures
Figure 1.—
Variance of the five estimators of θ. Sample size (n) is 50.
Figure 2.—
(A) Changes in at the linked neutral locus as the advantageous mutation increases in frequency (f). (B) Changes in R(i) at different times τ (measured in units of 4_N_ generations) after fixation of the advantageous mutation. In all simulations, the parameters are defined as follows: θ = 4_N_μ, where μ is the mutation rate for the linked neutral locus; s is the selective coefficient of the advantageous mutation and c is the recombination distance (between the neutral variation under investigation and the advantageous mutation nearby), which is usually scaled by the selective coefficient. The parameter values are θ = 5, s = 0.001, c/s = 0.02, and sample size (n) is 50. In the simulation for hitchhiking, we also incorporated intragenic recombination among the neutral variants under investigation. The intragenic recombination rate of the neutral locus, multiplied by 4_N_, is 25 here and in Figure 3. The values of θ and intragenic recombination rate were chosen to reflect the reality of D. melanogaster; i.e., the scaled local recombination rate is about fivefold as large as the local population mutation rate. Intragenic recombination in other cases has a negligible effect on the results and was not incorporated.
Figure 3.—
Power of the tests before and after hitchhiking is completed. The x_-axis on the left represents the increase in the frequency of the advantageous mutation; on the right is the time after fixation (measured in units of 4_N generations). All parameter values are the same as those of Figure 2. All tests were one-sided; values falling into the lower 5% tail of the null distribution were considered significant. Results shown in Figures 4–6 were produced by the same method. (A) c/s = ; (B) c/s = 0.02.
Figure 4.—
Sensitivity (or power) of the tests to population expansion. We assume that the effective population size increases 10-fold instantaneously at time 0 to θ = 5. Sample size (n) is 50. Time is measured in units of 4_N_ generations.
Figure 5.—
Sensitivity (or power) of the tests to population shrinkage. We assume that the effective population size decreases 10-fold instantaneously at time 0 to θ = 2. Sample size (n) is 50. Time is measured in units of 4_N_ generations.
Figure 6.—
Sensitivity (or power) of the tests to population subdivision. A symmetric two-deme model with θ = 2 per deme (2_N_ genes per deme) was simulated. Populations are assumed to be in drift–migration equilibrium with symmetric migration at a rate of m, which is the fraction of new migrants each generation. Sample size (n) is 50. (A) Sensitivity as a function of the degree of population subdivision, expressed as 4_Nm_ on the x_-axis. All genes were sampled from one subpopulation. (B) Sensitivity as a function of the sampling skewness; for example, 5/45 means 5 genes are sampled from one subpopulation and 45 from the other. In this case, 4_Nm = 0.1, a value at which the tests show sensitivity to population subdivision in A.
Similar articles
- Compound tests for the detection of hitchhiking under positive selection.
Zeng K, Shi S, Wu CI. Zeng K, et al. Mol Biol Evol. 2007 Aug;24(8):1898-908. doi: 10.1093/molbev/msm119. Epub 2007 Jun 8. Mol Biol Evol. 2007. PMID: 17557886 - A new test for detecting recent positive selection that is free from the confounding impacts of demography.
Li H. Li H. Mol Biol Evol. 2011 Jan;28(1):365-75. doi: 10.1093/molbev/msq211. Epub 2010 Aug 13. Mol Biol Evol. 2011. PMID: 20709734 - Neutrality tests for sequences with missing data.
Ferretti L, Raineri E, Ramos-Onsins S. Ferretti L, et al. Genetics. 2012 Aug;191(4):1397-401. doi: 10.1534/genetics.112.139949. Epub 2012 Jun 1. Genetics. 2012. PMID: 22661328 Free PMC article. - Detecting directional selection in the presence of recent admixture in African-Americans.
Lohmueller KE, Bustamante CD, Clark AG. Lohmueller KE, et al. Genetics. 2011 Mar;187(3):823-35. doi: 10.1534/genetics.110.122739. Epub 2010 Dec 31. Genetics. 2011. PMID: 21196524 Free PMC article. - Host-Specific and Segment-Specific Evolutionary Dynamics of Avian and Human Influenza A Viruses: A Systematic Review.
Kim K, Omori R, Ueno K, Iida S, Ito K. Kim K, et al. PLoS One. 2016 Jan 13;11(1):e0147021. doi: 10.1371/journal.pone.0147021. eCollection 2016. PLoS One. 2016. PMID: 26760775 Free PMC article. Review.
Cited by
- Patterns of nucleotide diversity at the regions encompassing the Drosophila insulin-like peptide (dilp) genes: demography vs. positive selection in Drosophila melanogaster.
Guirao-Rico S, Aguadé M. Guirao-Rico S, et al. PLoS One. 2013;8(1):e53593. doi: 10.1371/journal.pone.0053593. Epub 2013 Jan 7. PLoS One. 2013. PMID: 23308258 Free PMC article. - Contrasting patterns of nucleotide diversity for four conifers of Alpine European forests.
Mosca E, Eckert AJ, Liechty JD, Wegrzyn JL, La Porta N, Vendramin GG, Neale DB. Mosca E, et al. Evol Appl. 2012 Nov;5(7):762-75. doi: 10.1111/j.1752-4571.2012.00256.x. Evol Appl. 2012. PMID: 23144662 Free PMC article. - Assessing the influence of adjacent gene orientation on the evolution of gene upstream regions in Arabidopsis thaliana.
He F, Chen WH, Collins S, Acquisti C, Goebel U, Ramos-Onsins S, Lercher MJ, de Meaux J. He F, et al. Genetics. 2010 Jun;185(2):695-701. doi: 10.1534/genetics.110.114629. Epub 2010 Mar 16. Genetics. 2010. PMID: 20233855 Free PMC article. - Both positive and negative selection pressures contribute to the polymorphism pattern of the duplicated human CYP21A2 gene.
Szabó JA, Szilágyi Á, Doleschall Z, Patócs A, Farkas H, Prohászka Z, Rácz K, Füst G, Doleschall M. Szabó JA, et al. PLoS One. 2013 Nov 29;8(11):e81977. doi: 10.1371/journal.pone.0081977. eCollection 2013. PLoS One. 2013. PMID: 24312389 Free PMC article. - Microevolution of nematode miRNAs reveals diverse modes of selection.
Jovelin R, Cutter AD. Jovelin R, et al. Genome Biol Evol. 2014 Oct 28;6(11):3049-63. doi: 10.1093/gbe/evu239. Genome Biol Evol. 2014. PMID: 25355809 Free PMC article.
References
- Bustamante, C. D., R. Nielsen, S. A. Sawyer, K. M. Olsen, M. D. Purugganan et al., 2002. The cost of inbreeding in Arabidopsis. Nature 416: 531–534. - PubMed
Publication types
MeSH terms
Grants and funding
- R01 GM050428/GM/NIGMS NIH HHS/United States
- R01 GM060777/GM/NIGMS NIH HHS/United States
- GM 60777/GM/NIGMS NIH HHS/United States
- GM50428/GM/NIGMS NIH HHS/United States
- R29 GM050428/GM/NIGMS NIH HHS/United States
LinkOut - more resources
Full Text Sources
Other Literature Sources