Measuring the rates of spontaneous mutation from deep and large-scale polymorphism data - PubMed (original) (raw)

Measuring the rates of spontaneous mutation from deep and large-scale polymorphism data

Philipp W Messer. Genetics. 2009 Aug.

Abstract

The rates and patterns of spontaneous mutation are fundamental parameters of molecular evolution. Current methodology either tries to measure such rates and patterns directly in mutation-accumulation experiments or tries to infer them indirectly from levels of divergence or polymorphism. While experimental approaches are constrained by the low rate at which new mutations occur, indirect approaches suffer from their underlying assumption that mutations are effectively neutral. Here I present a maximum-likelihood approach to estimate mutation rates from large-scale polymorphism data. It is demonstrated that the method is not sensitive to demography and the distribution of selection coefficients among mutations when applied to mutations at sufficiently low population frequencies. With the many large-scale sequencing projects currently underway, for instance, the 1000 genomes project in humans, plenty of the required low-frequency polymorphism data will shortly become available. My method will allow for an accurate and unbiased inference of mutation rates and patterns from such data sets at high spatial resolution. I discuss how the assessment of several long-standing problems of evolutionary biology would benefit from the availability of accurate mutation rate estimates.

PubMed Disclaimer

Figures

F<sc>igure</sc> 1.—

Figure 1.—

Expected number of mutant alleles present at frequency x in a population. Distributions _g_γ(x) are shown for several different selection classes, always using μγ = 1. The solid line is the neutral asymptotics, _g_0(x) = 2/x, which in the double-logarithmic plot appear as a straight line with slope −1. Compared with neutral mutations, deleterious mutations (γ < 0) are systematically suppressed from reaching higher frequencies in the population, and beneficial mutations (γ > 0) are enriched at high frequencies. In the low-frequency limit all distributions converge to the neutral SFS, although convergence occurs substantially faster for beneficial than for deleterious mutations.

F<sc>igure</sc> 2.—

Figure 2.—

Probability density Pr(x | k, n) that a neutral mutation has population frequency x in a population of size N = 104 if it is observed in k of n = 1000 genotyped sequences.

F<sc>igure</sc> 3.—

Figure 3.—

Expected relative errors formula image according to Equation 24 for nonneutral mutations as a function of γ for three different k = 2, 5, and 10 in a sample of n = 1000 genotyped sequences.

F<sc>igure</sc> 4.—

Figure 4.—

Neutral-likelihood curves formula image for several mutation scenarios. The mutation rate is always μ = 0.005. The distribution of selection coefficients for a particular scenario is specified by the parameters ωγ. The expected counts of mutations to be observed in k of n = 1000 samples, formula image, were estimated from Equation 14 for each mutation scenario. Likelihoods formula image were calculated according to Equation 13. The size of the source population was N = 104.

F<sc>igure</sc> 5.—

Figure 5.—

Analysis of the expected errors formula image for three exemplary demographic scenarios. The time arrows in the three scenarios (A–C) go from the past to the present (tips of the arrows). Intervals of constant population size are specified by their respective population sizes Ni and durations τ_i_. Population size changes instantaneously between intervals. Dotted lines specify ancient points in time at which an equilibrium SFS was assumed. The sample size in D was n = 1000.

F<sc>igure</sc> 6.—

Figure 6.—

Example of formula image estimated for three different classes of sites. All classes have θ = 1, but selection coefficients differ between classes. For simplicity, all mutations within a class are modeled to have the same selection coefficient. formula image was then calculated by formula image. The sample size was n = 1000.

Similar articles

Cited by

References

    1. Akashi, H., and S. W. Schaeffer, 1997. Natural selection and the frequency distributions of “silent” DNA polymorphism in Drosophila. Genetics 146 295–307. - PMC - PubMed
    1. Andolfatto, P., 2005. Adaptive evolution of non-coding DNA in Drosophila. Nature 437 1149–1152. - PubMed
    1. Baer, C. F., M. M. Miyamoto and D. R. Denver, 2007. Mutation rate variation in multicellular eukaryotes: causes and consequences. Nat. Rev. Genet. 8 619–631. - PubMed
    1. Begun, D. J., A. K. Holloway, K. Stevens, L. W. Hillier, Y. P. Poh et al., 2007. Population genomics: whole-genome analysis of polymorphism and divergence in Drosophila simulans. PLoS Biol. 5 e310. - PMC - PubMed
    1. Boyko, A. R., S. H. Williamson, A. R. Indap, J. D. Degenhardt, R. D. Hernandez et al., 2008. Assessing the evolutionary impact of amino acid mutations in the human genome. PLoS Genet. 4 e1000083. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources