Probability of Fixation of an Advantageous Mutant in a Viral Quasispecies (original) (raw)

Journal Article

Digital Life Laboratory

, Caltech, Pasadena, California 91125

Address for correspondence: Digital Life Laboratory 136-93, Caltech, Pasadena, CA 91125. E-mail: wilke@caltech.edu

Search for other works by this author on:

Accepted:

01 November 2002

Published:

01 February 2003

Navbar Search Filter Mobile Enter search term Search

Abstract

The probability that an advantageous mutant rises to fixation in a viral quasispecies is investigated in the framework of multitype branching processes. Whether fixation is possible depends on the overall growth rate of the quasispecies that will form if invasion is successful rather than on the individual fitness of the invading mutant. The exact fixation probability can be calculated only if the fitnesses of all potential members of the invading quasispecies are known. Quasispecies fixation has two important characteristics: First, a sequence with negative selection coefficient has a positive fixation probability as long as it has the potential to grow into a quasispecies with an overall growth rate that exceeds that of the established quasispecies. Second, the fixation probabilities of sequences with identical fitnesses can nevertheless vary over many orders of magnitudes. Two approximations for the probability of fixation are introduced. Both approximations require only partial knowledge about the potential members of the invading quasispecies. The performance of these two approximations is compared to the exact fixation probability on a network of RNA sequences with identical secondary structure.

ONE of the most remarkable aspects of the dynamics of RNA viruses is the high rate at which mutant variants are produced. At mutation rates close to one substitution per genome per generation (Drake 1993; Drake and Holland 1999), a virus population forms a highly diverse cloud of mutants (Domingo et al. 1976, 1978; Holland et al. 1982; Steinhauer et al. 1989; Biebricher and Luce 1993; Burch and Chao 2000), a so-called quasispecies (Eigen and Schuster 1979; Nowak 1992; Domingo and Holland 1997; Domingo et al. 2001). At the same time, the sequence space is so large that even for population sizes up to 1012, there is a constant stream of new mutants that have never existed before. Most of these mutants have impaired fitness, but occasionally, a new mutant will fare better than all currently existing virions, for example, by presenting an epitope that the immune system fails to recognize. With a certain probability, this mutant will rise to fixation, where fixation is understood in the sense that the mutant becomes the ancestor of a new quasispecies, which completely replaces the currently existing one.

The problem of the fixation of an advantageous mutant is an old one, with a long history of investigations in classical population genetics, reaching back to Haldane and Fisher (Fisher 1922, 1930; Haldane 1927; Kimura 1957, 1964, 1970; Ewens 1967; Kimura and King 1979; Barton 1995; Bürger and Ewens 1995; Otto and Barton 1997; Pollak 2000). However, these investigations differ from the quasispecies case in one important aspect: the mutation rates considered. In classical population genetics, the usual assumption is that mutations are rare events, such that an invading mutant will not mutate again while it is moving toward either fixation or extinction. In the quasispecies setting, on the other hand, most of the immediate offspring of a mutant will have further mutations, and their offspring will as well, and so on. As a consequence, the fitness of a prospective invading quasispecies is not given by the fitness of the initial mutant, but rather by the average fitness of the offspring mutant cloud that will form eventually. One of the more surprising results of these dynamics is that a mutant with the ability to replace the currently existing quasispecies may actually have a reduced replication rate, if at the same time its robustness against further mutations is increased (Schuster and Swetina 1988; Wilke 2001b; Wilke et al. 2001; Krakauer and Plotkin 2002).

Quasispecies theory in its original formulation by Eigen and Schuster (1979) is based on deterministic differential equations and as such cannot deal with the fluctuations that are responsible for fixation or extinction of individual mutants. Within the more general mathematical framework of multitype branching processes, it is possible to describe both the deterministic aspects of large populations and the fluctuations inherent in the dynamics of small and very small populations (Demetrius et al. 1985; Hofbauer and Sigmund 1988; Hermisson et al. 2002). An expression for the probability of fixation follows naturally from branching process theory. We discuss how this expression relates to the predictions of the deterministic quasispecies equations, as well as to the results of classical population genetics. The remainder of the article is organized as follows.

First, we derive a general expression for the probability of fixation in an arbitrary fitness landscape. Then, we discuss the special case of fixation on a neutral network, that is, the case in which all sequences of the invading quasispecies have the same fitness, and derive two approximations for the fixation probability that can be evaluated without the knowledge of the full-fitness landscape. To give a concrete example, we apply both the exact expression and the approximations to a known network of >50,000 RNA sequences. For this neutral network, we also discuss how the fixation probability changes if multiple sequences invade at the same time.

THEORETICAL FRAMEWORK

For a population evolving under high mutational pressure, we have to understand fixation in the sense that a mutant is fixed once it has become a common ancestor of the whole population. The more traditional definition of fixation, which is to regard a mutation as fixed if all sequences in the population carry it, is not applicable: The mutational pressure constantly creates new deleterious mutants, which may not carry a particular mutation although their ancestors did so. If we understand fixation as the process by which a mutant becomes a common ancestor of the whole population, then the probability that a mutant is fixed is given by the probability that the cascade of further mutated offspring of the invading mutant does not come to a halt. We can calculate this probability from the theory of multitype branching processes.

The general setting to which our theory applies is as follows. Consider a viral quasispecies in mutation-selection balance, with an average fitness 〈_w_〉. If generations are discrete and nonoverlapping, and the population size N is constant, then the probability that a virion i produces k offspring in one generation is given by Wright-Fisher sampling,

with ξ_i_ = wi/(〈w_〉_N), where wi is the fitness of virion i.

Assume that a rare mutation leads to the emergence of a virion with the potential to form a new quasispecies and to replace the already established one in the process. This new quasispecies (in the following also called the invading quasispecies) may consist of sequences of type 1, 2,..., n, with replication rates wi. Let the probability that a sequence j produces an erroneous copy i be given by Qij. As long as the total abundance of the invading quasispecies is small compared to the established quasispecies, we can assume that 〈_w_〉 is not affected by the presence of the invading quasispecies. Then, the probability that a single sequence of type i generates (_k_1,..., kn) offspring of types 1,..., n can be expressed as

P(k1,…,kn∣i)=N!(N−∑rkr)!∏rkr!×∏r=1n(Mir∕N)kr(1−∑r=1nMir∕N)N−Σrkr

(2)

(see appendix), with Mij = wiQji/〈_w_〉. The matrix elements Mij give the expected number of offspring of type j from sequences of type i in one generation. In the following, we assume that the population size is so large that we can approximate P(_k_1,..., kn|i) by its limit for an infinitely large population. This limit is a multivariate Poisson distribution:

P(k1,…,kn∣i)=∏r=1n(1kr!Mirkr)e−ΣrMir.

(3)

By using the theory of branching processes and by assuming an infinite population size in Equation 3, we restrict the applicability of our theory to certain scenarios. We can apply our theory only to those types of fixation events that increase the average fitness of the population. The situation of genetic drift, whereby a neutral or deleterious mutant is fixed because of stochastic fluctuations in a small population (Kimura 1970; Kimura and King 1979), is not covered by our theory. This latter type of fixation event reduces the average fitness or leaves it unaltered.

Let xi be the probability that the offspring cascade spawned by a sequence i goes extinct after a finite number of generations. From the theory of multitype branching processes (Harris 1963), we know that the vector of extinction probabilities x = (x_1,..., xn) satisfies x = f(x), where f(z) = (f_1(z),..., fn(z)) is the probability-generating function of the distribution of offspring probabilities P(_k_1,..., kn|i). The probability-generating function is defined as

fi(z)=∑k1,…,knP(k1,…,kn∣i)z1k1…znkn.

(4)

After inserting Equation 3 into Equation 4, we obtain fi(z) = e_Σ_r Mir(zr – 1). With the convention ex = (_ex_1,..., exn), we can rewrite this expression as

Since the probability of fixation π_i_ of a sequence i is given by the probability that the offspring cascade spawned by i does not go extinct, we have π_i_ = 1 – xi. The vector of fixation probabilities satisfies therefore 1 – π = f(1 – π). With Equation 5, we find

This equation has exactly one solution with 0 < π_i_ < 1 for all i_ if the spectral radius ρ_M_ of M > 1(Harris 1963; for the matrices M we are considering here, the spectral radius coincides with the largest positive eigenvalue of M, by virtue of the Frobenius-Perron theorem). Otherwise, π_i = 0 for all i.

To compare Equation 6 to the result of Haldane (1927), we take the logarithm on both sides of Equation 6 and expand to second order:

log(1−πi)≈−πi−πi2∕2=−∑kMikπk.

(7)

With si = Mii – 1, this simplifies to

If si > 0 and all off-diagonal elements of M are 0, then Equation 8 reduces to Haldane's result π_i_ = 2_si_; that is, the fixation probability of a sequence is twice its selective advantage. If the off-diagonal elements are nonzero, then the fixation probability is increased, because the invading sequence gets support from its mutational neighbors. In particular, even if some si < 0, the corresponding π_i_ are positive as long as ρ_M_ > 1. This means that in quasispecies fixation, sequences that by themselves reproduce too slowly to outcompete the currently established quasispecies can nevertheless found a new quasispecies that grows fast enough to overtake the population.

For simplicity, we have considered only discrete, nonoverlapping generations. Generalization to continuous time is straightforward (see, e.g., Harris 1963; Hermisson et al. 2002). In the continuous-time case, the vector of fixation probabilities π is again determined by an equation of the form 1 – π = f(1 – π). However, the generating function f(z) is in general not given by Equation 5. Its functional form depends on the details of the continuous-time process that is being modeled. For example, if reproduction occurs through binary fission, f(z) will be quadratic in the variables _z_1,..., zn.

FIXATION ON A NEUTRAL NETWORK

Exact expressions and estimates: So far, we have made no assumptions about the structure and fitness distribution of the invading quasispecies. This has led to a general equation for the vector of fixation probabilities π, but not much further analysis is possible without a concrete model for the fitness landscape of the invading quasispecies (we do not have to make any further assumptions about the established quasispecies, since it enters the equations only through its average fitness 〈_w_〉). The concrete fitness landscape we study is that of a neutral network (Huynen et al. 1996; Bornberg-Bauer 1997) of related sequences with identical replication rate σ. All sequences that are not part of the neutral network are assumed to have a vanishing replication rate. Mutations occur as random substitutions of single bases, and we allow for at most one substitution per replication event, similar to the approach of van Nimwegen et al. (1999). The probability of a substitution is given by μ. The restriction to at most a single substitution is a technicality that simplifies the analysis. Generalization to more elaborate mutation schemes is possible along the lines of Wilke (2001a).

We denote the sequence length by L and the number of different bases by κ (κ= 4 for RNA/DNA). For the matrix M, we have to take into account only the sequences belonging to the neutral network. It is useful to introduce the connection graph G = (Gij). The elements Gij are 1 if and only if two sequences i and j are exactly one mutation apart. In all other cases, Gij = 0. We can express M in terms of G as

where s =σ(1 –μ)/〈w_〉 – 1, β=σμ/[〈_w_〉_L(κ–1)], and 1 is the identity matrix. We restrict our analysis to primitive connection graphs, in which case the spectral radius ρ_G_ of G is given by the unique positive eigenvalue of largest modulus of G (Varga 2000). (Irreducibility, which is often assumed in similar contexts, is not sufficient, since complex eigenvalues of modulus ρ_G_ may exist if G is not primitive. Irreducible undirected connection graphs of the kind we are considering here are primitive if they contain at least one cycle of odd length.)

The spectral radius of M is given in terms of the spectral radius of the connection graph ρ_G_ as

This implies that fixation can occur as long as s is not smaller than –βρ_G_.

In an experimental setting, we cannot expect to have knowledge of the complete connection graph G. Therefore, it is important to have approximations for the fixation probability π_i_. We consider two alternative methods. Both are based on replacing the matrix M in Equation 6 by a suitable diagonal matrix. This replacement leads to a decoupling of the equations for different π_i_.

The quantity that is easiest to obtain experimentally is the growth rate of the invading quasispecies relative to the established quasispecies, when initially both are present in large and equal amounts. From the definition of M, we see that this relative growth rate corresponds to the spectral radius ρ_M_ of M. If we assume that every mutant present in the invading quasispecies has an expectation of ρ_M_ offspring per generation, then we can replace M in Equation 6 with a matrix that has entries ρ_M_ on the diagonal, while all off-diagonal elements are zero. Then, Equation 6 simplifies to 1 – π_i_ = e_–ρ_M_π_i for all i. Clearly, this approximation will overestimate the π_i_ for some mutants (mostly those that produce on average <ρ_M_ offspring) and underestimate it for others (mostly those that produce on average >ρ_M_ offspring). In the following, we refer to this estimate as the deterministic growth estimate, because it is based on the assumption that the invading quasispecies grows according to the deterministic equations from the outset.

The alternative method of estimating π_i_ is as follows. It is reasonable to assume that the first couple of replication cycles mostly determine fixation or extinction for an invading sequence. During these initial generations, the subpopulation descending from the invading sequence cannot explore the full neutral network if the network is large. Therefore, the major contribution to the fixation probability comes from the connection matrix of the local genetic neighborhood of the invading sequence, and sequences farther away on the neutral network are relatively unimportant. The idea behind the second approximation is therefore to calculate the fixation probability on the basis of a small area of genotype space surrounding the invading sequence. In the simplest case, we consider only the invading sequence and its immediate mutational neighbors. Assume sequence i has ν_i_ neutral neighbors, i.e., Σ_jGij_ =ν_i_. Then the total expected number of offspring of sequence i is Σ_jMij_ = s + 1 +βν_i_. Under the assumption that all offspring of i have the same expected number of further offspring, the probability of fixation satisfies the equation 1 – π = e_–(s+1+βν_i)π_i_. We call the solution to this equation the neutrality estimate. As in the case of the deterministic growth estimate, it will overestimate the true fixation probability for some sequences and underestimate it for others.

Fixation on an RNA neutral network: We compared the two estimates to the exact fixation probabilities on a neutral network of RNA sequences. The network of 51,028 sequences of length L = 18 was found through exhaustive enumeration by van Nimwegen et al. (1999). The spectral radius of the network's connection graph is ρ_G_ = 15.7. To calculate fixation probabilities on this neutral network, we have to make an assumption about the average fitness 〈w_〉 of the established quasispecies. We assume 〈_w_〉 = 1 –μ[1 –ρ_G/(3_L_)], in which case the relative growth rate of the invading quasispecies (at macroscopic concentration) with respect to the established quasispecies follows from Equation 10 as ρ_M_ =σ, independent of the mutation rate.

Figure 1 displays the exact fixation probabilities (obtained numerically from Equation 6) and the two estimates as functions of the mutation rate. We have shown the average fixation probability π= Σπ_i_/n, the minimum probability πmin = min_i_{π_i_}, and the maximum probability πmax = max_i_{π_i_}. Since we chose 〈w_〉 such that ρ_M is independent of μ, the deterministic growth estimate is independent of μ. We observe that the deterministic growth estimate lies consistently above the average π, but below the maximum πmax. The neutrality estimate underestimates the smallest fixation probabilities and overestimates the largest ones. Its average lies slightly below π for small mutation rates and above π for large mutation rates. A more detailed plot of the fixation probabilities at a fixed mutation rate of μ= 0.5 is given in Figure 2. There, we display the fixation probability vs. the neutrality (number of neutral neighbors) of the invading sequence. The spread in the fixation probabilities is remarkable. For sequences with a given neutrality, the fixation probabilities vary over up to seven orders of magnitude. This demonstrates the important influence of not only the nearest neighbors but also the wider genetic neighborhood on the fate of a single sequence in quasispecies evolution. The neutrality estimate substantially underestimates the fixation probabilities of those sequences that have only few immediate neutral neighbors, but are otherwise located in a region of the genotype space where the density of neutral sequences is high. In principle, we could improve the neutrality estimate by taking into account all neutral sequences up to some distance d, but in practice this method becomes quickly as unwieldy as calculating the exact fixation probabilities.

—Fixation probability vs. mutation rate in a neutral network of 51,028 RNA sequences taken from van  Nimwegen  et al. (1999). Solid lines correspond to the solution of the full equations, dashed lines correspond to the neutrality estimate, and the dotted line indicates the deterministic growth estimate. σ= 1.05, L = 18, ρG = 15.7, 〈w〉 = 1 – μ[1 –ρG/(3L)].

Figure 1.

—Fixation probability vs. mutation rate in a neutral network of 51,028 RNA sequences taken from van Nimwegen et al. (1999). Solid lines correspond to the solution of the full equations, dashed lines correspond to the neutrality estimate, and the dotted line indicates the deterministic growth estimate. σ= 1.05, L = 18, ρ_G_ = 15.7, 〈w_〉 = 1 – μ[1 –ρ_G/(3_L_)].

Multiple invading sequences: The above considerations address only the case of a single invading sequence. The generalization to more than one invading sequence is straightforward. Assume that a set S of N sequences, with S = {i_1,..., iN}, invades an established quasispecies. The probability that this invasion is successful is given by 1 – Π_i_∈_S(1 –π_i_), where π_i_ are the fixation probabilities of the individual sequences. The probability of successful invasion of N sequences can be used as an indicator for the population size at which the deterministic quasispecies equations capture the relevant dynamics of a finite population. The fluctuations distinguishing the stochastic process of a finite population from the deterministic description can be neglected if the invasion probability is close to one. In Figure 3, the fixation probability on the same neutral network of RNA sequences that we have used before is displayed against the size of the invading population. The individual data points are averaged over 1000 independent trials, where for each trial the N starting sequences were chosen at random. As before, 〈_w_〉 is chosen such that σ is the average number of offspring of the invading quasispecies in the deterministic limit.

—Fixation probability vs. neutrality ν of the invading sequence in a neutral network of 51,028 RNA sequences taken from van  Nimwegen  et al. (1999). The dots stem from the exact numerical solution, the dashed line corresponds to the neutrality estimate, and the dotted line indicates the deterministic growth estimate. The inset shows the distribution of neutralities in the network. μ= 0.5, σ= 1.05, L = 18, ρG = 15.7, 〈w〉 = 1 –μ[1 –ρG/(3L)].

Figure 2.

—Fixation probability vs. neutrality ν of the invading sequence in a neutral network of 51,028 RNA sequences taken from van Nimwegen et al. (1999). The dots stem from the exact numerical solution, the dashed line corresponds to the neutrality estimate, and the dotted line indicates the deterministic growth estimate. The inset shows the distribution of neutralities in the network. μ= 0.5, σ= 1.05, L = 18, ρ_G_ = 15.7, 〈w_〉 = 1 –μ[1 –ρ_G/(3_L_)].

Figure 3 shows that the population need not cover the relevant sequence space to behave as predicted by the deterministic equations. On a neutral network of >50,000 sequences, a population of ∼1000 behaves deterministically at an advantage in growth rate of only 1%. It is important to note that this advantage has been calculated under the assumption of an infinite population and that sufficiently small populations will grow substantially slower (van Nimwegen et al. 1999). Apparently, here a population that covers only 2% of the neutral network is not sufficiently small to experience this reduction in growth rate.

DISCUSSION

The exact expression for the probability of fixation in the quasispecies context is easy to evaluate numerically if the fitnesses of all relevant sequences are known. However, these data are normally not available for experimental systems, and approximations must be used. What is most easily available experimentally is the relative rate of growth of the two quasispecies at macroscopic concentrations, which is the basis of the deterministic growth estimate. Since this estimate gives only a single number, independently of the sequence actually seeding the invading quasispecies, it does not reflect local variations in the density of viable sequences around the invading sequence. The neutrality estimate does not suffer from this shortcoming. However, it requires the knowledge of the fitnesses of the immediate neighbors of the invading sequence. Although experimentally tedious, these fitnesses can be measured in principle. For example, Elena and Lenski (1997) generated 225 mutant strains of the bacterium Escherichia coli (each mutant differed from the wild type by one, two, or three mutations) and measured the relative fitnesses of the mutant strains to the wild type. The mutant neighborhood of an RNA virus can conceivably be measured in a similar manner.

—Fixation probability π vs. size of the invading population N in a neutral network of 51,028 RNA sequences. The fixation probability is averaged over 1000 independent sets of invading sequences, chosen at random. The error bars indicate the standard deviation. Lines are meant as a guide to the eye. μ= 0.2, L = 18, ρG = 15.7, 〈w〉 = 1 –μ[1 –ρG/(3L)].

Figure 3.

—Fixation probability π vs. size of the invading population N in a neutral network of 51,028 RNA sequences. The fixation probability is averaged over 1000 independent sets of invading sequences, chosen at random. The error bars indicate the standard deviation. Lines are meant as a guide to the eye. μ= 0.2, L = 18, ρ_G_ = 15.7, 〈w_〉 = 1 –μ[1 –ρ_G/(3_L_)].

The predictive power of both the deterministic growth estimate and the neutrality estimate depends strongly on the distribution of neutral sequences in sequence space. For example, both estimates become exact for the case of a uniform neutral lattice, in which all sequences have exactly the same neutrality. Furthermore, we expect the neutrality estimate to perform particularly well in networks in which a sequence's neutrality is strongly correlated to the neutralities of its immediate and more distant neutral neighbors. The deterministic growth estimate, on the other hand, will yield the best results if the neutral network does not decompose into areas that are substantially more densely or less densely connected than other areas. However, to what extent these conditions are met in natural systems is questionable. As we have seen in this article, the connection graph of a comparatively simple neutral network—consisting of RNA sequences that are only 18 bp long—is already so heterogeneous that both estimates fail to give an accurate prediction of the fixation probability for a substantial fraction of sequences on that network. It is reasonable to assume that the distribution of high-fitness sequences in sequence space for an RNA virus that consists of several thousand bases is at least as heterogeneous as the one in our toy RNA network, probably more so. In this work, we have considered only the fate of a single invading quasispecies. However, while an invading quasispecies is moving toward fixation or extinction, another mutant, one that belongs to a quasispecies of even higher mean fitness, may appear. The fixation probability of the first invader will then be modulated by the dynamics of the second one and vice versa, an effect commonly referred to as “clonal interference” (Gerrish and Lenski 1998). Clonal interference has been reported in experiments with vesicular stomatitis virus (Miralles et al. 1999, 2000) and with the bacterium E. coli (de Visser et al. 1999). Currently, an accurate mathematical description of clonal interference for the quasispecies case is not available.

The approach we have followed in this work cannot be directly generalized to include clonal interference, because the assumption of a constant background average fitness 〈_w_〉 is not justified in the context of two (or more) competing branching processes. A second problem that we have to solve in a theory of quasispecies clonal interference is the identification of advantageous mutants. Throughout this article, we have used the definition that an advantageous mutant is one that can grow into a quasispecies with higher average fitness than that of the currently established quasispecies. To use this definition in the context of clonal interference, we need to have a priori knowledge about how to best subdivide the sequence space into independent quasispecies. Only with this knowledge can we decide whether a particular new mutant is part of the parent quasispecies or rather the founding member of a new quasispecies. A possible way to study clonal interference in future work will be to consider a particular fitness landscape—for example, a set of intertwined neutral networks at different fitness levels—for which the a priori separation into distinct quasispecies is possible. For such a landscape, numerical studies of clonal interference will be straightforward, and an analytic description should be possible as well. For landscapes that are a priori unknown, even the numerical investigation of clonal interference will remain difficult until a workable method for the identification of advantageous mutants has been found.

Recently, Jenkins et al. (2001) and Holmes and Moya (2002) expressed doubts regarding the relevancy of the quasispecies model for virology (but see Domingo 2002). They argued that there is no unequivocal experimental evidence for the quasispecies nature of RNA viruses and that the deterministic quasispecies equations are potentially not applicable to viral evolution on theoretical grounds, due to the immense size of the sequence space. The results of this article show that the second concern is not entirely justified. A single sequence has a positive probability to rise to fixation if and only if the average fitness of the quasispecies that will form eventually exceeds the average fitness of the currently established quasispecies. The individual fitness of the invading sequence has some influence on the exact value of that probability, but does not affect whether fixation is possible at all. Moreover, when the population size reaches several hundred, with probability of almost one the population will, for reasonable choices of the parameters, behave as predicted by the deterministic equations. A similar result has been obtained by van Nimwegen et al. (1999) for flow reactor simulations, where, on the same neutral network of RNA sequences that we have studied here, quasispecies effects started to become important when the product of population size and mutation rate _N_μ exceeded the value 10 (see Figure 3 of van Nimwegen et al. 1999).

Wilke (2001b) studied the probability of fixation for RNA sequences in a simulated flow reactor. The measured fixation probability was compared to an expression equivalent to the deterministic growth estimate of this work (since continuous time simulations were used to generate the data, the exact expressions differ from those given here). The analytic expression correctly predicted the parameter regions for which fixation was possible. In particular, the mutation rate at which a slower replicator with better mutational support could successfully invade a quasispecies consisting of sequences with higher individual fitnesses was determined accurately. However, the exact fixation probabilities seemed to be slightly overestimated. (Within the statistical accuracy of the data, a definite decision on this issue could not be made. While the data were in agreement with the model according to a χ2 test, they were not in agreement according to a nonparametric test based on how often the data points fell above or below the predicted value.)

The probability of fixation of advantageous mutants is obviously of tremendous importance for disease dynamics and vaccines. For example, live vaccinces of attenuated poliovirus can contain small amounts of virulent poliovirus variants (Chumakov et al. 1991), the reason being that attenuated and virulent virus variants are often separated by only one or a few mutations. In experiments, small amounts of highly virulent virus remain typically suppressed by the less virulent virus, but once a threshold concentration of the highly virulent virus variant is reached, infection occurs (de la Torre and Holland 1990; Chumakov et al. 1991; Teng et al. 1996). The apparent existence of such a threshold may well be a result of insufficient resolution of the experiments. Whether the highly virulent strain will grow is determined by stochastic fluctuations, and, as we have seen in Figure 3, the probability of fixation decays quickly with shrinking initial concentration of the virulent strain. If such a strain in a vaccine has a 1% chance to cause infection, then ≫100 replicates of the appropriate assay are necessary to observe at least one infection with certainty. Probabilities of this magnitude or lower can easily be missed at low numbers of replicates, so that the virulent strain appears to be safely suppressed.

APPENDIX

We consider a model with discrete, nonoverlapping generations and a constant population size N. Under the assumption that the reproductive success of a sequence i is proportional to its fitness wi, the probability that a randomly chosen sequence in the next generation is offspring of sequence i is given by ξ= wi/(〈w_〉_N), where 〈w_〉 is the average fitness in the population. Since there are N sequences in the population, the probability that k of them are offspring of sequence i is binomial, P(k∣i)=(kN)ξk(1−ξ)N−k⁠. Now consider a sequence of type r in the offspring generation. For the probability that the parent of sequence r is a particular sequence i of the previous generation, we find ξ_r = Qri_ξ= wiQri/(〈_w_〉_N), because only a fraction Qri of the total offspring of i will be of type r. Following the previous argument, we find for the probability that sequence i leaves kr offspring of type r:P(kr∣i)=(krN)ξrh(1−ξr)N−kr⁠.

We can extend the above argument to sequences of two types, r and s. The probability that sequence i leaves kr offspring sequences of type r and ks offspring sequences of type s is the probability that kr offspring are of type r, ξrkr⁠, times the probability that ks offspring are of type s, ξsks⁠, times the probability that the remaining offspring either are of different types or have different parent sequences, (1 – ξ_r_ – ξ_s_)N_–_k r_–_ks, times the number of possible ways in which kr and ks sequences can be chosen out of the total of N sequences in the population. This latter number is a multinomial coefficient, N!/[kr!ks!(Nkrks)!]. Putting everything together, we find

P(kr,ks∣i)=N!kr!ks!(N−kr−ks)!ξrkrξsks(1−ξr−ξs)N−kr−ks.

(A1)

By repeating this argument for n different sequence types, and with the definition Mij:= N_ξ_j = wiQji/〈_w_〉, we arrive at Equation 3.

Footnotes

Communicating editor: M. W. Feldman

Note added in proof: After acceptance of this manuscript, I became aware of a recent paper by T. Johnson and N. H. Barton (2002, The effect of deleterious alleles on adaptation in asexual populations. Genetics 162: 395–411). Using the theory of multitype branching processes, Johnson and Barton studied in this article the probability of fixation of a mutant that suffers additional deleterious mutations while going to fixation.

Acknowledgement

I thank C. Adami for extensive discussions and encouragement and E. C. Holmes for commenting on an earlier version of this manuscript. Moreover, I thank M. Huynen for providing the neutral network data from van Nimwegen et al. (1999). Financial support by the National Science Foundation under contract DEB-9981397 is gratefully acknowledged.

LITERATURE CITED

Barton

N H

,

1995

Linkage and the limits to natural selection

.

Genetics

140

:

821

841

.

Biebricher

C K

,

Luce

R

,

1993

Sequence analysis of RNA species synthesized by Qβ replicase without template

.

Biochemistry

32

:

4848

4854

.

Bornberg-Bauer

E

,

1997

How are model protein structures distributed in sequence space?

Biophys. J.

73

:

2393

2403

.

Burch

C L

,

Chao

L

,

2000

Evolvability of an RNA virus is determined by its mutational neighbourhood

.

Nature

406

:

625

628

.

Bürger

R

,

Ewens

W J

,

1995

Fixation probabilities of additive alleles in diploid populations

.

J. Math. Biol.

33

:

557

575

.

Chumakov

K M

,

Powers

L B

,

Noonan

K E

,

Roninson

I B

,

Levenbook

I S

,

1991

Correlation between amount of virus with altered nucleotide sequence and the monkey test for acceptability of oral poliovirus vaccine

.

Proc. Natl. Acad. Sci. USA

88

:

199

203

.

de la Torre

J C

,

Holland

J J

,

1990

RNA virus quasispecies populations can suppress vastly superior mutant progeny

.

J. Virol.

64

:

6278

6281

.

de Visser

J A G M

,

Zeyl

C W

,

Gerrish

P J

,

Blanchard

J L

,

Lenski

R E

,

1999

Diminishing returns from mutation supply rate in asexual populations

.

Science

283

:

404

406

.

Demetrius

L

,

Schuster

P

,

Sigmund

K

,

1985

Polynucleotide evolution and branching processes

.

Bull. Math. Biol.

47

:

239

262

.

Domingo

E

,

2002

Quasispecies theory in virology

.

J. Virol.

76

:

463

465

.

Domingo

E

,

Holland

J J

,

1997

RNA virus mutations and fitness for survival

.

Annu. Rev. Microbiol.

51

:

151

178

.

Domingo

E

,

Flavell

R A

,

Weissmann

C

,

1976

In vitro site directed mutagenesis: generation and properties of an infectious extracistronic mutant of bacteriophage Qβ

.

Gene

1

:

3

25

.

Domingo

E

,

Sabo

D

,

Taniguchi

T

,

Weissmann

C

,

1978

Nucleotide sequence heterogeneity of an RNA phage population

.

Cell

13

:

735

744

.

Domingo

E

,

Biebricher

C K

,

Eigen

M

,

Holland

J J

,

2001

Quasispecies and RNA Virus Evolution: Principles and Consequences

.

Landes Bioscience

,

Georgetown, TX

.

Drake

J W

,

1993

Rates of spontaneous mutation among RNA viruses

.

Proc. Natl. Acad. Sci. USA

90

:

4171

4175

.

Drake

J W

,

Holland

J J

,

1999

Mutation rates among RNA viruses

.

Proc. Natl. Acad. Sci. USA

96

:

13910

13913

.

Eigen

M

,

Schuster

P

,

1979

The Hypercycle: A Principle of Natural Self-Organization

.

Springer-Verlag

,

Berlin

.

Elena

S F

,

Lenski

R E

,

1997

Test of synergistic interactions among deleterious mutations in bacteria

.

Nature

390

:

395

398

.

Ewens

W J

,

1967

The probability of fixation of a mutant: the twolocus case

.

Evolution

21

:

532

540

.

Fisher

R A

,

1922

On the dominance ratio

.

Proc. R. Soc. Edinb.

42

:

321

341

.

Fisher

R A

,

1930

The distribution of gene ratios for rare mutations

.

Proc. R. Soc. Edinb.

50

:

204

219

.

Gerrish

P J

,

Lenski

R E

,

1998

The fate of competing beneficial mutations in an asexual population

.

Genetica 102

/

103

:

127

144

.

Haldane

J B S

,

1927

A mathematical theory of natural and artificial selection. Part V: selection and mutation

.

Proc. Camp. Philos. Soc.

23

:

838

844

.

Harris

T E

,

1963

The Theory of Branching Processes

.

Springer

,

Berlin

.

Hermisson

J

,

Redner

O

,

Wagner

H

,

Baake

E

,

2002

Mutation-selection balance: ancestry, load, and maximum principle

.

Theor. Popul. Biol.

62

:

9

46

.

Hofbauer

J

,

Sigmund

K

,

1988

The Theory of Evolution and Dynamical Systems

.

Cambridge University Press

,

Cambridge, UK

.

Holland

J

,

Spindler

K

,

Horodyski

F

,

Grabau

E

,

Nichol

S

et al. .,

1982

Rapid evolution of RNA genomes

.

Science

215

:

1577

1585

.

Holmes

E C

,

Moya

A

,

2002

Is the quasispecies concept relevant to RNA viruses?

J. Virol.

76

:

460

462

.

Huynen

M A

,

Stadler

P F

,

Fontana

W

,

1996

Smoothness within ruggedness: the role of neutrality in adaptation

.

Proc. Natl. Acad. Sci. USA

93

:

397

401

.

Jenkins

G M

,

Worobey

M

,

Woelk

C H

,

Holmes

E C

,

2001

Evidence for the non-quasispecies evolution of RNA viruses

.

Mol. Biol. Evol.

18

:

987

994

.

Kimura

M

,

1957

Some problems of stochastic processes in genetics

.

Ann. Math. Stat.

28

:

882

901

.

Kimura

M

,

1964

Diffusion models in population genetics

.

J. Appl. Prob.

1

:

177

232

.

Kimura

M

,

1970

The length of time required for a selectively neutral mutant to reach fixation through random frequency drift in a finite population

.

Genet. Res.

15

:

131

133

.

Kimura

M

,

King

J L

,

1979

Fixation of a deleterious allele at one of two “duplicate” loci by mutation pressure and random drift

.

Proc. Natl. Acad. Sci. USA

76

:

2858

2861

.

Krakauer

D C

,

Plotkin

J B

,

2002

Redundancy, antiredundancy, and the robustness of genomes

.

Proc. Natl. Acad. Sci. USA

99

:

1405

1409

.

Miralles

R

,

Gerrish

P J

,

Moya

A

,

Elena

S F

,

1999

Clonal interference and the evolution of RNA viruses

.

Science

285

:

1745

1747

.

Miralles

R

,

Moya

A

,

Elena

S F

,

2000

Diminishing returns of population size in the rate of RNA virus adaptation

.

J. Virol.

74

:

3566

3571

.

Nowak

M A

,

1992

What is a quasispecies?

TREE

7

:

118

121

.

Otto

S P

,

Barton

N H

,

1997

The evolution of recombination: removing the limits to natural selection

.

Genetics

147

:

879

906

.

Pollak

E

,

2000

Fixation probabilities when the population size undergoes cyclic fluctuations

.

Theor. Popul. Biol.

57

:

51

58

.

Schuster

P

,

Swetina

J

,

1988

Stationary mutant distributions and evolutionary optimization

.

Bull. Math. Biol.

50

:

635

660

.

Steinhauer

D A

,

de la Torre

J C

,

Meier

E

,

Holland

J J

,

1989

Extreme heterogeneity in populations of vesicular stomatitis virus

.

J. Virol.

63

:

2072

2080

.

Teng

M N

,

Oldstone

M B A

,

de la Torre

J C

,

1996

Suppression of lymphocytic choriomeningitis virus—induced growth hormone deficiency syndrome by disease-negative virus variants

.

Virology

223

:

113

119

.

van Nimwegen

E

,

Crutchfield

J P

,

Huynen

M

,

1999

Neutral evolution of mutational robustness

.

Proc. Natl. Acad. Sci. USA

96

:

9716

9720

.

Varga

R S

,

2000

Matrix Iterative Analysis

, Ed. 2.

Springer-Verlag

,

New York

.

Wilke

C O

,

2001a

Adaptive evolution on neutral networks

.

Bull. Math. Biol.

63

:

715

730

.

Wilke

C O

,

2001b

Selection for fitness versus selection for robustness in RNA secondary structure folding

.

Evolution

55

:

2412

2420

.

Wilke

C O

,

Wang

J L

,

Ofria

C

,

Lenski

R E

,

Adami

C

,

2001

Evolution of digital organisms at high mutation rate leads to survival of the flattest

.

Nature

412

:

331

333

.

© Genetics 2003

Citations

Views

Altmetric

Metrics

Total Views 521

453 Pageviews

68 PDF Downloads

Since 1/1/2021

Month: Total Views:
January 2021 2
February 2021 1
March 2021 3
April 2021 7
May 2021 7
June 2021 9
July 2021 7
August 2021 1
September 2021 6
October 2021 5
November 2021 20
December 2021 2
January 2022 3
February 2022 7
March 2022 3
April 2022 10
May 2022 5
June 2022 11
July 2022 5
August 2022 11
September 2022 2
October 2022 5
November 2022 4
December 2022 6
January 2023 1
February 2023 6
March 2023 26
April 2023 63
May 2023 7
June 2023 24
July 2023 35
August 2023 43
September 2023 35
October 2023 12
November 2023 11
December 2023 18
January 2024 15
February 2024 11
March 2024 3
April 2024 13
May 2024 14
June 2024 9
July 2024 9
August 2024 10
September 2024 10
October 2024 4

×

Email alerts

Citing articles via

More from Oxford Academic