Genome-wide mutational diversity in an evolving population of Escherichia coli - PubMed (original) (raw)

Genome-wide mutational diversity in an evolving population of Escherichia coli

J E Barrick et al. Cold Spring Harb Symp Quant Biol. 2009.

Abstract

The level of genetic variation in a population is the result of a dynamic tension between evolutionary forces. Mutations create variation, certain frequency-dependent interactions may preserve diversity, and natural selection purges variation. New sequencing technologies offer unprecedented opportunities to discover and characterize the diversity present in evolving microbial populations on a whole-genome scale. By sequencing mixed-population samples, we have identified single-nucleotide polymorphisms (SNPs) present at various points in the history of an Escherichia coli population that has evolved for almost 20 years from a founding clone. With 50-fold genome coverage, we were able to catch beneficial mutations as they swept to fixation, discover contending beneficial alleles that were eliminated by clonal interference, and detect other minor variants possibly adapted to a new ecological niche. Additionally, there was a dramatic increase in genetic diversity late in the experiment after a mutator phenotype evolved. Still finer-resolution details of the structure of genetic variation and how it changes over time in microbial evolution experiments will enable new applications and quantitative tests of population genetic theory.

PubMed Disclaimer

Figures

Figure 1

Figure 1. Expected dynamics in an evolving bacterial population

Lineages with new beneficial mutations are depicted as shaded wedges that originate in a previous genetic background and rise in frequency as they outcompete their ancestor and other lineages (Muller 1932). The same shading indicates lineages have equivalent fitnesses, and the path to the final dominant genotype containing five mutations is highlighted by the light gray curve. This figure was produced using a simulation with population size and mutation parameters meant to model the first 600 generations of the E. coli long-term evolution experiment (Woods 2005). Notice how the level of genetic diversity changes over time. Early on, a new beneficial mutation sweeps to fixation and the population has little diversity (a). Later, four lineages with different mutations coexist at appreciable frequencies for a time (b) before the descendants of one lineage become a majority (c).

Figure 2

Figure 2. Example coverage distribution and base error rates

The 2K mixed-population sample is displayed as representative of the sequence datasets. (a) The distribution of the number of ancestral genomic positions with a given read coverage depth (open circles) is over-dispersed relative to a Poisson model (dashed line) but is fit reasonably well by a negative binomial model (solid line). Repeat regions were excluded from this analysis. (b) The probability of a base error at a given quality score estimated from the number of observed mismatches in reads aligned to the reference genome usually decreases as a higher quality score is assigned to a base. Bases assigned a quality score of 10 had an anomalously high error rate in this dataset. The accompanying histogram shows that most bases in the dataset had high quality scores. Bases assigned a quality score of 40 do not appear on the log scale because they had zero errors.

Figure 3

Figure 3. Sensitivity of SNP prediction procedure

(a) Estimates of the probability that our statistical procedure would detect SNPs present at various frequencies in a mixed-population sample at different E-value cutoffs. For these calculations, the coverage and quality score distributions were those of the mixed-population 2K sample. (b) Estimates of sensitivity improvements possible by increasing sequencing coverage and by reducing the rate of base errors. For these calculations all sites had uniform coverage and the same error rate for all bases.

Figure 4

Figure 4. SNP predictions

The cumulative distributions of predictions below a given E-value threshold that also passed the bias filtering step are plotted for each dataset. Each panel contains a generation-paired mixed-population sample (squares and solid lines) and clone (circles and dashed lines), except there is only a clone at 0K and only a mixed population at 30K.

Figure 5

Figure 5. Mutational diversity in an evolving E. coli population

(a) Origin and eventual fate of point mutations in the 2K to 40K mixed-population samples. New mutations that first appear as SNPs or fixed alleles are shown as asterisks along the bottom or top, respectively, with arrows leading to the corresponding pools of SNPs and fixed mutations. Transient SNPs that were lost from the population are shown by descending lines ending in closed circles. Note that we only detect SNPs when they are between roughly 4% and 96% frequency in the population, and that we only recover approximately 50% of the SNPs at 5% frequency. Only the 49 SNP predictions in Table 2 were included for the 2K to 20K samples. (b) Stylized summary of the mixed-population SNP analysis. Shaded wedges represent subpopulations containing new mutations relative to the previous genetic background. Mutations are grouped to highlight their eventual fates, but we do not always have linkage information to resolve which SNPs occurred together. Labeled features are explained in the text.

Similar articles

Cited by

References

    1. Atwood KC, Schneider LK, Ryan FJ. Selective mechanisms in bacteria. Cold Spring Harb. Symp. Quant. Biol. 1951;16:345–355. - PubMed
    1. Barrick JE, Yu D-S, Yoon SH, Jeong H, Oh TK, Schneider D, Lenski RE, Kim JF. Genome dynamics in a long-term experiment with Escherichia coli. submitted. 2009 - PubMed
    1. Blount ZD, Borland CZ, Lenski RE. Historical contingency and the evolution of a key innovation in an experimental population of Escherichia coli. Proc. Natl. Acad. Sci. U.S.A. 2008 - PMC - PubMed
    1. Buckling A, Maclean CR, Brockhurst MA, Colegrave N. The Beagle in a bottle. Nature. 2009;457:824–829. - PubMed
    1. Campbell PJ, Pleasance ED, Stephens PJ, Dicks E, Rance R, Goodhead I, Follows GA, Green AR, Futreal PA, Stratton MR. Subclonal phylogenetic structures in cancer revealed by ultra-deep sequencing. Proc. Natl. Acad. Sci. U.S.A. 2008;105:13081–13086. - PMC - PubMed

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources