Approximating the coalescent with recombination - PubMed (original) (raw)

Comparative Study

Approximating the coalescent with recombination

Gilean A T McVean et al. Philos Trans R Soc Lond B Biol Sci. 2005.

Abstract

The coalescent with recombination describes the distribution of genealogical histories and resulting patterns of genetic variation in samples of DNA sequences from natural populations. However, using the model as the basis for inference is currently severely restricted by the computational challenge of estimating the likelihood. We discuss why the coalescent with recombination is so challenging to work with and explore whether simpler models, under which inference is more tractable, may prove useful for genealogy-based inference. We introduce a simplification of the coalescent process in which coalescence between lineages with no overlapping ancestral material is banned. The resulting process has a simple Markovian structure when generating genealogies sequentially along a sequence, yet has very similar properties to the full model, both in terms of describing patterns of genetic variation and as the basis for statistical inference.

PubMed Disclaimer

Figures

Figure 1

Figure 1

The ratio of the average number of recombination events in the ARG for the standard coalescent to the average number of recombination events in the SMC model for _n_=2. The average number of recombination events in the SMC is equal to ρ.

Figure 2

Figure 2

The sequentially Markov coalescent with recombination. The point of the recombination event (indicated by a crossmark) is placed uniformly on the tree. The branch above it is removed and the lineage coalesces back to the remaining tree at a rate proportional to the number of lineages present.

Figure 3

Figure 3

The decay of LD as a function of genetic distance (ρ) as approximated by _σ_d2 under the standard coalescent process (black) and the sequentially Markov version (grey).

Figure 4

Figure 4

Log likelihood surface for θ and ρ under the standard coalescent and sequentially Markov processes. Although the maximum likelihood estimates of both parameters are very similar under the two models, it should also be noted that the likelihood surfaces are very flat.

Figure 5

Figure 5

Expected values of the number of recombination events and TMRCA at each position under the standard coalescent (black) and sequentially Markov processes (grey). Maximum likelihood estimates of θ and ρ were used for each. Open circles represent the position of the mutations in the sequences.

Similar articles

Cited by

References

    1. Beaumont M.A, Zhang W, Balding D.J. Approximate Bayesian computation in population genetics. Genetics. 2002;162:2025–2035. - PMC - PubMed
    1. Crawford D.C, Bhangale T, Li N, Hellenthal G, Rieder M.J, Nickerson D.A, Stephens M. Evidence for substantial fine-scale variation in recombination rates across the human genome. Nat. Genet. 2004;36:700–706. - PubMed
    1. Fearnhead P, Donnelly P.J. Estimating recombination rates from population genetic data. Genetics. 2001;159:1299–1318. - PMC - PubMed
    1. Fearnhead P, Donnelly P. Approximate likelihood methods for estimating local recombination rates. J. R. Stat. Soc. B. 2002;64:1–24.
    1. Fearnhead P, Harding R.M, Schneider J.A, Myers S, Donnelly P. Application of coalescent methods to reveal fine-scale rate variation and recombination hotspots. Genetics. 2004;167:2067–2081. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources