A new method for modeling coalescent processes with recombination - PubMed (original) (raw)

A new method for modeling coalescent processes with recombination

Ying Wang et al. BMC Bioinformatics. 2014.

Abstract

Background: Recombination plays an important role in the maintenance of genetic diversity in many types of organisms, especially diploid eukaryotes. Recombination can be studied and used to map diseases. However, recombination adds a great deal of complexity to the genetic information. This renders estimation of evolutionary parameters more difficult. After the coalescent process was formulated, models capable of describing recombination using graphs, such as ancestral recombination graphs (ARG) were also developed. There are two typical models based on which to simulate ARG: back-in-time model such as ms and spatial model including Wiuf&Hein's, SMC, SMC', and MaCS.

Results: In this study, a new method of modeling coalescence with recombination, Spatial Coalescent simulator (SC), was developed, which considerably improved the algorithm described by Wiuf and Hein. The present algorithm constructs ARG spatially along the sequence, but it does not produce any redundant branches which are inevitable in Wiuf and Hein's algorithm. Interestingly, the distribution of ARG generated by the present new algorithm is identical to that generated by a typical back-in-time model adopted by ms, an algorithm commonly used to model coalescence. It is here demonstrated that the existing approximate methods such as the sequentially Markov coalescent (SMC), a related method called SMC', and Markovian coalescent simulator (MaCS) can be viewed as special cases of the present method. Using simulation analysis, the time to the most common ancestor (TMRCA) in the local trees of ARGs generated by the present algorithm was found to be closer to that produced by ms than time produced by MaCS. Sample-consistent ARGs can be generated using the present method. This may significantly reduce the computational burden.

Conclusion: In summary, the present method and algorithm may facilitate the estimation and description of recombination in population genomics and evolutionary biology.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Five types of recombination in the history of a population.

Figure 2

Figure 2

An example of current ARG. The graph displays the [0, 0.5] part of an ARG, the black lines represents branches constituting the current local tree, the gray lines represent all old branches, the numbers in brackets display intervals which denote the ancestral materials carried by nearby branches, the numbers without brackets denote recombination rates occur in the underlying nodes.

Figure 3

Figure 3

Generation of ARG along sequence. X s denotes [0, s] segments of the ARG, S i denotes the i th type 1 recombination breakpoints and Z i denotes the branches added to X Si_-_1. X Si is the collection of Z i and X Si_-_1.

Figure 4

Figure 4

Comparison of differences in the mean and variance of the first 100 local trees’ height between SC and Macs using ms as a control. Boxplot with 75% quantile and 25% quantile as top border and the bottom border, respectively.

Figure 5

Figure 5

Ratio of strictly sample-consistent ARGs to all ARGs. ARGs are not considered strictly sample-consistent unless they are both sample-consistent and the number of type 1 recombination of that ARG is within 10% of estimate using 100,000 simulations in 4 different cases. Case 1: ρ = 10, μ = 10 with _SC_-sample. Case 2: ρ = 50, μ = 50 with _SC_-sample. Case 3: ρ = 16.9,μ = 7.5 with _SC_-sample, which employ the mutation rate and recombination rate in human. Case 4: ρ = 10, μ = 10 with SC.

Figure 6

Figure 6

Distribution of number of type1 recombination in ARGs generated with SC - sample. The red vertical line indicates the expected number of type 1 recombination in a particular scenario**.**

Figure 7

Figure 7

Updated steps of SC to generate new ARG from current ARG. The same step numbers and case numbers as in the methods section are used here. Step 3 is a new type 1 recombination created on the current ARG. In step 4, the new branch coalesces into an old branch or a branch on current tree. If the new branch coalesces into the current tree, a new ARG has been constructed. If the new branch coalesces into an old branch, then there are two cases, case 5.1 and case 5.2. In case 5.2, a new ARG is generated. In case 5.1, a new branch is generated which could be dealt with in step 4. When a new ARG is generated, it turns into current ARG and a new round begins. For more details, see the Methods section.

Figure 8

Figure 8

A schematic diagram of the update steps of SMC, SMC′, and MaCS under our framework. Regardless of which branch the new branch coalesces into in Step 4, a new ARG is constructed.

Figure 9

Figure 9

Generation of sample-consistent ARG with SC-sample. A) Generation of a binary tree is consistent with the left site. B) Determine whether the current local tree is consistent with the second site, the answer is yes since there is only one mutant edge. C) Determine whether the current local tree is consistent with the third site. This tree is not consistent because there are two mutant edges, so P1 = 0.5. D) Generate the next recombination point that is uniform on [0, P1] and obtain P2 = 0.4. The dotted lines are the branches onto which the new branches are supposed to coalesce. E) The new branch coalesces and the [0, 0.4] part of the ARG is simulated.

Similar articles

Cited by

References

    1. Kingman JFC. On the genealogy of large populations. J Appl Probab. 1982;19A:27–43. doi: 10.2307/3213548. - DOI
    1. Kingman JFC. The coalescent. Stoch ProcessAppl. 1982;13:235–248. doi: 10.1016/0304-4149(82)90011-4. - DOI
    1. Hudson RR. Properties of a neutral allele model with intragenic recombination. Theor Popul Biol. 1983;23(2):183–201. doi: 10.1016/0040-5809(83)90013-8. - DOI - PubMed
    1. Griffiths RCaM P. An ancestral recombination graph. Prog Popul Genet Hum Evol. 1997;87:257–270. doi: 10.1007/978-1-4757-2609-1_16. - DOI
    1. Stumpf MPH, Goldstein DB. Demography, recombination hotspot intensity, and the block structure of linkage disequilibrium. Curr Biol. 2003;13:1–8. doi: 10.1016/S0960-9822(02)01404-5. - DOI - PubMed

Publication types

MeSH terms

LinkOut - more resources