Forward-time simulations of human populations with complex diseases - PubMed (original) (raw)

Forward-time simulations of human populations with complex diseases

Bo Peng et al. PLoS Genet. 2007.

Abstract

Due to the increasing power of personal computers, as well as the availability of flexible forward-time simulation programs like simuPOP, it is now possible to simulate the evolution of complex human diseases using a forward-time approach. This approach is potentially more powerful than the coalescent approach since it allows simulations of more than one disease susceptibility locus using almost arbitrary genetic and demographic models. However, the application of such simulations has been deterred by the lack of a suitable simulation framework. For example, it is not clear when and how to introduce disease mutants-especially those under purifying selection-to an evolving population, and how to control the disease allele frequencies at the last generation. In this paper, we introduce a forward-time simulation framework that allows us to generate large multi-generation populations with complex diseases caused by unlinked disease susceptibility loci, according to specified demographic and evolutionary properties. Unrelated individuals, small or large pedigrees can be drawn from the resulting population and provide samples for a wide range of study designs and ascertainment methods. We demonstrate our simulation framework using three examples that map genes associated with affection status, a quantitative trait, and the age of onset of a hypothetical cancer, respectively. Nonadditive fitness models, population structure, and gene-gene interactions are simulated. Case-control, sibpair, and large pedigree samples are drawn from the simulated populations and are examined by a variety of gene-mapping methods.

PubMed Disclaimer

Conflict of interest statement

Competing interests. The authors have declared that no competing interests exist.

Figures

Figure 1

Figure 1. Examples of Simulated Trajectories

Trajectories of simulated allele frequency under different selection models. For each selection model, 100 replicates are simulated and three trajectories corresponding to the 5%, 50%, and 95% quantiles of the trajectory length are plotted. The selection models are neutral (left top, s 1 = s 2 = 0), advantageous (right top, s 1 = 0.001 and s 2 = 0.002), deleterious (left bottom, s 1 = −0.001 and s 2 = −0.002) and a mixed-selection model (right bottom) in which the disease allele is advantageous before 2,000 generations ago (s 1 = 0.001 and s 2 = 0.002) and is under purifying selection in the recent 2,000 generations (s 1 = −0.001 and s 2 = −0.002). In all cases, the current allele frequency is 10%. The population size is

such that N(10,000) = 106. Note that one of the trajectories in the left bottom panel is longer than 20,000 generations and its allele frequency is more than 0.5 before generation 10,000.

Figure 2

Figure 2. Illustration of the Evolutionary Scenario

Illustration of the evolutionary scenario of a simulation with three DSL. Demographic model (left axis): The population starts at size 10,000 and begins to grow exponentially at generation 4,000. The population is split into five equal-sized subpopulations at generation 7,500 (with subpopulations separated with solid lines) and reaches size 2 × 105 at generation 10,000. Migration is allowed from generation 9,500 to 10,000 (with subpopulations separated by dashed lines). Disease allele frequencies (right axis): The DSL are under advantageous selection pressure, with fitnesses of 1, 1.0001, and 1.0002 for genotypes AA, Aa, and aa, respectively, where a is the disease susceptibility allele. The present disease allele frequencies are 0.01, 0.02, and 0.03, respectively. The trajectories simulated backward in time are plotted in solid lines with different colors. The trajectories obtained during forward-time controlled random mating are plotted as dotted lines, which are indistinguishable from the simulated trajectories.

Figure 3

Figure 3. Validation of Trajectory Lengths

Mean, 5%, and 95% quantile of the length of trajectories of a mutant under different selection pressure. The mutant starts at allele frequency 0.1, evolves backward in time in a constant population with size N = 5,000, and is subjected to constant selection pressure with a selection coefficient s of −0.001 to 0.001, until it is lost or fixed. The red smooth curve represents theoretical estimates of the mean number of generations before this mutant is lost or fixed. Note that the simulated trajectories that are fixed or have more than one mutant at generation 1 are also accepted, in accordance with the theoretical estimates.

Similar articles

Cited by

References

    1. Kruglyak L. Prospects for whole-genome linkage disequilibrium mapping of common disease genes. Nat Genet. 1999;22:139–44. - PubMed
    1. Pritchard JK. Are rare variants responsible for susceptibility to complex diseases. Am J Hum Genet. 2001;69:124–137. - PMC - PubMed
    1. Balloux F, Goudet J. Statistical properties of population differentiation estimators under stepwise mutation in a finite island model. Mol Ecol. 2002;11:771–783. - PubMed
    1. Kingman J. The coalescent. Stochastic Processes Appl. 1982;13:235–248.
    1. Griffiths RC. Neutral two-locus multiple allele models with recombination. Theor Popul Biol. 1981;19:169–186.

Publication types

MeSH terms

Substances

LinkOut - more resources