A survey of genetic simulation software for population and epidemiological studies (original) (raw)

Human Genomics volume 3, Article number: 79 (2008)Cite this article

Abstract

A number of programs have been developed for simulating population genetic and genetic epidemiological data conforming to one of three main algorithmic approaches: 'forwards', 'backwards' and 'sideways'. This review aims to make the reader aware of the range of options currently available to them. While no one program emerges as the best choice in all circumstances, we nominate a set of those which currently appear most promising.

Introduction

The two main reasons for wanting to simulate genetic data are, first, to gain insight into the effects that underlying demographic and mutational parameters may have on the genetic data one sees, and, secondly, to create test datasets for assessing the power of alternative genetic analysis methods. Ways of tackling the first goal range from informal approaches, which aim at getting a 'feel' for how altering different parameters affects the output data, to more formal methods based on matching many simulated datasets to an observed dataset (eg approximate Bayesian computation [1]). To tackle the second goal (and particularly for genetic epidemiology methods), an additional 'ascertainment' modelling element is often required to allow the simulation of disease-affecting loci within the context of a given study design (such as a case-control study).

The key challenges that all simulation algorithms face are: (1) speed -- typically one wants to do lots of simulations, so they need to be fast; (2) scalability -- with the advent of genome-wide genotyping and large-scale sequencing, there is a need for simulation programs to match; and (3) flexibility -- can the program cope with different demographic histories, population structure, recombination, selection, mutation models and disease models?

There are three main approaches to dealing with these challenges, here termed 'backwards', 'forwards' and 'sideways'. 'Backwards' (or coalescent) simulations start with the sample of individuals that will form your simulated dataset, then work backwards in time to construct the ancestral tree or graph of genealogical relationships that connects them all. Neutral mutations can subsequently be placed on this structure to create the simulated dataset. The simulation algorithm does not actually have to work backwards in time to achieve this, but this is a technical detail. The important point is that by restricting attention just to the genealogical structure relevant to the sample in question, a large computational saving is generally achieved relative to the 'forwards-in-time' approach. Still greater efficiency is afforded by the classic coalescent approach, which employs a continuous-time approximation to effectively skip over the intermediate generations between important tree-generating events. 'Forwards' simulations start with the entire population of individuals -- typically, many thousands -- and then follow how all the genetic data in question are passed on from one generation to the next. One usually needs to simulate over many thousands of generations in order to arrive at an equilibrium in which the genetic characteristics of the population are independent of the original starting conditions. Finally, 'sideways' simulations start with a collection of real present-day genetic data, and use these as a template for generating new simulated data with similar properties. 'Sideways' algorithms can also be coalescent-based (and thus fit into both 'backwards' and 'sideways' categories) but some adopt simpler resampling strategies that do not explicitly consider changes over generational time in either direction.

Backwards simulators

Table 1 lists all programs that the authors were able to source via PubMed and other internet-based searches. A list maintained by Heng Li [[35](/articles/10.1186/1479-7364-3-1-79#ref-CR35 "[ http://www.sanger.ac.uk/Users/lh3/coal-simu.html

              ]")\] was also helpful. Backwards (coalescent) approaches formthe largest part of Table [1](/articles/10.1186/1479-7364-3-1-79#Tab1), reflecting the inherent attractiveness and computational efficiency of simulating just that part of the genealogy needed to produce the data in the simulated sample. Richard Hudson's _ms_ program \[[8](/articles/10.1186/1479-7364-3-1-79#ref-CR8 "Hudson RR: Generating samples under a Wright--Fisher neutral model of genetic variation. Bioinformatics. 2005, 18: 337-338.")\]remains one of the most popular for straightforward problems. msHOT, \[[9](/articles/10.1186/1479-7364-3-1-79#ref-CR9 "Hellenthal G, Stephens M: msHOT: modifying Hudson's ms simulator to incorporate crossover and gene conversion hotspots. Bioinformatics. 2007, 23: 520-521. 10.1093/bioinformatics/btl622.")\] SNPsim \[[17](/articles/10.1186/1479-7364-3-1-79#ref-CR17 "Posada D, Wiuf C: Simulating haplotype blocks in the human genome. Bioinformatics. 2003, 19: 289-290. 10.1093/bioinformatics/19.2.289.")\] and COSI \[[3](/articles/10.1186/1479-7364-3-1-79#ref-CR3 "Schaffner SF, Foo C, Gabriel S: Calibrating a coalescent simulation of human genome sequence variation. Genome Res. 2005, 15: 1576-1583. 10.1101/gr.3709305.")\] extend the algorithm to allow variable recombination rates along the DNA sequence, and _msHOT_, _COSI_, _CoaSim_ \[[2](/articles/10.1186/1479-7364-3-1-79#ref-CR2 "Mailund T, Schierup MH, Pederson CNS, et al: CoaSim: A flexible environment for simulating genetic data under coalescent models. BMC Bioinformatics. 2005, 6: 252-10.1186/1471-2105-6-252.")\] and _newgenecoal_ \[[10](/articles/10.1186/1479-7364-3-1-79#ref-CR10 "Thornton KR: The neutral coalescent process for recent gene duplications and copy-number variants. Genetics. 2007, 177: 987-1000. 10.1534/genetics.107.074948.")\] also allow (allelic) gene conversion in addition to crossovers as recombination events. _SIMCOAL_ \[[15](/articles/10.1186/1479-7364-3-1-79#ref-CR15 "Excoffier L, Novembre J, Schneider S: SIMCOAL: A general coalescent program for the simulation of molecular data in interconnected populations with arbitrary demography. J Hered. 2000, 91: 506-509. 10.1093/jhered/91.6.506.")\] introduces complex demographic models, _SIMCOAL2_ extends this to variable recombination, _Serial SIMCOAL_ \[[14](/articles/10.1186/1479-7364-3-1-79#ref-CR14 "Anderson CN, Ramakrishnan U, Chan YL, Hadly EA: Serial SimCoal: A population genetics model for data from multiple populations and points in time. Bioinformatics. 2005, 21: 1733-1734. 10.1093/bioinformatics/bti154.")\] allows sampling at multiple time points and _MODELER4SIMCOAL_ \[[36](/articles/10.1186/1479-7364-3-1-79#ref-CR36 "Antao T, Beja-Pereira A, Luikart G: MODELER4SIMCOAL2: A user-friendly, extensible modeler of demography and linked loci for coalescent simulations. Bioinformatics. 2007, 23: 1848-1850. 10.1093/bioinformatics/btm243."), [37](/articles/10.1186/1479-7364-3-1-79#ref-CR37 "[
                http://popgen.eu/soft/m4s2/
                
              ]")\] provides a handy graphical user interface. _SelSim_ \[[13](/articles/10.1186/1479-7364-3-1-79#ref-CR13 "Spencer CC, Coop G: SelSim: A program to simulate population genetic data with natural selection and recombination. Bioinformatics. 2004, 20: 3673-3675. 10.1093/bioinformatics/bth417.")\] implements a single-locus selection model. Flexible, but not necessarily easy to implement, coalescent simulators are provided by _CoaSim_, \[[2](/articles/10.1186/1479-7364-3-1-79#ref-CR2 "Mailund T, Schierup MH, Pederson CNS, et al: CoaSim: A flexible environment for simulating genetic data under coalescent models. BMC Bioinformatics. 2005, 6: 252-10.1186/1471-2105-6-252.")\]_mlcoalsim_, \[[7](/articles/10.1186/1479-7364-3-1-79#ref-CR7 "Ramos-Onsins SE, Mitchell-Olds T: Mlcoalsim: Multilocus coalescent simulations. Evol Bioinformatics. 2007, 2: 41-44.")\]_SARG_ \[[12](/articles/10.1186/1479-7364-3-1-79#ref-CR12 "Nordborg M, Innan H: The genealogy of sequences containing multiple sites subject to strong selection in a subdivided population. Genetics. 2003, 163: 1201-1213.")\] and _GeneArtisan_ \[[5](/articles/10.1186/1479-7364-3-1-79#ref-CR5 "Wang Y, Rannala B: In silico analysis of disease-association mapping strategies using the coalescent process and incorporating ascertainment and selection. Am J Hum Genet. 2005, 76: 1066-1073. 10.1086/430472.")\].

For neutral loci, the tree or graph-generating step can be conveniently decoupled from the mutation-generating step, and the latter can be run via a separate program such as Andy Rambaut's SeqGen program [[38](/articles/10.1186/1479-7364-3-1-79#ref-CR38 "[ http://tree.bio.ed.ac.uk/software/seqgen/

              ]")\] to produce a wide range of different types of genetic data from a range of different mutational models. It is also possible to decouple the sampling ascertainment process (eg to get case-control data) by applying this as an additional step to unascertained simulated data. Currently, however, there are no easy ways of doing this, as additional user coding would be needed to adapt the sampling algorithms available in, for example, CoaSim,\[[2](/articles/10.1186/1479-7364-3-1-79#ref-CR2 "Mailund T, Schierup MH, Pederson CNS, et al: CoaSim: A flexible environment for simulating genetic data under coalescent models. BMC Bioinformatics. 2005, 6: 252-10.1186/1471-2105-6-252.")\]_SimuPOP_\[[28](/articles/10.1186/1479-7364-3-1-79#ref-CR28 "Peng B, Amos CI, Kimmel M: Forward-time simulations of human populations with complex diseases. PLoS Genet. 2007, 3: e47-10.1371/journal.pgen.0030047.")–[30](/articles/10.1186/1479-7364-3-1-79#ref-CR30 "Peng B, Amos CI: Forward-time simulations of non-random mating populations using simuPOP. Bioinformatics. 2008, 24: 1408-1409. 10.1093/bioinformatics/btn179.")\] or _FREGENE_\[[23](/articles/10.1186/1479-7364-3-1-79#ref-CR23 "Hoggart CJ, Chadeau-Hyam M, Clark TG, et al: Sequence-level population simulations over large genomic regions. Genetics. 2007, 177: 1725-1731. 10.1534/genetics.106.069088.")\]. Furthermore, there are as yet no completely flexible ascertainment options that would allow, for example, simulation of cases from models with more than one partially linked disease locus, or from more general causal models that have incorporated additional covariates.

Conventional coalescent algorithms break down for very large DNA regions such as whole chromosomes. This is because recombination gives rise to complex ancestral recombination graphs (ARGs) rather than simple binary genealogical trees, and more recombination leads to ever larger and more complex ARGs. The _FastCoal_[[4](/articles/10.1186/1479-7364-3-1-79#ref-CR4 "Marjoram P, Wall JD: Fast "coalescent" simulation a flexible environment for simulating genetic data under coalescent models. BMC Genet. 2006, 7: 16-")] and _GENOME_[6] simulators employ approximations to the real coalescent-with-recombination that lead to simpler ARGs and thence to feasible genome-wide simulations. MaCs, a recent update to FastCoal which uses an improved approximation to the coalescent-with-recombination, is available on request from Jeff Wall (wallj@humgen.ucsf.edu). FastCoal is reported to be able to generate 2,000 50-megabase (Mb) diploid samples in two minutes on a standard workstation, and GENOME to generate 600 150 Mb diploid samples in 66 minutes.

Forwards simulators

Forwards-in-time simulators are more naturally capable of coping with complex modelling scenarios, at the expense of decreased computational efficiency. Of these, the FREGENE [23] and GenomePop [24] programs make the biggest effort at maintaining speed, and, of these, only FREGENE allows for ascertained disease-gene sampling. A useful scaling option in both programs allows one to simulate a smaller population over a smaller number of generations and then use these results to approximate a larger population over more generations. Unfortunately, only diallelic SNP data can be simulated fast enough to cover large genomic regions. At smaller genomic scales, more complex nucleotide and codon models can be simulated by GenomePop, while copy number variation (CNV) and microsatellite data can be simulated by simuPOP [2830] and Nemo [27]. GenomeSIM [25] claims to be able to generate genome-wide SNP data by forwards simulation, but only achieves this by simulating over a very limited ten or so generations, far fewer than that needed to achieve proper genetic equilibrium. Indeed, FREGENE and GenomePop could also generate genome-wide datasets in this way, and presumably could do so with greater computational efficiency.

Table 1 List of population genetic simulation software

Full size table

Sideways simulators

Sideways simulators can, to some extent, side-step the whole issue of model complexity by relying on real data 'as is' to guide the simulation process. Simple bootstrap resampling breaks down for longer regions because the genetic diversity seen in the reference sample (usually the 270 individuals in HapMap) is not adequate to capture the full diversity among all humans. The situation will improve with the '1,000 genomes' project,[[39](/articles/10.1186/1479-7364-3-1-79#ref-CR39 "[ http://www.1000genomes.org/

              ]")\] and also with the steady increase in publically available genome-wide SNP data, but it still seems sensible to apply an additional method to perturb the simulated data away from the narrow range seen in the real data. Dudbridge \[[40](/articles/10.1186/1479-7364-3-1-79#ref-CR40 "Dudbridge F: A note on permutation tests in multistage association scans. Am J Hum Genet. 2006, 78: 1094-1095. 10.1086/504527.")\] proposed forming random diploid chromosomes from phased HapMap data followed by a single round of artificial meiosis, governed by empirical recombination rates also estimated from HapMap. This idea has been put to use in the _HAP-SAMPLE_ software,\[[34](/articles/10.1186/1479-7364-3-1-79#ref-CR34 "Wright FA, Huang H, Guan X: Simulating association studies: A data-based resampling method for candidate regions or whole genome scans. Bioinformatics. 2007, 23: 2581-2588. 10.1093/bioinformatics/btm386.")\] with an additional option to boost the baseline recombination rate (x100 recommended) to reduce long-range linkage disequilibrium. Durrant _et al_. \[[41](/articles/10.1186/1479-7364-3-1-79#ref-CR41 "Durrant C, Zondervan KT, Cardon LR, et al: Linkage disequilibrium mapping via cladistic analysis of single-nucleotide polymorphism haplotypes. Am J Hum Genet. 2004, 75: 35-43. 10.1086/422174.")\] proposed an alternative idea based on sliding windows for introducing new variations into simulated data. This method has been implemented in the _GWA simulator_ software,\[[32](/articles/10.1186/1479-7364-3-1-79#ref-CR32 "Li C, Li M: GWAsimulator: A rapid whole-genome simulation program. Bioinformatics. 2008, 24: 140-142. 10.1093/bioinformatics/btm549.")\] and an improved extension to this idea, which allows a variable sliding window size, has been implemented in the _gs_ software \[[31](/articles/10.1186/1479-7364-3-1-79#ref-CR31 "Li J, Chen Y: Generating samples for association studies based on HapMap data. BMC Bioinformatics. 2008, 9: 44-10.1186/1471-2105-9-44.")\]. Jonathan Marchini's _hapgen_ software,\[[33](/articles/10.1186/1479-7364-3-1-79#ref-CR33 "Marchini J, Howie B, Myers S, et al: A new multipoint method for genome-wide association studies by imputation of genotypes. Nat Genet. 2007, 39: 906-913. 10.1038/ng2088.")\] based on the same underlying principles as his genotype imputation software _impute_, applies an approximation to the coalescent-with-recombination to generate new simulated data from existing phased HapMap data, but is slower than the other two sideways simulators. _HAP-SAMPLE_ is reported to be able to generate 2,000 samples of a 100,000 genome-wide SNP chip in a few minutes on a standard workstation, and _gs_ to generate 2,000 samples of chromosome 6 (36,000 SNPs) in 140 minutes.

Conclusions

In summary, no one program is capable of doing everything, but there exist some useful applications from all three main simulation approaches. For genome-wide SNP data, the main contenders are FastCoal,[[4](/articles/10.1186/1479-7364-3-1-79#ref-CR4 "Marjoram P, Wall JD: Fast "coalescent" simulation a flexible environment for simulating genetic data under coalescent models. BMC Genet. 2006, 7: 16-")]GENOME,[6]HAP-SAMPLE [34] and gs [31]. For high model flexibility and sampling ascertainment at the 10 Mb scale or less (not whole-genome but still enough for many purposes), FREGENE [23] is recommended. Simulation of copy number variation and/or microsatellite data at larger genomic scales, and of more complex disease models allowing covariates and linked loci, remain areas for future program development.

References

  1. Beaumont MA, Zhang W, Balding DJ: Approximate Bayesian computation in population genetics. Genetics. 2002, 162: 2025-2035.
    PubMed Central PubMed Google Scholar
  2. Mailund T, Schierup MH, Pederson CNS, et al: CoaSim: A flexible environment for simulating genetic data under coalescent models. BMC Bioinformatics. 2005, 6: 252-10.1186/1471-2105-6-252.
    Article PubMed Central PubMed Google Scholar
  3. Schaffner SF, Foo C, Gabriel S: Calibrating a coalescent simulation of human genome sequence variation. Genome Res. 2005, 15: 1576-1583. 10.1101/gr.3709305.
    Article PubMed Central CAS PubMed Google Scholar
  4. Marjoram P, Wall JD: Fast "coalescent" simulation a flexible environment for simulating genetic data under coalescent models. BMC Genet. 2006, 7: 16-
    Article PubMed Central PubMed Google Scholar
  5. Wang Y, Rannala B: In silico analysis of disease-association mapping strategies using the coalescent process and incorporating ascertainment and selection. Am J Hum Genet. 2005, 76: 1066-1073. 10.1086/430472.
    Article PubMed Central CAS PubMed Google Scholar
  6. Liang L, Zöllner S, Abecasis GR: GENOME: A rapid coalescent-based whole genome simulator. Bioinformatics. 2007, 23: 1565-1567. 10.1093/bioinformatics/btm138.
    Article CAS PubMed Google Scholar
  7. Ramos-Onsins SE, Mitchell-Olds T: Mlcoalsim: Multilocus coalescent simulations. Evol Bioinformatics. 2007, 2: 41-44.
    Google Scholar
  8. Hudson RR: Generating samples under a Wright--Fisher neutral model of genetic variation. Bioinformatics. 2005, 18: 337-338.
    Article Google Scholar
  9. Hellenthal G, Stephens M: msHOT: modifying Hudson's ms simulator to incorporate crossover and gene conversion hotspots. Bioinformatics. 2007, 23: 520-521. 10.1093/bioinformatics/btl622.
    Article CAS PubMed Google Scholar
  10. Thornton KR: The neutral coalescent process for recent gene duplications and copy-number variants. Genetics. 2007, 177: 987-1000. 10.1534/genetics.107.074948.
    Article PubMed Central PubMed Google Scholar
  11. Arenas M, Posada D: Recodon: Coalescent simulation of coding DNA sequences with recombination, migration and demography. BMC Bioinformatics. 2007, 8: 458-10.1186/1471-2105-8-458.
    Article PubMed Central PubMed Google Scholar
  12. Nordborg M, Innan H: The genealogy of sequences containing multiple sites subject to strong selection in a subdivided population. Genetics. 2003, 163: 1201-1213.
    PubMed Central PubMed Google Scholar
  13. Spencer CC, Coop G: SelSim: A program to simulate population genetic data with natural selection and recombination. Bioinformatics. 2004, 20: 3673-3675. 10.1093/bioinformatics/bth417.
    Article CAS PubMed Google Scholar
  14. Anderson CN, Ramakrishnan U, Chan YL, Hadly EA: Serial SimCoal: A population genetics model for data from multiple populations and points in time. Bioinformatics. 2005, 21: 1733-1734. 10.1093/bioinformatics/bti154.
    Article CAS PubMed Google Scholar
  15. Excoffier L, Novembre J, Schneider S: SIMCOAL: A general coalescent program for the simulation of molecular data in interconnected populations with arbitrary demography. J Hered. 2000, 91: 506-509. 10.1093/jhered/91.6.506.
    Article CAS PubMed Google Scholar
  16. Laval G, Excoffier L: SIMCOAL 2.0: A program to simulate genomic diversity over large recombining regions in a subdivided population with a complex history. Bioinformatics. 2004, 20: 2485-2487. 10.1093/bioinformatics/bth264.
    Article CAS PubMed Google Scholar
  17. Posada D, Wiuf C: Simulating haplotype blocks in the human genome. Bioinformatics. 2003, 19: 289-290. 10.1093/bioinformatics/19.2.289.
    Article CAS PubMed Google Scholar
  18. Currat M, Ray N, Excoffier L: SPLATCHE: A program to simulate genetic diversity taking into account environmental heterogeneity. Mol Ecol Notes. 2004, 4: 139-142. 10.1046/j.1471-8286.2003.00582.x.
    Article Google Scholar
  19. Grassly NC, Harvey PH, Holmes EC: Population dynamics of HIV-1 inferred from gene sequences. Genetics. 1999, 151: 427-438.
    PubMed Central CAS PubMed Google Scholar
  20. Kuo C-H, Janzen FJ: BOTTLESIM: A bottleneck simulation program for long-lived species with overlapping generations. Mol Ecol Notes. 2003, 3: 669-673. 10.1046/j.1471-8286.2003.00532.x.
    Article CAS Google Scholar
  21. Balloux F: EASYPOP (Version 1.7): A computer program for population genetics simulations. J Hered. 2001, 92: 301-302. 10.1093/jhered/92.3.301.
    Article CAS PubMed Google Scholar
  22. Padhukasahasram B, Marjoram P, Wall JD, et al: Exploring population genetic models with recombination using efficient forward-time simulations. Genetics. 2008, 178: 2417-2427. 10.1534/genetics.107.085332.
    Article PubMed Central PubMed Google Scholar
  23. Hoggart CJ, Chadeau-Hyam M, Clark TG, et al: Sequence-level population simulations over large genomic regions. Genetics. 2007, 177: 1725-1731. 10.1534/genetics.106.069088.
    Article PubMed Central CAS PubMed Google Scholar
  24. Carvajal-Rodriguez A: GENOMEPOP: A program to simulate genomes in populations. BMC Bioinformatics. 2008, 9: 223-10.1186/1471-2105-9-223.
    Article PubMed Central PubMed Google Scholar
  25. Dudek SM, Hotsinger AA, Velez DR, et al: Data simulation software for whole-genome association and other studies in human genetics. Pac Symp Biocomput. 2006, 11: 499-510.
    Google Scholar
  26. Sanford J, Baumgardner J, Brewer W, et al: Mendel's Accountant: A biologically realistic forward-time population genetics program. SCPE. 2007, 8: 147-165. Available at: http://www.scpe.org/
    Google Scholar
  27. Guillaume F, Rougemont J: Nemo: An evolutionary and population genetics programming framework. Bioinformatics. 2006, 22: 2556-2557. 10.1093/bioinformatics/btl415.
    Article CAS PubMed Google Scholar
  28. Peng B, Amos CI, Kimmel M: Forward-time simulations of human populations with complex diseases. PLoS Genet. 2007, 3: e47-10.1371/journal.pgen.0030047.
    Article PubMed Central PubMed Google Scholar
  29. Peng B, Kimmel M: simuPOP: A forward-time population genetics simulation environment. Bioinformatics. 2005, 21: 3686-3687. 10.1093/bioinformatics/bti584.
    Article CAS PubMed Google Scholar
  30. Peng B, Amos CI: Forward-time simulations of non-random mating populations using simuPOP. Bioinformatics. 2008, 24: 1408-1409. 10.1093/bioinformatics/btn179.
    Article PubMed Central CAS PubMed Google Scholar
  31. Li J, Chen Y: Generating samples for association studies based on HapMap data. BMC Bioinformatics. 2008, 9: 44-10.1186/1471-2105-9-44.
    Article PubMed Central PubMed Google Scholar
  32. Li C, Li M: GWAsimulator: A rapid whole-genome simulation program. Bioinformatics. 2008, 24: 140-142. 10.1093/bioinformatics/btm549.
    Article CAS PubMed Google Scholar
  33. Marchini J, Howie B, Myers S, et al: A new multipoint method for genome-wide association studies by imputation of genotypes. Nat Genet. 2007, 39: 906-913. 10.1038/ng2088.
    Article CAS PubMed Google Scholar
  34. Wright FA, Huang H, Guan X: Simulating association studies: A data-based resampling method for candidate regions or whole genome scans. Bioinformatics. 2007, 23: 2581-2588. 10.1093/bioinformatics/btm386.
    Article CAS PubMed Google Scholar
  35. [http://www.sanger.ac.uk/Users/lh3/coal-simu.html]
  36. Antao T, Beja-Pereira A, Luikart G: MODELER4SIMCOAL2: A user-friendly, extensible modeler of demography and linked loci for coalescent simulations. Bioinformatics. 2007, 23: 1848-1850. 10.1093/bioinformatics/btm243.
    Article CAS PubMed Google Scholar
  37. [http://popgen.eu/soft/m4s2/]
  38. [http://tree.bio.ed.ac.uk/software/seqgen/]
  39. [http://www.1000genomes.org/]
  40. Dudbridge F: A note on permutation tests in multistage association scans. Am J Hum Genet. 2006, 78: 1094-1095. 10.1086/504527.
    Article PubMed Central CAS PubMed Google Scholar
  41. Durrant C, Zondervan KT, Cardon LR, et al: Linkage disequilibrium mapping via cladistic analysis of single-nucleotide polymorphism haplotypes. Am J Hum Genet. 2004, 75: 35-43. 10.1086/422174.
    Article PubMed Central CAS PubMed Google Scholar

Download references

Author information

Authors and Affiliations

  1. Bioinformatics Research Center, North Carolina State University, Campus Box 7566, Raleigh, NC, 27695-7566, USA
    Youfang Liu
  2. Facultat de Biologia, Departament de Biologia Animal, Facultat de Biologia, Universitat de Barcelona, Av. Diagonal 645, 08028, Barcelona, Spain
    Georgios Athanasiadis
  3. Department of Medical and Molecular Genetics, King's College London, Guy's Hospital, 8th Floor, Tower Wing, London, SE1 9RT, UK
    Michael E. Weale

Authors

  1. Youfang Liu
    You can also search for this author inPubMed Google Scholar
  2. Georgios Athanasiadis
    You can also search for this author inPubMed Google Scholar
  3. Michael E. Weale
    You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence toYoufang Liu.

Rights and permissions

About this article

Cite this article

Liu, Y., Athanasiadis, G. & Weale, M.E. A survey of genetic simulation software for population and epidemiological studies.Hum Genomics 3, 79 (2008). https://doi.org/10.1186/1479-7364-3-1-79

Download citation

Keywords