phenosim--A software to simulate phenotypes for testing in genome-wide association studies - PubMed (original) (raw)

phenosim--A software to simulate phenotypes for testing in genome-wide association studies

Torsten Günther et al. BMC Bioinformatics. 2011.

Abstract

Background: There is a great interest in understanding the genetic architecture of complex traits in natural populations. Genome-wide association studies (GWAS) are becoming routine in human, animal and plant genetics to understand the connection between naturally occurring genotypic and phenotypic variation. Coalescent simulations are commonly used in population genetics to simulate genotypes under different parameters and demographic models.

Results: Here, we present phenosim, a software to add a phenotype to genotypes generated in time-efficient coalescent simulations. Both qualitative and quantitative phenotypes can be generated and it is possible to partition phenotypic variation between additive effects and epistatic interactions between causal variants. The output formats of phenosim are directly usable as input for different GWAS tools. The applicability of phenosim is shown by simulating a genome-wide association study in Arabidopsis thaliana.

Conclusions: By using the coalescent approach to generate genotypes and phenosim to add phenotypes, the data sets can be used to assess the influence of various factors such as demography, genetic architecture or selection on the statistical power of association methods to detect causal genetic variants under a wide variety of population genetic scenarios. phenosim is freely available from the authors' website http://evoplant.uni-hohenheim.de.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Work flow. Flowchart of the phenosim pipeline.

Figure 2

Figure 2

Example. Proportion of significant marker-trait associations (Bonferroni adjusted significance threshold) found at different distances from the causal marker. For each QTN, the distance to the next significant association is shown. The bars on the left show the proportion of simulated data sets for which no significant association was found. The simulated models include: 'additive' - 2 QTNs randomly distributed with a variance proportion of 0.05 each; 'epi_rand' - 2 QTNs randomly distributed with a variance proportion of 0.01 each and 0.08 epistatic effect; 'epi_hapl' - 2 QTNs located on a common haploblock with a variance proportion of 0.01 each and 0.08 epistatic effect. Each model was simulated 1000 times.

Similar articles

Cited by

References

    1. Hindorff La, Sethupathy P, Junkins Ha, Ramos EM, Mehta JP, Collins FS, Manolio TA. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proceedings of the National Academy of Sciences of the United States of America. 2009;106(23):9362–7. doi: 10.1073/pnas.0903103106. - DOI - PMC - PubMed
    1. Atwell S, Huang YS, Vilhjálmsson BJ, Willems G, Horton M, Li Y, Meng D, Platt A, Tarone AM, Hu TT, Jiang R, Muliyati NW, Zhang X, Amer MA, Baxter I, Brachi B, Chory J, Dean C, Debieu M, de Meaux J, Ecker JR, Faure N, Kniskern JM, Jones JDG, Michael T, Nemri A, Roux F, Salt DE, Tang C, Todesco M, Traw MB, Weigel D, Marjoram P, Borevitz JO, Bergelson J, Nordborg M. Genome-wide association study of 107 phenotypes in Arabidopsis thaliana inbred lines. Nature. 2010;465(7298):627–31. doi: 10.1038/nature08800. - DOI - PMC - PubMed
    1. Stranger BE, Stahl Ea, Raj T. Progress and Promise of Genome-wide Association Studies for Human Complex Trait Genetics. Genetics. 2010;187(2):367–383. - PMC - PubMed
    1. Wang WYS, Barratt BJ, Clayton DG, Todd JA. Genome-wide association studies: theoretical and practical concerns. Nature reviews Genetics. 2005;6(2):109–18. doi: 10.1038/nrg1522. - DOI - PubMed
    1. Hudson RR. Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics. 2002;18:337–338. doi: 10.1093/bioinformatics/18.2.337. - DOI - PubMed

Publication types

MeSH terms

LinkOut - more resources