Methods to impute missing genotypes for population data - PubMed (original) (raw)
. 2007 Dec;122(5):495-504.
doi: 10.1007/s00439-007-0427-y. Epub 2007 Sep 13.
Affiliations
- PMID: 17851696
- DOI: 10.1007/s00439-007-0427-y
Methods to impute missing genotypes for population data
Zhaoxia Yu et al. Hum Genet. 2007 Dec.
Abstract
For large-scale genotyping studies, it is common for most subjects to have some missing genetic markers, even if the missing rate per marker is low. This compromises association analyses, with varying numbers of subjects contributing to analyses when performing single-marker or multi-marker analyses. In this paper, we consider eight methods to infer missing genotypes, including two haplotype reconstruction methods (local expectation maximization-EM, and fastPHASE), two k-nearest neighbor methods (original k-nearest neighbor, KNN, and a weighted k-nearest neighbor, wtKNN), three linear regression methods (backward variable selection, LM.back, least angle regression, LM.lars, and singular value decomposition, LM.svd), and a regression tree, Rtree. We evaluate the accuracy of them using single nucleotide polymorphism (SNP) data from the HapMap project, under a variety of conditions and parameters. We find that fastPHASE has the lowest error rates across different analysis panels and marker densities. LM.lars gives slightly less accurate estimate of missing genotypes than fastPHASE, but has better performance than the other methods.
Similar articles
- The impact of missing and erroneous genotypes on tagging SNP selection and power of subsequent association tests.
Liu W, Zhao W, Chase GA. Liu W, et al. Hum Hered. 2006;61(1):31-44. doi: 10.1159/000092141. Epub 2006 Mar 23. Hum Hered. 2006. PMID: 16557026 - The use of family relationships and linkage disequilibrium to impute phase and missing genotypes in up to whole-genome sequence density genotypic data.
Meuwissen T, Goddard M. Meuwissen T, et al. Genetics. 2010 Aug;185(4):1441-9. doi: 10.1534/genetics.110.113936. Epub 2010 May 17. Genetics. 2010. PMID: 20479147 Free PMC article. - Quantifying the amount of missing information in genetic association studies.
Nicolae DL. Nicolae DL. Genet Epidemiol. 2006 Dec;30(8):703-17. doi: 10.1002/gepi.20181. Genet Epidemiol. 2006. PMID: 16986163 - Evaluating associations of haplotypes with traits.
Schaid DJ. Schaid DJ. Genet Epidemiol. 2004 Dec;27(4):348-64. doi: 10.1002/gepi.20037. Genet Epidemiol. 2004. PMID: 15543638 Review. - Algorithms for inferring haplotypes.
Niu T. Niu T. Genet Epidemiol. 2004 Dec;27(4):334-47. doi: 10.1002/gepi.20024. Genet Epidemiol. 2004. PMID: 15368348 Review.
Cited by
- An empirical evaluation of imputation accuracy for association statistics reveals increased type-I error rates in genome-wide associations.
Almeida MA, Oliveira PS, Pereira TV, Krieger JE, Pereira AC. Almeida MA, et al. BMC Genet. 2011 Jan 20;12:10. doi: 10.1186/1471-2156-12-10. BMC Genet. 2011. PMID: 21251252 Free PMC article. - Comparison of different imputation methods from low- to high-density panels using Chinese Holstein cattle.
Weng Z, Zhang Z, Zhang Q, Fu W, He S, Ding X. Weng Z, et al. Animal. 2013 May;7(5):729-35. doi: 10.1017/S1751731112002224. Epub 2012 Dec 11. Animal. 2013. PMID: 23228675 Free PMC article. - Genotype imputation via matrix completion.
Chi EC, Zhou H, Chen GK, Del Vecchyo DO, Lange K. Chi EC, et al. Genome Res. 2013 Mar;23(3):509-18. doi: 10.1101/gr.145821.112. Epub 2012 Dec 10. Genome Res. 2013. PMID: 23233546 Free PMC article. - Genotype determination for polymorphisms in linkage disequilibrium.
Yu Z, Garner C, Ziogas A, Anton-Culver H, Schaid DJ. Yu Z, et al. BMC Bioinformatics. 2009 Feb 20;10:63. doi: 10.1186/1471-2105-10-63. BMC Bioinformatics. 2009. PMID: 19228433 Free PMC article. - Effects of missing marker and segregation distortion on QTL mapping in F2 populations.
Zhang L, Wang S, Li H, Deng Q, Zheng A, Li S, Li P, Li Z, Wang J. Zhang L, et al. Theor Appl Genet. 2010 Oct;121(6):1071-82. doi: 10.1007/s00122-010-1372-z. Epub 2010 Jun 10. Theor Appl Genet. 2010. PMID: 20535442
References
- Hum Hered. 2005;59(4):185-9 - PubMed
- Annu Rev Genet. 1995;29:423-44 - PubMed
- Nat Genet. 2006 Aug;38(8):904-9 - PubMed
- Genet Epidemiol. 2006 Dec;30(8):690-702 - PubMed
- Am J Hum Genet. 2003 Nov;73(5):1162-9 - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Research Materials