Relative performance of Bayesian clustering software for inferring population substructure and individual assignment at low levels of population differentiation (original) (raw)
Abstract
Traditional methods for characterizing genetic differentiation among populations rely on a priori grouping of individuals. Bayesian clustering methods avoid this limitation by using linkage and Hardy–Weinberg disequilibrium to decompose a sample of individuals into genetically distinct groups. There are several software programs available for Bayesian clustering analyses, all of which describe a decrease in the ability to detect distinct clusters as levels of genetic differentiation among populations decrease. However, no study has yet compared the performance of such methods at low levels of population differentiation, which may be common in species where populations have experienced recent separation or high levels of gene flow. We used simulated data to evaluate the performance of three Bayesian clustering software programs, PARTITION, STRUCTURE, and BAPS, at levels of population differentiation below F ST=0.1. PARTITION was unable to correctly identify the number of subpopulations until levels of F ST reached around 0.09. Both STRUCTURE and BAPS performed very well at low levels of population differentiation, and were able to correctly identify the number of subpopulations at F ST around 0.03. The average proportion of an individual’s genome assigned to its true population of origin increased with increasing F ST for both programs, reaching over 92% at an F ST of 0.05. The average number of misassignments (assignments to the incorrect subpopulation) continued to decrease as F ST increased, and when F ST was 0.05, fewer than 3% of individuals were misassigned using either program. Both STRUCTURE and BAPS worked extremely well for inferring the number of clusters when clusters were not well-differentiated (F ST=0.02–0.03), but our results suggest that F ST must be at least 0.05 to reach an assignment accuracy of greater than 97%.
Access this article
Subscribe and save
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime Subscribe now
Buy Now
Price excludes VAT (USA)
Tax calculation will be finalised during checkout.
Instant access to the full article PDF.
References
- Beaumont M, Barratt EM, Gottelli D, Kitchener AC, Daniels MJ, Pritchard JK, Bruford MW (2001) Genetic diversity and introgression in the Scottish wildcat. Mol. Ecol. 10:319–336
Article PubMed CAS Google Scholar - Berry O, Tocher MD, Gleeson DM, Sarre SD (2005) Effects of vegetation matrix on animal dispersal: genetic evidence from a study of endangered skinks. Conserv. Biol. 19:855–864
Article Google Scholar - Bowcock AM, Ruiz-Linares A, Tomfohrde J, Minch E, Kidd JR, Cavalli-Sforza LL (1994) High resolution of human evolutionary trees with polymorphic microsatellites. Nature 368:455–457
Article PubMed CAS Google Scholar - Castric V, Bernatchez L (2004) Individual assignment test reveals differential restriction to dispersal between two salmonids despite no increase of genetic differences with distance. Mol. Ecol. 13:1299–1312
Article PubMed CAS Google Scholar - Corander J, Walmann P, Sillanpaa MJ (2003) Bayesian analysis of genetic differentiation between populations. Genetics 163:367–374
PubMed CAS Google Scholar - Corander J, Walmann P, Marttinen P, Sillanpaa MJ (2004) BAPS2: enhanced possibilities for the analysis of genetic population structure. Bioinformatics 20:2363–2369
Article PubMed CAS Google Scholar - Corander J, Marttinen P, Mäntyniemi S (2005) Bayesian identification of stock mixtures from molecular marker data. Fish. Bull., in press
- Dawson KJ, Belkhir K (2001) A Bayesian approach to the identification of panmictic populations and the assignment of individuals. Genet. Res. 78:59–77
Article PubMed CAS Google Scholar - Evanno G, Regnaut S, Goudet J (2005) Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Mol. Ecol. 14:2611–2620
Article PubMed CAS Google Scholar - Falush D, Stephens M, Pritchard JK (2003) Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164:1567–1587
PubMed CAS Google Scholar - Fletcher R (1987) Practical methods of optimization. Wiley, New York
Google Scholar - Glaubitz JC (2004) CONVERT (version 1.2): A user-friendly program to reformat diploid genotypic data for commonly used population genetic software packages. Mol. Ecol. Notes, 4:309–310.
Article CAS Google Scholar - Hedrick PW (1999) Perspective: Highly variable loci and their interpretation in evolution and conservation. Evolution 53:313–318
Article Google Scholar - Hedrick PW (2005) A standardized genetic differentiation measure. Evolution 59:1633–1638
PubMed CAS Google Scholar - Kyle CJ, Strobeck C (2001) Genetic structure of North American wolverine (Gulo gulo) populations. Mol. Ecol. 10:337–347
Article PubMed CAS Google Scholar - Latch EK, Harveson LA, King JS, Hobson MD, Rhodes OE (2005) Assessing hybridization in wildlife populations using molecular markers: A case study in wild turkeys. J. Wildl. Manag., in press
- Maingon RDC, Ward RD, Hamilton JGC, Noyes HA, Souza N, Kemp SJ, and Watts PC (2003) Genetic identification of two sibling species of Lutzomyia longipalpis (Diptera: Psychodidae) that produce distinct male sex pheromones in Sobral, Ceara State, Brazil. Mol. Ecol. 12:1879–1894
Article PubMed CAS Google Scholar - Manel S, Berthier P, Luikart G (2002) Detecting wildlife poaching: Identifying the origin of individuals with Bayesian assignment tests and multilocus genotypes. Conserv. Biol. 16:650–659
Article Google Scholar - Manel S, Gaggiotti OE, Waples RS (2005) Assignment methods: matching biological questions with appropriate techniques. Trends Ecol. Evol. 20:136–142
Article PubMed Google Scholar - Mank JE and JC Avise (2004) Individual organisms as units of analysis: Bayesian-clustering alternatives in population genetics. Genet. Res. 84:135–143
Article PubMed CAS Google Scholar - Nie M (1977) F-statistics and analysis of gene diversity in subdivided populations. Ann. Hum. Genet. 41:225–233
Article PubMed CAS Google Scholar - Paetkau D, Calvert W, Stirling I, Strobeck C (1995) Microsatellite analysis of population structure in Canadian polar bears. Mol. Ecol. 4:347–354
Article PubMed CAS Google Scholar - Pearse DE, and KA Crandall (2004) Beyond F ST: analysis of population genetic data for conservation. Conserv. Genet. 5:585–602
Article CAS Google Scholar - Primmer CR, Koskinen MT, Piironen J (2000) The one that did not get away: individual assignment using microsatellite data detects a case of fishing competition fraud. Proc. R. Soc. Lond. B. Biol. Sci. 267:1699–1704
Article CAS Google Scholar - Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155:945–959
PubMed CAS Google Scholar - Pritchard JK, Wen W (2003) Documentation for STRUCTURE software: Version 2. Available from http://www.pritch.bsd. uchicago.edu
- Randi E, Pierpaoli M, Beaumont M, Ragni B, Sforzi A (2001) Genetic identification of wild and domestic cats (Felis silvestris) and their hybrids using Bayesian clustering methods. Mol. Biol. Evol. 18:1679–1693
PubMed CAS Google Scholar
Acknowledgements
We would like to thank Jukka Corander for providing technical advice and an advance copy of his manuscript for BAPS 3.1, and Khalid Belkhir for offering assistance regarding PARTITION. Funding was provided by Purdue University.
Author information
Authors and Affiliations
- Department of Forestry and Natural Resources, Purdue University, West Lafayette, Indiana, 47907, USA
Emily K. Latch, Guha Dharmarajan & Olin E. Rhodes Jr. - Laboratory of Genetics, University of Wisconsin-Madison, Madison, Wisconsin, 53706, USA
Jeffrey C. Glaubitz
Authors
- Emily K. Latch
You can also search for this author inPubMed Google Scholar - Guha Dharmarajan
You can also search for this author inPubMed Google Scholar - Jeffrey C. Glaubitz
You can also search for this author inPubMed Google Scholar - Olin E. Rhodes Jr.
You can also search for this author inPubMed Google Scholar
Corresponding author
Correspondence toEmily K. Latch.
Rights and permissions
About this article
Cite this article
Latch, E.K., Dharmarajan, G., Glaubitz, J.C. et al. Relative performance of Bayesian clustering software for inferring population substructure and individual assignment at low levels of population differentiation.Conserv Genet 7, 295–302 (2006). https://doi.org/10.1007/s10592-005-9098-1
- Received: 13 May 2005
- Accepted: 09 November 2005
- Published: 21 February 2006
- Issue Date: April 2006
- DOI: https://doi.org/10.1007/s10592-005-9098-1