The Use of Microsatellite Variation to Infer Population Structure and Demographic History in a Natural Model System (original) (raw)

Journal Article

,

Department of Zoology

, University of Oxford, Oxford OX1 3PS, United Kingdom

Search for other works by this author on:

,

Department of Biology

, University of California, Los Angeles, California 90007

Search for other works by this author on:

,

Department of Biology

, University of California, Los Angeles, California 90007

Search for other works by this author on:

,

Department of Zoology

, University of Oxford, Oxford OX1 3PS, United Kingdom

Search for other works by this author on:

,

Interval Research

, Palo Alto, California 94303

Search for other works by this author on:

Department of Biology

, University of California, Los Angeles, California 90007

Corresponding author: David B. Goldstein, Department of Zoology, South Parks Rd., University of Oxford, Oxford, OX1 3PS, United Kingdom. E-mail: david.goldstein@zoo.ox.ac.uk

Search for other works by this author on:

Received:

10 September 1997

Accepted:

30 October 1998

Published:

01 February 1999

Cite

David B Goldstein, Gary W Roemer, Deborah A Smith, David E Reich, Aviv Bergman, Robert K Wayne, The Use of Microsatellite Variation to Infer Population Structure and Demographic History in a Natural Model System, Genetics, Volume 151, Issue 2, 1 February 1999, Pages 797–801, https://doi.org/10.1093/genetics/151.2.797
Close

Navbar Search Filter Mobile Enter search term Search

Abstract

To assess the reliability of genetic markers it is important to compare inferences that are based on them to a priori expectations. In this article we present an analysis of microsatellite variation within and among populations of island foxes (Urocyon littoralis) on California's Channel Islands. We first show that microsatellite variation at a moderate number of loci (19) can provide an essentially perfect description of the boundaries between populations and an accurate representation of their historical relationships. We also show that the pattern of variation across unlinked microsatellite loci can be used to test whether population size has been constant or increasing. Application of these approaches to the island fox system indicates that microsatellite variation may carry considerably more information about population history than is currently being used.

THE island fox is a diminutive form of the mainland gray fox, Urocyon cinereoargenteus, which colonized the Channel Islands 10,400–16,000 years before present (YBP). It currently occupies six of California's Channel Islands, the southern set having been derived from northern populations relatively recently (Gilbert et al. 1990; Collins 1991; Wayne et al. 1991). The oldest fossils of the island fox were found on Santa Rosa Island and have been dated to 10,400–16,000 YBP, at which time the three northern islands were a single land mass (Orr 1968). About 11,500 YBP, due to rising sea levels, Santa Cruz separated from Santa Rosa and San Miguel, followed by the separation of the latter two ∼9500 YBP. The southern islands were never connected to the northern islands or to the mainland. Foxes probably colonized the northern islands by rafting or swimming but may have been transported to the southern islands by Native Americans who arrived in the Channel Islands ∼9000–10,000 YBP (Collins 1991). Fox remains recovered from archaeological sites on the southern islands have been dated at 3400 YBP for San Clemente, 800–3880 YBP for Santa Catalina, and 2200 YBP for San Nicolas (Collins 1991).

Here we assess the ability of microsatellite analyses to detect differentiation over a range of time scales while avoiding the assumptions of computer simulations (Goldstein et al. 1995a; Takezaki and Nei 1996). In particular, we determine: (1) whether the population structure imposed by isolation among the islands is accurately recorded in microsatellite variation and (2) whether genetic differences among islands are consistent with the pattern of colonization suggested by fossil and geological evidence. We also show that the expected magnitude of the variation in diversity measures across unlinked microsatellite markers depends on whether a population has been growing or stationary.

EXPERIMENTAL METHODS

Sample collection: Tissue or blood samples were obtained from 15 gray foxes from three counties (Santa Barbara, Ventura, and Los Angeles) in Southern California. Samples of 171 island foxes were obtained from Santa Cruz (n = 29), Santa Rosa (n = 30), San Miguel (n = 22), Santa Catalina (n = 30), San Clemente (n = 30), and San Nicolas (n = 30) Islands.

DNA extraction: DNA was extracted from tissue or white blood cells using a standard proteinase K digestion followed by phenol/chloroform/isoamyl alcohol extraction (Sambrook et al. 1989). The extracted DNA was vacuum dried, ethanol precipitated, and resuspended in 1× TE (pH 7.0).

Primer selection and PCR amplification: A total of 21 samples, 3 from each of the gray and island fox populations, were screened for variation at 66 unlinked dinucleotide repeat loci using primers on the basis of analysis of a domestic dog genomic library (Ostrander et al. 1993). Nineteen loci that were predominantly simple (CA)n repeats were selected for typing that consistently yielded PCR products, produced minimal stutter, and were polymorphic. The designations of the 19 microsatellite primer pairs are CXX.109, CXX.147, CXX.155, CXX.173, CXX.204, CXX.231, CXX.225, CXX.250, CXX.263, CXX.279 and 366, 377, 383, 410, 431, 442, 453, 606, and 671 (Ostrander et al. 1993 and listing at http://tiberius.fhcrc.org/markers/table1.html/).

Twenty picomoles of one primer was end-labeled by incubating at 37° for 40 min with 2 μCi [γ-32P]dATP, 2.5 μl of T4 polynucleotide kinase buffer, and 1.2 μl of diluted (1:4) T4 polynucleotide kinase in a total volume of 25 μl (Sambrook et al. 1989). The enzyme was then denatured by heating the mixture to 94° for 5 min. The labeled and unlabeled primers (20 pmol) were mixed with 50 ng of genomic DNA, 0.2 mm DNTP, 2 mm MgCl2, 1× Thermo DNA reaction buffer, and 0.8 units of Taq DNA polymerase in a reaction volume of 25 μl. Twenty-five to 32 cycles of amplification were conducted in a Perkin Elmer-Cetus (Norwalk, CT) 9600 DNA thermal cycler using the following protocol: denaturation at 94° for 5 min, followed by cycling of denaturation at 94° for 45 sec, annealing at 45–55° for 45 sec, and polymerization at 72° for 1 min. Annealing temperatures were optimized for each primer pair, and the last cycle was followed by an additional extension period at 72° for 5 min.

After amplification, 3 μl of the reaction mix was mixed with 2 μl of formamide loading dye and heat denatured at 94° for 5 min. Three microliters of the denatured product was then loaded onto a 6% polyacrylamide premixed sequencing gel with TBE buffer (Sequagel; National Diagnostic, Atlanta), and electrophoresed for 2 to 4 hr at 55 W. A nonrecombinant M13 control sequence was run on each gel as an absolute size standard allowing comparisons between samples run on different gels. Afterward, gels were fixed to Whatman paper by drying under vacuum for ∼1.5 hr at 80°. Microsatellite alleles were visualized by exposure to autoradiographic film for 12 to 24 hr.

–UPGMA tree of individuals (Bowcock et al. 1994) based on multilocus genotypes of 183 foxes using the allele-sharing genetic distance (Goldstein et al. 1995a) in which taxa have been rotated, with no changes in the clusters, to allow placement on the islands.

Figure 1.

–UPGMA tree of individuals (Bowcock et al. 1994) based on multilocus genotypes of 183 foxes using the allele-sharing genetic distance (Goldstein et al. 1995a) in which taxa have been rotated, with no changes in the clusters, to allow placement on the islands.

STATISTICAL METHODS AND RESULTS

Population structure: Phylogenetic classification of individuals (Figure 1), based on the allele-sharing genetic distance, demonstrates the striking resolution of population structure provided by microsatellite data. In the resulting tree of individuals 181 out of 183 foxes were correctly assigned to their geographic origins. Population trees based on the stepwise genetic distance (δμ)2 (Goldstein et al. 1995b; Goldstein and Pollock 1997) also topologically reflect the presumed pattern of island separation and island founding times (Figure 2). The nonstepwise allele-sharing distance (Goldstein and Pollock 1997), in contrast, results in a tree less consistent with the presumed history, most notably in not placing the gray fox as an outgroup to the rest (tree not shown). The independent evidence of island fox monophyly comes from morphology and archaeology (Collins 1991), as well as from mtDNA, allozyme, and fingerprinting data (Wayne et al. 1991). Inferred separation times (Figure 2), however, are not in complete agreement with the fossil record and the history of sea-level changes. The estimated separation times between Santa Rosa and San Miguel and between Santa Cruz and the pair Santa Rosa and San Miguel both appear shorter than would be indicated by the date of sea-level-induced isolation, the former especially so. Estimated separation times, however, are based on the oldest recorded fossil and may therefore be underestimates (see Figure 2 legend). A low level of gene flow among the islands may also reduce the estimated separation times. San Miguel and Santa Rosa, for example, are <4 km apart, and Native Americans may have transported foxes between other pairs of islands (Collins 1991).

–UPGMA bootstrap consensus tree of five island fox and gray fox populations based on the (δμ)2 genetic distance (Goldstein et al. 1995b). Percentages appearing alongside nodes indicate the proportion of bootstrap replications in which the figured grouping appears. San Nicolas has been excluded from the analysis because this population has undergone tremendous genetic drift, as evidenced by the nearly complete lack of variation on this island (Figure 1). The separation times among the populations were estimated on the basis of the time of introduction to the islands as 16,000 YBP and assuming the genetic distance underlying the tree has been linear with time (Goldstein et al. 1995b). If the calibration is based on the more reliable date of separation between Santa Cruz and the then-joined pair Santa Rosa and San Miguel, the inferred time of appearance of foxes would be pushed back to >20,000 YBP, a plausible result given the expected gap between founding time and the first fossil record. The separation between Santa Rosa and San Miguel, however, still remains far less than expected.

Figure 2.

–UPGMA bootstrap consensus tree of five island fox and gray fox populations based on the (δμ)2 genetic distance (Goldstein et al. 1995b). Percentages appearing alongside nodes indicate the proportion of bootstrap replications in which the figured grouping appears. San Nicolas has been excluded from the analysis because this population has undergone tremendous genetic drift, as evidenced by the nearly complete lack of variation on this island (Figure 1). The separation times among the populations were estimated on the basis of the time of introduction to the islands as 16,000 YBP and assuming the genetic distance underlying the tree has been linear with time (Goldstein et al. 1995b). If the calibration is based on the more reliable date of separation between Santa Cruz and the then-joined pair Santa Rosa and San Miguel, the inferred time of appearance of foxes would be pushed back to >20,000 YBP, a plausible result given the expected gap between founding time and the first fossil record. The separation between Santa Rosa and San Miguel, however, still remains far less than expected.

Test of demographic history: The motivation for the test of demographic history can be seen most readily by reference to the expected topologies of gene genealogies in different demographic scenarios. It is well known that the structure of a gene genealogy at a single locus is sensitive to the pattern of change in population size over time (Slatkin and Hudson 1991; Rogers and Harpending 1992; Nee et al. 1995; Donnelly 1996). For example, two extreme models produce gene genealogies with dramatically different structures (Donnelly 1996). At equilibrium in a population of constant size genealogies tend to have a single deep split and a variety of more shallow splits, while under exponential growth genealogies are comblike, with all branches of similar length. Such features of single-locus genealogies have been exploited in the past to develop tests of demographic models and have been principally applied to the analysis of mitochondrial DNA (Slatkin and Hudson 1991; Rogers and Harpending 1992). Microsatellite analyses, however, routinely include markers from unlinked regions and therefore contain multiple observations of the genealogies expected from the relevant demographic conditions. As we shall see, the differences among such independent genealogies are, if anything, even more diagnostic of demographic history than the shape of a single realized genealogy. In particular, in a constant population the distribution of internode lengths differs greatly from genealogy to genealogy (that is, from one unlinked locus to another or, equivalently, over replications of the evolutionary process). Under rapid exponential growth, however, allelic lineages tend to coalesce at or near the time at which exponential growth began (Donnelly 1996). Thus the total length of the branches separating alleles is more similar across loci under exponential growth than it is in a population of constant size. To make use of this difference, we note that the total genealogical branch length separating a sampled pair of alleles is linearly related to the expected squared difference in repeat score between them (Slatkin 1995), which in turn determines the variance in repeat scores, Vr. This measure of variability, therefore, will tend to be more similar across loci under exponential growth than in a population of constant size. Specifically, the conjecture is that the variance of the variance in repeat scores, denoted Vl[Vr], will tend to be reduced by rapid population growth and can form the basis of a test of demographic history. Simulation of a growing population confirms our expectation that exponential growth reduces Vl[Vr] (Figure 3). To formalize the test, we note that at mutation-drift equilibrium in a population of constant size, the drift variance of Vr is denoted Vg[Vr] and is given in Zhivotovsky and Feldman (1995) and in Roe (1992) as

Because the covariance of Vr at unlinked microsatellite loci is negligible (Zhivotovsky and Feldman 1995), we may take this as the prediction for the variance of Vr across unlinked loci in a single population of constant size. This expectation only applies for a set of loci with a constant mutation rate. Variation in the mutation rate will inflate the observed value.

–Predicted and observed variances as a function of the number of generations since the beginning of exponential growth in 500 replicate simulated populations of 10 loci undergoing stepwise mutations. The solid line shows the predicted variance among loci of the variance in repeat score (denoted Vg[Vr] in the text) and the observed value (denoted Vl[Vr]) as a function of the number of generations since the beginning of exponential growth. The predicted value is generated using Equation 1 and the observed variance, Vr, and is what would be expected in a population of constant size. Because the test uses the observed Vr to calculate an expected variance across loci it is not sensitive to the magnitude of Vr.

Figure 3.

–Predicted and observed variances as a function of the number of generations since the beginning of exponential growth in 500 replicate simulated populations of 10 loci undergoing stepwise mutations. The solid line shows the predicted variance among loci of the variance in repeat score (denoted Vg[Vr] in the text) and the observed value (denoted Vl[Vr]) as a function of the number of generations since the beginning of exponential growth. The predicted value is generated using Equation 1 and the observed variance, Vr, and is what would be expected in a population of constant size. Because the test uses the observed Vr to calculate an expected variance across loci it is not sensitive to the magnitude of Vr.

A test of whether an observed reduction is significant can be performed in one of two ways. An analytic test of significance can be developed from the suggestion in Goldstein et al. (1996) that the distribution of the Log[Vr] is approximately normal. Because the distribution of the variances drawn from a unit normal distribution is chi-square, it is possible to develop an analytic expression for the confidence interval using approximate methods to determine the variance of the Log[Vr] from Vg[Vr] (Goldstein et al. 1996). In our case, however, many loci are monomorphic (in certain islands), making it impossible to apply the log transformation. We also note that the approximate methods for determining the variance of the Log[Vr]'s are not very accurate. For these reasons we prefer to assess significance through coalescent-based computer simulations (Hudson 1990). To carry out the simulations it is necessary to use Vr to estimate θ (two times the product of the population size and the mutation rate; Moran 1975), which could introduce an additional source of error. To reduce this source of error we base the test on the ratio of the observed to expected variance of the variance that we denote as g = Vl[Vr]/Vg[Vr]. Simulations demonstrate that for all θ > 1.0, the distribution of g is approximately independent of θ.

For all of the island populations, the values of Vl[Vr] are greater than the expected values (Table 1), providing no suggestion of population growth. This accords with expectation given that the islands were first colonized

TABLE 1

Observed and predicted variances of the variance of repeat scores across loci for the mainland gray fox and five island fox populations used in the interlocus variance test of demographic history

Population V r V1[Vr] Vg[Vr] g
Mainland 4.47 9.86 27.4 0.36
Santa Cruz 1.12 7.86 1.86 4.23
San Miguel 0.618 3.67 0.61 6.09
Santa Rosa 0.97 2.45 1.42 1.73
Santa Catalina 2.06 22.8 6.0 3.81
San Clemente 2.15 26.36 6.54 4.03
Population V r V1[Vr] Vg[Vr] g
Mainland 4.47 9.86 27.4 0.36
Santa Cruz 1.12 7.86 1.86 4.23
San Miguel 0.618 3.67 0.61 6.09
Santa Rosa 0.97 2.45 1.42 1.73
Santa Catalina 2.06 22.8 6.0 3.81
San Clemente 2.15 26.36 6.54 4.03

TABLE 1

Observed and predicted variances of the variance of repeat scores across loci for the mainland gray fox and five island fox populations used in the interlocus variance test of demographic history

Population V r V1[Vr] Vg[Vr] g
Mainland 4.47 9.86 27.4 0.36
Santa Cruz 1.12 7.86 1.86 4.23
San Miguel 0.618 3.67 0.61 6.09
Santa Rosa 0.97 2.45 1.42 1.73
Santa Catalina 2.06 22.8 6.0 3.81
San Clemente 2.15 26.36 6.54 4.03
Population V r V1[Vr] Vg[Vr] g
Mainland 4.47 9.86 27.4 0.36
Santa Cruz 1.12 7.86 1.86 4.23
San Miguel 0.618 3.67 0.61 6.09
Santa Rosa 0.97 2.45 1.42 1.73
Santa Catalina 2.06 22.8 6.0 3.81
San Clemente 2.15 26.36 6.54 4.03

thousands of generations ago and effective population sizes remain on the order of hundreds of individuals (Wayne et al. 1991). In populations of this size mutation-drift equilibrium is reached very rapidly. The obvious trend toward higher than expected ratios probably stems from two artifacts. First, variation in the mutation rate across loci will inflate the ratio. An additional contribution is probably made by intermittent migration among the islands, which might differentially affect the loci and therefore contribute to differences among them. The case for the mainland, however, is markedly different. The observed value of 9.38 is only about one-third of the expected value, suggesting some form of population increase. To assess the significance of this reduction we performed 1000 simulations and created an ordered list of the simulated ratios of the statistic g = Vl[Vr]/Vg[Vr]. The observed value of g = 0.36 falls at the 120th ordered observation, indicating that P = 0.12 in a one-tailed test. For a sample size of 15, the 0.05 cutoff occurs at g = 0.28. If we consider that the principal known artifact (variation in the mutation rate across loci) inflates g, the observed value of 0.36 is a strong indication of growth, despite falling short of significance.

CONCLUSIONS

It is now clear that microsatellite markers are substantially more complicated than assumed in the various methods used to analyze them (Goldstein and Pollock 1997). It is therefore important to test the performance of the methods in model systems in which inferences can be checked against a priori expectations. We have shown that even with a modest number of microsatellite loci, nodes in trees of individuals have an almost perfect correspondence with those predicted from geographic boundaries. Because the tree is well resolved despite very recent and incomplete isolation, we may take this as an indication that negative results in such analyses reflect the real population structure as opposed to limitations of the marker.

We have also applied a new statistical approach for assessing demographic history on the basis of the pattern of variation across unlinked microsatellite loci. While a great deal of effort has been devoted to making inferences from single genomic regions, our results demonstrate that genome-wide analyses open a new and highly informative window on the demographic histories of natural populations. While our analysis of demographic history falls short of statistical significance, the existence of a trend indicating growth is encouraging given that known artifacts are conservative. We suspect that the approach will often allow discrimination of growing and stationary populations using only a moderate number of markers. Taken collectively, our analyses indicate that whatever the complexities of microsatellite mutations and evolution, they do not appear to prohibit the estimation of subtle aspects of population structure and demographic history.

Footnotes

Communicating editor: M. K. Uyenoyama

LITERATURE CITED

Bowcock

A M

,

Ruiz Linares

A

,

Tomfohrde

J

,

Minch

E

,

Kidd

J R

et al. .,

1994

High resolution of human evolutionary trees with polymorphic microsatellites

.

Nature

368

:

455

457

.

Collins

P W

,

1991

Interaction between island foxes (Urocyon littoralis ) and Indians on islands off the coast of Southern California: I. Morphologic and archaeological evidence of human assisted dispersal

.

J. Ethnobiol.

11

(

1

):

51

81

.

Donnelly

P

,

1996

Interpreting genetic-variability–the effects of shared evolutionary history

.

Ciba Found. Symp.

197

:

25

40

.

Gilbert

D A

,

Lehman

N

,

O'Brien

S J

,

Wayne

R K

,

1990

Genetic fingerprinting reflects population differentiation in the California channel-island fox

.

Nature

344

:

764

767

.

Goldstein

D B

,

Pollock

D D

,

1997

Launching microsatellites: a review of mutation processes and methods of phylogenetic inference

.

J. Hered.

88

:

335

342

.

Goldstein

D B

,

Ruiz Linares

A

,

Cavalli-Sforza

L L

,

Feldman

M W

,

1995a

An evaluation of genetic distances for use with microsatellite loci

.

Genetics

139

:

463

471

.

Goldstein

D B

,

Ruiz Linares

A

,

Cavalli-Sforza

L L

,

Feldman

M W

,

1995b

Genetic absolute dating based on microsatellites and the origin of modern humans

.

Proc. Natl. Acad. Sci. USA

92

:

6723

6727

.

Goldstein

D B

,

Zhivotovsky

L A

,

Nayar

K

,

Ruiz Linares

A

,

Cavalli-Sforza

L L

et al. .,

1996

Statistical properties of the variation at linked microsatellite loci–implications for the history of human Y-chromosomes

.

Mol. Biol. Evol.

13

:

1213

1218

.

Hudson

R R

,

1990

Gene genealogies and the coalescent process

.

Oxford Surv. Evol. Biol.

7

:

1

44

.

Moran

P A P

,

1975

Wandering distributions and the electrophoretic profile

.

Theor. Popul. Biol.

8

:

318

330

.

Nee

S

,

Holmes

E C

,

Rambaut

A

,

Harvey

P H

,

1995

Inferring population history from molecular phylogenies

.

Philos. Trans. R. Soc. Lond. B Biol. Sci.

349

:

25

31

.

Orr

P C

,

1968

Prehistory of Santa Rosa Island.

Santa Barbara Museum of Natural History

,

Santa Barbara, CA

.

Ostrander

E A

,

Sprague

G F

,

Rine

J

,

1993

Identification and characterization of dinucleotide repeat (ca)n markers for genetic-mapping in dog

.

Genomics

16

:

207

213

.

Roe

A

,

1992

Correlations and interactions in random walks and population genetics

.

Ph.D. Thesis

,

University of London

.

Rogers

A R

,

Harpending

H

,

1992

Population-growth makes waves in the distribution of pairwise genetic-differences

.

Mol. Biol. Evol.

9

:

552

569

.

Sambrook

J

,

Fritsch

E F

,

Maniatis

T

,

1989

Molecular Cloning: A Laboratory Manual.

Cold Spring Harbor Laboratory Press

,

Cold Spring Harbor, NY

.

Slatkin

M

,

1995

A measure of population subdivision based on microsatellite allele frequencies

.

Genetics

139

:

457

462

.

Slatkin

M

,

Hudson

R R

,

1991

Pairwise comparisons of mitochondrial-DNA sequences in stable and exponentially growing populations

.

Genetics

129

:

555

562

.

Takezaki

N

,

Nei

M

,

1996

Genetic distances and reconstruction of phylogenetic trees from microsatellite DNA

.

Genetics

144

:

389

399

.

Wayne

R K

,

George

S B

,

Gilbert

D

,

Collins

P W

,

Kovach

S D

et al. .,

1991

A morphological and genetic study of the island fox, Urocyon Littoralis

.

Evolution

45

:

1849

1868

.

Zhivotovsky

L A

,

Feldman

M W

,

1995

Microsatellite variability and genetic distances

.

Proc. Natl. Acad. Sci. USA

92

:

11549

11552

.

© Genetics 1999

Citations

Views

Altmetric

Metrics

Total Views 482

353 Pageviews

129 PDF Downloads

Since 2/1/2021

Month: Total Views:
February 2021 10
March 2021 4
April 2021 16
May 2021 15
June 2021 11
July 2021 4
August 2021 15
September 2021 9
October 2021 23
November 2021 9
December 2021 4
January 2022 8
February 2022 10
March 2022 10
April 2022 7
May 2022 6
June 2022 8
July 2022 11
August 2022 8
September 2022 9
October 2022 4
November 2022 16
December 2022 4
January 2023 9
February 2023 2
March 2023 29
April 2023 7
May 2023 12
June 2023 15
July 2023 5
August 2023 12
September 2023 8
October 2023 8
November 2023 11
December 2023 11
January 2024 11
February 2024 35
March 2024 10
April 2024 14
May 2024 17
June 2024 4
July 2024 11
August 2024 7
September 2024 12
October 2024 11

×

Email alerts

Citing articles via

More from Oxford Academic