Segmental phylogenetic relationships of inbred mouse strains revealed by fine-scale analysis of sequence variation across 4.6 mb of mouse genome - PubMed (original) (raw)

Segmental phylogenetic relationships of inbred mouse strains revealed by fine-scale analysis of sequence variation across 4.6 mb of mouse genome

Kelly A Frazer et al. Genome Res. 2004 Aug.

Abstract

High-density SNP screening of panels of inbred mouse strains has been proposed as a method to accelerate the identification of genes associated with complex biomedical phenotypes. To evaluate the potential of these studies, a more detailed understanding of the fine structure of sequence variation across inbred mouse strains is needed. Here, we use high-density oligonucleotide arrays to discover an extremely dense set of SNPs in 13 classical and two wild-derived inbred strains in five genomic intervals totaling 4.6 Mb of DNA sequence, and then analyze the segmental haplotype structure defined by these high-density SNPs. This analysis reveals segments ranging from 12 to 608 kb in length within which the inbred strains have a simple and distinct phylogenetic relationship with typically two or three clades accounting for the 13 classical strains examined. The phylogenetic relationships among strains change abruptly and unpredictably from segment to segment, and are distinct in each of the five genomic regions examined. The data suggest that at least 12 strains would need to be resequenced for exhaustive SNP discovery in every region of the mouse genome, that approximately 97% of the variation among inbred strains is ancestral (between clades) and approximately 3% private (within clades), and provides critical insights into the proposed use of panels of inbred strains to identify genes underlying quantitative trait loci.

Copyright 2004 Cold Spring Harbor Laboratory Press ISSN

PubMed Disclaimer

Figures

Figure 1

Figure 1

Phylogenetic relationships of the 13 inbred classical strains and the two wild-derived strains. (A_–_E) For each of the five sequence contigs, the phylogenetic relationships within adjacent segments are shown. (Top and middle panels) The locations of unique sequences represented on the high-density arrays and identified SNPs, respectively. (Third panel) The segmental block locations (numbered), the boundaries of which were determined by analyzing transition points between high SNP and low SNP blocks for all 78 pairwise sequence comparisons of the 13 inbred classical strains. The phylogenetic relationships of the 15 inbred mouse strains (color-coded squares), determined using the Kimura 2-parameter model, are shown for each segment. For the wild-derived inbred strains, CAST/EiJ and SPRET/EiJ, uneven amounts of missing data may affect the relative branch lengths of these outgroups in some trees as a result of unbalanced loss of CAST or SPRET specific polymorphisms. (F) For the five sequence contigs, classical inbred strains that were in the same phylogenetic group for every segment in the contig are shown together. Missing SNP data are ignored for this analysis. Related strains are depicted as solid and hatched squares of the same color (A/J and A/HeJ, BALB/cByJ and BALB/cJ, C57BL/6J and B10.D2-H2d, NZB/B1NJ and NZW/LacJ).

Figure 1

Figure 1

Phylogenetic relationships of the 13 inbred classical strains and the two wild-derived strains. (A_–_E) For each of the five sequence contigs, the phylogenetic relationships within adjacent segments are shown. (Top and middle panels) The locations of unique sequences represented on the high-density arrays and identified SNPs, respectively. (Third panel) The segmental block locations (numbered), the boundaries of which were determined by analyzing transition points between high SNP and low SNP blocks for all 78 pairwise sequence comparisons of the 13 inbred classical strains. The phylogenetic relationships of the 15 inbred mouse strains (color-coded squares), determined using the Kimura 2-parameter model, are shown for each segment. For the wild-derived inbred strains, CAST/EiJ and SPRET/EiJ, uneven amounts of missing data may affect the relative branch lengths of these outgroups in some trees as a result of unbalanced loss of CAST or SPRET specific polymorphisms. (F) For the five sequence contigs, classical inbred strains that were in the same phylogenetic group for every segment in the contig are shown together. Missing SNP data are ignored for this analysis. Related strains are depicted as solid and hatched squares of the same color (A/J and A/HeJ, BALB/cByJ and BALB/cJ, C57BL/6J and B10.D2-H2d, NZB/B1NJ and NZW/LacJ).

Figure 2

Figure 2

The number of strains required to capture >95% of the observed variation was calculated both by determining how many of the 13 inbred classical strains were needed to detect >95% of the SNPs (A,B) and by determining how many of the 78 pairwise comparisons of the inbred classical strains were needed to observe >95% of the segmental block boundaries (C). (A,B) The percent of SNPs observed versus the number of inbred classical strains examined was determined for each contig individually as well as the average across all five contigs. (A) Directed selection of strains based on their level of divergence (based on the Kimura 2-parameter distance), starting with the most divergent. (B) The selection of inbred strains was random (nondirected selection), with the process being repeated 100 times, and the average results for each contig and across all five contigs shown. (C) The percent of segmental boundaries observed versus the number of inbred classical pairwise comparisons examined is shown for each contig individually as well as the average across all five contigs. Intervals containing relatively few segmental blocks, such as NT_014989, require fewer pairs of strains, and intervals containing an above-average number of segmental blocks, such as NT_026540, require more pairs of strains.

Similar articles

Cited by

References

    1. Beck, J.A., Lloyd, S., Hafezparast, M., Lennon-Pierce, M., Eppig, J.T., Festing, M.F., and Fisher, E.M. 2000. Genealogies of mouse inbred strains. Nat. Genet. 24: 23–25. - PubMed
    1. Bonhomme, F., Guenet, J.-L., Dod, B., Moriwaki, K., and Bulfield, G. 1987. The polyphyletic origin of laboratory inbred mice and their rate of evolution. J. Linn. Soc. 30: 51–58.
    1. Chee, M., Yang, R., Hubbell, E., Berno, A., Huang, X.C., Stern, D., Winkler, J., Lockhart, D.J., Morris, M.S., and Fodor, S.P. 1996. Accessing genetic information with high-density DNA arrays. Science 274: 610–614. - PubMed
    1. Dubchak, I., Brudno, M., Loots, G.G., Pachter, L., Mayor, C., Rubin, E.M., and Frazer, K.A. 2000. Active conservation of noncoding sequences revealed by three-way species comparisons. Genome Res. 10: 1304–1306. - PMC - PubMed
    1. Felsenstein, J. 1989. PHYLIP—Phylogeny Inference Package (Version 3.2). Cladistics 5: 164–166.

WEB SITE REFERENCES

    1. http://ftp.genome.washington.edu/RM/RepeatMasker.html; RepeatMasker.

MeSH terms

LinkOut - more resources