Sequence-based characterization of structural variation in the mouse genome (original) (raw)

Nature volume 477, pages 326–329 (2011)Cite this article

Subjects

Abstract

Structural variation is widespread in mammalian genomes1,2 and is an important cause of disease3, but just how abundant and important structural variants (SVs) are in shaping phenotypic variation remains unclear4,5. Without knowing how many SVs there are, and how they arise, it is difficult to discover what they do. Combining experimental with automated analyses, we identified 711,920 SVs at 281,243 sites in the genomes of thirteen classical and four wild-derived inbred mouse strains. The majority of SVs are less than 1 kilobase in size and 98% are deletions or insertions. The breakpoints of 160,000 SVs were mapped to base pair resolution, allowing us to infer that insertion of retrotransposons causes more than half of SVs. Yet, despite their prevalence, SVs are less likely than other sequence variants to cause gene expression or quantitative phenotypic variation. We identified 24 SVs that disrupt coding exons, acting as rare variants of large effect on gene function. One-third of the genes so affected have immunological functions.

This is a preview of subscription content, access via your institution

Access options

Subscribe to this journal

Receive 51 print issues and online access

$199.00 per year

only $3.90 per issue

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Additional access options:

Similar content being viewed by others

References

  1. Mills, R. E. et al. Mapping copy number variation by population-scale genome sequencing. Nature 470, 59–65 (2011)
    Article CAS Google Scholar
  2. Quinlan, A. R. et al. Genome-wide mapping and assembly of structural variant breakpoints in the mouse genome. Genome Res. 20, 623–635 (2010)
    Article CAS Google Scholar
  3. Zhang, F., Gu, W., Hurles, M. E. & Lupski, J. R. Copy number variation in human health, disease, and evolution. Annu. Rev. Genomics Hum. Genet. 10, 451–481 (2009)
    Article CAS Google Scholar
  4. Conrad, D. F. et al. Origins and functional impact of copy number variation in the human genome. Nature 464, 704–712 (2010)
    Article CAS Google Scholar
  5. Stranger, B. E. et al. Population genomics of human gene expression. Nature Genet. 39, 1217–1224 (2007)
    Article CAS Google Scholar
  6. Agam, A. et al. Elusive copy number variation in the mouse genome. PLoS ONE 5, e12839 (2010)
    Article ADS Google Scholar
  7. Cahan, P., Li, Y., Izumi, M. & Graubert, T. A. The impact of copy number variation on local gene expression in mouse hematopoietic stem and progenitor cells. Nature Genet. 41, 430–437 (2009)
    Article CAS Google Scholar
  8. Henrichsen, C. N. et al. Segmental copy number variation shapes tissue transcriptomes. Nature Genet. 41, 424–429 (2009)
    Article CAS Google Scholar
  9. Schadt, E. E. et al. An integrative genomics approach to infer causal associations between gene expression and disease. Nature Genet. 37, 710–717 (2005)
    Article CAS Google Scholar
  10. Zhang, F. et al. The DNA replication FoSTeS/MMBIR mechanism can generate genomic, genic and exonic complex rearrangements in humans. Nature Genet. 41, 849–853 (2009)
    Article CAS Google Scholar
  11. Ma, J. L., Kim, E. M., Haber, J. E. & Lee, S. E. Yeast Mre11 and Rad1 proteins define a Ku-independent mechanism to repair double-strand breaks lacking overlapping end sequences. Mol. Cell. Biol. 23, 8820–8828 (2003)
    Article CAS Google Scholar
  12. Stankiewicz, P. & Lupski, J. R. Structural variation in the human genome and its role in disease. Annu. Rev. Med. 61, 437–455 (2010)
    Article CAS Google Scholar
  13. Stankiewicz, P. & Lupski, J. R. Genome architecture, rearrangements and genomic disorders. Trends Genet. 18, 74–82 (2002)
    Article CAS Google Scholar
  14. Hastings, P. J., Ira, G. & Lupski, J. R. A microhomology-mediated break-induced replication model for the origin of human copy number variation. PLoS Genet. 5, e1000327 (2009)
    Article CAS Google Scholar
  15. Huang, G. J. et al. High resolution mapping of expression QTLs in heterogeneous stock mice in multiple tissues. Genome Res. 19, 1133–1140 (2009)
    Article CAS Google Scholar
  16. Yalcin, B., Flint, J. & Mott, R. Using progenitor strain information to identify quantitative trait nucleotides in outbred mice. Genetics 171, 673–681 (2005)
    Article CAS Google Scholar
  17. Valdar, W. et al. Genome-wide genetic association of complex traits in heterogeneous stock mice. Nature Genet. 38, 879–887 (2006)
    Article CAS Google Scholar
  18. Keane, T. M. et al. Mouse genomic variation and its effect on phenotypes and gene regulation. Nature doi:10.1038/nature10413 (this issue).
  19. Yalcin, B. et al. Commercially available outbred mice for genome-wide association studies. PLoS Genet. 6, e1001085 (2010)
    Article Google Scholar
  20. Best, S., Le Tissier, P., Towers, G. & Stoye, J. P. Positional cloning of the mouse retrovirus restriction gene Fv1 . Nature 382, 826–829 (1996)
    Article ADS CAS Google Scholar
  21. Boyden, L. M. et al. Skint1, the prototype of a newly identified immunoglobulin superfamily gene cluster, positively selects epidermal γδ T cells. Nature Genet. 40, 656–662 (2008)
    Article CAS Google Scholar
  22. Nelson, T. M., Munger, S. D. & Boughter, J. D., Jr Haplotypes at the Tas2r locus on distal chromosome 6 vary with quinine taste sensitivity in inbred mice. BMC Genet. 6, 32 (2005)
    Article Google Scholar
  23. Persson, K., Heby, O. & Berger, F. G. The functional intronless S-adenosylmethionine decarboxylase gene of the mouse (Amd-2) is linked to the ornithine decarboxylase gene (Odc) on chromosome 12 and is present in distantly related species of the genus Mus . Mamm. Genome 10, 784–788 (1999)
    Article CAS Google Scholar
  24. Wu, B. et al. Mutations in sterol O-acyltransferase 1 (Soat1) result in hair interior defects in AKR/J mice. J. Invest. Dermatol. 130, 2666–2668 (2010)
    Article CAS Google Scholar
  25. Tareen, S. U., Sawyer, S. L., Malik, H. S. & Emerman, M. An expanded clade of rodent Trim5 genes. Virology 385, 473–483 (2009)
    Article CAS Google Scholar
  26. Taylor, K. et al. Defensin-related peptide 1 (Defr1) is allelic to Defb8 and chemoattracts immature DC and CD4+ T cells independently of CCR6. Eur. J. Immunol. 39, 1353–1360 (2009)
    Article CAS Google Scholar
  27. Ye, K., Schulz, M. H., Long, Q., Apweiler, R. & Ning, Z. Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics 25, 2865–2871 (2009)
    Article CAS Google Scholar
  28. Chen, K. et al. BreakDancer: an algorithm for high-resolution mapping of genomic structural variation. Nature Methods 6, 677–681 (2009)
    Article CAS Google Scholar
  29. Simpson, J. T., McIntyre, R. E., Adams, D. J. & Durbin, R. Copy number variant detection in inbred strains from short read sequence data. Bioinformatics 26, 565–567 (2010)
    Article CAS Google Scholar
  30. Manske, H. M. & Kwiatkowski, D. P. LookSeq: a browser-based viewer for deep sequencing data. Genome Res. 19, 2125–2132 (2009)
    Article CAS Google Scholar

Download references

Acknowledgements

We thank A. Whitley, G. Durrant, A. M. Hammond, D. J. Fabrigar, L. Chen, M. Johannesson, E. Cong and G. Blázquez for helping B.Y. with various laboratory-based work. We also thank C. P. Ponting for comments on the manuscript. This project was supported by The Medical Research Council, UK, and the Wellcome Trust. D.J.A. is supported by Cancer Research UK.

Author information

Author notes

  1. Binnaz Yalcin, Kim Wong, Avigail Agam and Martin Goodson: These authors contributed equally to this work.

Authors and Affiliations

  1. The Wellcome Trust Centre for Human Genetics, Roosevelt Drive, Oxford OX3 7BN, UK ,
    Binnaz Yalcin, Avigail Agam, Martin Goodson, Xiangchao Gan, Leo Goodstadt, Jérôme Nicod, Amarjit Bhomra, Polinka Hernandez-Pliego, Helen Whitley, James Cleak, Rebekah Dutton, Deborah Janowitz, Richard Mott & Jonathan Flint
  2. The Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1HH, UK ,
    Kim Wong, Thomas M. Keane & David J. Adams
  3. Department of Physiology, MRC Functional Genomics Unit, Anatomy and Genetics, University of Oxford, South Parks Road, Oxford OX1 3QX, UK,
    Avigail Agam & Christoffer Nellåker
  4. Department of Psychiatry and Psychotherapy, Ernst-Moritz-Arndt-Universität Greifswald Klinikum der Hansestadt Stralsund, Rostocker Chaussee 70, 18437 Stralsund, Germany,
    Deborah Janowitz

Authors

  1. Binnaz Yalcin
    You can also search for this author inPubMed Google Scholar
  2. Kim Wong
    You can also search for this author inPubMed Google Scholar
  3. Avigail Agam
    You can also search for this author inPubMed Google Scholar
  4. Martin Goodson
    You can also search for this author inPubMed Google Scholar
  5. Thomas M. Keane
    You can also search for this author inPubMed Google Scholar
  6. Xiangchao Gan
    You can also search for this author inPubMed Google Scholar
  7. Christoffer Nellåker
    You can also search for this author inPubMed Google Scholar
  8. Leo Goodstadt
    You can also search for this author inPubMed Google Scholar
  9. Jérôme Nicod
    You can also search for this author inPubMed Google Scholar
  10. Amarjit Bhomra
    You can also search for this author inPubMed Google Scholar
  11. Polinka Hernandez-Pliego
    You can also search for this author inPubMed Google Scholar
  12. Helen Whitley
    You can also search for this author inPubMed Google Scholar
  13. James Cleak
    You can also search for this author inPubMed Google Scholar
  14. Rebekah Dutton
    You can also search for this author inPubMed Google Scholar
  15. Deborah Janowitz
    You can also search for this author inPubMed Google Scholar
  16. Richard Mott
    You can also search for this author inPubMed Google Scholar
  17. David J. Adams
    You can also search for this author inPubMed Google Scholar
  18. Jonathan Flint
    You can also search for this author inPubMed Google Scholar

Contributions

D.J.A. and J.F. conceived the study and directed the research. J.F. wrote the core of the paper. K.W. and T.K. performed the genome-wide SV discovery and local assembly for SV breakpoint resolution. K.W. carried out the sensitivity and specificity analyses. K.W. and B.Y. liaised regularly to integrate experimental work into genome-wide SV discovery pipeline. This resulted in a highly accurate map of SV across the mouse genome, essential to downstream analyses. A.B., P.H.P., H.W., J.C., R.D. and D.J. carried out experimental work, led by B.Y. A.B. and B.Y. analysed Sanger-based sequencing data, resolved SV breakpoints at nucleotide-level resolution and inferred mechanism of SV formation. M.G. performed the genome-wide SV mechanism of formation and outgroup analysis, with contributions from A.A. and B.Y.; J.F. and A.A. analysed functional impact of SVs on expression and phenotypes. C.N., L.G., J.N., A.A. and R.M. carried out additional analyses. B.Y. characterized function of individual SV examples.

Corresponding authors

Correspondence toDavid J. Adams or Jonathan Flint.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Information

This file contains Supplementary Figures 1-2 with legends, Supplementary Methods, Supplementary References and Supplementary Tables 1-5. (PDF 722 kb)

PowerPoint slides

Rights and permissions

About this article

Cite this article

Yalcin, B., Wong, K., Agam, A. et al. Sequence-based characterization of structural variation in the mouse genome.Nature 477, 326–329 (2011). https://doi.org/10.1038/nature10432

Download citation

This article is cited by

Editorial Summary

Variation in the mouse genome

The laboratory mouse has become the workhorse of biomedical research. The draft sequence of the mouse reference genome was published in 2002, but some forms of variation are still poorly documented. Two papers in this issue go a long way towards filling the gaps. The generation and analysis of sequence from 17 key mouse genomes, including most of the commonly used inbred strains and their progenitors, reveal extensive genetic variation and provide insights into the molecular nature of functional variants as well as the phylogenetic history of the lab mouse. The data will be an important resource for a new era of functional analysis. The second paper describes the landscape of structural variants in the genomes of 13 classical and four wild-derived inbred mouse strains, mapping many of them to base-pair resolution. Despite their prevalence, structural variants are shown to have a relatively small impact on phenotypic variation.