Comparative analyses of multi-species sequences from targeted genomic regions (original) (raw)
- Letter
- Published: 14 August 2003
- J. W. Touchman1,2,13 nAff12,
- R. W. Blakesley1,2,
- G. G. Bouffard1,2,
- S. M. Beckstrom-Sternberg1,2,
- E. H. Margulies1,
- M. Blanchette3,
- A. C. Siepel3,
- P. J. Thomas2,
- J. C. McDowell2,
- B. Maskeri2,
- N. F. Hansen2,
- M. S. Schwartz3,
- R. J. Weber3,
- W. J. Kent3,
- D. Karolchik3,
- T. C. Bruen3,
- R. Bevan3,
- D. J. Cutler4,
- S. Schwartz5,
- L. Elnitski5,
- J. R. Idol1,
- A. B. Prasad1,
- S.-Q. Lee-Lin1,
- V. V. B. Maduro1,
- T. J. Summers1,
- M. E. Portnoy1,
- N. L. Dietrich2,
- N. Akhter2,
- K. Ayele2,
- B. Benjamin2,
- K. Cariaga2,
- C. P. Brinkley2,
- S. Y. Brooks2,
- S. Granite2,
- X. Guan2,
- J. Gupta2,
- P. Haghighi2,
- S.-L. Ho2,
- M. C. Huang2,
- E. Karlins2,
- P. L. Laric2,
- R. Legaspi2,
- M. J. Lim2,
- Q. L. Maduro2,
- C. A. Masiello2,
- S. D. Mastrian2,
- J. C. McCloskey2,
- R. Pearson2,
- S. Stantripop2,
- E. E. Tiongson2,
- J. T. Tran2,
- C. Tsurgeon2,
- J. L. Vogt2,
- M. A. Walker2,
- K. D. Wetherby2,
- L. S. Wiggins2,
- A. C. Young2,
- L.-H. Zhang2,
- K. Osoegawa6,
- B. Zhu6,
- B. Zhao6,
- C. L. Shu6,
- P. J. De Jong6,
- C. E. Lawrence7,
- A. F. Smit8,
- A. Chakravarti4,
- D. Haussler3,9,
- P. Green10,
- W. Miller5 &
- …
- E. D. Green1,2
Nature volume 424, pages 788–793 (2003)Cite this article
- 6862 Accesses
- 484 Citations
- 17 Altmetric
- Metrics details
Abstract
The systematic comparison of genomic sequences from different organisms represents a central focus of contemporary genome analysis. Comparative analyses of vertebrate sequences can identify coding1,2,3,4,5,6 and conserved non-coding4,6,7 regions, including regulatory elements8,9,10, and provide insight into the forces that have rendered modern-day genomes6. As a complement to whole-genome sequencing efforts3,5,6, we are sequencing and comparing targeted genomic regions in multiple, evolutionarily diverse vertebrates. Here we report the generation and analysis of over 12 megabases (Mb) of sequence from 12 species, all derived from the genomic region orthologous to a segment of about 1.8 Mb on human chromosome 7 containing ten genes, including the gene mutated in cystic fibrosis. These sequences show conservation reflecting both functional constraints and the neutral mutational events that shaped this genomic region. In particular, we identify substantial numbers of conserved non-coding segments beyond those previously identified experimentally, most of which are not detectable by pair-wise sequence comparisons alone. Analysis of transposable element insertions highlights the variation in genome dynamics among these species and confirms the placement of rodents as a sister group to the primates.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Additional access options:
Similar content being viewed by others
References
- Batzoglou, S., Pachter, L., Mesirov, J. P., Berger, B. & Lander, E. S. Human and mouse gene structure: comparative analysis and application to exon prediction. Genome Res. 10, 950–958 (2000)
Article CAS Google Scholar - Roest Crollius, H. et al. Estimate of human gene number provided by genome-wide analysis using Tetraodon nigroviridis DNA sequence. Nature Genet. 25, 235–238 (2000)
Article CAS Google Scholar - International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001)
Article Google Scholar - Chen, R., Bouck, J. B., Weinstock, G. M. & Gibbs, R. A. Comparing vertebrate whole-genome shotgun reads to the human genome. Genome Res. 11, 1807–1816 (2001)
Article CAS Google Scholar - Aparicio, S. et al. Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes. Science 297, 1301–1310 (2002)
Article ADS CAS Google Scholar - Mouse Genome Sequencing Consortium. Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520–562 (2002)
Article Google Scholar - Dubchak, I. et al. Active conservation of noncoding sequences revealed by three-way species comparisons. Genome Res. 10, 1304–1306 (2000)
Article CAS Google Scholar - Gottgens, B. et al. Analysis of vertebrate SCL loci identifies conserved enhancers. Nature Biotechnol. 18, 181–186 (2000)
Article CAS Google Scholar - Hardison, R. C. Conserved noncoding sequences are reliable guides to regulatory elements. Trends Genet. 16, 369–372 (2000)
Article CAS Google Scholar - Pennacchio, L. A. & Rubin, E. M. Genomic strategies to identify mammalian regulatory sequences. Nature Rev. Genet. 2, 100–109 (2001)
Article CAS Google Scholar - Rommens, J. M. et al. Identification of the cystic fibrosis gene: chromosome walking and jumping. Science 245, 1059–1065 (1989)
Article ADS CAS Google Scholar - Felsenfeld, A., Peterson, J., Schloss, J. & Guyer, M. Assessing the quality of the DNA sequence from The Human Genome Project. Genome Res. 9, 1–4 (1999)
CAS PubMed Google Scholar - Schwartz, S. et al. Human–mouse alignments with BLASTZ. Genome Res 13, 103–107 (2003)
Article CAS Google Scholar - Schwartz, S. et al. MultiPipMaker and supporting tools: alignments and analysis of multiple genomic DNA sequences. Nucleic Acids Res. 31, 3518–3524 (2003)
Article CAS Google Scholar - Murphy, W. J. et al. Resolution of the early placental mammal radiation using Bayesian phylogenetics. Science 294, 2348–2351 (2001)
Article ADS CAS Google Scholar - Poux, C., Van Rheede, T., Madsen, O. & de Jong, W. W. Sequence gaps join mice and men: phylogenetic evidence from deletions in two proteins. Mol. Biol. Evol. 19, 2035–2037 (2002)
Article CAS Google Scholar - Huelsenbeck, J. P., Larget, B. & Swofford, D. A compound Poisson process for relaxing the molecular clock. Genetics 154, 1879–1892 (2000)
CAS PubMed PubMed Central Google Scholar - Cooper, G. M. et al. Quantitative estimates of sequence divergence for comparative analyses of mammalian genomes. Genome Res. 13, 813–820 (2003)
Article CAS Google Scholar - Siepel, A. & Haussler, D. Proc. 7th Annual Int. Conf. Research in Computational Molecular Biology (ACM, New York, 2003)
Google Scholar - Hardison, R. C. et al. Covariation in frequencies of substitution, deletion, transposition, and recombination during eutherian evolution. Genome Res. 13, 13–26 (2003)
Article CAS Google Scholar - Green, P. et al. Transcription-associated mutational asymmetry in mammalian evolution. Nature Genet. 33, 514–517 (2003)
Article CAS Google Scholar - Frazer, K. A. et al. Genomic DNA insertions and deletions occur frequently between humans and nonhuman primates. Genome Res. 13, 341–346 (2003)
Article CAS Google Scholar - Britten, R. J. Divergence between samples of chimpanzee and human DNA sequences is 5%, counting indels. Proc. Natl Acad. Sci. USA 99, 13633–13635 (2002)
Article ADS CAS Google Scholar - Springer, M. S., Murphy, W. J., Eizirik, E. & O'Brien, S. J. Placental mammal diversification and the Cretaceous/Tertiary boundary. Proc. Natl Acad. Sci. USA 100, 1056–1061 (2003)
Article ADS CAS Google Scholar - Li, W. H., Ellsworth, D. L., Krushkal, J., Chang, B. H. & Hewett-Emmett, D. Rates of nucleotide substitution in primates and rodents and the generation-time effect hypothesis. Mol. Phylogenet. Evol. 5, 182–187 (1996)
Article CAS Google Scholar - Kumar, S. & Subramanian, S. Mutation rates in mammalian genomes. Proc. Natl Acad. Sci. USA 99, 803–808 (2002)
Article ADS CAS Google Scholar - Shizuya, H. et al. Cloning and stable maintenance of 300-kilobase-pair fragments of human DNA in Escherichia coli using an F-factor-based vector. Proc. Natl Acad. Sci. USA 89, 8794–8797 (1992)
Article ADS CAS Google Scholar - Thomas, J. W. et al. Parallel construction of orthologous sequence-ready clone contig maps in multiple species. Genome Res. 12, 1277–1285 (2002)
Article CAS Google Scholar - Burge, C. & Karlin, S. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268, 78–94 (1997)
Article CAS Google Scholar - Kent, W. J. et al. The human genome browser at UCSC. Genome Res. 12, 996–1006 (2002)
Article CAS Google Scholar
Acknowledgements
We thank J. Weissenbach and H. Roest Crollius for Tetraodon BACs; M. Diekhans for computational expertise; N. Goldman and Z. Yang for advice on phylogenetic analyses; and F. Collins and J. Mullikin for critically reading the manuscript. We acknowledge the support of the National Human Genome Research Institute (National Institutes of Health) and the Howard Hughes Medical Institute.
Author information
Author notes
- J. W. Thomas
Present address: Department of Human Genetics, Emory University School of Medicine, Atlanta, Georgia, 30322, USA - J. W. Touchman
Present address: Translational Genomics Research Institute, Phoenix, Arizona, 85004
Authors and Affiliations
- Genome Technology Branch, National Human Genome Research Institute,
J. W. Thomas, J. W. Touchman, R. W. Blakesley, G. G. Bouffard, S. M. Beckstrom-Sternberg, E. H. Margulies, J. R. Idol, A. B. Prasad, S.-Q. Lee-Lin, V. V. B. Maduro, T. J. Summers, M. E. Portnoy & E. D. Green - NIH Intramural Sequencing Center, National Institutes of Health, Bethesda, Maryland, 20892, USA
J. W. Touchman, R. W. Blakesley, G. G. Bouffard, S. M. Beckstrom-Sternberg, P. J. Thomas, J. C. McDowell, B. Maskeri, N. F. Hansen, N. L. Dietrich, N. Akhter, K. Ayele, B. Benjamin, K. Cariaga, C. P. Brinkley, S. Y. Brooks, S. Granite, X. Guan, J. Gupta, P. Haghighi, S.-L. Ho, M. C. Huang, E. Karlins, P. L. Laric, R. Legaspi, M. J. Lim, Q. L. Maduro, C. A. Masiello, S. D. Mastrian, J. C. McCloskey, R. Pearson, S. Stantripop, E. E. Tiongson, J. T. Tran, C. Tsurgeon, J. L. Vogt, M. A. Walker, K. D. Wetherby, L. S. Wiggins, A. C. Young, L.-H. Zhang & E. D. Green - Center for Biomolecular Science and Engineering, University of California, Santa Cruz, California, 95064, USA
M. Blanchette, A. C. Siepel, M. S. Schwartz, R. J. Weber, W. J. Kent, D. Karolchik, T. C. Bruen, R. Bevan & D. Haussler - Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland, 21287, USA
D. J. Cutler & A. Chakravarti - Department of Computer Science and Engineering, The Pennsylvania State University, University Park, Pennsylvania, 16802, USA
S. Schwartz, L. Elnitski & W. Miller - Children's Hospital Oakland Research Institute, Oakland, California, 94609, USA
K. Osoegawa, B. Zhu, B. Zhao, C. L. Shu & P. J. De Jong - The Wadsworth Center for Laboratories and Research, New York State Department of Health, Albany, New York, 12201, USA
C. E. Lawrence - The Institute for Systems Biology, Seattle, Washington, 98103, USA
A. F. Smit - Howard Hughes Medical Institute, University of California, Santa Cruz, California, 95064, USA
D. Haussler - Howard Hughes Medical Institute and Department of Genome Sciences, University of Washington, Seattle, Washington, 98195, USA
P. Green - Department of Biology, Arizona State University, Tempe, Arizona, 85287, USA
J. W. Touchman
Authors
- J. W. Thomas
You can also search for this author inPubMed Google Scholar - J. W. Touchman
You can also search for this author inPubMed Google Scholar - R. W. Blakesley
You can also search for this author inPubMed Google Scholar - G. G. Bouffard
You can also search for this author inPubMed Google Scholar - S. M. Beckstrom-Sternberg
You can also search for this author inPubMed Google Scholar - E. H. Margulies
You can also search for this author inPubMed Google Scholar - M. Blanchette
You can also search for this author inPubMed Google Scholar - A. C. Siepel
You can also search for this author inPubMed Google Scholar - P. J. Thomas
You can also search for this author inPubMed Google Scholar - J. C. McDowell
You can also search for this author inPubMed Google Scholar - B. Maskeri
You can also search for this author inPubMed Google Scholar - N. F. Hansen
You can also search for this author inPubMed Google Scholar - M. S. Schwartz
You can also search for this author inPubMed Google Scholar - R. J. Weber
You can also search for this author inPubMed Google Scholar - W. J. Kent
You can also search for this author inPubMed Google Scholar - D. Karolchik
You can also search for this author inPubMed Google Scholar - T. C. Bruen
You can also search for this author inPubMed Google Scholar - R. Bevan
You can also search for this author inPubMed Google Scholar - D. J. Cutler
You can also search for this author inPubMed Google Scholar - S. Schwartz
You can also search for this author inPubMed Google Scholar - L. Elnitski
You can also search for this author inPubMed Google Scholar - J. R. Idol
You can also search for this author inPubMed Google Scholar - A. B. Prasad
You can also search for this author inPubMed Google Scholar - S.-Q. Lee-Lin
You can also search for this author inPubMed Google Scholar - V. V. B. Maduro
You can also search for this author inPubMed Google Scholar - T. J. Summers
You can also search for this author inPubMed Google Scholar - M. E. Portnoy
You can also search for this author inPubMed Google Scholar - N. L. Dietrich
You can also search for this author inPubMed Google Scholar - N. Akhter
You can also search for this author inPubMed Google Scholar - K. Ayele
You can also search for this author inPubMed Google Scholar - B. Benjamin
You can also search for this author inPubMed Google Scholar - K. Cariaga
You can also search for this author inPubMed Google Scholar - C. P. Brinkley
You can also search for this author inPubMed Google Scholar - S. Y. Brooks
You can also search for this author inPubMed Google Scholar - S. Granite
You can also search for this author inPubMed Google Scholar - X. Guan
You can also search for this author inPubMed Google Scholar - J. Gupta
You can also search for this author inPubMed Google Scholar - P. Haghighi
You can also search for this author inPubMed Google Scholar - S.-L. Ho
You can also search for this author inPubMed Google Scholar - M. C. Huang
You can also search for this author inPubMed Google Scholar - E. Karlins
You can also search for this author inPubMed Google Scholar - P. L. Laric
You can also search for this author inPubMed Google Scholar - R. Legaspi
You can also search for this author inPubMed Google Scholar - M. J. Lim
You can also search for this author inPubMed Google Scholar - Q. L. Maduro
You can also search for this author inPubMed Google Scholar - C. A. Masiello
You can also search for this author inPubMed Google Scholar - S. D. Mastrian
You can also search for this author inPubMed Google Scholar - J. C. McCloskey
You can also search for this author inPubMed Google Scholar - R. Pearson
You can also search for this author inPubMed Google Scholar - S. Stantripop
You can also search for this author inPubMed Google Scholar - E. E. Tiongson
You can also search for this author inPubMed Google Scholar - J. T. Tran
You can also search for this author inPubMed Google Scholar - C. Tsurgeon
You can also search for this author inPubMed Google Scholar - J. L. Vogt
You can also search for this author inPubMed Google Scholar - M. A. Walker
You can also search for this author inPubMed Google Scholar - K. D. Wetherby
You can also search for this author inPubMed Google Scholar - L. S. Wiggins
You can also search for this author inPubMed Google Scholar - A. C. Young
You can also search for this author inPubMed Google Scholar - L.-H. Zhang
You can also search for this author inPubMed Google Scholar - K. Osoegawa
You can also search for this author inPubMed Google Scholar - B. Zhu
You can also search for this author inPubMed Google Scholar - B. Zhao
You can also search for this author inPubMed Google Scholar - C. L. Shu
You can also search for this author inPubMed Google Scholar - P. J. De Jong
You can also search for this author inPubMed Google Scholar - C. E. Lawrence
You can also search for this author inPubMed Google Scholar - A. F. Smit
You can also search for this author inPubMed Google Scholar - A. Chakravarti
You can also search for this author inPubMed Google Scholar - D. Haussler
You can also search for this author inPubMed Google Scholar - P. Green
You can also search for this author inPubMed Google Scholar - W. Miller
You can also search for this author inPubMed Google Scholar - E. D. Green
You can also search for this author inPubMed Google Scholar
Corresponding author
Correspondence toE. D. Green.
Ethics declarations
Competing interests
The authors declare that they have no competing financial interests.
Supplementary information
Rights and permissions
About this article
Cite this article
Thomas, J., Touchman, J., Blakesley, R. et al. Comparative analyses of multi-species sequences from targeted genomic regions.Nature 424, 788–793 (2003). https://doi.org/10.1038/nature01858
- Received: 11 April 2003
- Accepted: 16 June 2003
- Issue Date: 14 August 2003
- DOI: https://doi.org/10.1038/nature01858