Open-reading-frame sequence tags (OSTs) support the existence of at least 17,300 genes in C. elegans (original) (raw)
- Letter
- Published: March 2001
- Philippe Vaglio1 na1,
- Nia Tzellas1,
- Nicolas Thierry-Mieg1,2,
- Troy Moore3,
- Cindy Jackson3,
- Tadasu Shin-i4,
- Yuji Kohara4,
- Danielle Thierry-Mieg5,
- Jean Thierry-Mieg5,
- Hongmei Lee6,
- Joseph Hitti6,
- Lynn Doucette-Stamm6,
- James L. Hartley7,
- Gary F. Temple7,
- Michael A. Brasch7,
- Jean Vandenhaute8,
- Philippe E. Lamesch1,8,
- David E. Hill1 &
- …
- Marc Vidal1
Nature Genetics volume 27, pages 332–336 (2001)Cite this article
Abstract
The genome sequences of Caenorhabditis elegans, Drosophila melanogaster and Arabidopsis thaliana have been predicted to contain 19,000, 13,600 and 25,500 genes, respectively1,2,3. Before this information can be fully used for evolutionary and functional studies, several issues need to be addressed. First, the gene number estimates obtained in silico and not yet supported by any experimental data need to be verified. For example, it seems biologically paradoxical that C. elegans would have 50% more genes than Drosophilia. Second, intron/exon predictions need to be tested experimentally. Third, complete sets of open reading frames (ORFs), or “ORFeomes,”4 need to be cloned into various expression vectors. To address these issues simultaneously, we have designed and applied to C. elegans the following strategy. Predicted ORFs are amplified by PCR from a highly representative cDNA library4 using ORF-specific primers, cloned by Gateway recombination cloning4,5,6 and then sequenced to generate ORF sequence tags (OSTs) as a way to verify identity and splicing. In a sample (n=1,222) of the nearly 10,000 genes predicted ab initio (that is, for which no expressed sequence tag (EST) is available so far), at least 70% were verified by OSTs. We also observed that 27% of these experimentally confirmed genes have a structure different from that predicted by GeneFinder. We now have experimental evidence that supports the existence of at least 17,300 genes in C. elegans. Hence we suggest that gene counts based primarily on ESTs may underestimate the number of genes in human and in other organisms.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Additional access options:
Similar content being viewed by others
References
- The C. elegans Sequencing Consortium. Genome sequence of the nematode C. elegans: a platform for investigating biology. Science 282, 2012–2018 (1998).
- Adams, M.D. et al. The genome sequence of Drosophila melanogaster. Science 287, 2185–2195 (2000).
Article Google Scholar - The Arabidopsis Genome Initiative. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408, 796–815 (2000).
- Walhout, A.J.M. et al. Gateway recombinational cloning: application to the cloning of large numbers of open reading frames, or ORFeomes. Methods Enzymol. 328, 575–592 (2000).
Article CAS Google Scholar - Walhout, A.J.M. et al. Protein interaction mapping in C. elegans using proteins involved in vulval development. Science 287, 116–122 (2000).
Article CAS Google Scholar - Hartley, J.L., Temple, F.T. & Brasch, M.A. DNA cloning using in vitro site-specific recombination. Genome Res. 10, 1788–1795 (2000).
Article CAS Google Scholar - Hill, A.A., Hunter, C.P., Tsung, B.T., Tucker-Kellogg, G. & Brown, E.L. Genomic analysis of gene expression in C. elegans. Science 290, 809–812 (2000).
Article CAS Google Scholar - Gopal, S. et al. Homology-based annotation yields 1,042 new candidate genes in the Drosophila melanogaster genome. Nature Genet. 27, 337–340 (2001).
Article CAS Google Scholar - Ewing, B. & Green, P. Analysis of expressed sequence tags indicates 35,000 human genes. Nature Genet. 25, 232–234 (2000).
Article CAS Google Scholar - Dunham, I. et al. The DNA sequence of human chromosome 22. Nature 402, 489–495 (1999).
Article CAS Google Scholar - Roest Crollius, H. et al. Estimate of human gene number provided by genome-wide analysis using Tetraodon nigroviridis DNA sequence. Nature Genet. 25, 235–238 (2000).
Article CAS Google Scholar - Liang, F. et al. Gene Index analysis of the human genome estimates approximately 120,000 genes. Nature Genet. 25, 239–240 (2000); correction: 26, 501 (2000).
Article CAS Google Scholar
Acknowledgements
We thank S. Boulton, L. Matthews, J. Polanowska, M. Tewari and A.J.M. Walhout for comments on the manuscript and discussions; and L. Hillier and P. Green for the primer design program (OSP). This work was supported by grants from CREST, Japan Science and Technology Corporation and Grant-in-Aid for Scientific Research on Priority Areas from the Ministry of Education, Science, Sports and Culture of Japan (to Y.K.), and by grants 1 RO1 HG01715-01 from the National Human Genome Research Institute, 1 R21 CA81658 A 01 from the National Cancer Institute and 128 from the Merck Genome Research Institute (to M.V.).
Author information
Author notes
- Jérôme Reboul and Philippe Vaglio: These authors contributed equally to this work.
Authors and Affiliations
- Dana-Farber Cancer Institute and Department of Genetics, Harvard Medical School, Boston, Massachusetts, USA
Jérôme Reboul, Philippe Vaglio, Nia Tzellas, Nicolas Thierry-Mieg, Philippe E. Lamesch, David E. Hill & Marc Vidal - Laboratoire LSR-IMAG, St-Martin D'Heres, France
Nicolas Thierry-Mieg - Research Genetics, Huntsville, Alabama, USA
Troy Moore & Cindy Jackson - Genome Biology Laboratory, National Institute of Genetics, Mishima, Japan
Tadasu Shin-i & Yuji Kohara - National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA
Danielle Thierry-Mieg & Jean Thierry-Mieg - Genome Therapeutics Corp., Waltham, Massachusetts, USA
Hongmei Lee, Joseph Hitti & Lynn Doucette-Stamm - Life Technologies Inc., Rockville, Maryland, USA
James L. Hartley, Gary F. Temple & Michael A. Brasch - Département de Biologie, Facultés Universitaires Notre-Dame de la Paix, Namur, Belgium
Jean Vandenhaute & Philippe E. Lamesch
Authors
- Jérôme Reboul
You can also search for this author inPubMed Google Scholar - Philippe Vaglio
You can also search for this author inPubMed Google Scholar - Nia Tzellas
You can also search for this author inPubMed Google Scholar - Nicolas Thierry-Mieg
You can also search for this author inPubMed Google Scholar - Troy Moore
You can also search for this author inPubMed Google Scholar - Cindy Jackson
You can also search for this author inPubMed Google Scholar - Tadasu Shin-i
You can also search for this author inPubMed Google Scholar - Yuji Kohara
You can also search for this author inPubMed Google Scholar - Danielle Thierry-Mieg
You can also search for this author inPubMed Google Scholar - Jean Thierry-Mieg
You can also search for this author inPubMed Google Scholar - Hongmei Lee
You can also search for this author inPubMed Google Scholar - Joseph Hitti
You can also search for this author inPubMed Google Scholar - Lynn Doucette-Stamm
You can also search for this author inPubMed Google Scholar - James L. Hartley
You can also search for this author inPubMed Google Scholar - Gary F. Temple
You can also search for this author inPubMed Google Scholar - Michael A. Brasch
You can also search for this author inPubMed Google Scholar - Jean Vandenhaute
You can also search for this author inPubMed Google Scholar - Philippe E. Lamesch
You can also search for this author inPubMed Google Scholar - David E. Hill
You can also search for this author inPubMed Google Scholar - Marc Vidal
You can also search for this author inPubMed Google Scholar
Corresponding author
Correspondence toMarc Vidal.
Supplementary information
Rights and permissions
About this article
Cite this article
Reboul, J., Vaglio, P., Tzellas, N. et al. Open-reading-frame sequence tags (OSTs) support the existence of at least 17,300 genes in C. elegans.Nat Genet 27, 332–336 (2001). https://doi.org/10.1038/85913
- Received: 28 November 2000
- Accepted: 06 February 2001
- Issue Date: March 2001
- DOI: https://doi.org/10.1038/85913