Functional annotation of a full-length mouse cDNA collection (original) (raw)

References

  1. Roest Crollius, H. et al. Estimate of human gene number provided by genome-wide analysis using Tetraodon nigroviridis DNA sequence. Nature Genet. 25, 235–238 (2000).
    Article CAS Google Scholar
  2. Ewing, B. & Green, P. Analysis of expressed sequence tags indicates 35,000 human genes. Nature Genet. 25, 232–234 (2000).
    Article CAS Google Scholar
  3. Liang, F. et al. Gene index analysis of the human genome estimates approximately 120,000 genes. Nature Genet. 25, 239–240 (2000).
    Article CAS Google Scholar
  4. Carninci, P. & Hayashizaki, Y. High-efficiency full-length cDNA cloning. Methods Enzymol. 303, 19–44 (1999).
    Article CAS Google Scholar
  5. Carninci, P. et al. High-efficiency full-length cDNA cloning by biotinylated CAP trapper. Genomics 37, 327–336 (1996).
    Article CAS Google Scholar
  6. Carninci, P. et al. Normalization and subtraction of cap-trapper-selected cDNAs to prepare full-length cDNA libraries for rapid discovery of new genes. Genome Res. 10, 1617–1630 (2000).
    Article CAS Google Scholar
  7. Ewing, B., Hillier, L., Wendl, M. C. & Green, P. Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res. 8, 175–185 (1998).
    Article CAS Google Scholar
  8. Ewing, B. & Green, P. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 8, 186–194 (1998).
    Article CAS Google Scholar
  9. Gautheret, D., Poirot, O., Lopez, F., Audic, S. & Claverie, J. M. Alternate polyadenylation in human mRNAs: a large-scale analysis by EST clustering. Genome Res. 8, 524–530 (1998).
    Article CAS Google Scholar
  10. Huang, X., Adams, M. D., Zhou, H. & Kerlavage, A. R. A tool for analyzing and annotating genomic sequences. Genomics 46, 37–45 (1997).
    Article CAS Google Scholar
  11. Huang, X. & Madan, A. CAP3: A DNA sequence assembly program. Genome Res. 9, 868–877 (1999).
    Article CAS Google Scholar
  12. Thompson, J. D., Higgins, D. G. & Gibson, T. J. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680 (1994).
    Article CAS Google Scholar
  13. Ashburner, M. et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nature Genet. 25, 25–29 (2000).
    Article CAS Google Scholar
  14. Croft, L. et al. ISIS, the intron information system, reveals the high frequency of alternative splicing in the human genome. Nature Genet. 24, 340–341 (2000).
    Article ADS CAS Google Scholar
  15. Hanke, J. et al. Alternative splicing of human genes: more the rule than the exception? Trends Genet. 15, 389–390 (1999).
    Article CAS Google Scholar
  16. Rubin, G. M. et al. Comparative genomics of the eukaryotes. Science 287, 2204–2215 (2000).
    Article CAS Google Scholar
  17. Adams, M. D. et al. The genome sequence of Drosophila melanogaster. Science 287, 2185–2195 (2000).
    Article Google Scholar
  18. Aravind, L. & Koonin, E. V. SAP- a putative DNA-binding motif involved in chromosomal organization. Trends Biochem. Sci. 25, 112–114 (2000).
    Article CAS Google Scholar
  19. Matsuda, H. Detection of conserved domains in protein sequences using a maximum-density subgraph algorithm. IEICE Trans. Fundamentals Electron. Commun. Comput. Sci. E83-A, 713–721 (2000).
    Google Scholar
  20. Pesole, G., Liuni, S. & D'Souza, M. PatSearch: a pattern matcher software that finds functional elements in nucleotide and protein sequences and assesses their statistical significance. Bioinformatics 16, 439–450 (2000).
    Article CAS Google Scholar
  21. Carninci, P. et al. Thermostabilization and thermoactivation of thermolabile enzymes by trehalose and its application for the synthesis of full length cDNA. Proc. Natl Acad. Sci. USA 95, 520–524 (1998).
    Article ADS CAS Google Scholar
  22. Batzoglou, S., Pachter, L., Mesirov, J. P., Berger, B. & Lander, E. S. Human and mouse gene structure: comparative analysis and application to exon prediction. Genome Res. 10, 950–958 (2000).
    Article CAS Google Scholar
  23. Itoh, M. et al. Automated filtration-based high-throughput plasmid preparation system. Genome Res. 9, 463–470 (1999).
    CAS PubMed PubMed Central Google Scholar
  24. Shibata, K. et al. RIKEN integrated sequence analysis (RISA) system-384-format sequencing pipline with 384 multicapillary sequencer. Genome Res. 10, 1757–1771 (2000).
    Article CAS Google Scholar
  25. Gordon, D., Abajian, C. & Green, P. Consed: a graphical tool for sequence finishing. Genome Res. 8, 195–202 (1998).
    Article CAS Google Scholar
  26. Fukunishi, Y. & Hayashizaki, Y. Amino-acid translation program for full-length cDNA sequences with frame-shift error. Physiol. Genomics. (in the press).

Download references

Acknowledgements

We thank the following (in alphabetical order) for discussion, encouragement and technical assistance: R. Abagyan, T. Akimura, K. Arakawa, M. Boguski, L. Corbani, T. A. Dragani, J. T. Eppig, S. Fujimori, G. Grillo, T. Haga, T. Hanagaki, S. Hanaoka, S. Hatta, N. Hayatsu, K. Hiramoto, T. Hiraoka, T. Hirozane, Y. Hodoyama, F. Hori, T. Hubbard, R. Hynes, K. Ikeda, K. Ikeo, C. Imamura, K. Imotani, S. Inoue, H. Kato, N. Kikuchi, Y. Kojima, A. Konagaya, M. Kouda, S. Koya, M. Kubota, S. Kumagai, C. Kurihara, M. Kusakabe, F. Licciulli, S. Liuni, L. Maltais, T. Matsuyama, L. McKenzie, A. Miyazaki, K. Mori, M. Muramatsu, M. Nakamura, K. Nomura, N. Nukina, K. Numata, R. Numazaki, M. Ohno, Y. Okuma, H. Ono, C. Owa, Y. Ozawa, G. Pertea, S. Ramachandran, E. M. Rubin, N. Saga, H. Saitou, H. Sakai, C. Sakai, A. Sakurai, H. Sano, D. Sasaki, L. Sato, C. Schneider, J. Schug, T. Shiraki, M. B. Soares, Y. Sogabe, C. Stoeckert, H. Sugawara, R. Sultana, H. Suzuki, M. Tagami, A. Tagawa, F. Takahashi, S. Takaku-Akahira, M. Takeuchi, T. Tanaka, Y. Tateno, Y. Tejima, J. Todd, A. Tomaru, S. Tonegawa, T. Toya, A. Wada, L. Wagner, A. Watahiki, T. Yamamura, T. Yamashita, T. Yao, A. Yasunishi, T. Yokota, S. Yokoyama, A. Yoshiki and K. Yotsutani. We also thank N. Kazuta, Y. Sigemoto, H. Torigoe and T. Washida for secretarial assistance. This study has been mainly supported by a grant for the RIKEN Genome Exploration Research Project and CREST (Core Research for Evolutional Science and Technology) to Y.H. Further support came from ACT-JST (Research and Development for Applying Advanced Computational Science and Technology) of Japan Science and Technology Corporation (JST) to Y.H. and H.M., and the Science and Technology Agency of the Japanese Government to Y.H. and Y.O. (All funds from the Science Technology Agency of the Japanese Government.) This work was also supported by a Grant-in-Aid for Scientific Research on Priority Areas and Human Genome Program, from the Ministry of Education, Science and Culture, and by a Grant-in-Aid for a Second Term Comprehensive 10-Year Strategy for Cancer Control from the Ministry of Health and Welfare to Y.H. Authors’ contributions: J. Kawai and Y. Okazaki contributed as organizers in phase II team and FANTOM, respectively. A. Shinagawa and H. Bono contributed as managers in sequence data production system and computing system, respectively. J. Quackenbush, P. Carninci, M. J. Brownstein, D. A. Hume, C. Schönbach, H. Suzuki and C. Weitz acted as senior managers of the annotation project.

Author information

Authors and Affiliations

  1. Laboratory for Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), Yokohama Institute 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, 230-0045, Kanagawa, Japan
    J. Kawai, A. Shinagawa, K. Shibata, M. Yoshino, M. Itoh, Y. Ishii, T. Arakawa, A. Hara, Y. Fukunishi, H. Konno, J. Adachi, S. Fukuda, K. Aizawa, M. Izawa, K. Nishi, H. Kiyosawa, S. Kondo, I. Yamanaka, T. Saito, Y. Okazaki, H. Bono, R. Saito, K. Kadota, K. Sakai, T. Okido, M. Furuno, H. Aono, P. Carninci, M. Kamiya, K. Sato, Y. Shibata, H. Suzuki, K. Yoshida & Y. Hayashizaki
  2. CREST, JST, 3-1-1 Koyadai, Tsukuba, 305-0074, Ibaraki, Japan
    J. Kawai, K. Shibata, M. Itoh, Y. Fukunishi, H. Konno, S. Fukuda, K. Aizawa, M. Kamiya & Y. Hayashizaki
  3. Center for Information Biology, National Institute of Genetics, 1111 Yata, Mishima, 411-8540, Shizuoka, Japan
    T. Gojobori & J. Mashima
  4. NTT Software Corporation, 223-1 Yamashita-cho, Naka-ku, Yokohama, 231-8554, Kanagawa, Japan
    T. Kasukawa, Y. Hasegawa, H. Kawaji & S. Kohtsuki
  5. Osaka University, 1-3 Machikaneyama, Toyonaka, 560-8531, Osaka, Japan
    H. Matsuda & H. Kawaji
  6. European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, Cambridge, UK
    M. Ashburner & W. Fleischmann
  7. Genomics Institute of the Novartis Research Foundation, 3115 Merryfield Row, San Diego, 92121, California, USA
    S. Batalov & C. Fletcher
  8. The Coordinated Laboratory for Computational Genomics, University of Iowa Iowa City, 52242, Iowa, USA
    T. Casavant
  9. The Rockefeller University, 1230 York Avenue, New York, 10021-6399, New York, USA
    T. Gaasterland
  10. Dipartimento di Fisiologia e Biochimica Generali, Universita di Milano Via Celoria, 26, Milano, 20133, Italy
    C. Gissi & G. Pesole
  11. Mouse Genome Informatics, The Jackson Laboratory, 600 Main Street, Bar Harbor, 04609, Maine, USA
    B. King, R. Baldarelli, J. Blake, C. Bult, D. Hill & M. Ringwald
  12. Laboratory for Bioinformatics, Faculty of Environmental Information, Keio University, 5322 Endoh, Fujisawa, 252-0816, Kanagawa, Japan
    H. Kochiwa, R. Suzuki, M. Tomita & T. Washio
  13. Department of Molecular & Cell Biology, University of Maryland at Baltimore, Baltimore, 20201, Maryland, USA
    P. Kuehl
  14. Department of Molecular & Cell Biology, University of California, Berkeley, 142 Life Sciences Addition #3200, Berkeley, 94720-3200, California, USA
    S. Lewis
  15. Computational Proteomics Team, Bioinformatics Group, RIKEN Genomic Sciences Center (GSC), Yokohama Institute 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, 230-0045, Kanagawa, Japan
    Y. Matsuo
  16. Tokai University, Graduate School of Marine Science and Technology, 3-20-1 Orido, Shimizu, 424-8610, Shizuoka, Japan
    I. Nikaido
  17. The Institute for Genomic Research, 9712 Medical Center Dr., Rockville, 20850, Maryland, USA
    J. Quackenbush & N. H. Lee
  18. National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, Room 8N805, Bethesda, 20894, Maryland, USA
    L. M. Schriml & L. Wagner
  19. LION Bioscience AG, Im Neuenheimer Feld 515-519, Heidelberg, D-69120, Germany
    F. Staubli, N. Bojunga & M. Hofmann
  20. Stanford University School of Medicine, Beckman Centre B271A, Stanford, 94305-5428, California, USA
    G. Barsh
  21. Lawrence Berkeley Laboratory, 1 Cyclotron Rd, MS84-255, Berkeley, 94710, California, USA
    D. Boffelli
  22. Department of Pediatrics, The University of Iowa, 200 Hawkins Drive 440B EMRB, Iowa City, 52242-1009, Iowa, USA
    M. F. de Bonaldo
  23. Laboratory of Genetics, NIMH/NHGRI, National Institutes of Health Building 36, Room 3D06, Bethesda, 20892, Maryland, USA
    M. J. Brownstein
  24. Graduate School of Medicine, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, 113-8655, Tokyo, Japan
    M. Fujita
  25. Istituto Tumori Milano, Via Venezian,1, Milano, 120133, Italy
    M. Gariboldi
  26. Department of Neurobiology, Harvard Medical School, 220 Longwood Ave., Boston, 02115, Massachusetts, USA
    S. Gustincich
  27. Institute for Molecular Bioscience, University of Queensland, Brisbane, 4072, Queensland, Australia
    D. A. Hume
  28. Department of Medical Genetics, Wellcome Trust Centre for Molecular Mechanisms in Disease, University of Cambridge, Wellcome Trust/MRC building, Addenbrookes Hospital, Cambridge, CB2 2XY, UK
    P. Lyons
  29. LNCIB c/o AREA Science Park, Padriciano 99, Trieste, 34012, Italy
    L. Marchionni
  30. Computational and Bioinformatics Laboratory, Center for Bioinformatics, University of Pennsylvania, 1313 Blockley Hall, 418 Guardian Drive, Philadelphia, 19104-6021, Pennsylvania, USA
    J. Mazzarelli
  31. Vertebrate Developmental Neurogenetics, The Rockefeller University, 1230 York Avenue, Box 242, New York, 10021-6399, New York, USA
    P. Mombaerts & I. Rodriguez
  32. University at Buffalo/Roswell Park Cancer Institute, 120 Meyers Rd.#615, Amherst, 14226, New York, USA
    P. Nordone
  33. Department of Genetics, Stanford University, Beckman Centre B281, Stanford, 94305, California, USA
    B. Ring
  34. RIKEN Brain Science Institute, 2-1 Hirosawa, Wako, 351-0198, Saitama, Japan
    N. Sakamoto
  35. National Cancer Research Institute, 1-1 Tsukiji, Chuo-ku, 104-0045, Tokyo, Japan
    H. Sasaki
  36. Computational Genomics Team, Bioinformatics Group, RIKEN Genomic Sciences Center (GSC), Yokohama Institute 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, 230-0045, Kanagawa, Japan
    C. Schönbach
  37. Osaka Medical Center for Cancer, Nakamichi 1-3-3, Higashinari-ku, 537-8511, Osaka, Japan
    T. Seya
  38. Department of Neurobiology, Harvard Medical School, 220 Longwood Ave., Boston, 02115, Massachusetts, USA
    K.-F. Storch & C. Weitz
  39. Department of Pediatrics, University of California, San Diego, School of Medicine, 9500 Gilman Dr., Medical Teaching Facility 253, La Jolla, 92093-0627, California, USA
    K. Toyo-oka
  40. E17-353, Center for Learning and Memory, Massacusetts Institute of Technlogy, 77 Massachusetts Ave., Cambridge, 02139, Massachusetts, USA
    K. H. Wang
  41. Massachusetts Institute of Technology, MIT CCR, 77 Massachusetts Avenue 17-230, Cambridge, 02139, Massachusetts, USA
    C. Whittaker
  42. Sanger Centre, Wellcome Trust Genome Campus, Hinxton, CB10 1SA, Cambridgeshire, UK
    L. Wilming
  43. University of California San Diego School of Medicine, 9500 Gilman Dr., Medical Teaching Facility, Room 252, La Jolla, 92093-0627, California, USA
    A. Wynshaw-Boris
  44. Tsukuba University, 1-1-1 Tennodai, Tsukuba, 305-8577,, Ibaraki, Japan
    K. Sato & Y. Hayashizaki

Consortia

The RIKEN Genome Exploration Research Group Phase II Team and the FANTOM Consortium

Corresponding author

Correspondence toY. Hayashizaki.

Supplementary information

Rights and permissions

About this article

Cite this article

The RIKEN Genome Exploration Research Group Phase II Team and the FANTOM Consortium. Functional annotation of a full-length mouse cDNA collection.Nature 409, 685–690 (2001). https://doi.org/10.1038/35055500

Download citation