Functional annotation of a full-length mouse cDNA collection (original) (raw)

References

Roest Crollius, H. et al. Estimate of human gene number provided by genome-wide analysis using Tetraodon nigroviridis DNA sequence. Nature Genet. 25, 235–238 (2000).
Article CAS Google Scholar
Ewing, B. & Green, P. Analysis of expressed sequence tags indicates 35,000 human genes. Nature Genet. 25, 232–234 (2000).
Article CAS Google Scholar
Liang, F. et al. Gene index analysis of the human genome estimates approximately 120,000 genes. Nature Genet. 25, 239–240 (2000).
Article CAS Google Scholar
Carninci, P. & Hayashizaki, Y. High-efficiency full-length cDNA cloning. Methods Enzymol. 303, 19–44 (1999).
Article CAS Google Scholar
Carninci, P. et al. High-efficiency full-length cDNA cloning by biotinylated CAP trapper. Genomics 37, 327–336 (1996).
Article CAS Google Scholar
Carninci, P. et al. Normalization and subtraction of cap-trapper-selected cDNAs to prepare full-length cDNA libraries for rapid discovery of new genes. Genome Res. 10, 1617–1630 (2000).
Article CAS Google Scholar
Ewing, B., Hillier, L., Wendl, M. C. & Green, P. Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res. 8, 175–185 (1998).
Article CAS Google Scholar
Ewing, B. & Green, P. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 8, 186–194 (1998).
Article CAS Google Scholar
Gautheret, D., Poirot, O., Lopez, F., Audic, S. & Claverie, J. M. Alternate polyadenylation in human mRNAs: a large-scale analysis by EST clustering. Genome Res. 8, 524–530 (1998).
Article CAS Google Scholar
Huang, X., Adams, M. D., Zhou, H. & Kerlavage, A. R. A tool for analyzing and annotating genomic sequences. Genomics 46, 37–45 (1997).
Article CAS Google Scholar
Huang, X. & Madan, A. CAP3: A DNA sequence assembly program. Genome Res. 9, 868–877 (1999).
Article CAS Google Scholar
Thompson, J. D., Higgins, D. G. & Gibson, T. J. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680 (1994).
Article CAS Google Scholar
Ashburner, M. et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nature Genet. 25, 25–29 (2000).
Article CAS Google Scholar
Croft, L. et al. ISIS, the intron information system, reveals the high frequency of alternative splicing in the human genome. Nature Genet. 24, 340–341 (2000).
Article ADS CAS Google Scholar
Hanke, J. et al. Alternative splicing of human genes: more the rule than the exception? Trends Genet. 15, 389–390 (1999).
Article CAS Google Scholar
Rubin, G. M. et al. Comparative genomics of the eukaryotes. Science 287, 2204–2215 (2000).
Article CAS Google Scholar
Adams, M. D. et al. The genome sequence of Drosophila melanogaster. Science 287, 2185–2195 (2000).
Article Google Scholar
Aravind, L. & Koonin, E. V. SAP- a putative DNA-binding motif involved in chromosomal organization. Trends Biochem. Sci. 25, 112–114 (2000).
Article CAS Google Scholar
Matsuda, H. Detection of conserved domains in protein sequences using a maximum-density subgraph algorithm. IEICE Trans. Fundamentals Electron. Commun. Comput. Sci. E83-A, 713–721 (2000).
Google Scholar
Pesole, G., Liuni, S. & D'Souza, M. PatSearch: a pattern matcher software that finds functional elements in nucleotide and protein sequences and assesses their statistical significance. Bioinformatics 16, 439–450 (2000).
Article CAS Google Scholar
Carninci, P. et al. Thermostabilization and thermoactivation of thermolabile enzymes by trehalose and its application for the synthesis of full length cDNA. Proc. Natl Acad. Sci. USA 95, 520–524 (1998).
Article ADS CAS Google Scholar
Batzoglou, S., Pachter, L., Mesirov, J. P., Berger, B. & Lander, E. S. Human and mouse gene structure: comparative analysis and application to exon prediction. Genome Res. 10, 950–958 (2000).
Article CAS Google Scholar
Itoh, M. et al. Automated filtration-based high-throughput plasmid preparation system. Genome Res. 9, 463–470 (1999).
CAS PubMed PubMed Central Google Scholar
Shibata, K. et al. RIKEN integrated sequence analysis (RISA) system-384-format sequencing pipline with 384 multicapillary sequencer. Genome Res. 10, 1757–1771 (2000).
Article CAS Google Scholar
Gordon, D., Abajian, C. & Green, P. Consed: a graphical tool for sequence finishing. Genome Res. 8, 195–202 (1998).
Article CAS Google Scholar
Fukunishi, Y. & Hayashizaki, Y. Amino-acid translation program for full-length cDNA sequences with frame-shift error. Physiol. Genomics. (in the press).

Acknowledgements

We thank the following (in alphabetical order) for discussion, encouragement and technical assistance: R. Abagyan, T. Akimura, K. Arakawa, M. Boguski, L. Corbani, T. A. Dragani, J. T. Eppig, S. Fujimori, G. Grillo, T. Haga, T. Hanagaki, S. Hanaoka, S. Hatta, N. Hayatsu, K. Hiramoto, T. Hiraoka, T. Hirozane, Y. Hodoyama, F. Hori, T. Hubbard, R. Hynes, K. Ikeda, K. Ikeo, C. Imamura, K. Imotani, S. Inoue, H. Kato, N. Kikuchi, Y. Kojima, A. Konagaya, M. Kouda, S. Koya, M. Kubota, S. Kumagai, C. Kurihara, M. Kusakabe, F. Licciulli, S. Liuni, L. Maltais, T. Matsuyama, L. McKenzie, A. Miyazaki, K. Mori, M. Muramatsu, M. Nakamura, K. Nomura, N. Nukina, K. Numata, R. Numazaki, M. Ohno, Y. Okuma, H. Ono, C. Owa, Y. Ozawa, G. Pertea, S. Ramachandran, E. M. Rubin, N. Saga, H. Saitou, H. Sakai, C. Sakai, A. Sakurai, H. Sano, D. Sasaki, L. Sato, C. Schneider, J. Schug, T. Shiraki, M. B. Soares, Y. Sogabe, C. Stoeckert, H. Sugawara, R. Sultana, H. Suzuki, M. Tagami, A. Tagawa, F. Takahashi, S. Takaku-Akahira, M. Takeuchi, T. Tanaka, Y. Tateno, Y. Tejima, J. Todd, A. Tomaru, S. Tonegawa, T. Toya, A. Wada, L. Wagner, A. Watahiki, T. Yamamura, T. Yamashita, T. Yao, A. Yasunishi, T. Yokota, S. Yokoyama, A. Yoshiki and K. Yotsutani. We also thank N. Kazuta, Y. Sigemoto, H. Torigoe and T. Washida for secretarial assistance. This study has been mainly supported by a grant for the RIKEN Genome Exploration Research Project and CREST (Core Research for Evolutional Science and Technology) to Y.H. Further support came from ACT-JST (Research and Development for Applying Advanced Computational Science and Technology) of Japan Science and Technology Corporation (JST) to Y.H. and H.M., and the Science and Technology Agency of the Japanese Government to Y.H. and Y.O. (All funds from the Science Technology Agency of the Japanese Government.) This work was also supported by a Grant-in-Aid for Scientific Research on Priority Areas and Human Genome Program, from the Ministry of Education, Science and Culture, and by a Grant-in-Aid for a Second Term Comprehensive 10-Year Strategy for Cancer Control from the Ministry of Health and Welfare to Y.H. Authors’ contributions: J. Kawai and Y. Okazaki contributed as organizers in phase II team and FANTOM, respectively. A. Shinagawa and H. Bono contributed as managers in sequence data production system and computing system, respectively. J. Quackenbush, P. Carninci, M. J. Brownstein, D. A. Hume, C. Schönbach, H. Suzuki and C. Weitz acted as senior managers of the annotation project.

Author information

Authors and Affiliations

Laboratory for Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), Yokohama Institute 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, 230-0045, Kanagawa, Japan
J. Kawai, A. Shinagawa, K. Shibata, M. Yoshino, M. Itoh, Y. Ishii, T. Arakawa, A. Hara, Y. Fukunishi, H. Konno, J. Adachi, S. Fukuda, K. Aizawa, M. Izawa, K. Nishi, H. Kiyosawa, S. Kondo, I. Yamanaka, T. Saito, Y. Okazaki, H. Bono, R. Saito, K. Kadota, K. Sakai, T. Okido, M. Furuno, H. Aono, P. Carninci, M. Kamiya, K. Sato, Y. Shibata, H. Suzuki, K. Yoshida & Y. Hayashizaki
CREST, JST, 3-1-1 Koyadai, Tsukuba, 305-0074, Ibaraki, Japan
J. Kawai, K. Shibata, M. Itoh, Y. Fukunishi, H. Konno, S. Fukuda, K. Aizawa, M. Kamiya & Y. Hayashizaki
Center for Information Biology, National Institute of Genetics, 1111 Yata, Mishima, 411-8540, Shizuoka, Japan
T. Gojobori & J. Mashima
NTT Software Corporation, 223-1 Yamashita-cho, Naka-ku, Yokohama, 231-8554, Kanagawa, Japan
T. Kasukawa, Y. Hasegawa, H. Kawaji & S. Kohtsuki
Osaka University, 1-3 Machikaneyama, Toyonaka, 560-8531, Osaka, Japan
H. Matsuda & H. Kawaji
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, Cambridge, UK
M. Ashburner & W. Fleischmann
Genomics Institute of the Novartis Research Foundation, 3115 Merryfield Row, San Diego, 92121, California, USA
S. Batalov & C. Fletcher
The Coordinated Laboratory for Computational Genomics, University of Iowa Iowa City, 52242, Iowa, USA
T. Casavant
The Rockefeller University, 1230 York Avenue, New York, 10021-6399, New York, USA
T. Gaasterland
Dipartimento di Fisiologia e Biochimica Generali, Universita di Milano Via Celoria, 26, Milano, 20133, Italy
C. Gissi & G. Pesole
Mouse Genome Informatics, The Jackson Laboratory, 600 Main Street, Bar Harbor, 04609, Maine, USA
B. King, R. Baldarelli, J. Blake, C. Bult, D. Hill & M. Ringwald
Laboratory for Bioinformatics, Faculty of Environmental Information, Keio University, 5322 Endoh, Fujisawa, 252-0816, Kanagawa, Japan
H. Kochiwa, R. Suzuki, M. Tomita & T. Washio
Department of Molecular & Cell Biology, University of Maryland at Baltimore, Baltimore, 20201, Maryland, USA
P. Kuehl
Department of Molecular & Cell Biology, University of California, Berkeley, 142 Life Sciences Addition #3200, Berkeley, 94720-3200, California, USA
S. Lewis
Computational Proteomics Team, Bioinformatics Group, RIKEN Genomic Sciences Center (GSC), Yokohama Institute 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, 230-0045, Kanagawa, Japan
Y. Matsuo
Tokai University, Graduate School of Marine Science and Technology, 3-20-1 Orido, Shimizu, 424-8610, Shizuoka, Japan
I. Nikaido
The Institute for Genomic Research, 9712 Medical Center Dr., Rockville, 20850, Maryland, USA
J. Quackenbush & N. H. Lee
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, Room 8N805, Bethesda, 20894, Maryland, USA
L. M. Schriml & L. Wagner
LION Bioscience AG, Im Neuenheimer Feld 515-519, Heidelberg, D-69120, Germany
F. Staubli, N. Bojunga & M. Hofmann
Stanford University School of Medicine, Beckman Centre B271A, Stanford, 94305-5428, California, USA
G. Barsh
Lawrence Berkeley Laboratory, 1 Cyclotron Rd, MS84-255, Berkeley, 94710, California, USA
D. Boffelli
Department of Pediatrics, The University of Iowa, 200 Hawkins Drive 440B EMRB, Iowa City, 52242-1009, Iowa, USA
M. F. de Bonaldo
Laboratory of Genetics, NIMH/NHGRI, National Institutes of Health Building 36, Room 3D06, Bethesda, 20892, Maryland, USA
M. J. Brownstein
Graduate School of Medicine, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, 113-8655, Tokyo, Japan
M. Fujita
Istituto Tumori Milano, Via Venezian,1, Milano, 120133, Italy
M. Gariboldi
Department of Neurobiology, Harvard Medical School, 220 Longwood Ave., Boston, 02115, Massachusetts, USA
S. Gustincich
Institute for Molecular Bioscience, University of Queensland, Brisbane, 4072, Queensland, Australia
D. A. Hume
Department of Medical Genetics, Wellcome Trust Centre for Molecular Mechanisms in Disease, University of Cambridge, Wellcome Trust/MRC building, Addenbrookes Hospital, Cambridge, CB2 2XY, UK
P. Lyons
LNCIB c/o AREA Science Park, Padriciano 99, Trieste, 34012, Italy
L. Marchionni
Computational and Bioinformatics Laboratory, Center for Bioinformatics, University of Pennsylvania, 1313 Blockley Hall, 418 Guardian Drive, Philadelphia, 19104-6021, Pennsylvania, USA
J. Mazzarelli
Vertebrate Developmental Neurogenetics, The Rockefeller University, 1230 York Avenue, Box 242, New York, 10021-6399, New York, USA
P. Mombaerts & I. Rodriguez
University at Buffalo/Roswell Park Cancer Institute, 120 Meyers Rd.#615, Amherst, 14226, New York, USA
P. Nordone
Department of Genetics, Stanford University, Beckman Centre B281, Stanford, 94305, California, USA
B. Ring
RIKEN Brain Science Institute, 2-1 Hirosawa, Wako, 351-0198, Saitama, Japan
N. Sakamoto
National Cancer Research Institute, 1-1 Tsukiji, Chuo-ku, 104-0045, Tokyo, Japan
H. Sasaki
Computational Genomics Team, Bioinformatics Group, RIKEN Genomic Sciences Center (GSC), Yokohama Institute 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, 230-0045, Kanagawa, Japan
C. Schönbach
Osaka Medical Center for Cancer, Nakamichi 1-3-3, Higashinari-ku, 537-8511, Osaka, Japan
T. Seya
Department of Neurobiology, Harvard Medical School, 220 Longwood Ave., Boston, 02115, Massachusetts, USA
K.-F. Storch & C. Weitz
Department of Pediatrics, University of California, San Diego, School of Medicine, 9500 Gilman Dr., Medical Teaching Facility 253, La Jolla, 92093-0627, California, USA
K. Toyo-oka
E17-353, Center for Learning and Memory, Massacusetts Institute of Technlogy, 77 Massachusetts Ave., Cambridge, 02139, Massachusetts, USA
K. H. Wang
Massachusetts Institute of Technology, MIT CCR, 77 Massachusetts Avenue 17-230, Cambridge, 02139, Massachusetts, USA
C. Whittaker
Sanger Centre, Wellcome Trust Genome Campus, Hinxton, CB10 1SA, Cambridgeshire, UK
L. Wilming
University of California San Diego School of Medicine, 9500 Gilman Dr., Medical Teaching Facility, Room 252, La Jolla, 92093-0627, California, USA
A. Wynshaw-Boris
Tsukuba University, 1-1-1 Tennodai, Tsukuba, 305-8577,, Ibaraki, Japan
K. Sato & Y. Hayashizaki

Consortia

The RIKEN Genome Exploration Research Group Phase II Team and the FANTOM Consortium

The RIKEN Genome Exploration Research Group Phase II Team
- J. Kawai
- , A. Shinagawa
- , K. Shibata
- , M. Yoshino
- , M. Itoh
- , Y. Ishii
- , T. Arakawa
- , A. Hara
- , Y. Fukunishi
- , H. Konno
- , J. Adachi
- , S. Fukuda
- , K. Aizawa
- , M. Izawa
- , K. Nishi
- , H. Kiyosawa
- , S. Kondo
- , I. Yamanaka
- & T. Saito
FANTOM Consortium
- Y. Okazaki
- , T. Gojobori
- , H. Bono
- , T. Kasukawa
- , R. Saito
- , K. Kadota
- , H. Matsuda
- , M. Ashburner
- , S. Batalov
- , T. Casavant
- , W. Fleischmann
- , T. Gaasterland
- , C. Gissi
- , B. King
- , H. Kochiwa
- , P. Kuehl
- , S. Lewis
- , Y. Matsuo
- , I. Nikaido
- , G. Pesole
- , J. Quackenbush
- , L. M. Schriml
- , F. Staubli
- , R. Suzuki
- , M. Tomita
- , L. Wagner
- , T. Washio
- , K. Sakai
- , T. Okido
- , M. Furuno
- , H. Aono
- , R. Baldarelli
- , G. Barsh
- , J. Blake
- , D. Boffelli
- , N. Bojunga
- , P. Carninci
- , M. F. de Bonaldo
- , M. J. Brownstein
- , C. Bult
- , C. Fletcher
- , M. Fujita
- , M. Gariboldi
- , S. Gustincich
- , D. Hill
- , M. Hofmann
- , D. A. Hume
- , M. Kamiya
- , N. H. Lee
- , P. Lyons
- , L. Marchionni
- , J. Mashima
- , J. Mazzarelli
- , P. Mombaerts
- , P. Nordone
- , B. Ring
- , M. Ringwald
- , I. Rodriguez
- , N. Sakamoto
- , H. Sasaki
- , K. Sato
- , C. Schönbach
- , T. Seya
- , Y. Shibata
- , K.-F. Storch
- , H. Suzuki
- , K. Toyo-oka
- , K. H. Wang
- , C. Weitz
- , C. Whittaker
- , L. Wilming
- , A. Wynshaw-Boris
- , K. Yoshida
- , Y. Hasegawa
- , H. Kawaji
- & S. Kohtsuki
General organizer
- Y. Hayashizaki

Corresponding author

Correspondence toY. Hayashizaki.

Supplementary information

Rights and permissions

About this article

Cite this article

The RIKEN Genome Exploration Research Group Phase II Team and the FANTOM Consortium. Functional annotation of a full-length mouse cDNA collection.Nature 409, 685–690 (2001). https://doi.org/10.1038/35055500

Download citation

Received: 06 November 2000
Accepted: 29 December 2000
Issue Date: 08 February 2001
DOI: https://doi.org/10.1038/35055500