Genome-wide analysis of mouse transcripts using exon microarrays and factor graphs (original) (raw)
- Letter
- Published: 28 August 2005
- Naveed Mohammad2 na1,
- Quaid D Morris1,2 na1,
- Wen Zhang2,3 na1,
- Mark D Robinson1,2,
- Sanie Mnaimneh2,
- Richard Chang2,
- Qun Pan2,
- Eric Sat4,
- Janet Rossant3,4,
- Benoit G Bruneau3,5,
- Jane E Aubin3,
- Benjamin J Blencowe2,3 &
- …
- Timothy R Hughes2,3
Nature Genetics volume 37, pages 991–996 (2005)Cite this article
- 371 Accesses
- 34 Citations
- Metrics details
Abstract
Recent mammalian microarray experiments detected widespread transcription and indicated that there may be many undiscovered multiple-exon protein-coding genes. To explore this possibility, we labeled cDNA from unamplified, polyadenylation-selected RNA samples from 37 mouse tissues to microarrays encompassing 1.14 million exon probes. We analyzed these data using GenRate, a Bayesian algorithm that uses a genome-wide scoring function in a factor graph to infer genes. At a stringent exon false detection rate of 2.7%, GenRate detected 12,145 gene-length transcripts and confirmed 81% of the 10,000 most highly expressed known genes. Notably, our analysis showed that most of the 155,839 exons detected by GenRate were associated with known genes, providing microarray-based evidence that most multiple-exon genes have already been identified. GenRate also detected tens of thousands of potential new exons and reconciled discrepancies in current cDNA databases by 'stitching' new transcribed regions into previously annotated genes.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Additional access options:
Similar content being viewed by others
Accession codes
Accessions
Gene Expression Omnibus
References
- International Human Genome Sequencing Consortium. Finishing the euchromatic sequence of the human genome. Nature 431, 931–945 (2004).
- Schadt, E.E. et al. A comprehensive transcript index of the human genome generated using microarrays and computational approaches. Genome Biol. 5, R73 (2004).
Article Google Scholar - Hughes, T.R. et al. Expression profiling using microarrays fabricated by an ink-jet oligonucleotide synthesizer. Nat. Biotechnol. 19, 342–347 (2001).
Article CAS Google Scholar - Burge, C. & Karlin, S. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268, 78–94 (1997).
Article CAS Google Scholar - Krogh, A. Two methods for improving performance of an HMM and their applicatoin for gene finding. Proc. Int. Conf. Intell. Syst. Mol. Biol. 5, 179–186 (1997).
CAS PubMed Google Scholar - Xu, Y., Mural, R.J. & Uberbacher, E.C. Inferring gene structures in genomic sequences using pattern recognition and expressed sequence tags. Proc. Int. Conf. Intell. Syst. Mol. Biol. 5, 344–353 (1997).
CAS PubMed Google Scholar - Waterston, R.H. et al. Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520–562 (2002).
Article CAS Google Scholar - Zhang, W. et al. The functional landscape of moues gene expression. J. Biol. 3, 21 (2004).
Article Google Scholar - Karolchik, D. et al. The UCSC genome browser database. Nucleic Acids Res. 31, 51–54 (2003).
Article CAS Google Scholar - Shoemaker, D.D. et al. Experimental annotation of the human genome using microarray technology. Nature 409, 922–927 (2001).
Article CAS Google Scholar - Stolc, V. et al. Global identification of human transcribed sequences with genome tiling arrays. Science 306, 2242–2246 (2004).
Article Google Scholar - Yamada, K. et al. Empricial analysis of transcriptional activity in the Arabidopsis genome. Science 302, 842–846 (2003).
Article CAS Google Scholar - Kapranov, P. et al. Large-scale transcriptional activity in Chromosomes 21 and 22. Science 296, 916–919 (2002).
Article CAS Google Scholar - Rinn, J.L. et al. The transcriptional activity of human Chromosome 22. Genes Dev. 17, 529–540 (2003).
Article CAS Google Scholar - Bertone, P. et al. Global identification of human transcribed sequences with genome tiling arrays. Science 306, 2242–2246 (2004).
Article CAS Google Scholar - Kschischang, F.R., Frey, B.J. & Loeliger, H.A. Factor graphs and the sum-product algorithm. IEEE Trans. Inf. Theory 47, 498–519 (2001).
Article Google Scholar - Garbarino, J.E. & Gibbons, I.R. Expression and genomic analysis of midasin, a novel and highly conserved AAA protein distantly related to dynein. BMC Genomics 3, 18 (2002).
Article Google Scholar - Okazaki, Y. et al. Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature 420, 563–573 (2002).
Article Google Scholar - Cheng, J. et al. Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution. Science 308, 1149–1154 (2005).
Article CAS Google Scholar - Wang, J. et al. Mouse transcriptome: Neutral evolution of 'non-coding' complementary DNAs (reply). Nature 431, 757 (2004).
Article CAS Google Scholar - Wyers, F. et al. Cryptic Pol II transcripts are degraded by a nuclear quality control pathway involving a new poly(A) polymerase. Cell 121, 725–737 (2005).
Article CAS Google Scholar - Wong, G.K., Passey, D.A. & Yu, J. Most of the human genome is transcribed. Genome Res. 11, 1975–1977 (2001).
Article CAS Google Scholar - Kent, W.J. BLAT - The BLAST-like alignment tool. Genome Res. 12, 656–664 (2002).
Article CAS Google Scholar - Pruitt, K.D., Tatusova, T. & Maglott, D.R. NCBI Reference Sequence Project: update and current status. Nucleic Acids Res. 31, 34–37 (2003).
Article CAS Google Scholar - Hubbard, T. et al. Ensembl 2005. Nucleic Acids Res. 33, D447–D453 (2005).
Article CAS Google Scholar - Pontius, J.U., Wagner, L. & Schuler, G.D. Unigene: A unified view of the transcriptome. in The NCBI Handbook (National Center for Biotechnology Information, Bethesda, MD, 2003).
Google Scholar
Acknowledgements
We thank G.E. Hinton for conversations and C. Boone and B. Andrews for their support. This work was supported by grants from the Canadian Institutes of Health Research, the Natural Sciences and Engineering Research Council of Canada and the Canadian Foundation for Innovation (to T.R.H., B.J.F. and B.J.B.), by a PREA award (to B.J.F.) and by a Natural Sciences and Engineering Research Council of Canada postdoctoral fellowship (to Q.D.M.).
Author information
Author notes
- Brendan J Frey, Naveed Mohammad, Quaid D Morris and Wen Zhang: These authors contributed equally to this work.
Authors and Affiliations
- Electrical and Computer Engineering, University of Toronto, 10 King's College Rd., Toronto, M5S 3G4, Ontario, Canada
Brendan J Frey, Quaid D Morris & Mark D Robinson - Banting and Best Department of Medical Research, University of Toronto, 112 College St., Toronto, M5G 1L6, Ontario, Canada
Brendan J Frey, Naveed Mohammad, Quaid D Morris, Wen Zhang, Mark D Robinson, Sanie Mnaimneh, Richard Chang, Qun Pan, Benjamin J Blencowe & Timothy R Hughes - Medical Genetics and Microbiology, University of Toronto, 1 King's College Ct., Toronto, M5S 3G4, Ontario, Canada
Wen Zhang, Janet Rossant, Benoit G Bruneau, Jane E Aubin, Benjamin J Blencowe & Timothy R Hughes - Samuel Lunenfeld Research Institute, Mount Sinai Hospital, 600 University Avenue, Toronto, M5G 1X5, Ontario, Canada
Eric Sat & Janet Rossant - The Hospital for Sick Children, 555 University Ave., Toronto, M5G 1X8, Ontario, Canada
Benoit G Bruneau
Authors
- Brendan J Frey
You can also search for this author inPubMed Google Scholar - Naveed Mohammad
You can also search for this author inPubMed Google Scholar - Quaid D Morris
You can also search for this author inPubMed Google Scholar - Wen Zhang
You can also search for this author inPubMed Google Scholar - Mark D Robinson
You can also search for this author inPubMed Google Scholar - Sanie Mnaimneh
You can also search for this author inPubMed Google Scholar - Richard Chang
You can also search for this author inPubMed Google Scholar - Qun Pan
You can also search for this author inPubMed Google Scholar - Eric Sat
You can also search for this author inPubMed Google Scholar - Janet Rossant
You can also search for this author inPubMed Google Scholar - Benoit G Bruneau
You can also search for this author inPubMed Google Scholar - Jane E Aubin
You can also search for this author inPubMed Google Scholar - Benjamin J Blencowe
You can also search for this author inPubMed Google Scholar - Timothy R Hughes
You can also search for this author inPubMed Google Scholar
Corresponding authors
Correspondence toBenjamin J Blencowe or Timothy R Hughes.
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Rights and permissions
About this article
Cite this article
Frey, B., Mohammad, N., Morris, Q. et al. Genome-wide analysis of mouse transcripts using exon microarrays and factor graphs.Nat Genet 37, 991–996 (2005). https://doi.org/10.1038/ng1630
- Received: 17 June 2005
- Accepted: 28 July 2005
- Published: 28 August 2005
- Issue Date: 01 September 2005
- DOI: https://doi.org/10.1038/ng1630