The transcriptional landscape of the yeast genome defined by RNA sequencing - PubMed (original) (raw)
The transcriptional landscape of the yeast genome defined by RNA sequencing
Ugrappa Nagalakshmi et al. Science. 2008.
Abstract
The identification of untranslated regions, introns, and coding regions within an organism remains challenging. We developed a quantitative sequencing-based method called RNA-Seq for mapping transcribed regions, in which complementary DNA fragments are subjected to high-throughput sequencing and mapped to the genome. We applied RNA-Seq to generate a high-resolution transcriptome map of the yeast genome and demonstrated that most (74.5%) of the nonrepetitive sequence of the yeast genome is transcribed. We confirmed many known and predicted introns and demonstrated that others are not actively used. Alternative initiation codons and upstream open reading frames also were identified for many yeast genes. We also found unexpected 3'-end heterogeneity and the presence of many overlapping genes. These results indicate that the yeast transcriptome is more complex than previously appreciated.
Figures
Figure 1. Flowchart of experimental and informatics of RNA-Seq method
A) RNA Seq experimental pipeline. B) Informatics pipeline. C) A snapshot of the mapped RNA-Seq reads showing no expression in a deleted gene (LEU2) and an expressed neighboring gene (YCL017C).
Figure 2. Extensive expression of the yeast genome revealed by RNA-Seq
A) The genome distribution of transcribed regions. Colors represent different transcription levels for each base (log2 tag count). B) Distribution of transcribed regions on chromosome VI. C) Histogram of transcribed bases. D) A summary of the transcription level of the transcriptome.
Figure 3. Analyses and mapping of 5′ and 3′ gene boundaries
A) Size differences of 5′-UTR between RNA-Seq and our RACE data (top left) or RNA-Seq 3′-UTR data and cDNA sequencing data(7) (bottom left). Distributions of the size of 5′-UTR (top right) or 3′-UTR (bottom right) is also shown. B) A comparison of 5′-UTR determined by RNA-Seq or by 5′-RACE for gene YKL004W. C) 3′-UTR determined by RNA-Seq based on end tags for gene YDR460W, YDR004W, and YDR461-C, or YDR004W that is also determined by cDNA sequencing (7). Endtag_W and Endtag_C represent RNA-Seq reads that contain polyA tails on either Watson or Crick strands, respectively. D) 3′-UTR determined by RNA-Seq based on sharp expression decrease, comparing to cDNA data(7). End tags information were not used in this case due to low scores. UTR, untranslated region; RACE, rapid implication of cDNA ends
Figure 4. Precise annotation of UTRs using RNA-Seq
New annotations of the UTRs in a previously well annotated region on chrVI (A) and a relatively poor annotated region on the same chromosome (B). In the new annotation, ORFs are denoted by dotted lines, and arrows denote transcription direction. UTRs are denoted by green shaded boxes flanking the ORFs. cDNA transcripts in red are high confident ones and those in blue are low confident ones (7)
Figure 5. Annotation of upstream ATG, uORF and novel transcribed regions
A) RNA-Seq reveals genes that may have upstream start codon (uATG, in red) relative to the existing annotated ATG (blue). B) Some genes have ORFs (uORFs) upstream of the major annotated ORF. GO analysis revealed that they are significantly enriched in DNA binding (molecular function) and anatomical structure and development (biological process). P-values are False Discovery Rate adjusted. C) An example of uORF (boxed and in red). D) Size distribution of novel transcribed regions. E) Novel transcribed regions that have been covered by cDNA sequencing(7) in percentages. F) An example of a novel transcribed region with a polyA signal (shaded in red).
Figure 6. Comparison between RNA-Seq data with qPCR, tiling array and gene expression microarrays
A) Comparison of the transcription level for 34 ORFs determined by RNA-Seq or quantitative PCR (qPCR). B) Comparison of the transcription level for 4,846 ORFs determined by RNA-Seq with published tiling array (16). C) Comparison of the transcription level for 4,422 ORFs determined by RNA-Seq with the published gene expression microarrays (15). Pearson linear correlation coefficients (corr) are shown in A–C. D) Transcription level distribution for 5,099 ORFs by RNA-Seq.
Similar articles
- Genome-wide analysis of mRNA lengths in Saccharomyces cerevisiae.
Hurowitz EH, Brown PO. Hurowitz EH, et al. Genome Biol. 2003;5(1):R2. doi: 10.1186/gb-2003-5-1-r2. Epub 2003 Dec 22. Genome Biol. 2003. PMID: 14709174 Free PMC article. - Evidence for abundant transcription of non-coding regions in the Saccharomyces cerevisiae genome.
Havilio M, Levanon EY, Lerman G, Kupiec M, Eisenberg E. Havilio M, et al. BMC Genomics. 2005 Jun 16;6:93. doi: 10.1186/1471-2164-6-93. BMC Genomics. 2005. PMID: 15960846 Free PMC article. - Genome-wide profiling of untranslated regions by paired-end ditag sequencing reveals unexpected transcriptome complexity in yeast.
Kang YN, Lai DP, Ooi HS, Shen TT, Kou Y, Tian J, Czajkowsky DM, Shao Z, Zhao X. Kang YN, et al. Mol Genet Genomics. 2015 Feb;290(1):217-24. doi: 10.1007/s00438-014-0913-6. Epub 2014 Sep 12. Mol Genet Genomics. 2015. PMID: 25213602 - Life with 6000 genes.
Goffeau A, Barrell BG, Bussey H, Davis RW, Dujon B, Feldmann H, Galibert F, Hoheisel JD, Jacq C, Johnston M, Louis EJ, Mewes HW, Murakami Y, Philippsen P, Tettelin H, Oliver SG. Goffeau A, et al. Science. 1996 Oct 25;274(5287):546, 563-7. doi: 10.1126/science.274.5287.546. Science. 1996. PMID: 8849441 Review. - Non-coding transcription by RNA polymerase II in yeast: Hasard or nécessité?
Tudek A, Candelli T, Libri D. Tudek A, et al. Biochimie. 2015 Oct;117:28-36. doi: 10.1016/j.biochi.2015.04.020. Epub 2015 May 6. Biochimie. 2015. PMID: 25956976 Review.
Cited by
- RNA Sequencing Analysis of Patients with Chronic Hepatitis B Treated Using PEGylated Interferon.
Chen SL, Shen YJ, Chen GZ. Chen SL, et al. Int J Gen Med. 2024 Oct 1;17:4465-4474. doi: 10.2147/IJGM.S474284. eCollection 2024. Int J Gen Med. 2024. PMID: 39372134 Free PMC article. - Body fluid multiomics in 3PM-guided ischemic stroke management: health risk assessment, targeted protection against health-to-disease transition, and cost-effective personalized approach are envisaged.
Chen R, Wang X, Li N, Golubnitschaja O, Zhan X. Chen R, et al. EPMA J. 2024 Aug 29;15(3):415-452. doi: 10.1007/s13167-024-00376-2. eCollection 2024 Sep. EPMA J. 2024. PMID: 39239108 Free PMC article. Review. - Pleiotropic effects of PAB1 deletion: Extensive changes in the yeast proteome, transcriptome, and translatome.
Mangkalaphiban K, Ganesan R, Jacobson A. Mangkalaphiban K, et al. PLoS Genet. 2024 Sep 5;20(9):e1011392. doi: 10.1371/journal.pgen.1011392. eCollection 2024 Sep. PLoS Genet. 2024. PMID: 39236083 Free PMC article. - CleanUpRNAseq: An R/Bioconductor Package for Detecting and Correcting DNA Contamination in RNA-Seq Data.
Liu H, Hu K, O'Connor K, Kelliher MA, Zhu LJ. Liu H, et al. BioTech (Basel). 2024 Aug 3;13(3):30. doi: 10.3390/biotech13030030. BioTech (Basel). 2024. PMID: 39189209 Free PMC article. - Translation efficiency covariation across cell types is a conserved organizing principle of mammalian transcriptomes.
Liu Y, Hoskins I, Geng M, Zhao Q, Chacko J, Qi K, Persyn L, Wang J, Zheng D, Zhong Y, Rao S, Park D, Cenik ES, Agarwal V, Ozadam H, Cenik C. Liu Y, et al. bioRxiv [Preprint]. 2024 Aug 11:2024.08.11.607360. doi: 10.1101/2024.08.11.607360. bioRxiv. 2024. PMID: 39149359 Free PMC article. Preprint.
References
- Snyder M, Gerstein M. Science. 2003;300:258. - PubMed
- Gerstein MB, et al. Genome Res. 2007;17:669. - PubMed
- Adams MD, et al. Nature. 1995;377:3. - PubMed
- Kapranov P, et al. Science. 2002;296:916. - PubMed
- Bertone P, et al. Science. 2004;306:2242. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
- P50 HG002357/HG/NHGRI NIH HHS/United States
- P50 HG002357-10/HG/NHGRI NIH HHS/United States
- R01 CA077808/CA/NCI NIH HHS/United States
- R01 CA077808-12/CA/NCI NIH HHS/United States
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases