A large-scale full-length cDNA analysis to explore the budding yeast transcriptome - PubMed (original) (raw)

A large-scale full-length cDNA analysis to explore the budding yeast transcriptome

Fumihito Miura et al. Proc Natl Acad Sci U S A. 2006.

Abstract

We performed a large-scale cDNA analysis to explore the transcriptome of the budding yeast Saccharomyces cerevisiae. We sequenced two cDNA libraries, one from the cells exponentially growing in a minimal medium and the other from meiotic cells. Both libraries were generated by using a vector-capping method that allows the accurate mapping of transcription start sites (TSSs). Consequently, we identified 11,575 TSSs associated with 3,638 annotated genomic features, including 3,599 ORFs, to suggest that most yeast genes have two or more TSSs. In addition, we identified 45 previously undescribed introns, including those affecting current ORF annotations and those spliced alternatively. Furthermore, the analysis revealed 667 transcription units in the intergenic regions and transcripts derived from antisense strands of 367 known features. We also found that 348 ORFs carry TSSs in their 3'-halves to generate sense transcripts starting from inside the ORFs. These results indicate that the budding yeast transcriptome is considerably more complex than previously thought, and it shares many recently revealed characteristics with the transcriptomes of mammals and other higher eukaryotes. Thus, the genome-wide active transcription that generates novel classes of transcripts appears to be an intrinsic feature of the eukaryotic cells. The budding yeast will serve as a versatile model for the studies on these aspects of transcriptome, and the full-length cDNA clones can function as an invaluable resource in such studies.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.

Fig. 1.

Data of the large-scale cDNA analysis. (A) A screenshot of the UT Genome Browser depicting a region including SDT1, COG1, and EDC1 (coordinate 78,801–80,800 of chromosome 7). Each bar in the GCAP track indicates the 5′-single-pass sequence of each cDNA clone, whose ID is shown at the left side of the bar. Blue and orange indicate Watson and Crick strands, respectively. The sharp (#) at the end of line indicates that the clone is a full-length one with a G-cap-derived nucleotide addition. This screen shows four and one full-length clones for SDT1 and COG1, respectively. In addition, it displays five, one, and one full-length clones for antisense transcripts of COG1, a transcript starting within the COG1 ORF, and a transcription unit lying between COG1 and EDC1 (TU #257 in Table 8), respectively. (B) Breakdown of the cDNA data. Starting from 51,026 clones, 13,159 TSSs were identified (see text for detail).

Fig. 2.

Fig. 2.

Transcription start sites. (A) Distribution of TSSs around the presumptive initiation codon. (B) Two typical patterns of TSS distribution.

Fig. 3.

Fig. 3.

Alternative splicing events. The bold line with squares with ORF/gene names shown at the top of each image indicates the genome map, and the arrows shown below the map indicate the transcripts. (A) YNL194C and YNL195C share the same promoter to generate three mRNAs (Left), including a potential bicistronic one detected as a ≈2,000-nt band in Northern blot analysis of SK1 cells (Right). IVT, in vitro transcription. (B) YMR148W has two distinct promoters, and YMR147W may be an upstream exon of YMR148W used in the transcript starting from the distal promoter. (C) A case of transcription-induced chimera between NCE101 and YJL206C.

Fig. 4.

Fig. 4.

Transcription units in intergenic regions. (A) Examples of isolated transcription unit. Boxes and arrows indicate annotated ORFs and transcripts, respectively. Red and blue indicate features on Watson and Crick strands, respectively. (B) The longest ORF size was plotted against the size of each transcription unit. (C) Conservation of the ORF encoded by the transcription unit no. 633. Identical (∗), conservative (:), and semiconservative (.) residues are indicated.

Fig. 5.

Fig. 5.

Antisense transcription. (A) A screenshot of the UT Genome Browser depicting GAL1-GAL10 locus. Two cDNA clones (blue lines) were isolated for the antisense transcript of GAL10, which had been hit by a SAGE tag (17). Note that GAL10 also has an internally primed sense transcript (orange lines), which was also detected by the tiling array study (segment ID 1028; ref. 19). (B) The expression of GAL10 is suppressed or induced in the presence of dextrose (Dex) or galactose (Gal), respectively. Coinduction of the sense and antisense transcripts was observed upon addition of Gal.

References

    1. Dolinski C, Botstein D. Genome Res. 2005;15:1611–1619. - PubMed
    1. Giaever G, Chu AM, Ni L, Connelly C, Riles L, Veronneau S, Dow S, Lucau-Danila A, Anderson K, Andre B, et al. Nature. 2002;418:387–391. - PubMed
    1. Tong AH, Lesage G, Bader GD, Ding H, Xu H, Xin X, Young J, Berriz GF, Brost RL, Chang M, et al. Science. 2004;303:808–813. - PubMed
    1. Hughes TR, Marton MJ, Jones AR, Roberts CJ, Stoughton R, Armour CD, Bennett HA, Coffey E, Dai H, He YD, et al. Cell. 2000;102:109–126. - PubMed
    1. Lee TI, Rinaldi NJ, Robert F, Odom DT, Bar-Joseph Z, Gerber GK, Hannett NM, Harbison CT, Thompson CM, Simon I, et al. Science. 2002;298:799–804. - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources