A paired-end sequencing strategy to map the complex landscape of transcription initiation - PubMed (original) (raw)
A paired-end sequencing strategy to map the complex landscape of transcription initiation
Ting Ni et al. Nat Methods. 2010 Jul.
Abstract
Recent studies using high-throughput sequencing protocols have uncovered the complexity of mammalian transcription by RNA polymerase II, helping to define several initiation patterns in which transcription start sites (TSSs) cluster in both narrow and broad genomic windows. Here we describe a paired-end sequencing strategy, which enables more robust mapping and characterization of capped transcripts. We used this strategy to explore the transcription initiation landscape in the Drosophila melanogaster embryo. Extending the previous findings in mammals, we found that fly promoters exhibited distinct initiation patterns, which were linked to specific promoter sequence motifs. Furthermore, we identified many 5' capped transcripts originating from coding exons; our analyses support that they are unlikely the result of alternative TSSs, but rather the product of post-transcriptional modifications. We demonstrated paired-end TSS analysis to be a powerful method to uncover the transcriptional complexity of eukaryotic genomes.
Figures
Figure 1. Paired-End Analysis of Transcriptional start sites (PEAT)
(a) Schematic outline of the PEAT strategy. The RNA fragment is shown as an arrowed line (red), the two Mme I sites induced at the oligo-capping and reverse transcription (RT) steps are shown in green and purple, respectively. (b) Mapping efficiency of the reads that have built-in linker sequences, combined from two technical replicates. (c) The distribution of uniquely mapped 5′ and 3′ reads relative to known TSSs and other genomic regions. (d) Comparison between PEAT and microarray expression data. 10,101 genes were plotted that had at least 1 mapped read-pair and were included in the microarray data. For the array data, expression level is the mean of simple background subtraction values across 3 replicates from mixed stage 0-11 D. melanogaster embryos. To estimate the expression level using paired-end sequencing data, we used the counts of 3′ tags that map to a transcribed region. Correlation coefficient was determined by Pearson correlation.
Figure 2. TSS clusters and initiation patterns identified in the Drosophila embryo
(a) The approach for identifying TSS clusters. A representative example (Chr. 2: 14516000-14516600) is shown. In essence, a smoothed density estimate of 5′ TSS tags was computed (blue line). Cluster boundary was then determined as exceeding a baseline score, estimated on a genomic background (red line). TSS clusters were further condensed to the shortest distance containing 95% of the reads (dark shaded area). (b) The genomic locations of all clusters that contain ≥ 100 reads. Clusters overlapping an annotated TSS in FlyBase were classified as FlyBase TSS. For the remaining clusters, classifications were based on the mode of each given cluster and its relative location to annotated transcripts. (c) Size distribution of all clusters with ≥ 100 reads. Cluster sizes are similar to previous reports for mammals, with the majority of clusters shorter than 120nt in length. (d) Definition of initiation patterns.
Figure 3. Promoter motifs associated with distinct promoter types
(a) The three initiation patterns, NP, BP and WP, are each represented by a candidate locus. The graphs show the relative percentage of 5′ reads that are mapped within a 100nt window. (b) Sequence landscape in the promoter region of each pattern. The mode location of each cluster is set as reference point ‘+1’. Sequence logos of 100-nt window are shown. (c) The core promoter motifs overrepresented for each initiation pattern. Significant motifs were identified in 200nt core promoter sequences and binned into 5nt intervals; only the 100nt region surrounding the TSS is shown as no motifs were found to be enriched outside of this window. All bins with normalized motif occurrences of 5-fold enriched or above are shown. The percent of sequences containing at least one high-stringency instance of each motif in its preferred location is listed on the left side of the heat map.
Figure 4. A distinct sequence motif identified for internally capped transcripts
(a-b) The gene structures of the PROD and RNPS1 loci indicating exons (thick bar) and introns (thin bar) from FlyBase are shown. A thick grey bar represents the UTR region. Grey areas highlight read clusters (≥ 100 reads/cluster). Green arrows denote primer locations for RT-PCR validation. A junction primer, which spans the linker and 5′ gene specific sequence at the cluster mode, together with a downstream primer (100-200 bp distance) were used to carry out RT-PCR. For each locus, cDNAs derived from RNA samples with (+) or without (−) linker ligation were used as template. The DNA ladder (M) is shown in the left lane. Sanger sequencing results show the correct position of the mode of the called TSS cluster for (a) a capped 5′ read cluster in the middle of a coding region; and (b) an example of a capped 5′ read cluster near the end of the coding region. (c) Sequence logo of a 100 nt window around the mode location (identified as ‘+1ߣ) of all clusters containing more than 100 reads and mapping to a coding region.
Similar articles
- Motif composition, conservation and condition-specificity of single and alternative transcription start sites in the Drosophila genome.
Rach EA, Yuan HY, Majoros WH, Tomancak P, Ohler U. Rach EA, et al. Genome Biol. 2009;10(7):R73. doi: 10.1186/gb-2009-10-7-r73. Epub 2009 Jul 9. Genome Biol. 2009. PMID: 19589141 Free PMC article. - Biological function of unannotated transcription during the early development of Drosophila melanogaster.
Manak JR, Dike S, Sementchenko V, Kapranov P, Biemar F, Long J, Cheng J, Bell I, Ghosh S, Piccolboni A, Gingeras TR. Manak JR, et al. Nat Genet. 2006 Oct;38(10):1151-8. doi: 10.1038/ng1875. Epub 2006 Sep 3. Nat Genet. 2006. PMID: 16951679 - Transcription start site profiling uncovers divergent transcription and enhancer-associated RNAs in Drosophila melanogaster.
Meers MP, Adelman K, Duronio RJ, Strahl BD, McKay DJ, Matera AG. Meers MP, et al. BMC Genomics. 2018 Feb 21;19(1):157. doi: 10.1186/s12864-018-4510-7. BMC Genomics. 2018. PMID: 29466941 Free PMC article. - Regulation of snRNA gene expression by the Drosophila melanogaster small nuclear RNA activating protein complex (DmSNAPc).
Hung KH, Stumph WE. Hung KH, et al. Crit Rev Biochem Mol Biol. 2011 Feb;46(1):11-26. doi: 10.3109/10409238.2010.518136. Epub 2010 Oct 6. Crit Rev Biochem Mol Biol. 2011. PMID: 20925482 Review. - One thousand and one ways of making functionally similar transcriptional enhancers.
Veitia RA. Veitia RA. Bioessays. 2008 Nov;30(11-12):1052-7. doi: 10.1002/bies.20849. Bioessays. 2008. PMID: 18937349 Review.
Cited by
- Alternative TSS use is widespread in Cryptococcus fungi in response to environmental cues and regulated genome-wide by the transcription factor Tur1.
Dang TTV, Maufrais C, Colin J, Moyrand F, Mouyna I, Coppée JY, Onyishi CU, Lipecka J, Guerrera IC, May RC, Janbon G. Dang TTV, et al. PLoS Biol. 2024 Jul 25;22(7):e3002724. doi: 10.1371/journal.pbio.3002724. eCollection 2024 Jul. PLoS Biol. 2024. PMID: 39052688 Free PMC article. - KSHV 3.0: a state-of-the-art annotation of the Kaposi's sarcoma-associated herpesvirus transcriptome using cross-platform sequencing.
Prazsák I, Tombácz D, Fülöp Á, Torma G, Gulyás G, Dörmő Á, Kakuk B, McKenzie Spires L, Toth Z, Boldogkői Z. Prazsák I, et al. mSystems. 2024 Feb 20;9(2):e0100723. doi: 10.1128/msystems.01007-23. Epub 2024 Jan 11. mSystems. 2024. PMID: 38206015 Free PMC article. - Alternative Transcription Start Site Usage and Functional Implications in Pathogenic Fungi.
Dang TTV, Colin J, Janbon G. Dang TTV, et al. J Fungi (Basel). 2022 Oct 3;8(10):1044. doi: 10.3390/jof8101044. J Fungi (Basel). 2022. PMID: 36294609 Free PMC article. Review. - In-Depth Temporal Transcriptome Profiling of an Alphaherpesvirus Using Nanopore Sequencing.
Tombácz D, Kakuk B, Torma G, Csabai Z, Gulyás G, Tamás V, Zádori Z, Jefferson VA, Meyer F, Boldogkői Z. Tombácz D, et al. Viruses. 2022 Jun 13;14(6):1289. doi: 10.3390/v14061289. Viruses. 2022. PMID: 35746760 Free PMC article. - Genome-Wide Profiling of Transcription Initiation with STRIPE-seq.
Policastro RA, Zentner GE. Policastro RA, et al. Methods Mol Biol. 2022;2477:21-34. doi: 10.1007/978-1-0716-2257-5_2. Methods Mol Biol. 2022. PMID: 35524109
References
- Hochheimer A, Zhou S, Zheng S, Holmes MC, Tjian R. TRF2 associates with DREF and directs promoter-selective gene expression in Drosophila. Nature. 2002;420:439–445. - PubMed
- Holmes MC, Tjian R. Promoter-selective properties of the TBP-related factor TRF1. Science. 2000;288:867–870. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Molecular Biology Databases