A correlation with exon expression approach to identify cis-regulatory elements for tissue-specific alternative splicing - PubMed (original) (raw)

A correlation with exon expression approach to identify cis-regulatory elements for tissue-specific alternative splicing

Debopriya Das et al. Nucleic Acids Res. 2007.

Abstract

Correlation of motif occurrences with gene expression intensity is an effective strategy for elucidating transcriptional cis-regulatory logic. Here we demonstrate that this approach can also identify cis-regulatory elements for alternative pre-mRNA splicing. Using data from a human exon microarray, we identified 56 cassette exons that exhibited higher transcript-normalized expression in muscle than in other normal adult tissues. Intron sequences flanking these exons were then analyzed to identify candidate regulatory motifs for muscle-specific alternative splicing. Correlation of motif parameters with gene-normalized exon expression levels was examined using linear regression and linear splines on RNA words and degenerate weight matrices, respectively. Our unbiased analysis uncovered multiple candidate regulatory motifs for muscle-specific splicing, many of which are phylogenetically conserved among vertebrate genomes. The most prominent downstream motifs were binding sites for Fox1- and CELF-related splicing factors, and a branchpoint-like element acuaac; pyrimidine-rich elements resembling PTB-binding sites were most significant in upstream introns. Intriguingly, our systematic study indicates a paucity of novel muscle-specific elements that are dominant in short proximal intronic regions. We propose that Fox and CELF proteins play major roles in enforcing the muscle-specific alternative splicing program, facilitating expression of unique isoforms of cytoskeletal proteins critical to muscle cell function.

PubMed Disclaimer

Figures

Figure 1.

Figure 1.

Validation of microarray predictions of muscle-enriched alternative exons. RT-PCR confirmation of muscle-enriched alternative exon expression. Amplifications were performed using primers in the flanking constitutive exons. RNA sources used for amplification by lane number: 1, brain; 2, kidney; 3, liver; 4, stomach; 5, bone marrow; 6, testis; 7, heart; 8, skeletal muscle. Arrowheads indicate positions of the alternative exon inclusion products that are most enriched in heart and skeletal muscle.

Figure 2.

Figure 2.

Conservation of intron sequences flanking muscle-enriched exons. Representative VISTA genome alignments of exon and flanking intron sequences from the mouse, chicken and frog genomes with the prototypical muscle-enriched exon from human. Exon boundaries are indicated by vertical lines. Shaded regions indicate sequences that exceed 75% identity, while curves above baseline indicate regions with >50% identity to the human sequence.

Figure 3.

Figure 3.

Correlation with exon expression for the

ugcaug

regulatory element. (A) Linear fit between ratios of gene-normalized exon expression levels and counts of

ugcaug

in 200 nt of downstream intronic sequence across 356 exons (56 muscle-specific exons and 300 randomly selected exons) (P = 4.6_E_–06). (B) Dependence of correlational _P_-values of

ugcaug

count with distance. Filled circles indicate correlation with total motif count, while unfilled boxes indicate correlation with bin-wise count (bin size = 100 nt). E.g. for 300 nt, the filled circle reflects the strength of correlation with the count in 1–300 nt of intron, while the unfilled box reflects the correlation with the count in 201–300 nt of intron. Minus sign indicates upstream intron. (C) Contrast scores of

ugcaug

in upstream and downstream introns of human, mouse, chicken and frog. (‘E’ in Figure 3B and 3C indicates position of the exon).

Figure 4.

Figure 4.

Phylogenetic conservation of regulatory motifs in the proximal intron sequences. Bar graphs show the over-representation of the indicated _cis_-regulatory motifs in the proximal intron sequences for muscle-enriched exons in four vertebrate species. (A) Enrichment of selected regulatory motifs in 1-kb flanking intron regions. The highest abundance of

ugcaug, ugugug

and

acuaac

elements is consistently within the proximal downstream region of ∼200 nt, while

cucucu

and

ucuu

elements were enriched in the proximal upstream intron. The representative hnRNP A1-binding site

uaggg

was not over-represented near muscle-enriched exons. (B) Analysis of the putative CELF-binding site

ugc

in 0.5-kb flanking introns.

ugc

is enriched in the D50 region of all four species. Vertical axis, contrast score, i.e. difference in motif frequency between muscle datasets and control datasets of constitutive exons, occurrences/nt × 103; horizontal axis, nt range relative to the alternative exon; E indicates position of the muscle-enriched exon.

Figure 5.

Figure 5.

A candidate model showing splicing factors implicated in regulation of conserved muscle-enriched alternative exons. Based on the conserved distribution of splicing factor binding sites across multiple vertebrate orders, the positive correlation with muscle-specific splicing and the high absolute abundance of

ugcaug

motifs, Fox proteins are proposed to play a major role in promoting inclusion of muscle-enriched exons. The distribution of binding motifs among individual introns suggests that CELF proteins and by KH-type splicing factor(s) function independently in some cases, and together with Fox proteins in others, to specify muscle-enriched splicing. In contrast, the enrichment of candidate PTB-binding sites in the proximal upstream intron suggests a role in preventing inappropriate inclusion of muscle-specific exons in other cell types.

Similar articles

Cited by

References

    1. Black DL. Mechanisms of alternative pre-messenger RNA splicing. Annu. Rev. Biochem. 2003;72:291–336. - PubMed
    1. Fairbrother WG, Yeh RF, Sharp PA, Burge CB. Predictive identification of exonic splicing enhancers in human genes. Science. 2002;297:1007–1013. - PubMed
    1. Wang Z, Rolish ME, Yeo G, Tung V, Mawson M, Burge CB. Systematic identification and analysis of exonic splicing silencers. Cell. 2004;119:831–845. - PubMed
    1. Cartegni L, Wang J, Zhu Z, Zhang MQ, Krainer AR. ESEfinder: a web resource to identify exonic splicing enhancers. Nucleic Acids Res. 2003;31:3568–3571. - PMC - PubMed
    1. Stamm S, Zhang MQ, Marr TG, Helfman DM. A sequence compilation and comparison of exons that are alternatively spliced in neurons. Nucleic Acids Res. 1994;22:1515–1526. - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources