A novel approach to describe a U1 snRNA binding site - PubMed (original) (raw)

. 2003 Dec 1;31(23):6963-75.

doi: 10.1093/nar/gkg901.

Corinna Asang, Susanne Kammler, Carolin Konermann, Jörg Krummheuer, Marianne Hipp, Imke Meyer, Wolfram Gierling, Stephan Theiss, Thorsten Preuss, Detlev Schindler, Jørgen Kjems, Heiner Schaal

Affiliations

A novel approach to describe a U1 snRNA binding site

Marcel Freund et al. Nucleic Acids Res. 2003.

Abstract

RNA duplex formation between U1 snRNA and a splice donor (SD) site can protect pre-mRNA from degradation prior to splicing and initiates formation of the spliceosome. This process was monitored, using sub-genomic HIV-1 expression vectors, by expression analysis of the glycoprotein env, whose formation critically depends on functional SD4. We systematically derived a hydrogen bond model for the complementarity between the free 5' end of U1 snRNA and 5' splice sites and numerous mutations following transient transfection of HeLa-T4+ cells with 5' splice site mutated vectors. The resulting model takes into account number, interdependence and neighborhood relationships of predicted hydrogen bond formation in a region spanning the three most 3' base pairs of the exon (-3 to -1) and the eight most 5' base pairs of the intron (+1 to +8). The model is represented by an algorithm classifying U1 snRNA binding sites which can or cannot functionally substitute SD4 with respect to Rev-mediated env expression. In a data set of 5' splice site mutations of the human ATM gene we found a significant correlation between the algorithmic classification and exon skipping (P = 0.018, chi2-test), showing that the applicability of the proposed model reaches far beyond HIV-1 splicing. However, the algorithmic classification must not be taken as an absolute measure of SD usage as it may be modified by upstream sequence elements. Upstream to SD4 we identified a fragment supporting ASF/SF2 binding. Mutating GAR nucleotide repeats within this site decreased the SD4-dependent Rev-mediated env expression, which could be balanced simply by artificially increasing the complementarity of SD4.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Mutated SD4 transcripts essentially fail to bind U1 snRNA. (A) Schematic diagram of the HIV-1NL genome. Depicted below is the env expression vector SV E/X tat– rev–. The subgenomic region delimited by the dashed lines was cloned into the SV40 early expression vector pSVT7 (see Materials and Methods). The open-reading frames for tat and rev are translationally silent due to mutations of their translational start codons. (B) Pull-down assay of U1 and U2 snRNP with HIV-1 env transcripts containing different mutant SD4 RNAs. Biotinylated in vitro transcripts carrying mutations of the tat/rev 5′ splice site were incubated with HeLa nuclear extracts. RNA was extracted from the formed complexes with streptavidine beads. Bound U1- and U2-snRNAs were detected by [32P]UTP-labeled antisense transcripts. U2 snRNA served as a loading control for the assay. (C) Immunoblot analysis of Env glycoproteins (gp120 and its precursor gp160) expressed in HeLa-T4+ cells transfected with SVcrev, pGL3-control and env expression vectors carrying mutations of the tat/rev 5′ splice site SD4. For each plasmid, transfection using FuGENE™ 6 was done with 1 µg of the env expression vector, 1 µg of SVcrev and 1 µg of pGL3-control. The hydrogen bonding patterns (middle) are given as sums of hydrogen bonds within each continuous stretch of complementary nucleotides according to the algorithmic calculation. A mismatch is indicated by a vertical line.

Figure 2

Figure 2

Contributions of G:U base pairs to U1 snRNA binding. Immunoblot analysis (left) of Env glycoproteins (gp120 and its precursor gp160) expressed in HeLa-T4+ cells transfected with 1 µg of env expression vector, 1 µg of SVcrev and 1 µg of pGL3-control. The different env expression vectors carried predicted U1 snRNA binding sites in the position of the tat/rev 5′ splice site SD4. A cellular protein that cross-reacts with the anti-gp120 mAb is marked by an asterisk. Sequences of the predicted U1 snRNA binding sites and their hydrogen bonding patterns are depicted on the right. The three columns indicate U1 snRNA binding site hydrogen bonding patterns on the assumption that different numbers of hydrogen bonds are possible for a G-U base pair (underlined within the sequence): two [G:U (2)], one [G:U (1)] or no [G:U(0)] hydrogen bond(s) are scored per pair. A mismatch is indicated by a vertical line.

Figure 3

Figure 3

Predicted U1 binding site 18 supports env expression with wild-type U1 snRNA, whereas 15 does so only with site-mutated U1 snRNA 5C10C11C. Immunoblot analysis (top) of Env glycoproteins (gp120 and its precursor gp160) expressed in HeLa-T4+ cells transfected with 1 µg of SVcrev, 1 µg of pGL3-control and 1 µg of env expression vector carrying the predicted U1 snRNA binding site 15 or 18 in the position of the tat/rev 5′ splice site SD4. An expression plasmid coding for wild-type U1 snRNA (U1) or one of three site-mutated U1 snRNAs was cotransfected. Nucleotides of the U1 snRNAs that hypothetically might base pair with the corresponding nucleotide of the predicted binding site SD 156266, i.e. A-U, G-C and G-U, are given in upper case letters, mismatches in lower case letters (left, designations and sequences). Mutated nucleotides of the U1 snRNAs are in bold letters. + and – represent the transfection scheme (right).

Figure 4

Figure 4

A G:U base pair in position +7 does not contribute to U1 snRNA binding without sufficient flanking base pairs. Immunoblot analysis (left) of Env glycoproteins (gp120 and its precursor gp160) expressed in HeLa-T4+ cells transfected with 1 µg of SVcrev, 1 µg of pGL3-control and 1 µg of of env expression vector carrying alternatives of the tat/rev 5′ splice site SD4, i.e. the predicted U1 binding site 17 (top) or its site-mutated variants, –1G5U or –1G5U8U (right). In the scheme (center), hydrogen bonds that are predicted to form with U1 snRNA are shown as circles (1, one hydrogen bond; 2, two hydrogen bonds; 3, three hydrogen bonds; mismatch, empty circle) assuming one hydrogen bond for G:U wobbles. The right column [G:U (1)] shows the number of hydrogen bonds on either side of the unpaired nucleotide in position +5. A mismatch is indicated by a vertical line.

Figure 5

Figure 5

A G:U wobble, like any mismatch, in the center of the complementary RNA sequences of the 5′ splice site SD4 and U1 snRNA abolishes duplex formation. Immunoblot analysis (left) of Env glycoproteins (gp120 and its precursor gp160) expressed in HeLa-T4+ cells that were transfected with 1 µg of SVcrev, 1 µg of pGL3-control and 0.2 or 1 µg, respectively, of the env expression vector carrying the tat/rev 5′ splice site SD4 (4A) or its counterpart with any possible mutation site-directed to position +4. Within the mutated SD4 sequence of 4G, 4C and 4U (center), wobbles and mismatches to U1 (top) are shown in lower case. env was only expressed with SD4, very little with 4G. Of note, in the transfection experiment 4A only one fifth of the reference construct SV E/X tat– rev– (4A) was used as compared with the experiments 4G, 4C and 4U.

Figure 6

Figure 6

Base substitutions in U1 snRNA that reinstate hydrogen bonds to the SD4 mutations 4C and 4U (Fig. 5) restore efficient env expression. Immunoblot analysis (center) of Env glycoproteins (gp120 and its precursor gp160) expressed in HeLa-T4+ cells that were transfected with 1 µg of SVcrev, 1 µg of pGL3-control, 1 µg of env expression vectors SV E/X tat– rev– 4C or SV E/X tat– rev– 4U and an expression plasmid coding for wild-type U1 snRNA (U1) or site-mutated U1 snRNA (U1 5G and U1 5A) as indicated (+/–) by the scheme (top). The sequences of the free 5′ end of the U1 snRNAs and the mutant splice sites are given at the bottom of the figure. Mismatches within the RNA duplexes are given in lower case letters of the respective U1 snRNA sequence.

Figure 7

Figure 7

A terminal G:U wobble does not contribute to RNA duplex formation and env expression. Immunoblot analysis (left) of Env glycoproteins (gp120 and its precursor gp160) expressed in HeLa-T4+ cells transfected with 1 µg of SVcrev, 1 µg of pGL3-control and 1 µg of env expression vectors carrying mutations in the tat/rev 5′ splice site SD4. The sequences of the splice sites are depicted right of the plasmid designations. The nomenclature refers to the contiguous stretch (cs) of hydrogen bonds starting at the indicated position of the cs within the 5′ splice site and to the number of hydrogen bonds given by the superscript (e.g. cs –314), or to mutated nucleotides of SD4 (e.g. 3U indicates that position +3 of SD4 is mutated from A to T). The hydrogen bonding patterns are given on the right of the figure for different possible weights attributed to single hydrogen bonds. First column, every G:U base pair contributes one hydrogen bond; second column, same as the first column, but terminal G:U base pairs contribute no hydrogen bonds; third column, same as the second column, but a G:C base pair in position –3 contributes two hydrogen bonds. Note that detectability of env expression is consistent with the algorithm classifying the predicted hydrogen bonding pattern as HC or LC with the given threshold values. Calculated free energy of the RNA duplexes using (1)Dynalign (

http://rna.chem.rochester.edu/dynalign.html

) (46) and (2)HyTher (

http://ozone2.chem.wayne.edu/Hyther/hytherm1main.html

) (47) is shown utmost right.

Figure 8

Figure 8

A terminal A:U at position +8 contributes only one hydrogen bond. Immunoblot analysis (left) of Env glycoproteins (gp120 and its precursor gp160) expressed in HeLa-T4+ cells transfected with 1 µg of SVcrev, 1 µg of pGL3-control and 1 µg of env expression vectors carrying mutations in the tat/rev 5′ splice site SD4. The sequences of the splice sites are depicted right of the plasmid designations. Mismatches within the RNA duplexes are given in lower case letters of the respective splice site sequence. The hydrogen bonding patterns are given on the right of the figure for different possible weights attributed to single hydrogen bonds. The scoring rules of Figure 7, columns 1–3, have already been incorporated here. In addition, in the first column every A:U base pair contributes two hydrogen bonds, the second column is the same as the first, but an A:U base pair in position +8 contributes one hydrogen bond. Note that detectability of env expression is consistent with the algorithm classifying the predicted hydrogen bonding pattern as HC.

Figure 9

Figure 9

5′ Splice sites graded as HC succeed in normal splicing, whereas those classified as LC show impaired exon recognition. (A) Schematic diagram of the pSV-1-env construct. pSV-1-env contains the HIV-1 genome of pNL4-3 but lacks the U3 region of the 5′ LTR, the non-coding region of the 3′ LTR and the region downstream of the gag start codon nearly onto the first coding exon of Rev. The start codons for expression of gag and rev are mutated. The positions of the primers used for RT–PCR are indicated by arrows. GAR, GAR repeats; SV40, SV40 early promotor; p(A), SV40 polyadenylation signal. (B) RT–PCR assay. HeLa-T4+ cells were transfected with 1 µg of pSV-1-env carrying substitutions of SD4 and cotransfected with 1 µg of pXGH5 expressing the hGH mRNA as an internal control. Size marker (M), pSV-1-env SD2 (lane 1), pSV-1-env –1G3U (lane 2), pSV-1-env 4G7C (lane 3) or pSV-1-env 4C8U (lane 4). (C) Determination of the linear amplification range for pSV-1-env and hGH RT–PCR assays. For pSV-1-env RT–PCR, relative amounts over number of cycles for the strongest (1–5 and 4–7) and of the shortest PCR product (1–7) were assessed. (D) Immunoblot analysis of Env glycoproteins (gp120 and its precursor gp160) expressed in HeLa-T4+ cells transfected with 1 µg of SVcrev, 1 µg of pGL3-control and 1 µg of env expression vectors carrying substitutions of the tat/rev 5′ splice site SD4. SV E/X tat– rev– SD2 (lane 1), SV E/X tat– rev– –1G3U (lane 2), SV E/X tat– rev– 4G7C (lane 3), SV E/X tat– rev– 4C8U (lane 4).

Figure 10

Figure 10

Increased complementarity between SD4 and U1 snRNA compensates for a lack of GAR-mediated enhancing. Immunoblot analysis of Env glycoproteins (gp120 and its precursor gp160) expressed in HeLa-T4+ cells transfected with 1 µg of SVcrev, 1 µg of pGL3-control and 1 µg of env expression vectors SV E/X tat– rev– GAR or SV E/X tat– rev– GAR–, and expression plasmid coding for wild-type U1 snRNA (U1) or a 9U/10G/11C site-mutated U1 snRNA (U1 9U/10G/11C) as indicated (A and C). The numbers above the immunoblot analysis shown in (C) indicate the numbers of hydrogen bonds in the continuous stretch of the 5′ splice sites analyzed (16 for SD4, 20 for mutant cs –220). The sequences of the free 5′ end of the U1 snRNA and the 5′ splice site SD4 are given at the bottom of the figure (B). Mismatches within the RNA duplexes are given in lower case letters of the respective U1 snRNA sequence.

References

    1. Lu X.B., Heimer,J., Rekosh,D. and Hammarskjold,M.L. (1990) U1 small nuclear RNA plays a direct role in the formation of a rev-regulated human immunodeficiency virus env mRNA that remains unspliced. Proc. Natl Acad. Sci. USA, 87, 7598–7602. - PMC - PubMed
    1. Barrett N.L., Li,X. and Carmichael,G.G. (1995) The sequence and context of the 5′ splice site govern the nuclear stability of polyoma virus late RNAs. Nucleic Acids Res., 23, 4812–4817. - PMC - PubMed
    1. Kammler S., Leurs,C., Freund,M., Krummheuer,J., Seidel,K., Tange,T.O., Lund,M.K., Kjems,J., Scheid,A. and Schaal,H. (2001) The sequence complementarity between HIV-1 5′ splice site SD4 and U1 snRNA determines the steady-state level of an unstable env pre-mRNA. RNA, 7, 421–434. - PMC - PubMed
    1. Lund M. and Kjems,J. (2002) Defining a 5′ splice site by functional selection in the presence and absence of U1 snRNA 5′ end. RNA, 8, 166–179. - PMC - PubMed
    1. Zhuang Y. and Weiner,A.M. (1986) A compensatory base change in U1 snRNA suppresses a 5′ splice site mutation. Cell, 46, 827–835. - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources