Making (anti)sense of non-coding sequence conservation (original) (raw)

Journal Article

National Center for Biotechnology Information, National Library of Medicine, NIH

,

Building 38A 8N803, 8600 Rockville Pike, Bethesda, MD 20894, USA

Search for other works by this author on:

Published:

01 September 1997

Navbar Search Filter Mobile Enter search term Search

Abstract

A substantial fraction of vertebrate mRNAs contain long conserved blocks in their untranslated regions as well as long blocks without silent changes in their protein coding regions. These conserved blocks are largely comprised of unique sequence within the genome, leaving us with an important puzzle regarding their function. A large body of experimental data shows that these regions are associated with regulation of mRNA stability. Combining this information with the rapidly accumulating data on endogenous antisense transcripts, we propose that the conserved sequences form long perfect duplexes with antisense transcripts. The formation of such duplexes may be essential for recognition by post-transcriptional regulatory systems. The conservation may then be explained by selection against the dominant negative effect of allelic divergence.

Since the early 1980s many studies on particular genes have noted sequence conservation in the 3′ untranslated regions (UTRs) of vertebrate mRNAs (1–3). Duret et al. (4) estimated that >30% of vertebrate mRNAs had conserved regions in their 3′ UTRs, defined as sharing at least 70% identity over >100 nucleotides between corresponding homologous genes (orthologs). They also noted the less frequent but still significant conservation in 5′ UTRs. We have recently observed long stretches of protein coding regions without silent changes in a substantial fraction of vertebrate mRNAs; most of these contain unusually conserved blocks both in the coding regions and in 5′ or 3′ UTRs (H.Sicotte and D.Lipman, unpublished data). A representative sample from a comparison of human and mouse orthologs is shown in Table 2. These conserved sequences are essentially unique in the genome and thus match only to corresponding regions of orthologous mRNAs in other species. The observed level of conservation is far greater than expected for non-coding regions or synonymous sites in coding regions on the basis of known evolutionary rates and divergence times (5).

What function constrains these regions? Sequence specific recognition, e.g., by RNA binding proteins, is an unlikely explanation because of the length of the conserved sequences. Furthermore, because so many different mRNAs contain these conserved regions, which are unique for each set of orthologs, sequence specific recognition would lead into an almost infinite regress. With >30% of the genes containing these unique conserved regions, then another 30% of the genes would be needed to code for these binding proteins, not to mention the proteins regulating these binding proteins, and so on. One might posit that many of these different sequences share common RNA secondary structure thus reducing the number of different binding proteins, but the sequence conservation would remain a mystery. It has been shown that short AU rich motifs promote mRNA degradation (6). Such motifs are often seen in the conserved portions of 3′ UTRs but these cannot explain the striking conservation between orthologs either. Another possibility would be that the conservation is due to the encoding of a protein on the complementary strand. Extensive database searches using translations of the complementary strand to these conserved regions did not reveal homologies to known proteins which could explain this conservation (results not shown).

A number of studies provide evidence that the conserved regions in 3′ UTRs are required for the regulation of mRNA stability (7). Typically deletion of these regions render the mRNA unresponsive to regulatory signals which normally lead to destabilization (8–10). Conversely, introduction of these regions into reporter mRNAs make them responsive to regulated destabilization (11–13). Conserved regions in 5′ UTRs (14) and coding regions (15–17) have also been implicated in regulation of mRNA stability.

The large number of bases in conserved blocks suggests a base-pairing interaction between mRNA and another nucleic acid. Over the last several years there has been an increasing number of reports of antisense RNA transcripts encoded by the complementary strand of a gene (18–22). Although most reported examples do not show evidence of coding regions, in some cases these countertranscripts encode expressed proteins (23,24). These countertranscripts are sometimes found in different tissues or developmental stages than their corresponding sense mRNA and thus a regulatory role for endogenous antisense has been proposed (25–28). Examples of regulation of gene expression by endogenous antisense have also been described for nematode (29), dictyostelium (30) and prokaryotes (31).

Why would the antisense-based regulatory mechanism require sequence conservation? If cells have a destabilization/degradation system which specifically recognizes long, nearly perfect RNA duplex, then mutations in a region corresponding to a duplex will be selected against because of their mismatch with the other allele (Fig. 1). Consider, for example, the developmental expression pattern for Hoxa 11 sense and antisense transcripts (27); where sense transcripts are at high levels, antisense transcripts are at low levels, and vice-versa. When the Hoxa 11 antisense is abundant, most sense transcripts will be duplexed. Assuming the rate of transcription for the two alleles is roughly equal, a mutation in a region corresponding to a duplex would result in approximately half the sense transcripts forming mismatched duplexes. Let us further assume that the half life of a sense transcript is 12 h and the half life of a perfectly matching sense/antisense duplex is 12 min. When most of the sense transcripts are in perfect duplexes the drop in mRNA levels could therefore be an order of magnitude or more. However, a mutation leading to allelic divergence in a complementary region could lead to defective recognition of approximately half of the sense/antisense duplexes; thus, half the sense transcripts would have a half life of 12 min and half would have a half life approaching 12 h. The endogenous antisense mechanism would then only be able to reduce mRNA levels by a factor of two. Thus, the conserved regions in mRNAs will be maintained through selection against allelic divergence. In the three cases where the endogenous antisense has been sequenced and the corresponding orthologous mRNA sequences are also available, there is a strong correlation of complementary segments and sequence conservation. For example, in the BFGF gene, there is a single silent change between human and rat sequences in the 280 bases of the coding region which overlap the antisense transcript (unpublished observations).

Table 2

Examples of conserved blocks in human/mouse orthologous mRNAsa

With this hypothesis, one would predict that a chromosomal translocation in a region corresponding to a duplex would lead to upregulation of the product of the normal allele. An interesting example of this is the bcl-2/IgH translocation seen in B-cell lymphomas which is associated with increased levels of bcl-2 mRNA and bcl-2 protein as well as detectable levels of a bcl-2/IgH antisense transcript (32). Note that the translocation occurs within the 3′ UTR which contains a number of conserved blocks on either side of the breakpoint. Oligonucleotides complementary specifically to this chimeric antisense downregulate the bcl-2 gene product leading to apoptosis while oligonucleotides complementary to the bcl-2/IgH sense transcript have no effect (32,33). Presumably the chimeric antisense binds to the normal bcl-2 sense mRNA but is not efficiently recognized by the destabilization/degradation system and thus it acts as a competitive inhibitor of the normal bcl-2 antisense transcript.

Figure 1

Endogenous antisense and sequence conservation. Sense and antisense transcription. _s_1 is the sense transcript from Allele 1, _a_1 is the antisense transcript from Allele 1, _s_2 and _a_2 are the transcripts from Allele 2. Sequence conservation is observed in the blackened region in both alleles but Allele 2 has a mutation shown in red. RNA duplex formation. The sense and antisense transcripts are complementary in the paired region (corresponding to the black bar in the above line). The _s_1/_a_1 and _s_2/_a_2 duplexes match perfectly but the _s_1/_a_2 and _s_2/_a_1 duplexes contain a mismatch. mRNA degradation or stabilization. The yellow box represents the destabilization/degradation protein(s). The perfect duplexes are efficiently processed (dashed and diagonal lines) while the mismatched duplexes remain intact.

Double-stranded RNA adenine deaminase (DSRAD) has been implicated in the destabilization of the BFGF mRNA base-paired with its countertranscript (34). The modification efficiency of DSRAD has also been shown to decrease exponentially as the length of RNA duplex drops below 100 bp (35). More recently, several additional DSRAD-related proteins have been sequenced (36,37), and other regulatory proteins specific for double-stranded RNA have also been characterized including the interferoninducible protein kinase, PKR (38) and dsRNA-dependent RNAase L (39,40).

Whatever the proteins involved in destabilizing sense/antisense duplexes, if recognition is duplex-specific and not sequence specific one escapes the infinite regress trap. Perhaps even more importantly, strict specificity for near-perfect duplexes appears to be essential for the function of the postulated regulatory system, as otherwise recognition and degradation of other cellular RNA duplexes such as structural RNA would be catastrophic. The antiviral role of proteins recognizing long RNA duplexes (41,42) may be a serendipitous benefit of this regulatory system.

Studies on the regulation of the antisense transcripts of Wilms tumor suppressor (43), eif2-α (44) and myc (45) show that when the sense transcript is upregulated, the antisense transcription decreases, and when the sense is downregulated, the antisense transcription increases. Thus upregulating the gene increases the sense/antisense ratio and downregulating the gene decreases this ratio. If the sense/antisense duplexes are rapidly degraded a model with direct coupling of transcriptional regulation and mRNA stability appears straightforward.

For example, consider a gene where transcription of the sense message is 10-fold greater than the antisense transcription. In this context, the rapidly degraded duplexes are of little consequence. But decreasing sense transcription by a factor of only two or three and concomitantly increasing antisense transcription by the same magnitude would have a dramatic effect on overall mRNA stability and thus on mRNA levels. Such a model would help explain the drop in mRNA stability seen in a wide variety of systems including differentiation of MEL cells (46,47). For mRNAs with short half-lives, such as VEGF, the sense/antisense transcription ratio may be closer. The increase in VEGF mRNA levels with hypoxic induction is a result of a relatively small increase in transcription coupled with a significant increase in mRNA stability (48,49). Recent results by Kumar and Carmichael (50) show that polyoma virus sense/antisense duplexes, modified by DSRAD, are blocked from transport out of the nucleus, and this, rather than a drop in mRNA stability, accounts for the decreased levels of sense transcript. Their results suggest that the nucleus is the primary site of action for this proposed posttranscriptional regulation. Possible interference with the double-stranded RNA antiviral response provides additional support for this hypothesis. Whether stability, transport or both mechanisms are involved, the key for the model proposed here is that the specificity of recognition be conferred by long, near-perfect RNA duplexes.

Additional evidence for this model comes from experiments where treatment with oligonucleotides unexpectedly stabilized mRNA levels or upregulated a gene product. An oligonucleotide antisense to the start codon for myc (51) stabilized the mRNA level and blocked apoptosis while an analogous one for CD23 (52) increased the level of the CD23 gene product. Oligos in the sense orientation to the start codon (used as controls in experiments with antisense oligos) for the IGF-I receptor (53) and NF-kB (54) unexpectedly upregulated the respective gene products. In all four cases, the oligonucleotide was in a conserved region, suggesting that they, in some way, interrupted a duplex, thus inhibiting the endogenous destabilization/degradation system. These results suggest a simple approach for testing the model and perhaps modulating gene expression.

Other than coding for proteins or structural RNAs, the extensive ortholog-specific conservation in vertebrate mRNAs is perhaps the most pervasive functional constraint on the genome, as evidenced by sequence conservation. Any explanation for this conservation must deal with the problem of recognizing a unique signal for ∼30 000 different mRNAs. The model of mRNA stability regulation by countertranscripts proposed here handles this infinite regress by positing recognition of nearly perfect, long duplexes, which depends not on a unique signal for each mRNA but still results in sequence conservation. The direct coupling of transcriptional regulation and post-transcriptional regulation of mRNA stability inherent in this model could be important in development, cellular differentiation, stress response, or any other situation of coordinated regulation of multiple genes. If correct, the mechanism proposed here may be modulated for therapeutic benefit.

Acknowledgements

I would like to thank S. Brenner for initial discussions on non-coding conservation; K. Katz, J. Hensold, A. Pause, R. Klausner, D. Botstein and R. Roberts for helpful discussions; A. Krainer for pointing out the potential effect of endogenous antisense on the induction of interferon and the antiviral response; G. Schuler, J. Wootton, R. Tatusov, S. Altschul, D. Landsman, for initial analyses of the data and discussion; and E. Koonin for all of the above and help with the manuscript. G. Schuler contributed the schematic.

References

1

,

Science

,

1980

, vol.

210

(pg.

1360

-

1363

)

2

,

DNA

,

1981

, vol.

1

(pg.

11

-

18

)

3

,

Nucleic Acids Res.

,

1985

, vol.

13

(pg.

3723

-

3737

)

4

,

Nucleic Acids Res.

,

1993

, vol.

21

(pg.

2315

-

2322

)

5

,

Molecular Evolutionary Genetics

,

1985

New York

Plenum Press

(pg.

1

-

94

)

6

,

Mol. Cell. Biol.

,

1995

, vol.

15

(pg.

2219

-

2230

)

7

,

Trends Genet.

,

1996

, vol.

12

(pg.

171

-

175

)

8

,

J. Biol. Chem.

,

1995

, vol.

270

(pg.

10084

-

10090

)

9

,

J. Neurosci.

,

1997

, vol.

17

(pg.

1950

-

1958

)

10

,

Nucleic Acids Res.

,

1991

, vol.

19

(pg.

2387

-

2394

)

11

,

J. Biol. Chem.

,

1997

, vol.

272

(pg.

1331

-

1337

)

12

,

Cell

,

1986

, vol.

46

(pg.

659

-

667

)

13

,

Mol. Cell. Biol.

,

1988

, vol.

8

(pg.

5521

-

5527

)

14

,

Nucleic Acids Res.

,

1992

, vol.

20

(pg.

5753

-

5762

)

15

,

Mol. Cell. Biol.

,

1993

, vol.

13

(pg.

5034

-

5042

)

16

,

Genes Dev.

,

1991

, vol.

5

(pg.

232

-

243

)

17

,

Mol. Cell. Biol.

,

1997

, vol.

17

(pg.

1075

-

1083

)

18

,

J. Mol. Endocrinol.

,

1991

, vol.

7

(pg.

145

-

154

)

19

,

Mol. Reprod. Dev.

,

1993

, vol.

35

(pg.

394

-

397

)

20

,

Oncogene

,

1994

, vol.

9

(pg.

583

-

595

)

21

,

J. Biol. Chem.

,

1992

, vol.

267

(pg.

9738

-

9742

)

22

,

Genomics

,

1996

, vol.

35

(pg.

473

-

485

)

23

,

EMBO J.

,

1989

, vol.

8

(pg.

2983

-

2988

)

24

,

Biochem. Biophys. Res. Commun.

,

1996

, vol.

227

(pg.

70

-

76

)

25

,

Mol. Cell. Endocrinol.

,

1996

, vol.

118

(pg.

113

-

123

)

26

,

Biochem. Biophys. Res. Commun.

,

1994

, vol.

205

(pg.

577

-

583

)

27

,

Development

,

1995

, vol.

121

(pg.

1373

-

1385

)

28

,

Brain Res. Mol. Brain Res.

,

1996

, vol.

37

(pg.

85

-

95

)

29

,

Cell

,

1993

, vol.

75

(pg.

843

-

854

)

30

,

Cell

,

1992

, vol.

69

(pg.

197

-

204

)

31

,

Annu. Rev. Microbiol.

,

1994

, vol.

48

(pg.

713

-

742

)

32

,

Oncogene

,

1996

, vol.

13

(pg.

105

-

115

)

33

,

Proc. Natl. Acad. Sci. USA

,

1997

, vol.

94

(pg.

8150

-

8155

)

34

,

Cell

,

1989

, vol.

59

(pg.

687

-

696

)

35

,

EMBO J.

,

1991

, vol.

10

(pg.

3523

-

3532

)

36

,

J. Biol. Chem.

,

1996

, vol.

271

(pg.

31795

-

31798

)

37

,

Nature

,

1996

, vol.

379

(pg.

460

-

464

)

38

,

Trends Biochem. Sci.

,

1995

, vol.

20

(pg.

241

-

246

)

39

,

J. Interferon Res.

,

1994

, vol.

14

(pg.

101

-

104

)

40

,

J. Biol. Chem.

,

1994

, vol.

269

(pg.

14153

-

14158

)

41

,

Virology

,

1996

, vol.

219

(pg.

339

-

349

)

42

,

Virology

,

1993

, vol.

193

(pg.

1037

-

1041

)

43

,

Oncogene

,

1995

, vol.

11

(pg.

1589

-

1595

)

44

,

J. Biol. Chem.

,

1994

, vol.

269

(pg.

29161

-

29167

)

45

,

Oncogene

,

1991

, vol.

6

(pg.

1979

-

1982

)

46

,

Nucleic Acids Res.

,

1986

, vol.

14

(pg.

9653

-

9666

)

47

,

J. Biol. Chem.

,

1996

, vol.

271

(pg.

3385

-

3391

)

48

,

J. Biol. Chem.

,

1995

, vol.

270

(pg.

13333

-

13340

)

49

,

FEBS Lett.

,

1995

, vol.

370

(pg.

203

-

208

)

50

,

Proc. Natl. Acad. Sci. USA

,

1997

, vol.

94

(pg.

3542

-

3547

)

51

,

J. Exp. Med.

,

1994

, vol.

179

(pg.

221

-

228

)

52

,

Blood

,

1994

, vol.

84

(pg.

1881

-

1886

)

53

,

J. Biol. Chem.

,

1995

, vol.

270

(pg.

14383

-

14388

)

54

,

Antisense Res. Dev.

,

1993

, vol.

3

(pg.

309

-

322

)

© 1997 Oxford University Press

I agree to the terms and conditions. You must accept the terms and conditions.

Submit a comment

Name

Affiliations

Comment title

Comment

You have entered an invalid code

Thank you for submitting a comment on this article. Your comment will be reviewed and published at the journal's discretion. Please check for further notifications by email.

Citations

Views

Altmetric

Metrics

Total Views 553

374 Pageviews

179 PDF Downloads

Since 11/1/2016

Month: Total Views:
November 2016 4
January 2017 2
February 2017 7
April 2017 1
June 2017 3
August 2017 1
October 2017 1
November 2017 1
December 2017 8
January 2018 1
February 2018 7
March 2018 8
April 2018 2
May 2018 14
June 2018 4
July 2018 3
August 2018 12
September 2018 5
October 2018 1
November 2018 3
December 2018 6
January 2019 5
February 2019 5
March 2019 11
April 2019 15
May 2019 16
June 2019 6
July 2019 6
August 2019 4
September 2019 7
October 2019 9
November 2019 8
December 2019 7
January 2020 4
February 2020 8
March 2020 11
April 2020 5
May 2020 5
June 2020 13
July 2020 7
August 2020 3
September 2020 4
October 2020 6
November 2020 7
December 2020 13
January 2021 7
February 2021 6
March 2021 6
April 2021 7
May 2021 6
June 2021 4
July 2021 1
August 2021 5
September 2021 4
October 2021 5
November 2021 8
December 2021 2
January 2022 2
February 2022 8
March 2022 3
April 2022 6
May 2022 6
June 2022 1
July 2022 5
August 2022 14
September 2022 14
October 2022 9
November 2022 1
December 2022 6
January 2023 9
February 2023 6
March 2023 4
April 2023 5
June 2023 1
July 2023 1
August 2023 2
September 2023 6
October 2023 2
November 2023 4
December 2023 13
January 2024 12
February 2024 9
March 2024 6
April 2024 12
May 2024 13
June 2024 4
July 2024 12
August 2024 4
September 2024 7
October 2024 6

Citations

74 Web of Science

×

Email alerts

Citing articles via

More from Oxford Academic