Identification, characterization and molecular phylogeny of U12-dependent introns in the Arabidopsis thaliana genome - PubMed (original) (raw)
Identification, characterization and molecular phylogeny of U12-dependent introns in the Arabidopsis thaliana genome
Wei Zhu et al. Nucleic Acids Res. 2003.
Abstract
U12-dependent introns are spliced by the minor U12-type spliceosome and occur in a variety of eukaryotic organisms, including Arabidopsis. In this study, a set of putative U12-dependent introns was compiled from a large collection of cDNA/EST- confirmed introns in the Arabidopsis thaliana genome by means of high-throughput bioinformatic analysis combined with manual scrutiny. A total of 165 U12-type introns were identified based upon stringent criteria. This number of sequences well exceeds the total number of U12-type introns previously reported for plants and allows a more thorough statistical analysis of U12-type signals. Of particular note is the discovery that the distance between the branch site adenosine and the acceptor site ranges from 10 to 39 nt, significantly longer than the previously postulated limit of 21 bp. Further analysis indicates that, in addition to the spacing constraint, the sequence context of the potential acceptor site may have an important role in 3' splice site selection. Several alternative splicing events involving U12-type introns were also captured in this study, providing evidence that U12-dependent acceptor sites can also be recognized by the U2-type spliceosome. Furthermore, phylogenetic analysis suggests that both U12-type AT-AC and U12-type GT-AG introns occurred in Na+/H+ antiporters in a progenitor of animals and plants.
Figures
Figure 1
Identification of U12-type introns. Each transcript-confirmed intron is represented by a point at coordinates (_S_d, _S_b) where _S_d and _S_b are the statistical score for donor site and branch site, respectively. The yellow rectangle identifies introns that were empirically classified as U12 type. In addition, a yellow arrow indicates a U12-type likely GT-AG intron not included in the selection (see text for details).
Figure 2
Histogram of branch site to acceptor site distances (DistBA) of U12-type introns. The distances were compiled from 51 U12-type AT-AC or AT-AA introns (black bars) and 106 U12-type GT-AG introns (gray bars) listed in Supplementary Material. Note that 12 U12-type introns (five AT-AC introns and seven GT-AG introns) have branch site to 3′ss distances that are longer than 21 bp.
Figure 3
Length distribution of the U12- and U2-type introns. The histogram for the U2-type introns was derived from 70 189 transcript-confirmed Arabidopsis introns (plotted in green line). The histogram for U12-type introns (filled column) is based on 145 sequences.
Figure 4
(Opposite and above) Analysis of Na+/H+ antiporters. The sources of the Na+/H+ antiporter protein sequences are as follows (GenBank accession nos in parentheses): AtNHX1 (AAD16946), AtNHX2 (AAM08403), AtNHX3, (AAO41905), AtNHX4 (AAM08405), AtNHX5(AAM08406), AtNHX6 (AAM08407) and AtSOS1 (AAF76139) from A.thaliana; InNHX1 (BAB60899) from Ipomoea nil; OsNHX1 (BAA83337) from rice; AgNHX1 (BAB11940) from Atriplex gmelini; ScNHX1 (NP_010744) from yeast; HsNHE1 (P19634), HsNHE2 (AAD41635), HsNHE3 (P48764), HsNHE6 (Q92581) and HsNHE7 (NP_115980) from human; EcNhaA (P13738) and EcNhaB (P27377) from E.coli. (A) The region [361, 420] of the multiple alignment of the Na+/H+ antiporters. Residues in each column of the alignments are shaded in black or gray if >70% of residues in the column are identical or similar. A phase-0 (i.e. between codons) U12-type AT-AC intron is marked by an upside down green triangle for AtNHX5, AtNHX6 and HsNHE6. Correspondingly, a phase-0 U2-type GT-AG intron is marked by a red triangle in HsNHE7. (B) The nucleotide sequences around the termini of the U2-type GT-AG intron marked by the red triangle in HsNHE7 in (A) (where | represents an exon–intron junction) and the sequence of the translation product. The intact U12-type splice signals are marked by shading. The potential U12-dependent splicing would replace the NAN tripeptide in the translation of the transcript resulting from U2-type splicing with the tetrapeptide VTAL, equal to the sequence in HsNHE6. (C) The region [481, 540] of the multiple alignment of the Na+/H+ antiporters. A phase-0 U12-type GT-AG intron is marked by an upside down yellow triangle for AtNHX5, AtNHX6, HsNHE6 and HsNHE7. (D) Neighbor-joining tree derived from the multiple alignment of the Na+/H+ antiporters using MEGA2 (31). The numbers on the tree branches are bootstrap values, and branch lengths are proportional to the pairwise _p_-distances as indicated by the scale bar in the lower left (see Materials and Methods for details). The branches are colored green, yellow and red corresponding to the occurrences of U12-type AT-AC, U12-type GT-AG and U2-type GT-AG introns, respectively.
Figure 4
(Opposite and above) Analysis of Na+/H+ antiporters. The sources of the Na+/H+ antiporter protein sequences are as follows (GenBank accession nos in parentheses): AtNHX1 (AAD16946), AtNHX2 (AAM08403), AtNHX3, (AAO41905), AtNHX4 (AAM08405), AtNHX5(AAM08406), AtNHX6 (AAM08407) and AtSOS1 (AAF76139) from A.thaliana; InNHX1 (BAB60899) from Ipomoea nil; OsNHX1 (BAA83337) from rice; AgNHX1 (BAB11940) from Atriplex gmelini; ScNHX1 (NP_010744) from yeast; HsNHE1 (P19634), HsNHE2 (AAD41635), HsNHE3 (P48764), HsNHE6 (Q92581) and HsNHE7 (NP_115980) from human; EcNhaA (P13738) and EcNhaB (P27377) from E.coli. (A) The region [361, 420] of the multiple alignment of the Na+/H+ antiporters. Residues in each column of the alignments are shaded in black or gray if >70% of residues in the column are identical or similar. A phase-0 (i.e. between codons) U12-type AT-AC intron is marked by an upside down green triangle for AtNHX5, AtNHX6 and HsNHE6. Correspondingly, a phase-0 U2-type GT-AG intron is marked by a red triangle in HsNHE7. (B) The nucleotide sequences around the termini of the U2-type GT-AG intron marked by the red triangle in HsNHE7 in (A) (where | represents an exon–intron junction) and the sequence of the translation product. The intact U12-type splice signals are marked by shading. The potential U12-dependent splicing would replace the NAN tripeptide in the translation of the transcript resulting from U2-type splicing with the tetrapeptide VTAL, equal to the sequence in HsNHE6. (C) The region [481, 540] of the multiple alignment of the Na+/H+ antiporters. A phase-0 U12-type GT-AG intron is marked by an upside down yellow triangle for AtNHX5, AtNHX6, HsNHE6 and HsNHE7. (D) Neighbor-joining tree derived from the multiple alignment of the Na+/H+ antiporters using MEGA2 (31). The numbers on the tree branches are bootstrap values, and branch lengths are proportional to the pairwise _p_-distances as indicated by the scale bar in the lower left (see Materials and Methods for details). The branches are colored green, yellow and red corresponding to the occurrences of U12-type AT-AC, U12-type GT-AG and U2-type GT-AG introns, respectively.
Figure 4
(Opposite and above) Analysis of Na+/H+ antiporters. The sources of the Na+/H+ antiporter protein sequences are as follows (GenBank accession nos in parentheses): AtNHX1 (AAD16946), AtNHX2 (AAM08403), AtNHX3, (AAO41905), AtNHX4 (AAM08405), AtNHX5(AAM08406), AtNHX6 (AAM08407) and AtSOS1 (AAF76139) from A.thaliana; InNHX1 (BAB60899) from Ipomoea nil; OsNHX1 (BAA83337) from rice; AgNHX1 (BAB11940) from Atriplex gmelini; ScNHX1 (NP_010744) from yeast; HsNHE1 (P19634), HsNHE2 (AAD41635), HsNHE3 (P48764), HsNHE6 (Q92581) and HsNHE7 (NP_115980) from human; EcNhaA (P13738) and EcNhaB (P27377) from E.coli. (A) The region [361, 420] of the multiple alignment of the Na+/H+ antiporters. Residues in each column of the alignments are shaded in black or gray if >70% of residues in the column are identical or similar. A phase-0 (i.e. between codons) U12-type AT-AC intron is marked by an upside down green triangle for AtNHX5, AtNHX6 and HsNHE6. Correspondingly, a phase-0 U2-type GT-AG intron is marked by a red triangle in HsNHE7. (B) The nucleotide sequences around the termini of the U2-type GT-AG intron marked by the red triangle in HsNHE7 in (A) (where | represents an exon–intron junction) and the sequence of the translation product. The intact U12-type splice signals are marked by shading. The potential U12-dependent splicing would replace the NAN tripeptide in the translation of the transcript resulting from U2-type splicing with the tetrapeptide VTAL, equal to the sequence in HsNHE6. (C) The region [481, 540] of the multiple alignment of the Na+/H+ antiporters. A phase-0 U12-type GT-AG intron is marked by an upside down yellow triangle for AtNHX5, AtNHX6, HsNHE6 and HsNHE7. (D) Neighbor-joining tree derived from the multiple alignment of the Na+/H+ antiporters using MEGA2 (31). The numbers on the tree branches are bootstrap values, and branch lengths are proportional to the pairwise _p_-distances as indicated by the scale bar in the lower left (see Materials and Methods for details). The branches are colored green, yellow and red corresponding to the occurrences of U12-type AT-AC, U12-type GT-AG and U2-type GT-AG introns, respectively.
Figure 4
(Opposite and above) Analysis of Na+/H+ antiporters. The sources of the Na+/H+ antiporter protein sequences are as follows (GenBank accession nos in parentheses): AtNHX1 (AAD16946), AtNHX2 (AAM08403), AtNHX3, (AAO41905), AtNHX4 (AAM08405), AtNHX5(AAM08406), AtNHX6 (AAM08407) and AtSOS1 (AAF76139) from A.thaliana; InNHX1 (BAB60899) from Ipomoea nil; OsNHX1 (BAA83337) from rice; AgNHX1 (BAB11940) from Atriplex gmelini; ScNHX1 (NP_010744) from yeast; HsNHE1 (P19634), HsNHE2 (AAD41635), HsNHE3 (P48764), HsNHE6 (Q92581) and HsNHE7 (NP_115980) from human; EcNhaA (P13738) and EcNhaB (P27377) from E.coli. (A) The region [361, 420] of the multiple alignment of the Na+/H+ antiporters. Residues in each column of the alignments are shaded in black or gray if >70% of residues in the column are identical or similar. A phase-0 (i.e. between codons) U12-type AT-AC intron is marked by an upside down green triangle for AtNHX5, AtNHX6 and HsNHE6. Correspondingly, a phase-0 U2-type GT-AG intron is marked by a red triangle in HsNHE7. (B) The nucleotide sequences around the termini of the U2-type GT-AG intron marked by the red triangle in HsNHE7 in (A) (where | represents an exon–intron junction) and the sequence of the translation product. The intact U12-type splice signals are marked by shading. The potential U12-dependent splicing would replace the NAN tripeptide in the translation of the transcript resulting from U2-type splicing with the tetrapeptide VTAL, equal to the sequence in HsNHE6. (C) The region [481, 540] of the multiple alignment of the Na+/H+ antiporters. A phase-0 U12-type GT-AG intron is marked by an upside down yellow triangle for AtNHX5, AtNHX6, HsNHE6 and HsNHE7. (D) Neighbor-joining tree derived from the multiple alignment of the Na+/H+ antiporters using MEGA2 (31). The numbers on the tree branches are bootstrap values, and branch lengths are proportional to the pairwise _p_-distances as indicated by the scale bar in the lower left (see Materials and Methods for details). The branches are colored green, yellow and red corresponding to the occurrences of U12-type AT-AC, U12-type GT-AG and U2-type GT-AG introns, respectively.
Figure 5
(Opposite and above) Putative drought-induced proteins. (A) Alignment of the protein sequences from Arabidopsis and rice. There is a phase-0 intron conservatively located immediately after the green colored column K103 in all of the genes. At that location, the rice gene AAO33770 has a U2-type GT-AG intron and the Arabidopsis gene At4g02200 has a U12-type GT-AG intron, whereas the remaining six genes each have a U12-type AT-AC intron. (B) Neighbor-joining tree derived from the alignment in (A). The figure scheme follows Figure 4D. (C) Alignment of the U12-type intron sequences in the genes At4g02200, At1g02750 and At3g05700. Only terminal alignments are displayed and splicing signals are indicated by shading.
Figure 5
(Opposite and above) Putative drought-induced proteins. (A) Alignment of the protein sequences from Arabidopsis and rice. There is a phase-0 intron conservatively located immediately after the green colored column K103 in all of the genes. At that location, the rice gene AAO33770 has a U2-type GT-AG intron and the Arabidopsis gene At4g02200 has a U12-type GT-AG intron, whereas the remaining six genes each have a U12-type AT-AC intron. (B) Neighbor-joining tree derived from the alignment in (A). The figure scheme follows Figure 4D. (C) Alignment of the U12-type intron sequences in the genes At4g02200, At1g02750 and At3g05700. Only terminal alignments are displayed and splicing signals are indicated by shading.
Figure 5
(Opposite and above) Putative drought-induced proteins. (A) Alignment of the protein sequences from Arabidopsis and rice. There is a phase-0 intron conservatively located immediately after the green colored column K103 in all of the genes. At that location, the rice gene AAO33770 has a U2-type GT-AG intron and the Arabidopsis gene At4g02200 has a U12-type GT-AG intron, whereas the remaining six genes each have a U12-type AT-AC intron. (B) Neighbor-joining tree derived from the alignment in (A). The figure scheme follows Figure 4D. (C) Alignment of the U12-type intron sequences in the genes At4g02200, At1g02750 and At3g05700. Only terminal alignments are displayed and splicing signals are indicated by shading.
Figure 6
Dinucleotide relative abundances in the proximity of the 3′ss of U12- and U2-type introns. The dinucleotide relative abundances (see Materials and Methods for definition) between the BSS and the acceptor site versus the equivalent size region immediately succeeding to the acceptor site were plotted for U12-type AT-AM introns (red fonts with underline), U12-type GT-AG introns (green fonts with underline) and U2-type GT-AG introns (blue fonts).
Similar articles
- Evolutionary conservation of minor U12-type spliceosome between plants and humans.
Lorkovic ZJ, Lehner R, Forstner C, Barta A. Lorkovic ZJ, et al. RNA. 2005 Jul;11(7):1095-107. doi: 10.1261/rna.2440305. RNA. 2005. PMID: 15987817 Free PMC article. - A computational scan for U12-dependent introns in the human genome sequence.
Levine A, Durbin R. Levine A, et al. Nucleic Acids Res. 2001 Oct 1;29(19):4006-13. doi: 10.1093/nar/29.19.4006. Nucleic Acids Res. 2001. PMID: 11574683 Free PMC article. - Evolutionary fates and origins of U12-type introns.
Burge CB, Padgett RA, Sharp PA. Burge CB, et al. Mol Cell. 1998 Dec;2(6):773-85. doi: 10.1016/s1097-2765(00)80292-0. Mol Cell. 1998. PMID: 9885565 - U12-dependent intron splicing in plants.
Simpson CG, Brown JW. Simpson CG, et al. Curr Top Microbiol Immunol. 2008;326:61-82. doi: 10.1007/978-3-540-76776-3_4. Curr Top Microbiol Immunol. 2008. PMID: 18630747 Review. - Alternative splicing of U12-type introns.
Chang WC, Chen HH, Tarn WY. Chang WC, et al. Front Biosci. 2008 Jan 1;13:1681-90. doi: 10.2741/2791. Front Biosci. 2008. PMID: 17981659 Review.
Cited by
- An updated database of virus circular RNAs provides new insights into the biogenesis mechanism of the molecule.
Fu P, Cai Z, Zhang Z, Meng X, Peng Y. Fu P, et al. Emerg Microbes Infect. 2023 Dec;12(2):2261558. doi: 10.1080/22221751.2023.2261558. Epub 2023 Oct 5. Emerg Microbes Infect. 2023. PMID: 37725485 Free PMC article. - Consideration of non-canonical splice sites improves gene prediction on the Arabidopsis thaliana Niederzenz-1 genome sequence.
Pucker B, Holtgräwe D, Weisshaar B. Pucker B, et al. BMC Res Notes. 2017 Dec 4;10(1):667. doi: 10.1186/s13104-017-2985-y. BMC Res Notes. 2017. PMID: 29202864 Free PMC article. - quatre-quart1 is an indispensable U12 intron-containing gene that plays a crucial role in Arabidopsis development.
Kwak KJ, Kim BM, Lee K, Kang H. Kwak KJ, et al. J Exp Bot. 2017 May 17;68(11):2731-2739. doi: 10.1093/jxb/erx138. J Exp Bot. 2017. PMID: 28475733 Free PMC article. - Full-length sequence assembly reveals circular RNAs with diverse non-GT/AG splicing signals in rice.
Ye CY, Zhang X, Chu Q, Liu C, Yu Y, Jiang W, Zhu QH, Fan L, Guo L. Ye CY, et al. RNA Biol. 2017 Aug 3;14(8):1055-1063. doi: 10.1080/15476286.2016.1245268. Epub 2016 Oct 14. RNA Biol. 2017. PMID: 27739910 Free PMC article. - Structural features important for the U12 snRNA binding and minor spliceosome assembly of Arabidopsis U11/U12-small nuclear ribonucleoproteins.
Park SJ, Jung HJ, Nguyen Dinh S, Kang H. Park SJ, et al. RNA Biol. 2016 Jul 2;13(7):670-9. doi: 10.1080/15476286.2016.1191736. Epub 2016 May 27. RNA Biol. 2016. PMID: 27232356 Free PMC article.
References
- Hall S.L. and Padgett,R.A. (1994) Conserved sequences in a class of rare eukaryotic nuclear introns with non-consensus splice sites. J. Mol. Biol., 239, 357–365. - PubMed
- Burge C.B., Padgett,R.A. and Sharp,P.A. (1998) Evolutionary fates and origins of U12-type introns. Mol. Cell, 2, 773–785. - PubMed
- Burge C.B., Tuschl,T. and Sharp,P.A. (1999) Splicing of precursors to mRNAs by the spliceosomes. In Gesteland,R.F., Cech,T. and Atkins,J.F. (eds), The RNA World II. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, pp. 525–560.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Molecular Biology Databases
Research Materials
Miscellaneous