Identification, characterization and molecular phylogeny of U12-dependent introns in the Arabidopsis thaliana genome - PubMed (original) (raw)

Identification, characterization and molecular phylogeny of U12-dependent introns in the Arabidopsis thaliana genome

Wei Zhu et al. Nucleic Acids Res. 2003.

Abstract

U12-dependent introns are spliced by the minor U12-type spliceosome and occur in a variety of eukaryotic organisms, including Arabidopsis. In this study, a set of putative U12-dependent introns was compiled from a large collection of cDNA/EST- confirmed introns in the Arabidopsis thaliana genome by means of high-throughput bioinformatic analysis combined with manual scrutiny. A total of 165 U12-type introns were identified based upon stringent criteria. This number of sequences well exceeds the total number of U12-type introns previously reported for plants and allows a more thorough statistical analysis of U12-type signals. Of particular note is the discovery that the distance between the branch site adenosine and the acceptor site ranges from 10 to 39 nt, significantly longer than the previously postulated limit of 21 bp. Further analysis indicates that, in addition to the spacing constraint, the sequence context of the potential acceptor site may have an important role in 3' splice site selection. Several alternative splicing events involving U12-type introns were also captured in this study, providing evidence that U12-dependent acceptor sites can also be recognized by the U2-type spliceosome. Furthermore, phylogenetic analysis suggests that both U12-type AT-AC and U12-type GT-AG introns occurred in Na+/H+ antiporters in a progenitor of animals and plants.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Identification of U12-type introns. Each transcript-confirmed intron is represented by a point at coordinates (_S_d, _S_b) where _S_d and _S_b are the statistical score for donor site and branch site, respectively. The yellow rectangle identifies introns that were empirically classified as U12 type. In addition, a yellow arrow indicates a U12-type likely GT-AG intron not included in the selection (see text for details).

Figure 2

Figure 2

Histogram of branch site to acceptor site distances (DistBA) of U12-type introns. The distances were compiled from 51 U12-type AT-AC or AT-AA introns (black bars) and 106 U12-type GT-AG introns (gray bars) listed in Supplementary Material. Note that 12 U12-type introns (five AT-AC introns and seven GT-AG introns) have branch site to 3′ss distances that are longer than 21 bp.

Figure 3

Figure 3

Length distribution of the U12- and U2-type introns. The histogram for the U2-type introns was derived from 70 189 transcript-confirmed Arabidopsis introns (plotted in green line). The histogram for U12-type introns (filled column) is based on 145 sequences.

Figure 4

Figure 4

(Opposite and above) Analysis of Na+/H+ antiporters. The sources of the Na+/H+ antiporter protein sequences are as follows (GenBank accession nos in parentheses): AtNHX1 (AAD16946), AtNHX2 (AAM08403), AtNHX3, (AAO41905), AtNHX4 (AAM08405), AtNHX5(AAM08406), AtNHX6 (AAM08407) and AtSOS1 (AAF76139) from A.thaliana; InNHX1 (BAB60899) from Ipomoea nil; OsNHX1 (BAA83337) from rice; AgNHX1 (BAB11940) from Atriplex gmelini; ScNHX1 (NP_010744) from yeast; HsNHE1 (P19634), HsNHE2 (AAD41635), HsNHE3 (P48764), HsNHE6 (Q92581) and HsNHE7 (NP_115980) from human; EcNhaA (P13738) and EcNhaB (P27377) from E.coli. (A) The region [361, 420] of the multiple alignment of the Na+/H+ antiporters. Residues in each column of the alignments are shaded in black or gray if >70% of residues in the column are identical or similar. A phase-0 (i.e. between codons) U12-type AT-AC intron is marked by an upside down green triangle for AtNHX5, AtNHX6 and HsNHE6. Correspondingly, a phase-0 U2-type GT-AG intron is marked by a red triangle in HsNHE7. (B) The nucleotide sequences around the termini of the U2-type GT-AG intron marked by the red triangle in HsNHE7 in (A) (where | represents an exon–intron junction) and the sequence of the translation product. The intact U12-type splice signals are marked by shading. The potential U12-dependent splicing would replace the NAN tripeptide in the translation of the transcript resulting from U2-type splicing with the tetrapeptide VTAL, equal to the sequence in HsNHE6. (C) The region [481, 540] of the multiple alignment of the Na+/H+ antiporters. A phase-0 U12-type GT-AG intron is marked by an upside down yellow triangle for AtNHX5, AtNHX6, HsNHE6 and HsNHE7. (D) Neighbor-joining tree derived from the multiple alignment of the Na+/H+ antiporters using MEGA2 (31). The numbers on the tree branches are bootstrap values, and branch lengths are proportional to the pairwise _p_-distances as indicated by the scale bar in the lower left (see Materials and Methods for details). The branches are colored green, yellow and red corresponding to the occurrences of U12-type AT-AC, U12-type GT-AG and U2-type GT-AG introns, respectively.

Figure 4

Figure 4

(Opposite and above) Analysis of Na+/H+ antiporters. The sources of the Na+/H+ antiporter protein sequences are as follows (GenBank accession nos in parentheses): AtNHX1 (AAD16946), AtNHX2 (AAM08403), AtNHX3, (AAO41905), AtNHX4 (AAM08405), AtNHX5(AAM08406), AtNHX6 (AAM08407) and AtSOS1 (AAF76139) from A.thaliana; InNHX1 (BAB60899) from Ipomoea nil; OsNHX1 (BAA83337) from rice; AgNHX1 (BAB11940) from Atriplex gmelini; ScNHX1 (NP_010744) from yeast; HsNHE1 (P19634), HsNHE2 (AAD41635), HsNHE3 (P48764), HsNHE6 (Q92581) and HsNHE7 (NP_115980) from human; EcNhaA (P13738) and EcNhaB (P27377) from E.coli. (A) The region [361, 420] of the multiple alignment of the Na+/H+ antiporters. Residues in each column of the alignments are shaded in black or gray if >70% of residues in the column are identical or similar. A phase-0 (i.e. between codons) U12-type AT-AC intron is marked by an upside down green triangle for AtNHX5, AtNHX6 and HsNHE6. Correspondingly, a phase-0 U2-type GT-AG intron is marked by a red triangle in HsNHE7. (B) The nucleotide sequences around the termini of the U2-type GT-AG intron marked by the red triangle in HsNHE7 in (A) (where | represents an exon–intron junction) and the sequence of the translation product. The intact U12-type splice signals are marked by shading. The potential U12-dependent splicing would replace the NAN tripeptide in the translation of the transcript resulting from U2-type splicing with the tetrapeptide VTAL, equal to the sequence in HsNHE6. (C) The region [481, 540] of the multiple alignment of the Na+/H+ antiporters. A phase-0 U12-type GT-AG intron is marked by an upside down yellow triangle for AtNHX5, AtNHX6, HsNHE6 and HsNHE7. (D) Neighbor-joining tree derived from the multiple alignment of the Na+/H+ antiporters using MEGA2 (31). The numbers on the tree branches are bootstrap values, and branch lengths are proportional to the pairwise _p_-distances as indicated by the scale bar in the lower left (see Materials and Methods for details). The branches are colored green, yellow and red corresponding to the occurrences of U12-type AT-AC, U12-type GT-AG and U2-type GT-AG introns, respectively.

Figure 4

Figure 4

(Opposite and above) Analysis of Na+/H+ antiporters. The sources of the Na+/H+ antiporter protein sequences are as follows (GenBank accession nos in parentheses): AtNHX1 (AAD16946), AtNHX2 (AAM08403), AtNHX3, (AAO41905), AtNHX4 (AAM08405), AtNHX5(AAM08406), AtNHX6 (AAM08407) and AtSOS1 (AAF76139) from A.thaliana; InNHX1 (BAB60899) from Ipomoea nil; OsNHX1 (BAA83337) from rice; AgNHX1 (BAB11940) from Atriplex gmelini; ScNHX1 (NP_010744) from yeast; HsNHE1 (P19634), HsNHE2 (AAD41635), HsNHE3 (P48764), HsNHE6 (Q92581) and HsNHE7 (NP_115980) from human; EcNhaA (P13738) and EcNhaB (P27377) from E.coli. (A) The region [361, 420] of the multiple alignment of the Na+/H+ antiporters. Residues in each column of the alignments are shaded in black or gray if >70% of residues in the column are identical or similar. A phase-0 (i.e. between codons) U12-type AT-AC intron is marked by an upside down green triangle for AtNHX5, AtNHX6 and HsNHE6. Correspondingly, a phase-0 U2-type GT-AG intron is marked by a red triangle in HsNHE7. (B) The nucleotide sequences around the termini of the U2-type GT-AG intron marked by the red triangle in HsNHE7 in (A) (where | represents an exon–intron junction) and the sequence of the translation product. The intact U12-type splice signals are marked by shading. The potential U12-dependent splicing would replace the NAN tripeptide in the translation of the transcript resulting from U2-type splicing with the tetrapeptide VTAL, equal to the sequence in HsNHE6. (C) The region [481, 540] of the multiple alignment of the Na+/H+ antiporters. A phase-0 U12-type GT-AG intron is marked by an upside down yellow triangle for AtNHX5, AtNHX6, HsNHE6 and HsNHE7. (D) Neighbor-joining tree derived from the multiple alignment of the Na+/H+ antiporters using MEGA2 (31). The numbers on the tree branches are bootstrap values, and branch lengths are proportional to the pairwise _p_-distances as indicated by the scale bar in the lower left (see Materials and Methods for details). The branches are colored green, yellow and red corresponding to the occurrences of U12-type AT-AC, U12-type GT-AG and U2-type GT-AG introns, respectively.

Figure 4

Figure 4

(Opposite and above) Analysis of Na+/H+ antiporters. The sources of the Na+/H+ antiporter protein sequences are as follows (GenBank accession nos in parentheses): AtNHX1 (AAD16946), AtNHX2 (AAM08403), AtNHX3, (AAO41905), AtNHX4 (AAM08405), AtNHX5(AAM08406), AtNHX6 (AAM08407) and AtSOS1 (AAF76139) from A.thaliana; InNHX1 (BAB60899) from Ipomoea nil; OsNHX1 (BAA83337) from rice; AgNHX1 (BAB11940) from Atriplex gmelini; ScNHX1 (NP_010744) from yeast; HsNHE1 (P19634), HsNHE2 (AAD41635), HsNHE3 (P48764), HsNHE6 (Q92581) and HsNHE7 (NP_115980) from human; EcNhaA (P13738) and EcNhaB (P27377) from E.coli. (A) The region [361, 420] of the multiple alignment of the Na+/H+ antiporters. Residues in each column of the alignments are shaded in black or gray if >70% of residues in the column are identical or similar. A phase-0 (i.e. between codons) U12-type AT-AC intron is marked by an upside down green triangle for AtNHX5, AtNHX6 and HsNHE6. Correspondingly, a phase-0 U2-type GT-AG intron is marked by a red triangle in HsNHE7. (B) The nucleotide sequences around the termini of the U2-type GT-AG intron marked by the red triangle in HsNHE7 in (A) (where | represents an exon–intron junction) and the sequence of the translation product. The intact U12-type splice signals are marked by shading. The potential U12-dependent splicing would replace the NAN tripeptide in the translation of the transcript resulting from U2-type splicing with the tetrapeptide VTAL, equal to the sequence in HsNHE6. (C) The region [481, 540] of the multiple alignment of the Na+/H+ antiporters. A phase-0 U12-type GT-AG intron is marked by an upside down yellow triangle for AtNHX5, AtNHX6, HsNHE6 and HsNHE7. (D) Neighbor-joining tree derived from the multiple alignment of the Na+/H+ antiporters using MEGA2 (31). The numbers on the tree branches are bootstrap values, and branch lengths are proportional to the pairwise _p_-distances as indicated by the scale bar in the lower left (see Materials and Methods for details). The branches are colored green, yellow and red corresponding to the occurrences of U12-type AT-AC, U12-type GT-AG and U2-type GT-AG introns, respectively.

Figure 5

Figure 5

(Opposite and above) Putative drought-induced proteins. (A) Alignment of the protein sequences from Arabidopsis and rice. There is a phase-0 intron conservatively located immediately after the green colored column K103 in all of the genes. At that location, the rice gene AAO33770 has a U2-type GT-AG intron and the Arabidopsis gene At4g02200 has a U12-type GT-AG intron, whereas the remaining six genes each have a U12-type AT-AC intron. (B) Neighbor-joining tree derived from the alignment in (A). The figure scheme follows Figure 4D. (C) Alignment of the U12-type intron sequences in the genes At4g02200, At1g02750 and At3g05700. Only terminal alignments are displayed and splicing signals are indicated by shading.

Figure 5

Figure 5

(Opposite and above) Putative drought-induced proteins. (A) Alignment of the protein sequences from Arabidopsis and rice. There is a phase-0 intron conservatively located immediately after the green colored column K103 in all of the genes. At that location, the rice gene AAO33770 has a U2-type GT-AG intron and the Arabidopsis gene At4g02200 has a U12-type GT-AG intron, whereas the remaining six genes each have a U12-type AT-AC intron. (B) Neighbor-joining tree derived from the alignment in (A). The figure scheme follows Figure 4D. (C) Alignment of the U12-type intron sequences in the genes At4g02200, At1g02750 and At3g05700. Only terminal alignments are displayed and splicing signals are indicated by shading.

Figure 5

Figure 5

(Opposite and above) Putative drought-induced proteins. (A) Alignment of the protein sequences from Arabidopsis and rice. There is a phase-0 intron conservatively located immediately after the green colored column K103 in all of the genes. At that location, the rice gene AAO33770 has a U2-type GT-AG intron and the Arabidopsis gene At4g02200 has a U12-type GT-AG intron, whereas the remaining six genes each have a U12-type AT-AC intron. (B) Neighbor-joining tree derived from the alignment in (A). The figure scheme follows Figure 4D. (C) Alignment of the U12-type intron sequences in the genes At4g02200, At1g02750 and At3g05700. Only terminal alignments are displayed and splicing signals are indicated by shading.

Figure 6

Figure 6

Dinucleotide relative abundances in the proximity of the 3′ss of U12- and U2-type introns. The dinucleotide relative abundances (see Materials and Methods for definition) between the BSS and the acceptor site versus the equivalent size region immediately succeeding to the acceptor site were plotted for U12-type AT-AM introns (red fonts with underline), U12-type GT-AG introns (green fonts with underline) and U2-type GT-AG introns (blue fonts).

Similar articles

Cited by

References

    1. Jackson I.J. (1991) A reappraisal of non-consensus mRNA splice sites. Nucleic Acids Res., 19, 3795–3798. - PMC - PubMed
    1. Hall S.L. and Padgett,R.A. (1994) Conserved sequences in a class of rare eukaryotic nuclear introns with non-consensus splice sites. J. Mol. Biol., 239, 357–365. - PubMed
    1. Burge C.B., Padgett,R.A. and Sharp,P.A. (1998) Evolutionary fates and origins of U12-type introns. Mol. Cell, 2, 773–785. - PubMed
    1. Burge C.B., Tuschl,T. and Sharp,P.A. (1999) Splicing of precursors to mRNAs by the spliceosomes. In Gesteland,R.F., Cech,T. and Atkins,J.F. (eds), The RNA World II. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, pp. 525–560.
    1. Wu Q. and Krainer,A.R. (1999) AT-AC pre-mRNA splicing mechanisms and conservation of minor introns in voltage-gated ion channel genes. Mol. Cell. Biol., 19, 3225–3236. - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources