Functional noncoding sequences derived from SINEs in the mammalian genome - PubMed (original) (raw)

Functional noncoding sequences derived from SINEs in the mammalian genome

Hidenori Nishihara et al. Genome Res. 2006 Jul.

Abstract

Recent comparative analyses of mammalian sequences have revealed that a large number of nonprotein-coding genomic regions are under strong selective constraint. Here, we report that some of these loci have been derived from a newly defined family of ancient SINEs (short interspersed repetitive elements). This is a surprising result, as SINEs and other transposable elements are commonly thought to be genomic parasites. We named the ancient SINE family AmnSINE1, for Amniota SINE1, because we found it to be present in mammals as well as in birds, and some copies predate the mammalian-bird split 310 million years ago (Mya). AmnSINE1 has a chimeric structure of a 5S rRNA and a tRNA-derived SINE, and is related to five tRNA-derived SINE families that we characterized here in the coelacanth, dogfish shark, hagfish, and amphioxus genomes. All of the newly described SINE families have a common central domain that is also shared by zebrafish SINE3, and we collectively name them the DeuSINE (Deuterostomia SINE) superfamily. Notably, of the approximately 1000 still identifiable copies of AmnSINE1 in the human genome, 105 correspond to loci phylogenetically highly conserved among mammalian orthologs. The conservation is strongest over the central domain. Thus, AmnSINE1 appears to be the best example of a transposable element of which a significant fraction of the copies have acquired genomic functionality.

PubMed Disclaimer

Figures

Figure 1.

Figure 1.

The structure and distribution of DeuSINEs. (A) Schematic representation of DeuSINEs and the known phylogenetic relationships between the host species. All of these SINEs were newly characterized in this study except for zebrafish SINE3 and sea urchin SINE2-3_SP. Green boxes represent common sequences among DeuSINEs, and an alignment of the region is shown in B. Yellow and red boxes denote promoter regions derived from tRNA (see Fig. 2A) and 5S rRNA (Fig. 2B), respectively. The regions denoted by yellow boxes in 5S rRNA-derived SINEs (AmnSINE1, SINE3, SINE3_IP and OS-SINE1) are similar to tRNA-derived promoter regions of LmeSINE1 (see Fig. 3 for details). Blue boxes represent 3′-tails of SINEs that are similar to that of zebrafish CR1-4_DR LINE (Fig. 4A), whereas the purple box is similar to that of the rainbow trout RSg-1 LINE (Fig. 4B). The 3′-tail sequences of EbuSINE1, EbuSINE2, BflSINE1, and SINE2-3_SP are distinct and of unknown origin. (B) Alignment of the common central DeuSINE sequences (Deu-domain; green boxes in A) in different host organisms. The SINE3-1 and the SINE3-2a sequences represent two subfamilies of the zebrafish SINE3 family. The top sequence is a consensus of the DeuSINE domain. Dots indicate nucleotides identical to those in the consensus sequence, and dashes indicate gaps inserted to improve the alignment.

Figure 2.

Figure 2.

Characterization of promoter regions of DeuSINEs. (A) Six tRNA-like structures of the promoter regions of tRNA-derived DeuSINEs: (a) LmeSINE1a, (b) LmeSINE1b, (c) SacSINE1, (d) EbuSINE1, (e) EbuSINE2, and (f) BflSINE1. Standard base pairs and G-T wobble pairs are shown as black dashes and dots, respectively. Nucleotides conserved in functional tRNAs (8T, 14A, 15G, 18G, 32C, 33T, and 37R) are circled, and the Box B promoter sequences are boxed. The numbering system corresponds with that of general tRNA (Gauss et al. 1979). (B) An alignment of consensus sequences of the 5S rRNA-related regions (red boxes) of AmnSINE1, SINE3, SINE3_IP, and OS-SINE1. Zebrafish 5S rRNA gene sequences were obtained from Kapitonov and Jurka (2003). The 5S rRNA sequences of human and rainbow trout were obtained from GenBank (accession nos. X51545 and J01861, respectively). Box A, Box C, and Intermediate Element of pol III promoters are denoted with black lines. Nucleotides shaded in black are conserved across sequences.

Figure 3.

Figure 3.

Chimeric structure of 5S rRNA and tRNA-derived DeuSINEs. (A) Comparison of the tRNA-derived regions of LmeSINE1a and LmeSINE1b with a part of the 5S rRNA-derived SINEs, AmnSINE1, SINE3, SINE3_IP, and OS-SINE1. Box A and Box B are the pol III promoter sequences of the two LmeSINEs. (B) A possible scheme for the structural evolution of the 5S rRNA-derived SINE families. The green boxes and the blue boxes denote the Deu-domain and 3′-tail, respectively. A 5S rRNA sequence (red boxes) became joined with a tRNA-derived SINE, with subsequent partial deletion of the original tRNA-derived promoter region (yellow boxes).

Figure 4.

Figure 4.

Alignment of the consensus 3′-tail sequences of DeuSINEs with that of the corresponding LINEs. “RTase” denotes the reverse-transcriptase encoded by LINEs. (A) The 3′-tail sequences of AmnSINE1, LmeSINE1a, LmeSINE1b, SINE3, SINE3_IP, and SacSINE1 (blue boxes) are similar to that of zebrafish CR1-4_DR LINEs. (B) The 3′-tail of OS-SINE1 (purple box) is similar to rainbow trout RSg-1 LINE. Both CR1-4_DR and RSg-1 LINE sequences were obtained from Repbase Update database (Jurka 2000).

Figure 5.

Figure 5.

Conservation of Deu-domain sequences (black box) among AmnSINE1 copies in human and chicken genomes. This graph shows the number of copies that include each nucleotide position of the AmnSINE1 in human (the bold line) and chicken (the dotted line). The number of copies of AmnSINE1 analyzed is 380 and 742 for human and chicken, respectively.

Figure 6.

Figure 6.

Evidence for purifying selection on AmnSINE1 in mammals. (A) An example of conserved AmnSINE1 locus (the window is chr15:59,354,815-59,356,314 in which the AmnSINE1 position is located at chr15:59,355,417-59,355,712) obtained from UCSC Genome Bioinformatics Web site. Note that the AmnSINE1 sequence is conserved in all mammals including opossum. (B) The location and conservation of 10 representative AmnSINE1 loci in human. The position information in human was determined from genomic sequence data in UCSC Genome Bioinformatics (ver. hg17). (Chr) Chromosome number. PhastCons conservation scores of the 1.5-kbp region around and including each AmnSINE1 were obtained by comparing human, chimpanzee, mouse, rat, and dog sequences, and the graphs are displayed for each locus. In each graph, the black region denotes the AmnSINE1 sequence and the gray represents the flanking region corresponding to the given position. Detailed information for the 10 loci is available in Supplemental Figure 3.

Figure 7.

Figure 7.

A schematic representation of the method used in this study to find novel SINEs from genomic sequences of the coelacanth. This algorithm consists of the following five steps: (1) detection of the Box B-like sequence; (2) extraction of their flanking sequences (exemplified as Seq1–8); (3) BLAST search for homology among one another to find similar sequences; (4) collection of sequences that are recognized as similar to each other (E-value <10−50); (5) alignment of the sequences within each group.

References

    1. Altschul S.F., Gish W., Miller W., Myers E.W., Lipman D.J., Gish W., Miller W., Myers E.W., Lipman D.J., Miller W., Myers E.W., Lipman D.J., Myers E.W., Lipman D.J., Lipman D.J. Basic local alignment search tool. J. Mol. Biol. 1990;215:403–410. - PubMed
    1. Bejerano G., Pheasant M., Makunin I., Stephen S., Kent W.J., Mattick J.S., Haussler D., Pheasant M., Makunin I., Stephen S., Kent W.J., Mattick J.S., Haussler D., Makunin I., Stephen S., Kent W.J., Mattick J.S., Haussler D., Stephen S., Kent W.J., Mattick J.S., Haussler D., Kent W.J., Mattick J.S., Haussler D., Mattick J.S., Haussler D., Haussler D. Ultraconserved elements in the human genome. Science. 2004a;304:1321–1325. - PubMed
    1. Bejerano G., Haussler D., Blanchette M., Haussler D., Blanchette M., Blanchette M. Into the heart of darkness: Large-scale clustering of human non-coding DNA. Bioinformatics. 2004b;20(Suppl 1):I40–I48. - PubMed
    1. Bejerano G., Lowe C., Ahituv N., King B., Siepel A., Salama S., Rubin E.M., Kent W.J., Haussler D., Lowe C., Ahituv N., King B., Siepel A., Salama S., Rubin E.M., Kent W.J., Haussler D., Ahituv N., King B., Siepel A., Salama S., Rubin E.M., Kent W.J., Haussler D., King B., Siepel A., Salama S., Rubin E.M., Kent W.J., Haussler D., Siepel A., Salama S., Rubin E.M., Kent W.J., Haussler D., Salama S., Rubin E.M., Kent W.J., Haussler D., Rubin E.M., Kent W.J., Haussler D., Kent W.J., Haussler D., Haussler D. A distal enhancer and ultraconserved exon are derived from a novel retroposon. Nature. 2006;441:87–90. - PubMed
    1. Benton M.J. Vertebrate paleontology. Chapman & Hall; New York: 1997.

Publication types

MeSH terms

Substances

LinkOut - more resources