Evolutionary dynamics of transposable elements in the short-tailed opossum Monodelphis domestica - PubMed (original) (raw)

Evolutionary dynamics of transposable elements in the short-tailed opossum Monodelphis domestica

Andrew J Gentles et al. Genome Res. 2007 Jul.

Abstract

The genome of the gray short-tailed opossum Monodelphis domestica is notable for its large size ( approximately 3.6 Gb). We characterized nearly 500 families of interspersed repeats from the Monodelphis. They cover approximately 52% of the genome, higher than in any other amniotic lineage studied to date, and may account for the unusually large genome size. In comparison to other mammals, Monodelphis is significantly rich in non-LTR retrotransposons from the LINE-1, CR1, and RTE families, with >29% of the genome sequence comprised of copies of these elements. Monodelphis has at least four families of RTE, and we report support for horizontal transfer of this non-LTR retrotransposon. In addition to short interspersed elements (SINEs) mobilized by L1, we found several families of SINEs that appear to use RTE elements for mobilization. In contrast to L1-mobilized SINEs, the RTE-mobilized SINEs in Monodelphis appear to shift from G+C-rich to G+C-low regions with time. Endogenous retroviruses have colonized approximately 10% of the opossum genome. We found that their density is enhanced in centromeric and/or telomeric regions of most Monodelphis chromosomes. We identified 83 new families of ancient repeats that are highly conserved across amniotic lineages, including 14 LINE-derived repeats; and a novel SINE element, MER131, that may have been exapted as a highly conserved functional noncoding RNA, and whose emergence dates back to approximately 300 million years ago. Many of these conserved repeats are also present in human, and are highly over-represented in predicted cis-regulatory modules. Seventy-six of the 83 families are present in chicken in addition to mammals.

PubMed Disclaimer

Figures

Figure 1.

Figure 1.

(A) Age distribution of RTE-3 and MAR1. RTE-3 and MAR1 insertions were separately split into groups according to their similarity to consensus, in bins of width 2% (horizontal axis). The vertical axis shows the proportion of RTE-3 (MAR1) elements of that age, calculated as the number of base pairs of sequence covered by elements in that similarity range divided by the total genome base pairs covered by RTE-3 (MAR1). (B) Distribution of target site duplication lengths of RTE-3 and MAR1. Length of target site duplication is shown on the horizontal axis. The vertical axis shows the frequency of TSDs of that length for RTE-3 and MAR1.

Figure 2.

Figure 2.

Phylogenetic relationship between reconstructed RTE consensus sequences. The tree was reconstructed using MrBayes, as described in Methods. Numbers at nodes indicate bootstrap support for that node (%); only support of >70% is shown.

Figure 3.

Figure 3.

Density of ERV insertions across Monodelphis chromosomes. The density shown is the percentage of sequence that is identified as internal ERV or LTR sequence in 100-kb segments spanning each chromosome. Centromere positions (determined from FISH data, see text) are indicated by a gray circle on the horizontal axis. Position along chromosomes is shown in megabases. The gray dots are values for each individual 100-kb segment. Black lines are a smoothed running mean. Peaks in ERV density on chromosomes 1 and 2 correspond to centromere locations. Prominent peaks are also found on chromosomes 3–5, but do not correspond to centromeric regions in the genome assembly (Mikkelsen et al. 2007); however, they are roughly consistent with locations of cytologically determined centromere activity reported in the literature (Rens et al. 2003).

Figure 4.

Figure 4.

Distributions of the RTE-mobilized SINE MAR1 across G+C ranges in Monodelphis. Distribution across G+C regions of the Monodelphis genome of MAR1 (putatively RTE-3 -mobilized). The horizontal axis shows G+C content in 5% bins, while the vertical axis shows the normalized densities of the TEs in that bin. For each TE, we categorized elements by age according to their similarity to their consensus sequence (“RSIM” in the legend) and plotted the distribution separately for each. RSIM = 70% indicates similarity to the consensus of 70%–75%, RSIM = 75% indicates 75%–80%, etc. Normalization of TE densities is described in Methods.

Figure 5.

Figure 5.

Sequence and conservation of the exapted SINE element MER131. (Top left) The MER131 consensus sequence. The putative Box-B promoter and poly(A) tail are highlighted in bold. (Top right) The distribution of pairwise similarities of the 200 most conserved MER131 sequences both within Monodelphis, and syntenic regions of _Monodelphis_-human. (Bottom) A MER131 insertion on chromosome 2, with 100-bp flanking sequence either side and degree of conservation across Monodelphis, human, mouse, rat, and chicken (the region shown is chr2: 359,497,570–359,498,703 from the UCSC genome browser Opossum January 2006 assembly). The MultiZ alignment score across all species is shown in black. Gray shaded areas are phastCons scores between Monodelphis and the individual species. The blocks labeled “Most Conserved” are predicted by phastCons (Siepel et al. 2005).

Figure 6.

Figure 6.

Interspersed repetitive elements in _cis_-regulatory modules (CRMs) and evolutionarily conserved regions. The _Y_-axis shows the percentage of 77 human interspersed repeats (listed below the _X_-axis) in CRMs (black diamonds/line), compared with normalized proportions of the same repeats (gray line) in evolutionarily conserved regions (Siepel et al. 2005).

References

    1. Babcock M., Pavlicek A., Spiteri E., Kashork C.D., Ioshikhes I., Shaffer L.G., Jurka J., Morrow B.E., Pavlicek A., Spiteri E., Kashork C.D., Ioshikhes I., Shaffer L.G., Jurka J., Morrow B.E., Spiteri E., Kashork C.D., Ioshikhes I., Shaffer L.G., Jurka J., Morrow B.E., Kashork C.D., Ioshikhes I., Shaffer L.G., Jurka J., Morrow B.E., Ioshikhes I., Shaffer L.G., Jurka J., Morrow B.E., Shaffer L.G., Jurka J., Morrow B.E., Jurka J., Morrow B.E., Morrow B.E. Shuffling of genes within low-copy repeats on 22q11 (LCR22) by Alu-mediated recombination events during evolution. Genome Res. 2003;13:2519–2532. - PMC - PubMed
    1. Bailey J.A., Gu Z., Clark R.A., Reinert K., Samonte R.V., Schwartz S., Adams M.D., Myers E.W., Li P.W., Eichler E.E., Gu Z., Clark R.A., Reinert K., Samonte R.V., Schwartz S., Adams M.D., Myers E.W., Li P.W., Eichler E.E., Clark R.A., Reinert K., Samonte R.V., Schwartz S., Adams M.D., Myers E.W., Li P.W., Eichler E.E., Reinert K., Samonte R.V., Schwartz S., Adams M.D., Myers E.W., Li P.W., Eichler E.E., Samonte R.V., Schwartz S., Adams M.D., Myers E.W., Li P.W., Eichler E.E., Schwartz S., Adams M.D., Myers E.W., Li P.W., Eichler E.E., Adams M.D., Myers E.W., Li P.W., Eichler E.E., Myers E.W., Li P.W., Eichler E.E., Li P.W., Eichler E.E., Eichler E.E. Recent segmental duplications in the human genome. Science. 2002;297:1003–1007. - PubMed
    1. Beadle G.W. A possible influence of the spindle fibre on crossing-over in Drosophila. Proc. Natl. Acad. Sci. 1932;18:160–165. - PMC - PubMed
    1. Bejerano B., Haussler D., Blanchette M., Haussler D., Blanchette M., Blanchette M. Into the heart of darkness: Large-scale clustering of human non-coding DNA. Bioinformatics. 2004;20(Suppl 1):I40–I48. - PubMed
    1. Bejerano G., Siepel A.C., Kent W.J., Haussler D., Siepel A.C., Kent W.J., Haussler D., Kent W.J., Haussler D., Haussler D. Computational screening of conserved genomic DNA in search of functional noncoding elements. Nat. Methods. 2005;2:535–545. - PubMed

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources