Origin of sphinx, a young chimeric RNA gene in Drosophila melanogaster - PubMed (original) (raw)

Origin of sphinx, a young chimeric RNA gene in Drosophila melanogaster

Wen Wang et al. Proc Natl Acad Sci U S A. 2002.

Abstract

Non-protein-coding RNA genes play an important role in various biological processes. How new RNA genes originated and whether this process is controlled by similar evolutionary mechanisms for the origin of protein-coding genes remains unclear. A young chimeric RNA gene that we term sphinx (spx) provides the first insight into the early stage of evolution of RNA genes. spx originated as an insertion of a retroposed sequence of the ATP synthase chain F gene at the cytological region 60DB since the divergence of Drosophila melanogaster from its sibling species 2-3 million years ago. This retrosequence, which is located at 102F on the fourth chromosome, recruited a nearby exon and intron, thereby evolving a chimeric gene structure. This molecular process suggests that the mechanism of exon shuffling, which can generate protein-coding genes, also plays a role in the origin of RNA genes. The subsequent evolutionary process of spx has been associated with a high nucleotide substitution rate, possibly driven by a continuous positive Darwinian selection for a novel function, as is shown in its sex- and development-specific alternative splicing. To test whether spx has adapted to different environments, we investigated its population genetic structure in the unique "Evolution Canyon" in Israel, revealing a similar haplotype structure in spx, and thus similar evolutionary forces operating on spx between environments.

PubMed Disclaimer

Figures

Figure 1

Figure 1

(a) FISH results showing that there are two red signals (arrows) in D. melanogaster. (b) Southern hybridization results also show that D. melanogaster has two copies homologous to the probe, which is the cDNA of ATP synthase chain F gene, but the other species have only one. (c) Phylogenetic tree of D. melanogaster subgroup (36). Divergence time of some nodes and emergence time of sphinx is indicated.

Figure 2

Figure 2

Alignment of the sphinx locus sequence of D. melanogaster with the sequences of the correspondent region of D. sechellia, CG4692 cDNA of D. melanogaster, and deduced partial cDNA of D. simulans. Asterisk indicates the start base of spx transcripts, dashes indicate deletions, and dots show the identical bases. The two exon sequences of the male-specific transcript (_sphinx_-m) are in capital letters. The two alternative adenylation signals (AATAAA) are in bold. All of the splicing sites (gt or cag) are indicated by boldface and vertical arrows. The retroposed sequence is flanked by the short direct repeats of TTCG, which are double underlined and indicated by the horizontal arrows. The region homologous to the ATP synthase gene is underlined. The remaining retroposed sequence is homologous to the terminal inverted repeat of S element, and therefore, the recipient slicing site was provided by the S element sequence.

Figure 3

Figure 3

(a) Gene structure of sphinx and its parental gene, ATP synthase chain F. Blank blocks represent the exon of the ATP synthase gene. Black blocks represent the terminal inverted repeat sequence of S element, a small part (27 bp) of which is recruited into the second exon of sphinx. Other stripped blocks are exon sequences that are endogenous in the fourth chromosome. Some important changes in the retroposed ATP synthase sequence are indicated, including the change of start codon into GTG, introduction of a stop codon, and three deletions (d1 = 42 bp, d2 = 1 bp, d3 = 4 bp). The positions of the two primers used for RT-PCR are indicated by arrows. Splicing sites are marked by GT or CAG. Polyadenylation sites are marked by AATAAA. (b) mRNA species resulted from alternative adenylation and splicing. Sphinx-m is detected in males and eggs, sphinx-f is in females and eggs, _sphinx_-s is in both females and males, and the unspliced one is in males, larvae, and pupae. (c) RT-PCR results. “−” indicates the RT-PCR-negative controls in which everything is the same as the positive (+) except omitting reverse transcriptase. The last lane is a positive control using genomic DNA as the template. The primer locations on the gene are shown in a.

Similar articles

Cited by

References

    1. Gesteland R F, Cech T R, Atkins J F. The RNA World. 2nd Ed. Plainview, NY: Cold Spring Harbor Lab. Press; 1999.
    1. Eddy S R. Curr Opin Genet Dev. 1999;9:695–699. - PubMed
    1. Eddy S R. Nat Rev Genet. 2001;2:919–929. - PubMed
    1. Erdmann V A, Szymanski M, Hochberg A, de Groot N, Barciszewski J. Nucleic Acid Res. 2000;28:197–200. - PMC - PubMed
    1. Huttenhofer A, Kiefmann M, Meier-Ewert S, O'Brien J, Lehrach H, Bachellerie J P, Brosius J. EMBO J. 2001;20:2943–2953. - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources