Irina Yushenova | Marine Biological Laboratory (original) (raw)
Papers by Irina Yushenova
Biochemistry (Moscow), Oct 31, 2023
Reverse transcriptases (RT), or RNA-dependent DNA polymerases, are unorthodox enzymes that origin... more Reverse transcriptases (RT), or RNA-dependent DNA polymerases, are unorthodox enzymes that originally added a new angle to the conventional view of the unidirectional f low of genetic information in the cell from DNA to RNA to protein. First discovered in vertebrate retroviruses, RTs were since rediscovered in most eukaryotes, bacteria, and archaea, spanning essentially all domains of life. For retroviruses, RTs provide the ability to copy the RNA genome into DNA for subsequent incorporation into the host genome, which is essential for their replication and survival. In cellular organisms, most RT sequences originate from retrotransposons, the type of self-replicating genetic elements that rely on reverse transcription to copy and paste their sequences into new genomic locations. Some retroelements, however, can undergo domestication, eventually becoming a valuable addition to the overall repertoire of cellular enzymes. They can be beneficial yet accessory, like the diversity-generating elements, or even essential, like the telomerase reverse transcriptases. Nowadays, ever-increasing numbers of domesticated RT-carrying genetic elements are being discovered. It may be argued that domesticated RTs and reverse transcription in general is more widespread in cellular organisms than previously thought, and that many important cellular functions, such as chromosome end maintenance, may evolve from an originally selfish process of converting RNA into DNA.
Nature Communications, Feb 28, 2022
DNA modifications are used to regulate gene expression and defend against invading genetic elemen... more DNA modifications are used to regulate gene expression and defend against invading genetic elements. In eukaryotes, modifications predominantly involve C5-methylcytosine (5mC) and occasionally N6-methyladenine (6mA), while bacteria frequently use N4-methylcytosine (4mC) in addition to 5mC and 6mA. Here we report that 4mC can serve as an epigenetic mark in eukaryotes. Bdelloid rotifers, tiny freshwater invertebrates with transposon-poor genomes rich in foreign genes, lack canonical eukaryotic C5-methyltransferases for 5mC addition, but encode an amino-methyltransferase, N4CMT, captured from bacteria >60 Mya. N4CMT deposits 4mC at active transposons and certain tandem repeats, and fusion to a chromodomain shapes its "histone-read-DNA-write" architecture recognizing silent chromatin marks. Furthermore, amplification of SETDB1 H3K9me3 histone methyltransferases yields variants preferentially binding 4mC-DNA, suggesting "DNA-read-histone-write" partnership to maintain chromatin-based silencing. Our results show how non-native DNA methyl groups can reshape epigenetic systems to silence transposons and demonstrate the potential of horizontal gene transfer to drive regulatory innovation in eukaryotes.
Retrotransposons and Human Disease
Insect Molecular Biology, 2016
This dataset includes:<br>1) Diploid genome assembly generated for the <i>A. vaga<... more This dataset includes:<br>1) Diploid genome assembly generated for the <i>A. vaga</i> L1 genome with SPAdes (version 3.6.0) from MiSeq reads (file 'Adineta.vaga.L1.contigs.fasta'). <br>2) Haploid sub-assembly of the <i>A. vaga</i> L1 genome (file 'Adineta.vaga.L1.haploid.segments.fasta'). This is haploid representation of the L1 genome containing mostly non-redundant genomic segments extracted from the L1 diploid contigs. <br>3) Annotation of protein-coding genes produced for the <i>A. vaga</i> L1 diploid contigs in the GTF format (78,303 gene models from 75,877 loci; file 'Adineta.vaga.L1.contigs.annotation.gtf').<br>4) List of gene models from the filtered gene set (file 'Adineta.vaga.L1.contigs.annotation.transcripts.filtered.txt'). This set includes 61,531 gene models retained after filtering aimed at removing pseudogenes and putative annotation errors. For loci with more than one gene mo...
Alignments of HiSeq reads for <i>A. vaga</i> individuals L1-L11 to <i>A. vaga &... more Alignments of HiSeq reads for <i>A. vaga</i> individuals L1-L11 to <i>A. vaga </i>mitochondrial contigs in the BAM format. HiSeq reads for <i>A. vaga</i> individuals L1-L11 were aligned to the mitochondrial contig of the individual L4 used as a reference. Sequence of L4 mitochondral contig in the FASTA format is also uploaded (file 'L4.mito.contigs.fasta'). For individual L1 carrying a divergent mitochondrial haplotype, we also provide alignments of L1 HiSeq reads to L1 mitochondrial contigs (file 'L1.vs.L1.mito.contigs.bam'). L1 mitochondrial contigs were recovered from the L1 diploid assembly generated using MiSeq reads and can be found in the file 'L1.mito.contigs.fasta'. Alignments were obtained using Bowtie 2 (version 2.3.2) with the parameters "--no-mixed --no-discordant" specifying the maximum insert size of 800 base pairs with only a single best alignment of the pair of reads reported. BAM alignments were ...
Sequences of mitochondrial contigs for <i>A. vaga </i>individuals L1-L11 extracted fr... more Sequences of mitochondrial contigs for <i>A. vaga </i>individuals L1-L11 extracted from whole-genome assemblies using the mitochondrial contig from the first published <i>A. vaga</i> genome assembly (Flot et al., 2013) as the query for blastn search. For L1, mitochondrial contigs were extracted from the MiSeq-based L1 diploid assembly (N50=22,073 bp). For individuals L2-L11, highly fragmented assemblies (with N50 in the range 1,600-3,400 bp) were generated with SPAdes (version 3.6.0) from HiSeq reads. Mitochondrial sequences for each individual, L1-L11, are in separate FASTA files.
This dataset includes SNPs detected among 11 <i>A. vaga</i> individuals L1-L11 (SNP d... more This dataset includes SNPs detected among 11 <i>A. vaga</i> individuals L1-L11 (SNP dataset I). SNPs were called with SAMtools/BCFtools (v.1.4.1) from alignments of HiSeq reads to the L1 haploid sub-assembly. File 'A.vaga.L1_L11.SNP.dataset.I.raw.vcf' contains raw variant calls (raw INDEL entries are also present). Filtered SNPs (stringent SNP dataset I) are contained in the file 'A.vaga.L1_L11.SNP.dataset.I.filtered.vcf'. Data are in the VCF format.
Alignments of HiSeq reads for <i>A. vaga</i> individuals L1-L11 to <i>A. vaga &... more Alignments of HiSeq reads for <i>A. vaga</i> individuals L1-L11 to <i>A. vaga </i>mitochondrial contigs in the BAM format. HiSeq reads for <i>A. vaga</i> individuals L1-L11 were aligned to the mitochondrial contig of the individual L4 used as a reference. Sequence of L4 mitochondral contig in the FASTA format is also uploaded (file 'L4.mito.contigs.fasta'). For individual L1 carrying a divergent mitochondrial haplotype, we also provide alignments of L1 HiSeq reads to L1 mitochondrial contigs (file 'L1.vs.L1.mito.contigs.bam'). L1 mitochondrial contigs were recovered from the L1 diploid assembly generated using MiSeq reads and can be found in the file 'L1.mito.contigs.fasta'. Alignments were obtained using Bowtie 2 (version 2.3.2) with the parameters "--no-mixed --no-discordant" specifying the maximum insert size of 800 base pairs with only a single best alignment of the pair of reads reported. BAM alignments were ...
This dataset includes:<br>1) Diploid genome assembly generated for the <i>A. vaga<... more This dataset includes:<br>1) Diploid genome assembly generated for the <i>A. vaga</i> L1 genome with SPAdes (version 3.6.0) from MiSeq reads (file 'Adineta.vaga.L1.contigs.fasta'). <br>2) Haploid sub-assembly of the <i>A. vaga</i> L1 genome (file 'Adineta.vaga.L1.haploid.segments.fasta'). This is haploid representation of the L1 genome containing mostly non-redundant genomic segments extracted from the L1 diploid contigs. <br>3) Annotation of protein-coding genes produced for the <i>A. vaga</i> L1 diploid contigs in the GTF format (78,303 gene models from 75,877 loci; file 'Adineta.vaga.L1.contigs.annotation.gtf').<br>4) List of gene models from the filtered gene set (file 'Adineta.vaga.L1.contigs.annotation.transcripts.filtered.txt'). This set includes 61,531 gene models retained after filtering aimed at removing pseudogenes and putative annotation errors. For loci with more than one gene mo...
Sequences of mitochondrial contigs for <i>A. vaga </i>individuals L1-L11 extracted fr... more Sequences of mitochondrial contigs for <i>A. vaga </i>individuals L1-L11 extracted from whole-genome assemblies using the mitochondrial contig from the first published <i>A. vaga</i> genome assembly (Flot et al., 2013) as the query for blastn search. For L1, mitochondrial contigs were extracted from the MiSeq-based L1 diploid assembly (N50=22,073 bp). For individuals L2-L11, highly fragmented assemblies (with N50 in the range 1,600-3,400 bp) were generated with SPAdes (version 3.6.0) from HiSeq reads. Mitochondrial sequences for each individual, L1-L11, are in separate FASTA files.
This dataset includes SNPs detected among 11 <i>A. vaga</i> individuals L1-L11 (SNP d... more This dataset includes SNPs detected among 11 <i>A. vaga</i> individuals L1-L11 (SNP dataset I). SNPs were called with SAMtools/BCFtools (v.1.4.1) from alignments of HiSeq reads to the L1 haploid sub-assembly. File 'A.vaga.L1_L11.SNP.dataset.I.raw.vcf' contains raw variant calls (raw INDEL entries are also present). Filtered SNPs (stringent SNP dataset I) are contained in the file 'A.vaga.L1_L11.SNP.dataset.I.filtered.vcf'. Data are in the VCF format.
retrotransposons in the bdelloid rotifer Adineta vaga exhibit unusual structural features and pla... more retrotransposons in the bdelloid rotifer Adineta vaga exhibit unusual structural features and play a role in expansion of host gene families Arkhipova et al. Arkhipova et al. Mobile DNA 2013, 4:19
In eukaryotes, 5-methylcytosine is the predominant DNA base modification, followed by N6-methylad... more In eukaryotes, 5-methylcytosine is the predominant DNA base modification, followed by N6-methyladenine. However, N4-methylcytosine (4mC) is confined to bacteria. Here we report that 4mC can serve as an epigenetic mark in eukaryotes. Bdelloid rotifers, freshwater invertebrates with transposon-poor genomes that are rich in foreign genes, lack C5-methyltransferases but encode an amino-methyltransferase, N4CMT, captured from bacteria >60 Mya. N4CMT introduces 4mC into DNA, and its chromodomain shapes the "histone-read-DNA-write" architecture together with a "DNA-read-histone-write" SETDB1/eggless H3K9me3 histone methyltransferase variant preferentially binding 4mC-DNA, to maintain 4mC and silent chromatin at transposons and tandem repeats. Our results bring the third base modification into the eukaryotic repertoire, demonstrate how non-native DNA methyl groups can reshape complex epigenetic systems to suppress transposon proliferation, and establish horizontal gen...
Molecular Biology and Evolution, 2021
Penelope-like elements (PLEs) are an enigmatic clade of retrotransposons whose reverse transcript... more Penelope-like elements (PLEs) are an enigmatic clade of retrotransposons whose reverse transcriptases (RTs) share a most recent common ancestor with telomerase RTs. The single ORF of canonical endonuclease (EN)+ PLEs encodes RT and a C-terminal GIY–YIG EN that enables intrachromosomal integration, whereas EN− PLEs lack EN and are generally restricted to chromosome termini. EN+ PLEs have only been found in animals, except for one case of horizontal transfer to conifers, whereas EN− PLEs occur in several kingdoms. Here, we report a new, deep-branching PLE clade with a permuted domain order, whereby an N-terminal GIY–YIG EN is linked to a C-terminal RT by a short domain with a characteristic CxC motif. These N-terminal EN+ PLEs share a structural organization, including pseudo-LTRs and complex tandem/inverted insertions, with canonical EN+ PLEs from Penelope/Poseidon, Neptune, and Nematis clades, and show insertion bias for microsatellites, but lack canonical hammerhead ribozyme motifs...
Nature Communications, 2020
Sexual reproduction is almost ubiquitous among extant eukaryotes. As most asexual lineages are sh... more Sexual reproduction is almost ubiquitous among extant eukaryotes. As most asexual lineages are short-lived, abandoning sex is commonly regarded as an evolutionary dead end. Still, putative anciently asexual lineages challenge this view. One of the most striking examples are bdelloid rotifers, microscopic freshwater invertebrates believed to have completely abandoned sexual reproduction tens of Myr ago. Here, we compare whole genomes of 11 wild-caught individuals of the bdelloid rotiferAdineta vagaand present evidence that some patterns in its genetic variation are incompatible with strict clonality and lack of genetic exchange. These patterns include genotype proportions close to Hardy-Weinberg expectations within loci, lack of linkage disequilibrium between distant loci, incongruent haplotype phylogenies across the genome, and evidence for hybridization between divergent lineages. Analysis of triallelic sites independently corroborates these findings. Our results provide evidence f...
Genome Biology and Evolution, 2019
Transposable elements (TEs) are ubiquitous in both prokaryotes and eukaryotes, and the dynamic ch... more Transposable elements (TEs) are ubiquitous in both prokaryotes and eukaryotes, and the dynamic character of their interaction with host genomes brings about numerous evolutionary innovations and shapes genome structure and function in a multitude of ways. In traditional classification systems, TEs are often being depicted in simplistic ways, based primarily on the key enzymes required for transposition, such as transposases/recombinases and reverse transcriptases. Recent progress in whole-genome sequencing and long-read assembly, combined with expansion of the familiar range of model organisms, resulted in identification of unprecedentedly long transposable units spanning dozens or even hundreds of kilobases, initially in prokaryotic and more recently in eukaryotic systems. Here, we focus on such oversized eukaryotic TEs, including retrotransposons and DNA transposons, outline their complex and often combinatorial nature and closely intertwined relationship with viruses, and discuss their potential for participating in transfer of long stretches of DNA in eukaryotes.
Sexual reproduction which involves alternation of meiosis and syngamy is the ancestral condition ... more Sexual reproduction which involves alternation of meiosis and syngamy is the ancestral condition of extant eukaryotes. Transitions to asexual reproduction were numerous, but most of the resulting eukaryotic lineages are rather short-lived. Still, there are several exceptions to this rule including darwinulid ostracods1,2and timema stick insects3. The most striking of them is bdelloid rotifers4–6, microscopic freshwater invertebrates which underwent an extensive adaptive radiation after apparently losing meiosis over 10 Mya. Indeed, both the lack of males in numerous bdelloid species and the lack of proper homology between chromosomes6rule out ordinary sex. However, this does not exclude the possibility of some other mode of interindividual genetic exchange and recombination in their populations7. Recent analyses based on a few loci suggested genetic exchanges in this group8,9, although this has been controversial10. Here, we compare complete genomes of 11 individuals from the wild p...
Biochemistry (Moscow), Oct 31, 2023
Reverse transcriptases (RT), or RNA-dependent DNA polymerases, are unorthodox enzymes that origin... more Reverse transcriptases (RT), or RNA-dependent DNA polymerases, are unorthodox enzymes that originally added a new angle to the conventional view of the unidirectional f low of genetic information in the cell from DNA to RNA to protein. First discovered in vertebrate retroviruses, RTs were since rediscovered in most eukaryotes, bacteria, and archaea, spanning essentially all domains of life. For retroviruses, RTs provide the ability to copy the RNA genome into DNA for subsequent incorporation into the host genome, which is essential for their replication and survival. In cellular organisms, most RT sequences originate from retrotransposons, the type of self-replicating genetic elements that rely on reverse transcription to copy and paste their sequences into new genomic locations. Some retroelements, however, can undergo domestication, eventually becoming a valuable addition to the overall repertoire of cellular enzymes. They can be beneficial yet accessory, like the diversity-generating elements, or even essential, like the telomerase reverse transcriptases. Nowadays, ever-increasing numbers of domesticated RT-carrying genetic elements are being discovered. It may be argued that domesticated RTs and reverse transcription in general is more widespread in cellular organisms than previously thought, and that many important cellular functions, such as chromosome end maintenance, may evolve from an originally selfish process of converting RNA into DNA.
Nature Communications, Feb 28, 2022
DNA modifications are used to regulate gene expression and defend against invading genetic elemen... more DNA modifications are used to regulate gene expression and defend against invading genetic elements. In eukaryotes, modifications predominantly involve C5-methylcytosine (5mC) and occasionally N6-methyladenine (6mA), while bacteria frequently use N4-methylcytosine (4mC) in addition to 5mC and 6mA. Here we report that 4mC can serve as an epigenetic mark in eukaryotes. Bdelloid rotifers, tiny freshwater invertebrates with transposon-poor genomes rich in foreign genes, lack canonical eukaryotic C5-methyltransferases for 5mC addition, but encode an amino-methyltransferase, N4CMT, captured from bacteria >60 Mya. N4CMT deposits 4mC at active transposons and certain tandem repeats, and fusion to a chromodomain shapes its "histone-read-DNA-write" architecture recognizing silent chromatin marks. Furthermore, amplification of SETDB1 H3K9me3 histone methyltransferases yields variants preferentially binding 4mC-DNA, suggesting "DNA-read-histone-write" partnership to maintain chromatin-based silencing. Our results show how non-native DNA methyl groups can reshape epigenetic systems to silence transposons and demonstrate the potential of horizontal gene transfer to drive regulatory innovation in eukaryotes.
Retrotransposons and Human Disease
Insect Molecular Biology, 2016
This dataset includes:<br>1) Diploid genome assembly generated for the <i>A. vaga<... more This dataset includes:<br>1) Diploid genome assembly generated for the <i>A. vaga</i> L1 genome with SPAdes (version 3.6.0) from MiSeq reads (file 'Adineta.vaga.L1.contigs.fasta'). <br>2) Haploid sub-assembly of the <i>A. vaga</i> L1 genome (file 'Adineta.vaga.L1.haploid.segments.fasta'). This is haploid representation of the L1 genome containing mostly non-redundant genomic segments extracted from the L1 diploid contigs. <br>3) Annotation of protein-coding genes produced for the <i>A. vaga</i> L1 diploid contigs in the GTF format (78,303 gene models from 75,877 loci; file 'Adineta.vaga.L1.contigs.annotation.gtf').<br>4) List of gene models from the filtered gene set (file 'Adineta.vaga.L1.contigs.annotation.transcripts.filtered.txt'). This set includes 61,531 gene models retained after filtering aimed at removing pseudogenes and putative annotation errors. For loci with more than one gene mo...
Alignments of HiSeq reads for <i>A. vaga</i> individuals L1-L11 to <i>A. vaga &... more Alignments of HiSeq reads for <i>A. vaga</i> individuals L1-L11 to <i>A. vaga </i>mitochondrial contigs in the BAM format. HiSeq reads for <i>A. vaga</i> individuals L1-L11 were aligned to the mitochondrial contig of the individual L4 used as a reference. Sequence of L4 mitochondral contig in the FASTA format is also uploaded (file 'L4.mito.contigs.fasta'). For individual L1 carrying a divergent mitochondrial haplotype, we also provide alignments of L1 HiSeq reads to L1 mitochondrial contigs (file 'L1.vs.L1.mito.contigs.bam'). L1 mitochondrial contigs were recovered from the L1 diploid assembly generated using MiSeq reads and can be found in the file 'L1.mito.contigs.fasta'. Alignments were obtained using Bowtie 2 (version 2.3.2) with the parameters "--no-mixed --no-discordant" specifying the maximum insert size of 800 base pairs with only a single best alignment of the pair of reads reported. BAM alignments were ...
Sequences of mitochondrial contigs for <i>A. vaga </i>individuals L1-L11 extracted fr... more Sequences of mitochondrial contigs for <i>A. vaga </i>individuals L1-L11 extracted from whole-genome assemblies using the mitochondrial contig from the first published <i>A. vaga</i> genome assembly (Flot et al., 2013) as the query for blastn search. For L1, mitochondrial contigs were extracted from the MiSeq-based L1 diploid assembly (N50=22,073 bp). For individuals L2-L11, highly fragmented assemblies (with N50 in the range 1,600-3,400 bp) were generated with SPAdes (version 3.6.0) from HiSeq reads. Mitochondrial sequences for each individual, L1-L11, are in separate FASTA files.
This dataset includes SNPs detected among 11 <i>A. vaga</i> individuals L1-L11 (SNP d... more This dataset includes SNPs detected among 11 <i>A. vaga</i> individuals L1-L11 (SNP dataset I). SNPs were called with SAMtools/BCFtools (v.1.4.1) from alignments of HiSeq reads to the L1 haploid sub-assembly. File 'A.vaga.L1_L11.SNP.dataset.I.raw.vcf' contains raw variant calls (raw INDEL entries are also present). Filtered SNPs (stringent SNP dataset I) are contained in the file 'A.vaga.L1_L11.SNP.dataset.I.filtered.vcf'. Data are in the VCF format.
Alignments of HiSeq reads for <i>A. vaga</i> individuals L1-L11 to <i>A. vaga &... more Alignments of HiSeq reads for <i>A. vaga</i> individuals L1-L11 to <i>A. vaga </i>mitochondrial contigs in the BAM format. HiSeq reads for <i>A. vaga</i> individuals L1-L11 were aligned to the mitochondrial contig of the individual L4 used as a reference. Sequence of L4 mitochondral contig in the FASTA format is also uploaded (file 'L4.mito.contigs.fasta'). For individual L1 carrying a divergent mitochondrial haplotype, we also provide alignments of L1 HiSeq reads to L1 mitochondrial contigs (file 'L1.vs.L1.mito.contigs.bam'). L1 mitochondrial contigs were recovered from the L1 diploid assembly generated using MiSeq reads and can be found in the file 'L1.mito.contigs.fasta'. Alignments were obtained using Bowtie 2 (version 2.3.2) with the parameters "--no-mixed --no-discordant" specifying the maximum insert size of 800 base pairs with only a single best alignment of the pair of reads reported. BAM alignments were ...
This dataset includes:<br>1) Diploid genome assembly generated for the <i>A. vaga<... more This dataset includes:<br>1) Diploid genome assembly generated for the <i>A. vaga</i> L1 genome with SPAdes (version 3.6.0) from MiSeq reads (file 'Adineta.vaga.L1.contigs.fasta'). <br>2) Haploid sub-assembly of the <i>A. vaga</i> L1 genome (file 'Adineta.vaga.L1.haploid.segments.fasta'). This is haploid representation of the L1 genome containing mostly non-redundant genomic segments extracted from the L1 diploid contigs. <br>3) Annotation of protein-coding genes produced for the <i>A. vaga</i> L1 diploid contigs in the GTF format (78,303 gene models from 75,877 loci; file 'Adineta.vaga.L1.contigs.annotation.gtf').<br>4) List of gene models from the filtered gene set (file 'Adineta.vaga.L1.contigs.annotation.transcripts.filtered.txt'). This set includes 61,531 gene models retained after filtering aimed at removing pseudogenes and putative annotation errors. For loci with more than one gene mo...
Sequences of mitochondrial contigs for <i>A. vaga </i>individuals L1-L11 extracted fr... more Sequences of mitochondrial contigs for <i>A. vaga </i>individuals L1-L11 extracted from whole-genome assemblies using the mitochondrial contig from the first published <i>A. vaga</i> genome assembly (Flot et al., 2013) as the query for blastn search. For L1, mitochondrial contigs were extracted from the MiSeq-based L1 diploid assembly (N50=22,073 bp). For individuals L2-L11, highly fragmented assemblies (with N50 in the range 1,600-3,400 bp) were generated with SPAdes (version 3.6.0) from HiSeq reads. Mitochondrial sequences for each individual, L1-L11, are in separate FASTA files.
This dataset includes SNPs detected among 11 <i>A. vaga</i> individuals L1-L11 (SNP d... more This dataset includes SNPs detected among 11 <i>A. vaga</i> individuals L1-L11 (SNP dataset I). SNPs were called with SAMtools/BCFtools (v.1.4.1) from alignments of HiSeq reads to the L1 haploid sub-assembly. File 'A.vaga.L1_L11.SNP.dataset.I.raw.vcf' contains raw variant calls (raw INDEL entries are also present). Filtered SNPs (stringent SNP dataset I) are contained in the file 'A.vaga.L1_L11.SNP.dataset.I.filtered.vcf'. Data are in the VCF format.
retrotransposons in the bdelloid rotifer Adineta vaga exhibit unusual structural features and pla... more retrotransposons in the bdelloid rotifer Adineta vaga exhibit unusual structural features and play a role in expansion of host gene families Arkhipova et al. Arkhipova et al. Mobile DNA 2013, 4:19
In eukaryotes, 5-methylcytosine is the predominant DNA base modification, followed by N6-methylad... more In eukaryotes, 5-methylcytosine is the predominant DNA base modification, followed by N6-methyladenine. However, N4-methylcytosine (4mC) is confined to bacteria. Here we report that 4mC can serve as an epigenetic mark in eukaryotes. Bdelloid rotifers, freshwater invertebrates with transposon-poor genomes that are rich in foreign genes, lack C5-methyltransferases but encode an amino-methyltransferase, N4CMT, captured from bacteria >60 Mya. N4CMT introduces 4mC into DNA, and its chromodomain shapes the "histone-read-DNA-write" architecture together with a "DNA-read-histone-write" SETDB1/eggless H3K9me3 histone methyltransferase variant preferentially binding 4mC-DNA, to maintain 4mC and silent chromatin at transposons and tandem repeats. Our results bring the third base modification into the eukaryotic repertoire, demonstrate how non-native DNA methyl groups can reshape complex epigenetic systems to suppress transposon proliferation, and establish horizontal gen...
Molecular Biology and Evolution, 2021
Penelope-like elements (PLEs) are an enigmatic clade of retrotransposons whose reverse transcript... more Penelope-like elements (PLEs) are an enigmatic clade of retrotransposons whose reverse transcriptases (RTs) share a most recent common ancestor with telomerase RTs. The single ORF of canonical endonuclease (EN)+ PLEs encodes RT and a C-terminal GIY–YIG EN that enables intrachromosomal integration, whereas EN− PLEs lack EN and are generally restricted to chromosome termini. EN+ PLEs have only been found in animals, except for one case of horizontal transfer to conifers, whereas EN− PLEs occur in several kingdoms. Here, we report a new, deep-branching PLE clade with a permuted domain order, whereby an N-terminal GIY–YIG EN is linked to a C-terminal RT by a short domain with a characteristic CxC motif. These N-terminal EN+ PLEs share a structural organization, including pseudo-LTRs and complex tandem/inverted insertions, with canonical EN+ PLEs from Penelope/Poseidon, Neptune, and Nematis clades, and show insertion bias for microsatellites, but lack canonical hammerhead ribozyme motifs...
Nature Communications, 2020
Sexual reproduction is almost ubiquitous among extant eukaryotes. As most asexual lineages are sh... more Sexual reproduction is almost ubiquitous among extant eukaryotes. As most asexual lineages are short-lived, abandoning sex is commonly regarded as an evolutionary dead end. Still, putative anciently asexual lineages challenge this view. One of the most striking examples are bdelloid rotifers, microscopic freshwater invertebrates believed to have completely abandoned sexual reproduction tens of Myr ago. Here, we compare whole genomes of 11 wild-caught individuals of the bdelloid rotiferAdineta vagaand present evidence that some patterns in its genetic variation are incompatible with strict clonality and lack of genetic exchange. These patterns include genotype proportions close to Hardy-Weinberg expectations within loci, lack of linkage disequilibrium between distant loci, incongruent haplotype phylogenies across the genome, and evidence for hybridization between divergent lineages. Analysis of triallelic sites independently corroborates these findings. Our results provide evidence f...
Genome Biology and Evolution, 2019
Transposable elements (TEs) are ubiquitous in both prokaryotes and eukaryotes, and the dynamic ch... more Transposable elements (TEs) are ubiquitous in both prokaryotes and eukaryotes, and the dynamic character of their interaction with host genomes brings about numerous evolutionary innovations and shapes genome structure and function in a multitude of ways. In traditional classification systems, TEs are often being depicted in simplistic ways, based primarily on the key enzymes required for transposition, such as transposases/recombinases and reverse transcriptases. Recent progress in whole-genome sequencing and long-read assembly, combined with expansion of the familiar range of model organisms, resulted in identification of unprecedentedly long transposable units spanning dozens or even hundreds of kilobases, initially in prokaryotic and more recently in eukaryotic systems. Here, we focus on such oversized eukaryotic TEs, including retrotransposons and DNA transposons, outline their complex and often combinatorial nature and closely intertwined relationship with viruses, and discuss their potential for participating in transfer of long stretches of DNA in eukaryotes.
Sexual reproduction which involves alternation of meiosis and syngamy is the ancestral condition ... more Sexual reproduction which involves alternation of meiosis and syngamy is the ancestral condition of extant eukaryotes. Transitions to asexual reproduction were numerous, but most of the resulting eukaryotic lineages are rather short-lived. Still, there are several exceptions to this rule including darwinulid ostracods1,2and timema stick insects3. The most striking of them is bdelloid rotifers4–6, microscopic freshwater invertebrates which underwent an extensive adaptive radiation after apparently losing meiosis over 10 Mya. Indeed, both the lack of males in numerous bdelloid species and the lack of proper homology between chromosomes6rule out ordinary sex. However, this does not exclude the possibility of some other mode of interindividual genetic exchange and recombination in their populations7. Recent analyses based on a few loci suggested genetic exchanges in this group8,9, although this has been controversial10. Here, we compare complete genomes of 11 individuals from the wild p...