Ignacio Marín | CSIC (Consejo Superior de Investigaciones Científicas-Spanish National Research Council) (original) (raw)
Papers by Ignacio Marín
Biology
This study establishes the origin and evolutionary history of the synuclein genes. A combination ... more This study establishes the origin and evolutionary history of the synuclein genes. A combination of phylogenetic analyses of the synucleins from twenty-two model species, characterization of local synteny similarities among humans, sharks and lampreys, and statistical comparisons among lamprey and human chromosomes, provides conclusive evidence for the current diversity of synuclein genes arising from the whole-genome duplications (WGDs) that occurred in vertebrates. An ancestral synuclein gene was duplicated in a first WGD, predating the diversification of all living vertebrates. The two resulting genes are still present in agnathan vertebrates. The second WGD, specific to the gnathostome lineage, led to the emergence of the three classical synuclein genes, SNCA, SNCB and SNCG, which are present in all jawed vertebrate lineages. Additional WGDs have added new genes in both agnathans and gnathostomes, while some gene losses have occurred in particular species. The emergence of synuc...
H the elements were obtained. Some times, letters have been added at the end to name different el... more H the elements were obtained. Some times, letters have been added at the end to name different elements present in the same sequence. Numbers in the branches refer to bootstrap support (in percentages) for two different methods, neighbor-joining (NJ, top) and maximum parsimony (MP, bottom). The results of both methods were, in this case and the ones in the next two figures, almost identical, so they can be shown in a single tree. Arrows points to the ten elements without frameshifts or stop codons in their ORFs (discussed in the text).<b>Copyright information:</b>Taken from "How Athila retrotransposons survive in the genome"http://www.biomedcentral.com/1471-2164/9/219BMC Genomics 2008;9():219-219.Published online 14 May 2008PMCID:PMC2410132.
E 1.<b>Copyright information:</b>Taken from "How Athila retrotransposons survive... more E 1.<b>Copyright information:</b>Taken from "How Athila retrotransposons survive in the genome"http://www.biomedcentral.com/1471-2164/9/219BMC Genomics 2008;9():219-219.Published online 14 May 2008PMCID:PMC2410132.
Biology, 12:1053, 2023
Simple Summary: Alpha-synuclein has been thoroughly analyzed due to its relevance to famili-al Pa... more Simple Summary: Alpha-synuclein has been thoroughly analyzed due to its relevance to famili-al Parkinson disease and other synucleopathies. In this study, I determine the origin of the synu-clein genes in all vertebrates. Contrary to previous assumptions, these genes are not the result of individual gene duplications. They are ohnologs that emerged in several whole-genome dupli-cations that occurred throughout vertebrate history.
Abstract: This study establishes the origin and evolutionary history of the synuclein genes. A combination of phylogenetic analyses of the synucleins from twenty-two model species, charac-terization of local synteny similarities among humans, sharks, and lampreys, and statistical comparisons among lamprey and human chromosomes, provides conclusive evidence for the current diversity of synuclein genes arising from the whole-genome duplications (WGDs) that occurred in vertebrates. An ancestral synuclein gene was duplicated in a first WGD, predating the diversification of all living vertebrates. The two resulting genes are still present in agnathan vertebrates. The second WGD, specific to the gnathostome lineage, led to the emergence of the three classical synuclein genes: SNCA, SNCB, and SNCG, which are present in all jawed verte-brate lineages. Additional WGDs have added new genes in both agnathans and gnathostomes, while some gene losses have occurred in particular species. The emergence of synucleins through WGDs prevented these genes from experiencing dosage effects, thus avoiding the po-tential detrimental effects associated with individual duplications of genes that encode proteins prone to aggregation. Additional insights into the structural and functional features of synucle-ins are gained through the analysis of the highly divergent synuclein proteins present in chon-drichthyans and agnathans.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Li... more This is an Open Access article distributed under the terms of the Creative Commons Attribution License
Genetics, 1996
Several estimators have been developed for assesing the number of sterility factors in a chromoso... more Several estimators have been developed for assesing the number of sterility factors in a chromosome based on the sizes of fertile and sterile introgressed fragments. Assuming that two factors are required for producing sterility, simulations show that one of these, twice the inverse of the relative size of the largest fertile fragment, provides good average approximations when as few as five fertile fragments are analyzed. The estimators have been used for deducing the number of factors from previous data on several pairs of species. A particular result contrasts with the authors' interpretations: instead of the high number of sterility factors suggested, only a few per autosome are estimated in both reciprocal crosses involving Drosophila buzzatii and D. koepferae. It has been possible to map these factors, between three and six per chromosome, in the autosomes 3 and 4 of these species. Out of 203 introgressions of different fragments or combinations of fragments, the outcome o...
Journal of Heredity, 1998
Molecular Biology and Evolution, 2006
The DJ-1 gene is extensively studied because of its involvement in familial Parkinson disease. DJ... more The DJ-1 gene is extensively studied because of its involvement in familial Parkinson disease. DJ-1 belongs to a complex superfamily of genes that includes both prokaryotic and eukaryotic representatives. We determine that many prokaryotic groups, such as proteobacteria, cyanobacteria, spirochaetes, firmicutes, or fusobacteria, have genes, often incorrectly called ''Thij,'' that are very close relatives of DJ-1, to the point that they cannot be clearly separated from the eukaryotic DJ-1 genes by phylogenetic analyses of their sequences. In addition, and contrary to a previous study that suggested that DJ-1 genes were animal specific, we show that DJ-1 genes are found in at least 5 of the 6 main eukaryotic groups: opisthokonta (both animals and fungi), plantae, chromalveolata, excavata, and amoebozoa. Our results thus provide strong evidence for DJ-1 genes originating before the origin of eukaryotes. Interestingly, we found that some fungal species, among them the model yeast Schizosaccharomyces pombe, have DJ-1-like genes, most likely orthologous to the animal genes. This finding opens new ways for the analysis of the functions of this group of genes.
Molecular Biology and Evolution, 2010
In a previous work, we characterized a gene, called Gypsy Integrase 1 (GIN1), which encodes a pro... more In a previous work, we characterized a gene, called Gypsy Integrase 1 (GIN1), which encodes a protein very similar to the integrase domains present in Gypsy/Ty3 retrotransposons. I describe here a paralog of GIN1 and GIN2 and show that both genes are present in multiple vertebrates and that a likely homolog is found in urochordates. Surprisingly, phylogenetic and structural analyses support the counterintuitive idea that the GIN genes did not directly derive from retrotransposons but from a novel type of animal-specific DNA transposons, the GIN elements. These elements, described for the first time in this study, are characterized by containing a gene that encodes a protein that is also very similar to Gypsy/Ty3 integrases. It turns out that the sequences of the integrases encoded by GIN1 and GIN2 are more similar to those found in GIN elements than to those detected in retrotransposons. Moreover, several introns are in the same positions in the integrase-encoding genes of some GIN elements, GIN1 and GIN2. The simplest explanation for these results is that GIN elements appeared early in animal evolution by co-option of the integrase of a retrotransposon, they later expanded in multiple animal lineages, and, eventually, gave rise to the GIN genes. In summary, GIN transposons may be the ''missing link'' that explain how GIN genes evolved from retrotransposons. GIN1 and GIN2 may have contributed to control the expansion of GIN elements and Gypsy/Ty3 retrotransposons in chordates.
Molecular Biology and Evolution, 2000
Dosage compensation in Drosophila is mediated by genes known as ''male-specific lethals'' (msls).... more Dosage compensation in Drosophila is mediated by genes known as ''male-specific lethals'' (msls). Several msls, including male-specific lethal-3 (msl-3), encode proteins of unknown function. We cloned the Drosophila virilis msl-3 gene. Using the information provided by the sequences of the Drosophila melanogaster and D. virilis genes, we found that sequences of other species can be aligned along their entire lengths with msl-3. Among them, there are genes in yeasts (the Schizosaccharomyces pombe Alp13 gene, as well as a putative Alp13 homolog, found in Saccharomyces cerevisae) and in mammals (MRG15 and MSL3L1 and their relatives) plus uncharacterized sequences of the nematode Caenorhabditis elegans and the plants Arabidopsis thaliana, Lycopersicon esculentum, and Zea mays. A second Drosophila gene of this family has also been found. It is thus likely that msl-3-like genes are present in all eukaryotes. Phylogenetic analyses suggest that msl-3 is orthologous to the mammalian MSL3L1 genes, while the second Drosophila melanogaster gene (which we have called Dm MRG15) is orthologous to mammalian MRG15. These analyses also suggest that the msl-3/MRG15 duplication occurred after the fungus/animal split, while an independent duplication occurred in plants. The proteins encoded by these genes have similar structures, including a putative chromodomain close to their N-terminal end and a putative leucine zipper at their C-terminus. The possible functional roles of these proteins are discussed.
Molecular Biology and Evolution, 2000
We performed a comprehensive analysis of the evolution of the Ty3/Gypsy group of long-terminal-re... more We performed a comprehensive analysis of the evolution of the Ty3/Gypsy group of long-terminal-repeat retrotransposons (also known as Metaviridae). Exhaustive database searches allowed us to detect novel elements of this group. In particular, the Arabidopsis thaliana and Drosophila melanogaster genome sequencing projects have recently disclosed a large number of new Ty3/Gypsy sequences. So far, elements of three different Ty3/Gypsy lineages had been described for A. thaliana. Here, we describe six new lineages, which we have called Tit-for-tat1, Tit-for-tat2, Gimli, Gloin, Legolas, and Little Athila. We confirm that plant Ty3/Gypsy elements form two main monophyletic groups. Moreover, our results suggest that at least four independent ancestral lineages existed before the monocot-dicot split, about 200 MYA. Twelve sequences from D. melanogaster that may correspond to new elements are also described. Some of these sequences are similar to those of Osvaldo and Ulysses, two elements of the Osvaldo clade that had never before been described for D. melanogaster. Comparative analyses of multiple organisms, some of them with completely sequenced genomes, show that the number of lineages of Ty3/Gypsy elements is very variable. Thus, while only 1 lineage is present in Saccharomyces cerevisiae, at least 6 exist in Caenorhabditis elegans, at least 9 are present in the A. thaliana, and perhaps 20 are present in D. melanogaster. Finally, we suggest that the presence of a chromodomain-containing integrase, a feature of some closely related Ty3/Gypsy elements of fungi, plants, and animals, may be used to define a new Metaviridae genus.
Molecular Biology and Evolution, 1998
We took advantage of the massive amount of sequence information generated by the Caenorhabditis e... more We took advantage of the massive amount of sequence information generated by the Caenorhabditis elegans genome project to perform a comprehensive analysis of a group of over 100 related sequences that has allowed us to describe two new C. elegans non-LTR retrotransposons. We named them Sam and Frodo. We also determined that several highly divergent subfamilies of both elements exist in C. elegans. It is likely that several master copies have been active at the same time in C. elegans, although only a few copies of both Sam and Frodo have characteristics that are compatible with them being active today. We discuss whether it is more appropriate under these circumstances to define only 2 elements corresponding to the most divergent groups of sequences or up to 16, considering each subfamily a different element. The C. elegans elements are related to other previously described non-LTR retrotransposons (CR1, found in different vertebrates; SR1, from the trematode Schistosoma; Q and T1, from the mosquito Anopheles). All of these elements, according to the analysis of their reverse transcriptases, form a monophyletic cluster that we call the ''T1/CR1 subgroup.'' Elements of this subgroup are thus ancient components of the genome of animal species. However, we discuss the possibility that these elements may occasionally be horizontally transmitted.
Molecular Biology and Evolution, 2001
Dosage compensation in Drosophila is mediated by a complex of proteins and RNAs called the ''comp... more Dosage compensation in Drosophila is mediated by a complex of proteins and RNAs called the ''compensasome.'' Two of the genes that encode proteins of the complex, maleless (mle) and males-absent-on-the-first (mof), respectively, belong to the DEAH helicase and MYST acetyltransferase gene families. We performed comprehensive phylogenetic and structural analyses to determine the evolutionary histories of these two gene families and thus to better understand the origin of the compensasome. All of the members of the DEAH and MYST families of the completely sequenced Saccharomyces cerevisiae and Caenorhabditis elegans genomes, as well as those so far (June 2000) found in Drosophila melanogaster (for which the euchromatic part of the genome has also been fully sequenced) and Homo sapiens, were analyzed. We describe a total of 39 DEAH helicases in these four species. Almost all of them can be grouped in just three main branches. The first branch includes the yeast PRP2, PRP16, PRP22, and PRP43 splicing factors and their orthologs in animal species. Each PRP gene has a single ortholog in metazoans. The second branch includes just four genes, found in yeast (Ecm16) and Drosophila (kurz) and their orthologs in humans and Caenorhabditis. The third branch includes (1) a single yeast gene (YLR419w); (2) six Drosophila genes, including maleless and spindle-E/homeless; (3) four human genes, among them the ortholog of maleless, which encodes RNA helicase A; and (4) three C. elegans genes, including orthologs of maleless and spindle-E. Thus, this branch has largely expanded in metazoans. We also show that, for the whole DEAH family, only MLE and its metazoan orthologs have acquired new protein domains since the fungi/animals split. We found a total of 17 MYST family proteins in the four analyzed species. We determined putative orthologs of mof in both C. elegans and H. sapiens, and we show that the most likely ortholog in yeast is the Sas2 gene. Moreover, a paralog of mof exists in Drosophila. All of these results, together with those found for a third member of the compensasome, msl-3, suggest that this complex emerged after the fungi/animals split and that it may be present in mammalian species. Both gene duplication and the acquisition of new protein modules may have played important roles in the origin of the compensasome.
Journal of Theoretical Biology, 1991
Several simple models are developed to calculate expected mating frequencies in ethological isola... more Several simple models are developed to calculate expected mating frequencies in ethological isolation experiments. They take into account the effect that the peculiar sexual behavior of Drosophila species can have in multiple-choice experiments. These models depend on only three basic parameters: male competitive ability (C), female receptivity (R) and the coefficient of females acceptance (A). Two types of model can be distinguished: (1) models with discrete preferences, in which A is a measure of the percentage of females accepting a particular kind of male and (2) models with continuous preferences, in which A represents the probability of acceptance for each courtship. It is demonstrated that the information rendered by just one experiment, although effective for determining whether sexual isolation exists, it is insufficient to estimate its degree or to demonstrate that it is asymmetrical. Further developments of the models under more complex conditions as well as their implications for reinforcement and founder effect theories are discussed.
Journal of Theoretical Biology, 1997
It has been generally assumed that ''choice experiments'' are useful to measure sexual isolation ... more It has been generally assumed that ''choice experiments'' are useful to measure sexual isolation between Drosophila strains or species. Theoretical models have demonstrated however that the results obtained using one of these designs, namely multiple-choice experiments, are insufficient to determine the degree of isolation, even under very favorable assumptions. In this work, a simple behavioral model is developed to test whether male-choice experiments can be used to measure sexual isolation in Drosophila. This model shows that, although the outcome of male-choice experiments is affected by differences in female receptivities, a procedure to estimate the minimum degree of isolation using this experimental design can be established. The application of the methods derived from the theoretical model to previously reported experimental data demonstrates that a substantial degree of isolation frequently exists intraspecifically, while isolation is far from complete interspecifically. These results have important implications for discussions based on the comparative analysis of Drosophila behavior, both intra-and interspecifically. Most especially, they are in contradiction with the expectations of the Recognition concept of species.
Bioinformatics, 2004
Motivation: Generation of fast tools of hierarchical clustering to be applied when distances amon... more Motivation: Generation of fast tools of hierarchical clustering to be applied when distances among elements of a set are constrained, causing frequent distance ties, as happens in protein interaction data. Results: We present in this work the program UVCLUSTER, that iteratively explores distance datasets using hierarchical clustering. Once the user selects a group of proteins, UVCLUSTER converts the set of primary distances among them (i.e. the minimum number of steps, or interactions, required to connect two proteins) into secondary distances that measure the strength of the connection between each pair of proteins when the interactions for all the proteins in the group are considered. We show that this novel strategy has advantages over conventional clustering methods to explore protein–protein interaction data. UVCLUSTER easily incorporates the information of the largest available interaction datasets to generate comprehensive primary distance tables. The versatility, simplicity ...
Genome Biology and Evolution, 2020
The evolution of the tumor necrosis factor superfamily (TNFSF) in early vertebrates is inferred b... more The evolution of the tumor necrosis factor superfamily (TNFSF) in early vertebrates is inferred by comparing the TNFSF genes found in humans and nine fishes: three agnathans, two chondrichthyans, three actinopterygians, and the sarcopterygian Latimeria chalumnae. By combining phylogenetic and synteny analyses, the TNFSF sequences detected are classified into five clusters of genes and 24 orthology groups. A model for their evolution since the origin of vertebrates is proposed. Fifteen TNFSF genes emerged from just three progenitors due to the whole-genome duplications (WGDs) that occurred before the agnathan/gnathostome split. Later, gnathostomes not only kept most of the genes emerged in the WGDs but soon added several tandem duplicates. More recently, complex, lineage-specific patterns of duplications and losses occurred in different gnathostome lineages. In agnathan species only seven to eight TNFSF genes are detected, because this lineage soon lost six of the genes emerged in th...
Proceedings International Parallel and Distributed Processing Symposium
We have developed a new algorithm that allows the exhaustive determination of words of up to 12 n... more We have developed a new algorithm that allows the exhaustive determination of words of up to 12 nucleotides in DNA sequences. It is fast enough as to be used at a genomic scale running on a standard personal computer. As an example, we apply the algorithm to compare the number of all 12-nucleotide long words in human chromosomes 21 and 22, each of them more than 33 million nucleotides long. Sequences that are chromosome specific are detected in less than 2 minutes, being analyzed any pair of chromosomes at a rate of 45 millions of nucleotides (45 Mb) per minute. The size of the words is long enough as to allow further analyses of all significant sequences using conventional database searches. This allows to very simply establish the location and, many times, the biological meaning of the selected words. As an example, we show here, for the comparison between human chromosomes 21 and 22, that all the sequences that are found at least 40 times in one chromosome but are absent in the other belong to just two different classes, namely tandem repeats or genes with characteristic, internally repetitive, coding regions. Other available versions of this program and further applications are discussed.
Biology
This study establishes the origin and evolutionary history of the synuclein genes. A combination ... more This study establishes the origin and evolutionary history of the synuclein genes. A combination of phylogenetic analyses of the synucleins from twenty-two model species, characterization of local synteny similarities among humans, sharks and lampreys, and statistical comparisons among lamprey and human chromosomes, provides conclusive evidence for the current diversity of synuclein genes arising from the whole-genome duplications (WGDs) that occurred in vertebrates. An ancestral synuclein gene was duplicated in a first WGD, predating the diversification of all living vertebrates. The two resulting genes are still present in agnathan vertebrates. The second WGD, specific to the gnathostome lineage, led to the emergence of the three classical synuclein genes, SNCA, SNCB and SNCG, which are present in all jawed vertebrate lineages. Additional WGDs have added new genes in both agnathans and gnathostomes, while some gene losses have occurred in particular species. The emergence of synuc...
H the elements were obtained. Some times, letters have been added at the end to name different el... more H the elements were obtained. Some times, letters have been added at the end to name different elements present in the same sequence. Numbers in the branches refer to bootstrap support (in percentages) for two different methods, neighbor-joining (NJ, top) and maximum parsimony (MP, bottom). The results of both methods were, in this case and the ones in the next two figures, almost identical, so they can be shown in a single tree. Arrows points to the ten elements without frameshifts or stop codons in their ORFs (discussed in the text).<b>Copyright information:</b>Taken from "How Athila retrotransposons survive in the genome"http://www.biomedcentral.com/1471-2164/9/219BMC Genomics 2008;9():219-219.Published online 14 May 2008PMCID:PMC2410132.
E 1.<b>Copyright information:</b>Taken from "How Athila retrotransposons survive... more E 1.<b>Copyright information:</b>Taken from "How Athila retrotransposons survive in the genome"http://www.biomedcentral.com/1471-2164/9/219BMC Genomics 2008;9():219-219.Published online 14 May 2008PMCID:PMC2410132.
Biology, 12:1053, 2023
Simple Summary: Alpha-synuclein has been thoroughly analyzed due to its relevance to famili-al Pa... more Simple Summary: Alpha-synuclein has been thoroughly analyzed due to its relevance to famili-al Parkinson disease and other synucleopathies. In this study, I determine the origin of the synu-clein genes in all vertebrates. Contrary to previous assumptions, these genes are not the result of individual gene duplications. They are ohnologs that emerged in several whole-genome dupli-cations that occurred throughout vertebrate history.
Abstract: This study establishes the origin and evolutionary history of the synuclein genes. A combination of phylogenetic analyses of the synucleins from twenty-two model species, charac-terization of local synteny similarities among humans, sharks, and lampreys, and statistical comparisons among lamprey and human chromosomes, provides conclusive evidence for the current diversity of synuclein genes arising from the whole-genome duplications (WGDs) that occurred in vertebrates. An ancestral synuclein gene was duplicated in a first WGD, predating the diversification of all living vertebrates. The two resulting genes are still present in agnathan vertebrates. The second WGD, specific to the gnathostome lineage, led to the emergence of the three classical synuclein genes: SNCA, SNCB, and SNCG, which are present in all jawed verte-brate lineages. Additional WGDs have added new genes in both agnathans and gnathostomes, while some gene losses have occurred in particular species. The emergence of synucleins through WGDs prevented these genes from experiencing dosage effects, thus avoiding the po-tential detrimental effects associated with individual duplications of genes that encode proteins prone to aggregation. Additional insights into the structural and functional features of synucle-ins are gained through the analysis of the highly divergent synuclein proteins present in chon-drichthyans and agnathans.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Li... more This is an Open Access article distributed under the terms of the Creative Commons Attribution License
Genetics, 1996
Several estimators have been developed for assesing the number of sterility factors in a chromoso... more Several estimators have been developed for assesing the number of sterility factors in a chromosome based on the sizes of fertile and sterile introgressed fragments. Assuming that two factors are required for producing sterility, simulations show that one of these, twice the inverse of the relative size of the largest fertile fragment, provides good average approximations when as few as five fertile fragments are analyzed. The estimators have been used for deducing the number of factors from previous data on several pairs of species. A particular result contrasts with the authors' interpretations: instead of the high number of sterility factors suggested, only a few per autosome are estimated in both reciprocal crosses involving Drosophila buzzatii and D. koepferae. It has been possible to map these factors, between three and six per chromosome, in the autosomes 3 and 4 of these species. Out of 203 introgressions of different fragments or combinations of fragments, the outcome o...
Journal of Heredity, 1998
Molecular Biology and Evolution, 2006
The DJ-1 gene is extensively studied because of its involvement in familial Parkinson disease. DJ... more The DJ-1 gene is extensively studied because of its involvement in familial Parkinson disease. DJ-1 belongs to a complex superfamily of genes that includes both prokaryotic and eukaryotic representatives. We determine that many prokaryotic groups, such as proteobacteria, cyanobacteria, spirochaetes, firmicutes, or fusobacteria, have genes, often incorrectly called ''Thij,'' that are very close relatives of DJ-1, to the point that they cannot be clearly separated from the eukaryotic DJ-1 genes by phylogenetic analyses of their sequences. In addition, and contrary to a previous study that suggested that DJ-1 genes were animal specific, we show that DJ-1 genes are found in at least 5 of the 6 main eukaryotic groups: opisthokonta (both animals and fungi), plantae, chromalveolata, excavata, and amoebozoa. Our results thus provide strong evidence for DJ-1 genes originating before the origin of eukaryotes. Interestingly, we found that some fungal species, among them the model yeast Schizosaccharomyces pombe, have DJ-1-like genes, most likely orthologous to the animal genes. This finding opens new ways for the analysis of the functions of this group of genes.
Molecular Biology and Evolution, 2010
In a previous work, we characterized a gene, called Gypsy Integrase 1 (GIN1), which encodes a pro... more In a previous work, we characterized a gene, called Gypsy Integrase 1 (GIN1), which encodes a protein very similar to the integrase domains present in Gypsy/Ty3 retrotransposons. I describe here a paralog of GIN1 and GIN2 and show that both genes are present in multiple vertebrates and that a likely homolog is found in urochordates. Surprisingly, phylogenetic and structural analyses support the counterintuitive idea that the GIN genes did not directly derive from retrotransposons but from a novel type of animal-specific DNA transposons, the GIN elements. These elements, described for the first time in this study, are characterized by containing a gene that encodes a protein that is also very similar to Gypsy/Ty3 integrases. It turns out that the sequences of the integrases encoded by GIN1 and GIN2 are more similar to those found in GIN elements than to those detected in retrotransposons. Moreover, several introns are in the same positions in the integrase-encoding genes of some GIN elements, GIN1 and GIN2. The simplest explanation for these results is that GIN elements appeared early in animal evolution by co-option of the integrase of a retrotransposon, they later expanded in multiple animal lineages, and, eventually, gave rise to the GIN genes. In summary, GIN transposons may be the ''missing link'' that explain how GIN genes evolved from retrotransposons. GIN1 and GIN2 may have contributed to control the expansion of GIN elements and Gypsy/Ty3 retrotransposons in chordates.
Molecular Biology and Evolution, 2000
Dosage compensation in Drosophila is mediated by genes known as ''male-specific lethals'' (msls).... more Dosage compensation in Drosophila is mediated by genes known as ''male-specific lethals'' (msls). Several msls, including male-specific lethal-3 (msl-3), encode proteins of unknown function. We cloned the Drosophila virilis msl-3 gene. Using the information provided by the sequences of the Drosophila melanogaster and D. virilis genes, we found that sequences of other species can be aligned along their entire lengths with msl-3. Among them, there are genes in yeasts (the Schizosaccharomyces pombe Alp13 gene, as well as a putative Alp13 homolog, found in Saccharomyces cerevisae) and in mammals (MRG15 and MSL3L1 and their relatives) plus uncharacterized sequences of the nematode Caenorhabditis elegans and the plants Arabidopsis thaliana, Lycopersicon esculentum, and Zea mays. A second Drosophila gene of this family has also been found. It is thus likely that msl-3-like genes are present in all eukaryotes. Phylogenetic analyses suggest that msl-3 is orthologous to the mammalian MSL3L1 genes, while the second Drosophila melanogaster gene (which we have called Dm MRG15) is orthologous to mammalian MRG15. These analyses also suggest that the msl-3/MRG15 duplication occurred after the fungus/animal split, while an independent duplication occurred in plants. The proteins encoded by these genes have similar structures, including a putative chromodomain close to their N-terminal end and a putative leucine zipper at their C-terminus. The possible functional roles of these proteins are discussed.
Molecular Biology and Evolution, 2000
We performed a comprehensive analysis of the evolution of the Ty3/Gypsy group of long-terminal-re... more We performed a comprehensive analysis of the evolution of the Ty3/Gypsy group of long-terminal-repeat retrotransposons (also known as Metaviridae). Exhaustive database searches allowed us to detect novel elements of this group. In particular, the Arabidopsis thaliana and Drosophila melanogaster genome sequencing projects have recently disclosed a large number of new Ty3/Gypsy sequences. So far, elements of three different Ty3/Gypsy lineages had been described for A. thaliana. Here, we describe six new lineages, which we have called Tit-for-tat1, Tit-for-tat2, Gimli, Gloin, Legolas, and Little Athila. We confirm that plant Ty3/Gypsy elements form two main monophyletic groups. Moreover, our results suggest that at least four independent ancestral lineages existed before the monocot-dicot split, about 200 MYA. Twelve sequences from D. melanogaster that may correspond to new elements are also described. Some of these sequences are similar to those of Osvaldo and Ulysses, two elements of the Osvaldo clade that had never before been described for D. melanogaster. Comparative analyses of multiple organisms, some of them with completely sequenced genomes, show that the number of lineages of Ty3/Gypsy elements is very variable. Thus, while only 1 lineage is present in Saccharomyces cerevisiae, at least 6 exist in Caenorhabditis elegans, at least 9 are present in the A. thaliana, and perhaps 20 are present in D. melanogaster. Finally, we suggest that the presence of a chromodomain-containing integrase, a feature of some closely related Ty3/Gypsy elements of fungi, plants, and animals, may be used to define a new Metaviridae genus.
Molecular Biology and Evolution, 1998
We took advantage of the massive amount of sequence information generated by the Caenorhabditis e... more We took advantage of the massive amount of sequence information generated by the Caenorhabditis elegans genome project to perform a comprehensive analysis of a group of over 100 related sequences that has allowed us to describe two new C. elegans non-LTR retrotransposons. We named them Sam and Frodo. We also determined that several highly divergent subfamilies of both elements exist in C. elegans. It is likely that several master copies have been active at the same time in C. elegans, although only a few copies of both Sam and Frodo have characteristics that are compatible with them being active today. We discuss whether it is more appropriate under these circumstances to define only 2 elements corresponding to the most divergent groups of sequences or up to 16, considering each subfamily a different element. The C. elegans elements are related to other previously described non-LTR retrotransposons (CR1, found in different vertebrates; SR1, from the trematode Schistosoma; Q and T1, from the mosquito Anopheles). All of these elements, according to the analysis of their reverse transcriptases, form a monophyletic cluster that we call the ''T1/CR1 subgroup.'' Elements of this subgroup are thus ancient components of the genome of animal species. However, we discuss the possibility that these elements may occasionally be horizontally transmitted.
Molecular Biology and Evolution, 2001
Dosage compensation in Drosophila is mediated by a complex of proteins and RNAs called the ''comp... more Dosage compensation in Drosophila is mediated by a complex of proteins and RNAs called the ''compensasome.'' Two of the genes that encode proteins of the complex, maleless (mle) and males-absent-on-the-first (mof), respectively, belong to the DEAH helicase and MYST acetyltransferase gene families. We performed comprehensive phylogenetic and structural analyses to determine the evolutionary histories of these two gene families and thus to better understand the origin of the compensasome. All of the members of the DEAH and MYST families of the completely sequenced Saccharomyces cerevisiae and Caenorhabditis elegans genomes, as well as those so far (June 2000) found in Drosophila melanogaster (for which the euchromatic part of the genome has also been fully sequenced) and Homo sapiens, were analyzed. We describe a total of 39 DEAH helicases in these four species. Almost all of them can be grouped in just three main branches. The first branch includes the yeast PRP2, PRP16, PRP22, and PRP43 splicing factors and their orthologs in animal species. Each PRP gene has a single ortholog in metazoans. The second branch includes just four genes, found in yeast (Ecm16) and Drosophila (kurz) and their orthologs in humans and Caenorhabditis. The third branch includes (1) a single yeast gene (YLR419w); (2) six Drosophila genes, including maleless and spindle-E/homeless; (3) four human genes, among them the ortholog of maleless, which encodes RNA helicase A; and (4) three C. elegans genes, including orthologs of maleless and spindle-E. Thus, this branch has largely expanded in metazoans. We also show that, for the whole DEAH family, only MLE and its metazoan orthologs have acquired new protein domains since the fungi/animals split. We found a total of 17 MYST family proteins in the four analyzed species. We determined putative orthologs of mof in both C. elegans and H. sapiens, and we show that the most likely ortholog in yeast is the Sas2 gene. Moreover, a paralog of mof exists in Drosophila. All of these results, together with those found for a third member of the compensasome, msl-3, suggest that this complex emerged after the fungi/animals split and that it may be present in mammalian species. Both gene duplication and the acquisition of new protein modules may have played important roles in the origin of the compensasome.
Journal of Theoretical Biology, 1991
Several simple models are developed to calculate expected mating frequencies in ethological isola... more Several simple models are developed to calculate expected mating frequencies in ethological isolation experiments. They take into account the effect that the peculiar sexual behavior of Drosophila species can have in multiple-choice experiments. These models depend on only three basic parameters: male competitive ability (C), female receptivity (R) and the coefficient of females acceptance (A). Two types of model can be distinguished: (1) models with discrete preferences, in which A is a measure of the percentage of females accepting a particular kind of male and (2) models with continuous preferences, in which A represents the probability of acceptance for each courtship. It is demonstrated that the information rendered by just one experiment, although effective for determining whether sexual isolation exists, it is insufficient to estimate its degree or to demonstrate that it is asymmetrical. Further developments of the models under more complex conditions as well as their implications for reinforcement and founder effect theories are discussed.
Journal of Theoretical Biology, 1997
It has been generally assumed that ''choice experiments'' are useful to measure sexual isolation ... more It has been generally assumed that ''choice experiments'' are useful to measure sexual isolation between Drosophila strains or species. Theoretical models have demonstrated however that the results obtained using one of these designs, namely multiple-choice experiments, are insufficient to determine the degree of isolation, even under very favorable assumptions. In this work, a simple behavioral model is developed to test whether male-choice experiments can be used to measure sexual isolation in Drosophila. This model shows that, although the outcome of male-choice experiments is affected by differences in female receptivities, a procedure to estimate the minimum degree of isolation using this experimental design can be established. The application of the methods derived from the theoretical model to previously reported experimental data demonstrates that a substantial degree of isolation frequently exists intraspecifically, while isolation is far from complete interspecifically. These results have important implications for discussions based on the comparative analysis of Drosophila behavior, both intra-and interspecifically. Most especially, they are in contradiction with the expectations of the Recognition concept of species.
Bioinformatics, 2004
Motivation: Generation of fast tools of hierarchical clustering to be applied when distances amon... more Motivation: Generation of fast tools of hierarchical clustering to be applied when distances among elements of a set are constrained, causing frequent distance ties, as happens in protein interaction data. Results: We present in this work the program UVCLUSTER, that iteratively explores distance datasets using hierarchical clustering. Once the user selects a group of proteins, UVCLUSTER converts the set of primary distances among them (i.e. the minimum number of steps, or interactions, required to connect two proteins) into secondary distances that measure the strength of the connection between each pair of proteins when the interactions for all the proteins in the group are considered. We show that this novel strategy has advantages over conventional clustering methods to explore protein–protein interaction data. UVCLUSTER easily incorporates the information of the largest available interaction datasets to generate comprehensive primary distance tables. The versatility, simplicity ...
Genome Biology and Evolution, 2020
The evolution of the tumor necrosis factor superfamily (TNFSF) in early vertebrates is inferred b... more The evolution of the tumor necrosis factor superfamily (TNFSF) in early vertebrates is inferred by comparing the TNFSF genes found in humans and nine fishes: three agnathans, two chondrichthyans, three actinopterygians, and the sarcopterygian Latimeria chalumnae. By combining phylogenetic and synteny analyses, the TNFSF sequences detected are classified into five clusters of genes and 24 orthology groups. A model for their evolution since the origin of vertebrates is proposed. Fifteen TNFSF genes emerged from just three progenitors due to the whole-genome duplications (WGDs) that occurred before the agnathan/gnathostome split. Later, gnathostomes not only kept most of the genes emerged in the WGDs but soon added several tandem duplicates. More recently, complex, lineage-specific patterns of duplications and losses occurred in different gnathostome lineages. In agnathan species only seven to eight TNFSF genes are detected, because this lineage soon lost six of the genes emerged in th...
Proceedings International Parallel and Distributed Processing Symposium
We have developed a new algorithm that allows the exhaustive determination of words of up to 12 n... more We have developed a new algorithm that allows the exhaustive determination of words of up to 12 nucleotides in DNA sequences. It is fast enough as to be used at a genomic scale running on a standard personal computer. As an example, we apply the algorithm to compare the number of all 12-nucleotide long words in human chromosomes 21 and 22, each of them more than 33 million nucleotides long. Sequences that are chromosome specific are detected in less than 2 minutes, being analyzed any pair of chromosomes at a rate of 45 millions of nucleotides (45 Mb) per minute. The size of the words is long enough as to allow further analyses of all significant sequences using conventional database searches. This allows to very simply establish the location and, many times, the biological meaning of the selected words. As an example, we show here, for the comparison between human chromosomes 21 and 22, that all the sequences that are found at least 40 times in one chromosome but are absent in the other belong to just two different classes, namely tandem repeats or genes with characteristic, internally repetitive, coding regions. Other available versions of this program and further applications are discussed.