Odile Lecompte - Academia.edu (original) (raw)

Papers by Odile Lecompte

Research paper thumbnail of Ten Years of Collaborative Progress in the Quest for Orthologs

Molecular Biology and Evolution

Accurate determination of the evolutionary relationships between genes is a foundational challeng... more Accurate determination of the evolutionary relationships between genes is a foundational challenge in biology. Homology—evolutionary relatedness—is in many cases readily determined based on sequence similarity analysis. By contrast, whether or not two genes directly descended from a common ancestor by a speciation event (orthologs) or duplication event (paralogs) is more challenging, yet provides critical information on the history of a gene. Since 2009, this task has been the focus of the Quest for Orthologs (QFO) Consortium. The sixth QFO meeting took place in Okazaki, Japan in conjunction with the 67th National Institute for Basic Biology conference. Here, we report recent advances, applications, and oncoming challenges that were discussed during the conference. Steady progress has been made toward standardization and scalability of new and existing tools. A feature of the conference was the presentation of a panel of accessible tools for phylogenetic profiling and several develo...

Research paper thumbnail of Novel Approach Combining Transcriptional and Evolutionary Signatures to Identify New Multiciliation Genes

Genes

Multiciliogenesis is a complex process that allows the generation of hundreds of motile cilia on ... more Multiciliogenesis is a complex process that allows the generation of hundreds of motile cilia on the surface of specialized cells, to create fluid flow across epithelial surfaces. Dysfunction of human multiciliated cells is associated with diseases of the brain, airway and reproductive tracts. Despite recent efforts to characterize the transcriptional events responsible for the differentiation of multiciliated cells, a lot of actors remain to be identified. In this work, we capitalize on the ever-growing quantity of high-throughput data to search for new candidate genes involved in multiciliation. After performing a large-scale screening using 10 transcriptomics datasets dedicated to multiciliation, we established a specific evolutionary signature involving Otomorpha fish to use as a criterion to select the most likely targets. Combining both approaches highlighted a list of 114 potential multiciliated candidates. We characterized these genes first by generating protein interaction ...

Research paper thumbnail of PROBE: analysis and visualization of protein block-level evolution

Bioinformatics (Oxford, England), Jan 7, 2018

Comparative studies of protein sequences are widely used in evolutionary and comparative genomics... more Comparative studies of protein sequences are widely used in evolutionary and comparative genomics studies, but there is a lack of efficient tools to identify conserved regions ab initio within a protein multiple alignment. PROBE provides a fully automatic analysis of protein family conservation, to identify conserved regions, or 'blocks', that may correspond to structural / functional domains or motifs. Conserved blocks are identified at two different levels: (i) family level blocks indicate sites that are probably of central importance to the protein's structure or function, and (ii) sub-family level blocks highlight regions that may signify functional specialization, such as binding partners, etc. All conserved blocks are mapped onto a phylogenetic tree and can also be visualized in the context of the multiple sequence alignment. PROBE thus facilitates in-depth studies of sequence-structure-function-evolution relationships, and opens the way to block-level phylogenetic...

Research paper thumbnail of MyGeneFriends: A Social Network Linking Genes, Genetic Diseases, and Researchers

Journal of medical Internet research, Jun 16, 2017

The constant and massive increase of biological data offers unprecedented opportunities to deciph... more The constant and massive increase of biological data offers unprecedented opportunities to decipher the function and evolution of genes and their roles in human diseases. However, the multiplicity of sources and flow of data mean that efficient access to useful information and knowledge production has become a major challenge. This challenge can be addressed by taking inspiration from Web 2.0 and particularly social networks, which are at the forefront of big data exploration and human-data interaction. MyGeneFriends is a Web platform inspired by social networks, devoted to genetic disease analysis, and organized around three types of proactive agents: genes, humans, and genetic diseases. The aim of this study was to improve exploration and exploitation of biological, postgenomic era big data. MyGeneFriends leverages conventions popularized by top social networks (Facebook, LinkedIn, etc), such as networks of friends, profile pages, friendship recommendations, affinity scores, news ...

Research paper thumbnail of Proteome Evolution of Deep-Sea Hydrothermal Vent Alvinellid Polychaetes Supports the Ancestry of Thermophily and Subsequent Adaptation to Cold in Some Lineages

Genome biology and evolution, Feb 1, 2017

Temperature, perhaps more than any other environmental factor, is likely to influence the evoluti... more Temperature, perhaps more than any other environmental factor, is likely to influence the evolution of all organisms. It is also a very interesting factor to understand how genomes are shaped by selection over evolutionary timescales, as it potentially affects the whole genome. Among thermophilic prokaryotes, temperature affects both codon usage and protein composition to increase the stability of the transcriptional/translational machinery, and the resulting proteins need to be functional at high temperatures. Among eukaryotes less is known about genome evolution, and the tube-dwelling worms of the family Alvinellidae represent an excellent opportunity to test hypotheses about the emergence of thermophily in ectothermic metazoans. The Alvinellidae are a group of worms that experience varying thermal regimes, presumably having evolved into these niches over evolutionary times. Here we analyzed 423 putative orthologous loci derived from 6 alvinellid species including the thermophilic...

Research paper thumbnail of Insights into ciliary genes and evolution from multi-level phylogenetic profiling

Molecular biology and evolution, Aug 28, 2017

Cilia (flagella) are important eukaryotic organelles, present in the Last Eukaryotic Common Ances... more Cilia (flagella) are important eukaryotic organelles, present in the Last Eukaryotic Common Ancestor, and are involved in cell motility and integration of extracellular signals. Ciliary dysfunction causes a class of genetic diseases, known as ciliopathies, however current knowledge of the underlying mechanisms is still limited and a better characterization of genes is needed. As cilia have been lost independently several times during evolution and they are subject to important functional variation between species, ciliary genes can be investigated through comparative genomics.

Research paper thumbnail of Standardized benchmarking in the quest for orthologs

Nature methods, May 4, 2016

Achieving high accuracy in orthology inference is essential for many comparative, evolutionary an... more Achieving high accuracy in orthology inference is essential for many comparative, evolutionary and functional genomic analyses, yet the true evolutionary history of genes is generally unknown and orthologs are used for very different applications across phyla, requiring different precision-recall trade-offs. As a result, it is difficult to assess the performance of orthology inference methods. Here, we present a community effort to establish standards and an automated web-based service to facilitate orthology benchmarking. Using this service, we characterize 15 well-established inference methods and resources on a battery of 20 different benchmarks. Standardized benchmarking provides a way for users to identify the most effective methods for the problem at hand, sets a minimum requirement for new tools and resources, and guides the development of more accurate orthology inference methods.

Research paper thumbnail of De l'analyse du génome de Pyrococcus abyssi à l'étude de protéines informationnelles : Stratégies de validation et d'exploitation des données en génomique comparative

Research paper thumbnail of Ptdlns5P régulation through évolution : rôles in membrane trafficking?

Trends in Biochemical Sciences, 2008

Research paper thumbnail of Temp�rature et plasticit� du chromatisme de la coquille chez le mollusque pulmon� Helix aspersa

C R Acad Sci Ser Iii Vie, 1998

Research paper thumbnail of Tex19 and Sectm1 concordant molecular phylogenies support co-evolution of both eutherian-specific genes

BMC Evolutionary Biology, 2015

Background: Transposable elements (TE) have attracted much attention since they shape the genome ... more Background: Transposable elements (TE) have attracted much attention since they shape the genome and contribute to species evolution. Organisms have evolved mechanisms to control TE activity. Testis expressed 19 (Tex19) represses TE expression in mouse testis and placenta. In the human and mouse genomes, Tex19 and Secreted and transmembrane 1 (Sectm1) are neighbors but are not homologs. Sectm1 is involved in immunity and its molecular phylogeny is unknown. Methods: Using multiple alignments of complete protein sequences (MACS), we inferred Tex19 and Sectm1 molecular phylogenies. Protein conserved regions were identified and folds were predicted. Finally, expression patterns were studied across tissues and species using RNA-seq public data and RT-PCR. Results: We present 2 high quality alignments of 58 Tex19 and 58 Sectm1 protein sequences from 48 organisms. First, both genes are eutherian-specific, i.e., exclusively present in mammals except monotremes (platypus) and marsupials. Second, Tex19 and Sectm1 have both duplicated in Sciurognathi and Bovidae while they have remained as single copy genes in all further placental mammals. Phylogenetic concordance between both genes was significant (p-value < 0.05) and supported co-evolution and functional relationship. At the protein level, Tex19 exhibits 3 conserved regions and 4 invariant cysteines. In particular, a CXXC motif is present in the N-terminal conserved region. Sectm1 exhibits 2 invariant cysteines and an Ig-like domain. Strikingly, Tex19 C-terminal conserved region was lost in Haplorrhini primates while a Sectm1 C-terminal extra domain was acquired. Finally, we have determined that Tex19 and Sectm1 expression levels anti-correlate across the testis of several primates (ρ = −0.72) which supports anti-regulation. Conclusions: Tex19 and Sectm1 co-evolution and anti-regulated expressions support a strong functional relationship between both genes. Since Tex19 operates a control on TE and Sectm1 plays a role in immunity, Tex19 might suppress an immune response directed against cells that show TE activity in eutherian reproductive tissues.

Research paper thumbnail of Genome-wide evidence for an essential role of the human Staf/ZNF143 transcription factor in bidirectional transcription

Nucleic Acids Research, 2011

In the human genome, 1010% of the genes are arranged head to head so that their transcription star... more In the human genome, 1010% of the genes are arranged head to head so that their transcription start sites reside within <1 kbp on opposite strands. In this configuration, a bidirectional promoter generally drives expression of the two genes. How bidirectional expression is performed from these particular promoters constitutes a puzzling question. Here, by a combination of in silico and biochemical approaches, we demonstrate that hStaf/ZNF143 is involved in controlling expression from a subset of divergent gene pairs. The binding sites for hStaf/ZNF143 (SBS) are overrepresented in bidirectional versus unidirectional promoters. Chromatin immunoprecipitation assays with a significant set of bidirectional promoters containing putative SBS revealed that 93% of them are associated with hStaf/ZNF143. Expression of dual reporter genes directed by bidirectional promoters are dependent on the SBS integrity and requires hStaf/ZNF143. Furthermore, in some cases, functional SBS are located in bidirectional promoters of gene pairs encoding a noncoding RNA and a protein gene. Remarkably, hStaf/ZNF143 per se exhibits an inherently bidirectional transcription activity, and together our data provide the demonstration that hStaf/ZNF143 is indeed a transcription factor controlling the expression of divergent protein-protein and protein-non-coding RNA gene pairs.

Research paper thumbnail of KD4v: comprehensible knowledge discovery system for missense variant

Nucleic Acids Research, 2012

 A major challenge in the post-genomic era is a better understanding of how human genetic altera... more  A major challenge in the post-genomic era is a better understanding of how human genetic alterations involved in disease affect the gene products.  The KD4v server allows to characterize and predict the phenotypic effects (deleterious/neutral) of missense variants.  16 predicates annotated by MSV3d database: conservation, physico-chemical, functional and 3D structure  The server provides a set of rules learned by Induction Logic Programming.  These rules are interpretable by non-expert humans and are used to accurately predict the deleterious/neutral status of an unknown mutation.

Research paper thumbnail of Retinoic Acid Receptor Subtype-Specific Transcriptotypes in the Early Zebrafish Embryo

Molecular Endocrinology, 2014

Retinoic acid (RA) controls many aspects of embryonic development by binding to specific receptor... more Retinoic acid (RA) controls many aspects of embryonic development by binding to specific receptors (retinoic acid receptors [RARs]) that regulate complex transcriptional networks. Three different RAR subtypes are present in vertebrates and play both common and specific roles in transducing RA signaling. Specific activities of each receptor subtype can be correlated with its exclusive expression pattern, whereas shared activities between different subtypes are generally assimilated to functional redundancy. However, the question remains whether some subtype-specific activity still exists in regions or organs coexpressing multiple RAR subtypes. We tackled this issue at the transcriptional level using early zebrafish embryo as a model. Using morpholino knockdown, we specifically invalidated the zebrafish endogenous RAR subtypes in an in vivo context. After building up a list of RA-responsive genes in the zebrafish gastrula through a whole-transcriptome analysis, we compared this panel of genes with those that still respond to RA in embryos lacking one or another RAR subtype. Our work reveals that RAR subtypes do not have fully redundant functions at the transcriptional level but can transduce RA signal in a subtype-specific fashion. As a result, we define RAR subtype-specific transcriptotypes that correspond to repertoires of genes activated by different RAR subtypes. Finally, we found genes of the RA pathway (cyp26a1, raraa) the regulation of which by RA is highly robust and can even resist the knockdown of all RARs. This suggests that RA-responsive genes are differentially sensitive to alterations in the RA pathway and, in particular, cyp26a1 and raraa are under a high pressure to maintain signaling integrity.

Research paper thumbnail of Detection and Characterisation of Mutations Responsible for Allele-Specific Protein Thermostabilities at the Mn-Superoxide Dismutase Gene in the Deep-Sea Hydrothermal Vent Polychaete Alvinella pompejana

Journal of Molecular Evolution, 2013

Alvinella pompejana (Polychaeta, Alvinellidae) is one of the most thermotolerant marine eukaryote... more Alvinella pompejana (Polychaeta, Alvinellidae) is one of the most thermotolerant marine eukaryotes known to date. It inhabits chimney walls of deep-sea hydrothermal vents along the East Pacific Rise (EPR) and is exposed to various challenging conditions (e.g. high temperature, hypoxia and the presence of sulphides, heavy metals and radiations), which increase the production of dangerous reactive oxygen species (ROS). Two different allelic forms of a manganese-superoxide dismutase involved in ROS detoxification, ApMnSOD1 and ApMnSOD2, and differing only by two substitutions (M110L and A138G) were identified in an A. pompejana cDNA library. RFLP screening of 60 individuals from different localities along the EPR showed that ApMnSOD2 was rare (2 %) and only found in the heterozygous state. Dynamic light scattering measurements and residual enzymatic activity experiments showed that the most frequent form (ApMnSOD1) was the most resistant to temperature. Their half-lives were similarly long at 65°C ([110 min) but exhibited a twofold difference at 80°C (20.8 vs 9.8 min). Those properties are likely to be explained by the occurrence of an additional sulphur-containing hydrogen bond involving the M110 residue and the effect of the A138 residue on the backbone entropy. Our results confirm the thermophily of A. pompejana and suggest that this locus is a good model to study how the extreme thermal heterogeneity of the vent conditions may help to maintain old rare variants in those populations.

Research paper thumbnail of Genome-wide in Silico Identification of New Conserved and Functional Retinoic Acid Receptor Response Elements (Direct Repeats Separated by 5 bp)

Journal of Biological Chemistry, 2011

Background: Retinoic acid (RA) receptors regulate gene expression through binding-specific respon... more Background: Retinoic acid (RA) receptors regulate gene expression through binding-specific response elements (RAREs). Results: A collection of new DR5 RAREs located Ϯ10 kb from TSSs and conserved among 6 vertebrates species or more has been amassed. Conclusion: We provide a wider knowledge base for analyzing RA target genes. Significance: The RA response of the conserved target genes differs between species and tissues.

Research paper thumbnail of Functional insights into the core-TFIIH from a comparative survey

Genomics, 2013

TFIIH is a eukaryotic complex composed of two subcomplexes, the CAK (Cdk activating kinase) and t... more TFIIH is a eukaryotic complex composed of two subcomplexes, the CAK (Cdk activating kinase) and the core-TFIIH. The core-TFIIH, composed of seven subunits (XPB, XPD, P62, P52, P44, P34, and P8), plays a crucial role in transcription and repair. Here, we performed an extended sequence analysis to establish the accurate phylogenetic distribution of the core-TFIIH in 63 eukaryotic organisms. In spite of the high conservation of the seven subunits at the sequence and genomic levels, the non-enzymatic P8, P34, P52 and P62 are absent from one or a few unicellular species. To gain insight into their respective roles, we undertook a comparative genomic analysis of the whole proteome to identify the gene sets sharing similar presence/absence patterns. While little information was inferred for P8 and P62, our studies confirm the known role of P52 in repair and suggest for the first time the implication of the core TFIIH in mRNA splicing via P34.

Research paper thumbnail of Conjugating effects of symbionts and environmental factors on gene expression in deep-sea hydrothermal vent mussels

BMC Genomics, 2011

Background: The deep-sea hydrothermal vent mussel Bathymodiolus azoricus harbors thiotrophic and ... more Background: The deep-sea hydrothermal vent mussel Bathymodiolus azoricus harbors thiotrophic and methanotrophic symbiotic bacteria in its gills. While the symbiotic relationship between this hydrothermal mussel and these chemoautotrophic bacteria has been described, the molecular processes involved in the cross-talking between symbionts and host, in the maintenance of the symbiois, in the influence of environmental parameters on gene expression, and in transcriptome variation across individuals remain poorly understood. In an attempt to understand how, and to what extent, this double symbiosis affects host gene expression, we used a transcriptomic approach to identify genes potentially regulated by symbiont characteristics, environmental conditions or both. This study was done on mussels from two contrasting populations. Results: Subtractive libraries allowed the identification of about 1000 genes putatively regulated by symbiosis and/ or environmental factors. Microarray analysis showed that 120 genes (3.5% of all genes) were differentially expressed between the Menez Gwen (MG) and Rainbow (Rb) vent fields. The total number of regulated genes in mussels harboring a high versus a low symbiont content did not differ significantly. With regard to the impact of symbiont content, only 1% of all genes were regulated by thiotrophic (SOX) and methanotrophic (MOX) bacteria content in MG mussels whereas 5.6% were regulated in mussels collected at Rb. MOX symbionts also impacted a higher proportion of genes than SOX in both vent fields. When host transcriptome expression was analyzed with respect to symbiont gene expression, it was related to symbiont quantity in each field. Conclusions: Our study has produced a preliminary description of a transcriptomic response in a hydrothermal vent mussel host of both thiotrophic and methanotrophic symbiotic bacteria. This model can help to identify genes involved in the maintenance of symbiosis or regulated by environmental parameters. Our results provide evidence of symbiont effect on transcriptome regulation, with differences related to type of symbiont, even though the relative percentage of genes involved remains limited. Differences observed between the vent site indicate that environment strongly influences transcriptome regulation and impacts both activity and relative abundance of each symbiont. Among all these genes, those participating in recognition, the immune system, oxidative stress, and energy metabolism constitute new promising targets for extended studies on symbiosis and the effect of environmental parameters on the symbiotic relationships in B. azoricus.

Research paper thumbnail of PARSEC: PAtteRn SEarch and Contextualization

Bioinformatics, 2013

We present PARSEC (PAtteRn Search and Contextualization), a new open source platform for guided d... more We present PARSEC (PAtteRn Search and Contextualization), a new open source platform for guided discovery, allowing localization and biological characterization of short genomic sites in entire eukaryotic genomes. PARSEC can search for a sequence or a degenerated pattern. The retrieved set of genomic sites can be characterized in terms of (i) conservation in model organisms, (ii) genomic context (proximity to genes) and (iii) function of neighboring genes. These modules allow the user to explore, visualize, filter and extract biological knowledge from a set of short genomic regions such as transcription factor binding sites. Availability: Web site implemented in Java, JavaScript and Cþþ, with all major browsers supported. Freely available at lbgi.fr/parsec. Source code is freely available at sourceforge.net/projects/genomicparsec.

Research paper thumbnail of Whole-Exome Sequencing Identifies LRIT3 Mutations as a Cause of Autosomal-Recessive Complete Congenital Stationary Night Blindness

The American Journal of Human Genetics, 2013

Congenital stationary night blindness (CSNB) is a clinically and genetically heterogeneous retina... more Congenital stationary night blindness (CSNB) is a clinically and genetically heterogeneous retinal disorder. Two forms can be distinguished clinically: complete CSNB (cCSNB) and incomplete CSNB. Individuals with cCSNB have visual impairment under low-light conditions and show a characteristic electroretinogram (ERG). The b-wave amplitude is severely reduced in the dark-adapted state of the ERG, representing abnormal function of ON bipolar cells. Furthermore, individuals with cCSNB can show other ocular features such as nystagmus, myopia, and strabismus and can have reduced visual acuity and abnormalities of the cone ERG waveform. The mode of inheritance of this form can be X-linked or autosomal recessive, and the dysfunction of four genes (NYX, GRM6, TRPM1, and GPR179) has been described so far. Whole-exome sequencing in one simplex cCSNB case lacking mutations in the known genes led to the identification of a missense mutation (c.983G>A [p.Cys328Tyr]) and a nonsense mutation (c.1318C>T [p.Arg440*]) in LRIT3, encoding leucine-rich-repeat (LRR), immunoglobulin-like, and transmembrane-domain 3 (LRIT3). Subsequent Sanger sequencing of 89 individuals with CSNB identified another cCSNB case harboring a nonsense mutation (c.1151C>G [p.Ser384*]) and a deletion predicted to lead to a premature stop codon (c.1538_1539del [p.Ser513Cysfs*59]) in the same gene. Human LRIT3 antibody staining revealed in the outer plexiform layer of the human retina a punctate-labeling pattern resembling the dendritic tips of bipolar cells; similar patterns have been observed for other proteins implicated in cCSNB. The exact role of this LRR protein in cCSNB remains to be elucidated.

Research paper thumbnail of Ten Years of Collaborative Progress in the Quest for Orthologs

Molecular Biology and Evolution

Accurate determination of the evolutionary relationships between genes is a foundational challeng... more Accurate determination of the evolutionary relationships between genes is a foundational challenge in biology. Homology—evolutionary relatedness—is in many cases readily determined based on sequence similarity analysis. By contrast, whether or not two genes directly descended from a common ancestor by a speciation event (orthologs) or duplication event (paralogs) is more challenging, yet provides critical information on the history of a gene. Since 2009, this task has been the focus of the Quest for Orthologs (QFO) Consortium. The sixth QFO meeting took place in Okazaki, Japan in conjunction with the 67th National Institute for Basic Biology conference. Here, we report recent advances, applications, and oncoming challenges that were discussed during the conference. Steady progress has been made toward standardization and scalability of new and existing tools. A feature of the conference was the presentation of a panel of accessible tools for phylogenetic profiling and several develo...

Research paper thumbnail of Novel Approach Combining Transcriptional and Evolutionary Signatures to Identify New Multiciliation Genes

Genes

Multiciliogenesis is a complex process that allows the generation of hundreds of motile cilia on ... more Multiciliogenesis is a complex process that allows the generation of hundreds of motile cilia on the surface of specialized cells, to create fluid flow across epithelial surfaces. Dysfunction of human multiciliated cells is associated with diseases of the brain, airway and reproductive tracts. Despite recent efforts to characterize the transcriptional events responsible for the differentiation of multiciliated cells, a lot of actors remain to be identified. In this work, we capitalize on the ever-growing quantity of high-throughput data to search for new candidate genes involved in multiciliation. After performing a large-scale screening using 10 transcriptomics datasets dedicated to multiciliation, we established a specific evolutionary signature involving Otomorpha fish to use as a criterion to select the most likely targets. Combining both approaches highlighted a list of 114 potential multiciliated candidates. We characterized these genes first by generating protein interaction ...

Research paper thumbnail of PROBE: analysis and visualization of protein block-level evolution

Bioinformatics (Oxford, England), Jan 7, 2018

Comparative studies of protein sequences are widely used in evolutionary and comparative genomics... more Comparative studies of protein sequences are widely used in evolutionary and comparative genomics studies, but there is a lack of efficient tools to identify conserved regions ab initio within a protein multiple alignment. PROBE provides a fully automatic analysis of protein family conservation, to identify conserved regions, or 'blocks', that may correspond to structural / functional domains or motifs. Conserved blocks are identified at two different levels: (i) family level blocks indicate sites that are probably of central importance to the protein's structure or function, and (ii) sub-family level blocks highlight regions that may signify functional specialization, such as binding partners, etc. All conserved blocks are mapped onto a phylogenetic tree and can also be visualized in the context of the multiple sequence alignment. PROBE thus facilitates in-depth studies of sequence-structure-function-evolution relationships, and opens the way to block-level phylogenetic...

Research paper thumbnail of MyGeneFriends: A Social Network Linking Genes, Genetic Diseases, and Researchers

Journal of medical Internet research, Jun 16, 2017

The constant and massive increase of biological data offers unprecedented opportunities to deciph... more The constant and massive increase of biological data offers unprecedented opportunities to decipher the function and evolution of genes and their roles in human diseases. However, the multiplicity of sources and flow of data mean that efficient access to useful information and knowledge production has become a major challenge. This challenge can be addressed by taking inspiration from Web 2.0 and particularly social networks, which are at the forefront of big data exploration and human-data interaction. MyGeneFriends is a Web platform inspired by social networks, devoted to genetic disease analysis, and organized around three types of proactive agents: genes, humans, and genetic diseases. The aim of this study was to improve exploration and exploitation of biological, postgenomic era big data. MyGeneFriends leverages conventions popularized by top social networks (Facebook, LinkedIn, etc), such as networks of friends, profile pages, friendship recommendations, affinity scores, news ...

Research paper thumbnail of Proteome Evolution of Deep-Sea Hydrothermal Vent Alvinellid Polychaetes Supports the Ancestry of Thermophily and Subsequent Adaptation to Cold in Some Lineages

Genome biology and evolution, Feb 1, 2017

Temperature, perhaps more than any other environmental factor, is likely to influence the evoluti... more Temperature, perhaps more than any other environmental factor, is likely to influence the evolution of all organisms. It is also a very interesting factor to understand how genomes are shaped by selection over evolutionary timescales, as it potentially affects the whole genome. Among thermophilic prokaryotes, temperature affects both codon usage and protein composition to increase the stability of the transcriptional/translational machinery, and the resulting proteins need to be functional at high temperatures. Among eukaryotes less is known about genome evolution, and the tube-dwelling worms of the family Alvinellidae represent an excellent opportunity to test hypotheses about the emergence of thermophily in ectothermic metazoans. The Alvinellidae are a group of worms that experience varying thermal regimes, presumably having evolved into these niches over evolutionary times. Here we analyzed 423 putative orthologous loci derived from 6 alvinellid species including the thermophilic...

Research paper thumbnail of Insights into ciliary genes and evolution from multi-level phylogenetic profiling

Molecular biology and evolution, Aug 28, 2017

Cilia (flagella) are important eukaryotic organelles, present in the Last Eukaryotic Common Ances... more Cilia (flagella) are important eukaryotic organelles, present in the Last Eukaryotic Common Ancestor, and are involved in cell motility and integration of extracellular signals. Ciliary dysfunction causes a class of genetic diseases, known as ciliopathies, however current knowledge of the underlying mechanisms is still limited and a better characterization of genes is needed. As cilia have been lost independently several times during evolution and they are subject to important functional variation between species, ciliary genes can be investigated through comparative genomics.

Research paper thumbnail of Standardized benchmarking in the quest for orthologs

Nature methods, May 4, 2016

Achieving high accuracy in orthology inference is essential for many comparative, evolutionary an... more Achieving high accuracy in orthology inference is essential for many comparative, evolutionary and functional genomic analyses, yet the true evolutionary history of genes is generally unknown and orthologs are used for very different applications across phyla, requiring different precision-recall trade-offs. As a result, it is difficult to assess the performance of orthology inference methods. Here, we present a community effort to establish standards and an automated web-based service to facilitate orthology benchmarking. Using this service, we characterize 15 well-established inference methods and resources on a battery of 20 different benchmarks. Standardized benchmarking provides a way for users to identify the most effective methods for the problem at hand, sets a minimum requirement for new tools and resources, and guides the development of more accurate orthology inference methods.

Research paper thumbnail of De l'analyse du génome de Pyrococcus abyssi à l'étude de protéines informationnelles : Stratégies de validation et d'exploitation des données en génomique comparative

Research paper thumbnail of Ptdlns5P régulation through évolution : rôles in membrane trafficking?

Trends in Biochemical Sciences, 2008

Research paper thumbnail of Temp�rature et plasticit� du chromatisme de la coquille chez le mollusque pulmon� Helix aspersa

C R Acad Sci Ser Iii Vie, 1998

Research paper thumbnail of Tex19 and Sectm1 concordant molecular phylogenies support co-evolution of both eutherian-specific genes

BMC Evolutionary Biology, 2015

Background: Transposable elements (TE) have attracted much attention since they shape the genome ... more Background: Transposable elements (TE) have attracted much attention since they shape the genome and contribute to species evolution. Organisms have evolved mechanisms to control TE activity. Testis expressed 19 (Tex19) represses TE expression in mouse testis and placenta. In the human and mouse genomes, Tex19 and Secreted and transmembrane 1 (Sectm1) are neighbors but are not homologs. Sectm1 is involved in immunity and its molecular phylogeny is unknown. Methods: Using multiple alignments of complete protein sequences (MACS), we inferred Tex19 and Sectm1 molecular phylogenies. Protein conserved regions were identified and folds were predicted. Finally, expression patterns were studied across tissues and species using RNA-seq public data and RT-PCR. Results: We present 2 high quality alignments of 58 Tex19 and 58 Sectm1 protein sequences from 48 organisms. First, both genes are eutherian-specific, i.e., exclusively present in mammals except monotremes (platypus) and marsupials. Second, Tex19 and Sectm1 have both duplicated in Sciurognathi and Bovidae while they have remained as single copy genes in all further placental mammals. Phylogenetic concordance between both genes was significant (p-value < 0.05) and supported co-evolution and functional relationship. At the protein level, Tex19 exhibits 3 conserved regions and 4 invariant cysteines. In particular, a CXXC motif is present in the N-terminal conserved region. Sectm1 exhibits 2 invariant cysteines and an Ig-like domain. Strikingly, Tex19 C-terminal conserved region was lost in Haplorrhini primates while a Sectm1 C-terminal extra domain was acquired. Finally, we have determined that Tex19 and Sectm1 expression levels anti-correlate across the testis of several primates (ρ = −0.72) which supports anti-regulation. Conclusions: Tex19 and Sectm1 co-evolution and anti-regulated expressions support a strong functional relationship between both genes. Since Tex19 operates a control on TE and Sectm1 plays a role in immunity, Tex19 might suppress an immune response directed against cells that show TE activity in eutherian reproductive tissues.

Research paper thumbnail of Genome-wide evidence for an essential role of the human Staf/ZNF143 transcription factor in bidirectional transcription

Nucleic Acids Research, 2011

In the human genome, 1010% of the genes are arranged head to head so that their transcription star... more In the human genome, 1010% of the genes are arranged head to head so that their transcription start sites reside within <1 kbp on opposite strands. In this configuration, a bidirectional promoter generally drives expression of the two genes. How bidirectional expression is performed from these particular promoters constitutes a puzzling question. Here, by a combination of in silico and biochemical approaches, we demonstrate that hStaf/ZNF143 is involved in controlling expression from a subset of divergent gene pairs. The binding sites for hStaf/ZNF143 (SBS) are overrepresented in bidirectional versus unidirectional promoters. Chromatin immunoprecipitation assays with a significant set of bidirectional promoters containing putative SBS revealed that 93% of them are associated with hStaf/ZNF143. Expression of dual reporter genes directed by bidirectional promoters are dependent on the SBS integrity and requires hStaf/ZNF143. Furthermore, in some cases, functional SBS are located in bidirectional promoters of gene pairs encoding a noncoding RNA and a protein gene. Remarkably, hStaf/ZNF143 per se exhibits an inherently bidirectional transcription activity, and together our data provide the demonstration that hStaf/ZNF143 is indeed a transcription factor controlling the expression of divergent protein-protein and protein-non-coding RNA gene pairs.

Research paper thumbnail of KD4v: comprehensible knowledge discovery system for missense variant

Nucleic Acids Research, 2012

 A major challenge in the post-genomic era is a better understanding of how human genetic altera... more  A major challenge in the post-genomic era is a better understanding of how human genetic alterations involved in disease affect the gene products.  The KD4v server allows to characterize and predict the phenotypic effects (deleterious/neutral) of missense variants.  16 predicates annotated by MSV3d database: conservation, physico-chemical, functional and 3D structure  The server provides a set of rules learned by Induction Logic Programming.  These rules are interpretable by non-expert humans and are used to accurately predict the deleterious/neutral status of an unknown mutation.

Research paper thumbnail of Retinoic Acid Receptor Subtype-Specific Transcriptotypes in the Early Zebrafish Embryo

Molecular Endocrinology, 2014

Retinoic acid (RA) controls many aspects of embryonic development by binding to specific receptor... more Retinoic acid (RA) controls many aspects of embryonic development by binding to specific receptors (retinoic acid receptors [RARs]) that regulate complex transcriptional networks. Three different RAR subtypes are present in vertebrates and play both common and specific roles in transducing RA signaling. Specific activities of each receptor subtype can be correlated with its exclusive expression pattern, whereas shared activities between different subtypes are generally assimilated to functional redundancy. However, the question remains whether some subtype-specific activity still exists in regions or organs coexpressing multiple RAR subtypes. We tackled this issue at the transcriptional level using early zebrafish embryo as a model. Using morpholino knockdown, we specifically invalidated the zebrafish endogenous RAR subtypes in an in vivo context. After building up a list of RA-responsive genes in the zebrafish gastrula through a whole-transcriptome analysis, we compared this panel of genes with those that still respond to RA in embryos lacking one or another RAR subtype. Our work reveals that RAR subtypes do not have fully redundant functions at the transcriptional level but can transduce RA signal in a subtype-specific fashion. As a result, we define RAR subtype-specific transcriptotypes that correspond to repertoires of genes activated by different RAR subtypes. Finally, we found genes of the RA pathway (cyp26a1, raraa) the regulation of which by RA is highly robust and can even resist the knockdown of all RARs. This suggests that RA-responsive genes are differentially sensitive to alterations in the RA pathway and, in particular, cyp26a1 and raraa are under a high pressure to maintain signaling integrity.

Research paper thumbnail of Detection and Characterisation of Mutations Responsible for Allele-Specific Protein Thermostabilities at the Mn-Superoxide Dismutase Gene in the Deep-Sea Hydrothermal Vent Polychaete Alvinella pompejana

Journal of Molecular Evolution, 2013

Alvinella pompejana (Polychaeta, Alvinellidae) is one of the most thermotolerant marine eukaryote... more Alvinella pompejana (Polychaeta, Alvinellidae) is one of the most thermotolerant marine eukaryotes known to date. It inhabits chimney walls of deep-sea hydrothermal vents along the East Pacific Rise (EPR) and is exposed to various challenging conditions (e.g. high temperature, hypoxia and the presence of sulphides, heavy metals and radiations), which increase the production of dangerous reactive oxygen species (ROS). Two different allelic forms of a manganese-superoxide dismutase involved in ROS detoxification, ApMnSOD1 and ApMnSOD2, and differing only by two substitutions (M110L and A138G) were identified in an A. pompejana cDNA library. RFLP screening of 60 individuals from different localities along the EPR showed that ApMnSOD2 was rare (2 %) and only found in the heterozygous state. Dynamic light scattering measurements and residual enzymatic activity experiments showed that the most frequent form (ApMnSOD1) was the most resistant to temperature. Their half-lives were similarly long at 65°C ([110 min) but exhibited a twofold difference at 80°C (20.8 vs 9.8 min). Those properties are likely to be explained by the occurrence of an additional sulphur-containing hydrogen bond involving the M110 residue and the effect of the A138 residue on the backbone entropy. Our results confirm the thermophily of A. pompejana and suggest that this locus is a good model to study how the extreme thermal heterogeneity of the vent conditions may help to maintain old rare variants in those populations.

Research paper thumbnail of Genome-wide in Silico Identification of New Conserved and Functional Retinoic Acid Receptor Response Elements (Direct Repeats Separated by 5 bp)

Journal of Biological Chemistry, 2011

Background: Retinoic acid (RA) receptors regulate gene expression through binding-specific respon... more Background: Retinoic acid (RA) receptors regulate gene expression through binding-specific response elements (RAREs). Results: A collection of new DR5 RAREs located Ϯ10 kb from TSSs and conserved among 6 vertebrates species or more has been amassed. Conclusion: We provide a wider knowledge base for analyzing RA target genes. Significance: The RA response of the conserved target genes differs between species and tissues.

Research paper thumbnail of Functional insights into the core-TFIIH from a comparative survey

Genomics, 2013

TFIIH is a eukaryotic complex composed of two subcomplexes, the CAK (Cdk activating kinase) and t... more TFIIH is a eukaryotic complex composed of two subcomplexes, the CAK (Cdk activating kinase) and the core-TFIIH. The core-TFIIH, composed of seven subunits (XPB, XPD, P62, P52, P44, P34, and P8), plays a crucial role in transcription and repair. Here, we performed an extended sequence analysis to establish the accurate phylogenetic distribution of the core-TFIIH in 63 eukaryotic organisms. In spite of the high conservation of the seven subunits at the sequence and genomic levels, the non-enzymatic P8, P34, P52 and P62 are absent from one or a few unicellular species. To gain insight into their respective roles, we undertook a comparative genomic analysis of the whole proteome to identify the gene sets sharing similar presence/absence patterns. While little information was inferred for P8 and P62, our studies confirm the known role of P52 in repair and suggest for the first time the implication of the core TFIIH in mRNA splicing via P34.

Research paper thumbnail of Conjugating effects of symbionts and environmental factors on gene expression in deep-sea hydrothermal vent mussels

BMC Genomics, 2011

Background: The deep-sea hydrothermal vent mussel Bathymodiolus azoricus harbors thiotrophic and ... more Background: The deep-sea hydrothermal vent mussel Bathymodiolus azoricus harbors thiotrophic and methanotrophic symbiotic bacteria in its gills. While the symbiotic relationship between this hydrothermal mussel and these chemoautotrophic bacteria has been described, the molecular processes involved in the cross-talking between symbionts and host, in the maintenance of the symbiois, in the influence of environmental parameters on gene expression, and in transcriptome variation across individuals remain poorly understood. In an attempt to understand how, and to what extent, this double symbiosis affects host gene expression, we used a transcriptomic approach to identify genes potentially regulated by symbiont characteristics, environmental conditions or both. This study was done on mussels from two contrasting populations. Results: Subtractive libraries allowed the identification of about 1000 genes putatively regulated by symbiosis and/ or environmental factors. Microarray analysis showed that 120 genes (3.5% of all genes) were differentially expressed between the Menez Gwen (MG) and Rainbow (Rb) vent fields. The total number of regulated genes in mussels harboring a high versus a low symbiont content did not differ significantly. With regard to the impact of symbiont content, only 1% of all genes were regulated by thiotrophic (SOX) and methanotrophic (MOX) bacteria content in MG mussels whereas 5.6% were regulated in mussels collected at Rb. MOX symbionts also impacted a higher proportion of genes than SOX in both vent fields. When host transcriptome expression was analyzed with respect to symbiont gene expression, it was related to symbiont quantity in each field. Conclusions: Our study has produced a preliminary description of a transcriptomic response in a hydrothermal vent mussel host of both thiotrophic and methanotrophic symbiotic bacteria. This model can help to identify genes involved in the maintenance of symbiosis or regulated by environmental parameters. Our results provide evidence of symbiont effect on transcriptome regulation, with differences related to type of symbiont, even though the relative percentage of genes involved remains limited. Differences observed between the vent site indicate that environment strongly influences transcriptome regulation and impacts both activity and relative abundance of each symbiont. Among all these genes, those participating in recognition, the immune system, oxidative stress, and energy metabolism constitute new promising targets for extended studies on symbiosis and the effect of environmental parameters on the symbiotic relationships in B. azoricus.

Research paper thumbnail of PARSEC: PAtteRn SEarch and Contextualization

Bioinformatics, 2013

We present PARSEC (PAtteRn Search and Contextualization), a new open source platform for guided d... more We present PARSEC (PAtteRn Search and Contextualization), a new open source platform for guided discovery, allowing localization and biological characterization of short genomic sites in entire eukaryotic genomes. PARSEC can search for a sequence or a degenerated pattern. The retrieved set of genomic sites can be characterized in terms of (i) conservation in model organisms, (ii) genomic context (proximity to genes) and (iii) function of neighboring genes. These modules allow the user to explore, visualize, filter and extract biological knowledge from a set of short genomic regions such as transcription factor binding sites. Availability: Web site implemented in Java, JavaScript and Cþþ, with all major browsers supported. Freely available at lbgi.fr/parsec. Source code is freely available at sourceforge.net/projects/genomicparsec.

Research paper thumbnail of Whole-Exome Sequencing Identifies LRIT3 Mutations as a Cause of Autosomal-Recessive Complete Congenital Stationary Night Blindness

The American Journal of Human Genetics, 2013

Congenital stationary night blindness (CSNB) is a clinically and genetically heterogeneous retina... more Congenital stationary night blindness (CSNB) is a clinically and genetically heterogeneous retinal disorder. Two forms can be distinguished clinically: complete CSNB (cCSNB) and incomplete CSNB. Individuals with cCSNB have visual impairment under low-light conditions and show a characteristic electroretinogram (ERG). The b-wave amplitude is severely reduced in the dark-adapted state of the ERG, representing abnormal function of ON bipolar cells. Furthermore, individuals with cCSNB can show other ocular features such as nystagmus, myopia, and strabismus and can have reduced visual acuity and abnormalities of the cone ERG waveform. The mode of inheritance of this form can be X-linked or autosomal recessive, and the dysfunction of four genes (NYX, GRM6, TRPM1, and GPR179) has been described so far. Whole-exome sequencing in one simplex cCSNB case lacking mutations in the known genes led to the identification of a missense mutation (c.983G>A [p.Cys328Tyr]) and a nonsense mutation (c.1318C>T [p.Arg440*]) in LRIT3, encoding leucine-rich-repeat (LRR), immunoglobulin-like, and transmembrane-domain 3 (LRIT3). Subsequent Sanger sequencing of 89 individuals with CSNB identified another cCSNB case harboring a nonsense mutation (c.1151C>G [p.Ser384*]) and a deletion predicted to lead to a premature stop codon (c.1538_1539del [p.Ser513Cysfs*59]) in the same gene. Human LRIT3 antibody staining revealed in the outer plexiform layer of the human retina a punctate-labeling pattern resembling the dendritic tips of bipolar cells; similar patterns have been observed for other proteins implicated in cCSNB. The exact role of this LRR protein in cCSNB remains to be elucidated.