Jean-François Dufayard - Academia.edu (original) (raw)
Papers by Jean-François Dufayard
Frontiers in Plant Science
Regulation of Sorghum Stem Composition unveiled sorghum MYB and NAC that have not been identified... more Regulation of Sorghum Stem Composition unveiled sorghum MYB and NAC that have not been identified to date as being involved in cell wall regulation. Although specific validation of the MYB and NAC genes uncovered in this study is needed, we provide a network of sorghum genes involved in SCW both at the structural and regulatory levels.
PeerJ
Background Small RNAs modulate plant gene expression at both the transcriptional and post-transcr... more Background Small RNAs modulate plant gene expression at both the transcriptional and post-transcriptional level, mostly through the induction of either targeted DNA methylation or transcript cleavage, respectively. Small RNA networks are involved in specific plant developmental processes, in signaling pathways triggered by various abiotic stresses and in interactions between the plant and viral and non-viral pathogens. They are also involved in silencing maintenance of transposable elements and endogenous viral elements. Alteration in small RNA production in response to various environmental stresses can affect all the above-mentioned processes. In rubber trees, changes observed in small RNA populations in response to trees affected by tapping panel dryness, in comparison to healthy ones, suggest a shift from a transcriptional to a post-transcriptional regulatory pathway. This is the first attempt to characterise small RNAs involved in post-transcriptional silencing and their target...
Background. Non-specific Lipid Transfer Proteins (nsLTPs) are widely distributed in the plant kin... more Background. Non-specific Lipid Transfer Proteins (nsLTPs) are widely distributed in the plant kingdom and constitute a superfamily of related proteins. More than 800 different sequences have been characterized so far, but their biological functions remain unclear. It has been clear for years that they present a certain interest for agronomic and nutritional issues. Deciphering their functions means collecting and analyzing a variety of data from gene sequence to protein structure, from cellular localization to the physiological role. As a huge and growing number of new protein sequences are available nowadays, extracting meaningful knowledge from sequence-structure-function relationships calls for the development of new tools and approaches. As nsLTPs show high evolutionary divergence, but a conserved common right-handed superhelix structural fold, and as they are involved in a large number of key roles in plant development and defense, they are a stimulating case study for validati...
Frontiers in plant science, 2017
Leucine-Rich Repeats Receptor-Like Kinase (LRR-RLK) genes represent a large and complex gene fami... more Leucine-Rich Repeats Receptor-Like Kinase (LRR-RLK) genes represent a large and complex gene family in plants, mainly involved in development and stress responses. These receptors are composed of an LRR-containing extracellular domain (ECD), a transmembrane domain (TM) and an intracellular kinase domain (KD). To provide new perspectives on functional analyses of these genes in model and non-model plant species, we performed a phylogenetic analysis on 8,360 LRR-RLK receptors in 31 angiosperm genomes (8 monocots and 23 dicots). We identified 101 orthologous groups (OGs) of genes being conserved among almost all monocot and dicot species analyzed. We observed that more than 10% of these OGs are absent in the Brassicaceae species studied. We show that the ECD structural features are not always conserved among orthologs, suggesting that functions may have diverged in some OG sets. Moreover, we looked at targets of positive selection footprints in 12 pairs of OGs and noticed that dependin...
Frontiers in plant science, 2017
Plant physiology, Jan 15, 2016
Gene duplications are an important factor in plant evolution and lineage specific expanded (LSE) ... more Gene duplications are an important factor in plant evolution and lineage specific expanded (LSE) genes are of particular interest. Receptor-like kinases (RLK) expanded massively in land plants and Leucine-Rich Repeat (LRR)-RLKs constitute the largest RLK family. Based on the phylogeny of 7,554 LRR-RLK genes from 31 fully sequenced flowering plant genomes, the complex evolutionary dynamics of this family was characterized in depth. We studied the involvement of selection during the expansion of this family among angiosperms. LRR-RLK subgroups harbor extremely contrasted rates of duplication, retention or loss and LSE copies are predominantly found in subgroups involved in environmental interactions. Expansion rates also differ significantly depending on the time when rounds of expansion or loss occurred on the angiosperm phylogenetic tree. Finally, using a dN/dS-based test in a phylogenetic framework, we searched for selection footprints on LSE and single-copy LRR-RLK genes. Selectiv...
With the increasing number of plant genomes being sequenced, a major challenge is to accurately t... more With the increasing number of plant genomes being sequenced, a major challenge is to accurately transfer annotation from well characterized genomes to newly obtained sequences. GreenPhylDB is a database designed for comparative and functional genomics based on complete genome-derived gene sequences (Conte et al, 2008, Rouard M, Guignon V et al, 2011). The database currently includes gene sequences from 22 plant species, including Musa (representative of bananas and plantains). Genes from all these species are organized in clusters based on sequence similarity. The clusters (or families) are manually annotated (i.e. properly named and classified) and sequences included in each cluster are characterized by phylogenetic analysis in order to elucidate evolutionary relationships (e.g. orthologs, super-orthologs, in/out-paralogs) among genes. GreenPhyl provides a reliable (Martinez, 2011) and stable catalog of gene families useful for annotation on new genome sequences in plants. GreenPhy...
The study of plant tolerance to stress is a crucial issue for crops improvement and stability in ... more The study of plant tolerance to stress is a crucial issue for crops improvement and stability in constrained environments. Gene family analysis is an important way to understand complex processes underlying stress response in crops. Several tools exist to study families and propose automatically clustered families or curated published families. We observed that automatic clustering, efficient for global analyses, is rarely sufficient for precise studies: i) families can be spread in several cluster, or ii) intrusive sequences are represented in clusters of interest. That is why biologists need most of the time to manually constitute their families. In response to this need, we propose to develop an integrative system that will allow to gather sequences from different sources for a customized family. This system will integrate several tools used for family construction and analysis. Currently, the prototype allows to query in-house Chado database (banana, coffee) and to import person...
The South Green platform (http://www.southgreen.fr/) is a local network of scientists gathering B... more The South Green platform (http://www.southgreen.fr/) is a local network of scientists gathering Bioinformatics skills based on the Agropolis campus that hosts research institutes such as CIRAD, IRD, INRA, SupAgro and Bioversity international. Based on this strong local community in the field of agriculture, food and biodiversity, various bioinformatics applications and resources dedicated to genomics of tropical and Mediterranean plants has been developed and published. The objectives of South Green are to promote these original tools as well as their interoperability. Exchange and collaborative developments are also fostered through regular hands-on sessions on synergistic themes such as Galaxy, genome annotation or next generation genotyping. Finally, we provide access to computing facilities and hands-on training for both users and developers engaged in the network. The South Green web portal contains currently 20 information systems and tools and targets about 30 plants. As a pr...
Phylogenetic tree databases, such as HOBACGEN or HOVERGEN, are often explored manually in order t... more Phylogenetic tree databases, such as HOBACGEN or HOVERGEN, are often explored manually in order to retrieve genes using phylogenetic criteria. This type of database may contain several thousands of trees, so this search process is time consuming and error prone. An algorithm for unordered tree pattern matching has been developed, in order to automatically solve this type of request. This
Nucleic acids research, Jan 3, 2015
SNiPlay is a web-based tool for detection, management and analysis of genetic variants including ... more SNiPlay is a web-based tool for detection, management and analysis of genetic variants including both single nucleotide polymorphisms (SNPs) and InDels. Version 3 now extends functionalities in order to easily manage and exploit SNPs derived from next generation sequencing technologies, such as GBS (genotyping by sequencing), WGRS (whole gre-sequencing) and RNA-Seq technologies. Based on the standard VCF (variant call format) format, the application offers an intuitive interface for filtering and comparing polymorphisms using user-defined sets of individuals and then establishing a reliable genotyping data matrix for further analyses. Namely, in addition to the various scaled-up analyses allowed by the application (genomic annotation of SNP, diversity analysis, haplotype reconstruction and network, linkage disequilibrium), SNiPlay3 proposes new modules for GWAS (genome-wide association studies), population stratification, distance tree analysis and visualization of SNP density. Addi...
PloS one, 2015
Chaperone/usher (CU) assembly pathway is used by a wide range of Enterobacteriaceae to assemble a... more Chaperone/usher (CU) assembly pathway is used by a wide range of Enterobacteriaceae to assemble adhesive surface structures called pili or fimbriae that play a role in bacteria-host cell interactions. In silico analysis revealed that the genome of Klebsiella pneumoniae LM21 harbors eight chromosomal CU loci belonging to γκп and ϭ clusters. Of these, only two correspond to previously described operons, namely type 1 and type 3-encoding operons. Isogenic usher deletion mutants of K. pneumoniae LM21 were constructed for each locus and their role in adhesion to animal (Intestine 407) and plant (Arabidopsis thaliana) cells, biofilm formation and murine intestinal colonization was investigated. Type 3 pili usher deleted mutant was impaired in all assays, whereas type 1 pili usher deleted mutant only showed attenuation in adhesion to plant cells and in intestinal colonization. The LM21ΔkpjC mutant was impaired in its capacity to adhere to Arabidopsis cells and to colonize the murine intest...
ABSTRACT Comparison of homologous sequences is an essential step for many studies related to mole... more ABSTRACT Comparison of homologous sequences is an essential step for many studies related to molecular biology and evolution: to identify important regions in genomic sequences, to study evolution at the molecular level, to determine phylogeny of species or to predict the function of a new gene. In this view, databases of homologous genes are very useful. We will now present the second release of HOGENOM, a extended gene family database which contains homologous genes families from 182 complete genomes of eukarya, bacteria, and archaea. HOGENOM can be queried according to several criteria as phylogeny, orthology/paralogy relationship, sequence data, bibliographic information.
Methods in Molecular Biology, 2009
Our understanding of the origins, the functions and/or the structures of biological sequences str... more Our understanding of the origins, the functions and/or the structures of biological sequences strongly depends on our ability to decipher the mechanisms of molecular evolution. These complex processes can be described through the comparison of homologous sequences in a phylogenetic framework. Moreover, phylogenetic inference provides sound statistical tools to exhibit the main features of molecular evolution from the analysis of actual sequences. This chapter focuses on phylogenetic tree estimation under the maximum likelihood (ML) principle. Phylogenies inferred under this probabilistic criterion are usually reliable and important biological hypotheses can be tested through the comparison of different models. Estimating ML phylogenies is computationally demanding, and careful examination of the results is warranted. This chapter focuses on PhyML, a software that implements recent ML phylogenetic methods and algorithms. We illustrate the strengths and pitfalls of this program through the analysis of a real data set. PhyML v3.
Nature genetics, 2015
Orchidaceae, renowned for its spectacular flowers and other reproductive and ecological adaptatio... more Orchidaceae, renowned for its spectacular flowers and other reproductive and ecological adaptations, is one of the most diverse plant families. Here we present the genome sequence of the tropical epiphytic orchid Phalaenopsis equestris, a frequently used parent species for orchid breeding. P. equestris is the first plant with crassulacean acid metabolism (CAM) for which the genome has been sequenced. Our assembled genome contains 29,431 predicted protein-coding genes. We find that contigs likely to be underassembled, owing to heterozygosity, are enriched for genes that might be involved in self-incompatibility pathways. We find evidence for an orchid-specific paleopolyploidy event that preceded the radiation of most orchid clades, and our results suggest that gene duplication might have contributed to the evolution of CAM photosynthesis in P. equestris. Finally, we find expanded and diversified families of MADS-box C/D-class, B-class AP3 and AGL6-class genes, which might contribute ...
Nature genetics, Jan 28, 2015
Frontiers in Plant Science
Regulation of Sorghum Stem Composition unveiled sorghum MYB and NAC that have not been identified... more Regulation of Sorghum Stem Composition unveiled sorghum MYB and NAC that have not been identified to date as being involved in cell wall regulation. Although specific validation of the MYB and NAC genes uncovered in this study is needed, we provide a network of sorghum genes involved in SCW both at the structural and regulatory levels.
PeerJ
Background Small RNAs modulate plant gene expression at both the transcriptional and post-transcr... more Background Small RNAs modulate plant gene expression at both the transcriptional and post-transcriptional level, mostly through the induction of either targeted DNA methylation or transcript cleavage, respectively. Small RNA networks are involved in specific plant developmental processes, in signaling pathways triggered by various abiotic stresses and in interactions between the plant and viral and non-viral pathogens. They are also involved in silencing maintenance of transposable elements and endogenous viral elements. Alteration in small RNA production in response to various environmental stresses can affect all the above-mentioned processes. In rubber trees, changes observed in small RNA populations in response to trees affected by tapping panel dryness, in comparison to healthy ones, suggest a shift from a transcriptional to a post-transcriptional regulatory pathway. This is the first attempt to characterise small RNAs involved in post-transcriptional silencing and their target...
Background. Non-specific Lipid Transfer Proteins (nsLTPs) are widely distributed in the plant kin... more Background. Non-specific Lipid Transfer Proteins (nsLTPs) are widely distributed in the plant kingdom and constitute a superfamily of related proteins. More than 800 different sequences have been characterized so far, but their biological functions remain unclear. It has been clear for years that they present a certain interest for agronomic and nutritional issues. Deciphering their functions means collecting and analyzing a variety of data from gene sequence to protein structure, from cellular localization to the physiological role. As a huge and growing number of new protein sequences are available nowadays, extracting meaningful knowledge from sequence-structure-function relationships calls for the development of new tools and approaches. As nsLTPs show high evolutionary divergence, but a conserved common right-handed superhelix structural fold, and as they are involved in a large number of key roles in plant development and defense, they are a stimulating case study for validati...
Frontiers in plant science, 2017
Leucine-Rich Repeats Receptor-Like Kinase (LRR-RLK) genes represent a large and complex gene fami... more Leucine-Rich Repeats Receptor-Like Kinase (LRR-RLK) genes represent a large and complex gene family in plants, mainly involved in development and stress responses. These receptors are composed of an LRR-containing extracellular domain (ECD), a transmembrane domain (TM) and an intracellular kinase domain (KD). To provide new perspectives on functional analyses of these genes in model and non-model plant species, we performed a phylogenetic analysis on 8,360 LRR-RLK receptors in 31 angiosperm genomes (8 monocots and 23 dicots). We identified 101 orthologous groups (OGs) of genes being conserved among almost all monocot and dicot species analyzed. We observed that more than 10% of these OGs are absent in the Brassicaceae species studied. We show that the ECD structural features are not always conserved among orthologs, suggesting that functions may have diverged in some OG sets. Moreover, we looked at targets of positive selection footprints in 12 pairs of OGs and noticed that dependin...
Frontiers in plant science, 2017
Plant physiology, Jan 15, 2016
Gene duplications are an important factor in plant evolution and lineage specific expanded (LSE) ... more Gene duplications are an important factor in plant evolution and lineage specific expanded (LSE) genes are of particular interest. Receptor-like kinases (RLK) expanded massively in land plants and Leucine-Rich Repeat (LRR)-RLKs constitute the largest RLK family. Based on the phylogeny of 7,554 LRR-RLK genes from 31 fully sequenced flowering plant genomes, the complex evolutionary dynamics of this family was characterized in depth. We studied the involvement of selection during the expansion of this family among angiosperms. LRR-RLK subgroups harbor extremely contrasted rates of duplication, retention or loss and LSE copies are predominantly found in subgroups involved in environmental interactions. Expansion rates also differ significantly depending on the time when rounds of expansion or loss occurred on the angiosperm phylogenetic tree. Finally, using a dN/dS-based test in a phylogenetic framework, we searched for selection footprints on LSE and single-copy LRR-RLK genes. Selectiv...
With the increasing number of plant genomes being sequenced, a major challenge is to accurately t... more With the increasing number of plant genomes being sequenced, a major challenge is to accurately transfer annotation from well characterized genomes to newly obtained sequences. GreenPhylDB is a database designed for comparative and functional genomics based on complete genome-derived gene sequences (Conte et al, 2008, Rouard M, Guignon V et al, 2011). The database currently includes gene sequences from 22 plant species, including Musa (representative of bananas and plantains). Genes from all these species are organized in clusters based on sequence similarity. The clusters (or families) are manually annotated (i.e. properly named and classified) and sequences included in each cluster are characterized by phylogenetic analysis in order to elucidate evolutionary relationships (e.g. orthologs, super-orthologs, in/out-paralogs) among genes. GreenPhyl provides a reliable (Martinez, 2011) and stable catalog of gene families useful for annotation on new genome sequences in plants. GreenPhy...
The study of plant tolerance to stress is a crucial issue for crops improvement and stability in ... more The study of plant tolerance to stress is a crucial issue for crops improvement and stability in constrained environments. Gene family analysis is an important way to understand complex processes underlying stress response in crops. Several tools exist to study families and propose automatically clustered families or curated published families. We observed that automatic clustering, efficient for global analyses, is rarely sufficient for precise studies: i) families can be spread in several cluster, or ii) intrusive sequences are represented in clusters of interest. That is why biologists need most of the time to manually constitute their families. In response to this need, we propose to develop an integrative system that will allow to gather sequences from different sources for a customized family. This system will integrate several tools used for family construction and analysis. Currently, the prototype allows to query in-house Chado database (banana, coffee) and to import person...
The South Green platform (http://www.southgreen.fr/) is a local network of scientists gathering B... more The South Green platform (http://www.southgreen.fr/) is a local network of scientists gathering Bioinformatics skills based on the Agropolis campus that hosts research institutes such as CIRAD, IRD, INRA, SupAgro and Bioversity international. Based on this strong local community in the field of agriculture, food and biodiversity, various bioinformatics applications and resources dedicated to genomics of tropical and Mediterranean plants has been developed and published. The objectives of South Green are to promote these original tools as well as their interoperability. Exchange and collaborative developments are also fostered through regular hands-on sessions on synergistic themes such as Galaxy, genome annotation or next generation genotyping. Finally, we provide access to computing facilities and hands-on training for both users and developers engaged in the network. The South Green web portal contains currently 20 information systems and tools and targets about 30 plants. As a pr...
Phylogenetic tree databases, such as HOBACGEN or HOVERGEN, are often explored manually in order t... more Phylogenetic tree databases, such as HOBACGEN or HOVERGEN, are often explored manually in order to retrieve genes using phylogenetic criteria. This type of database may contain several thousands of trees, so this search process is time consuming and error prone. An algorithm for unordered tree pattern matching has been developed, in order to automatically solve this type of request. This
Nucleic acids research, Jan 3, 2015
SNiPlay is a web-based tool for detection, management and analysis of genetic variants including ... more SNiPlay is a web-based tool for detection, management and analysis of genetic variants including both single nucleotide polymorphisms (SNPs) and InDels. Version 3 now extends functionalities in order to easily manage and exploit SNPs derived from next generation sequencing technologies, such as GBS (genotyping by sequencing), WGRS (whole gre-sequencing) and RNA-Seq technologies. Based on the standard VCF (variant call format) format, the application offers an intuitive interface for filtering and comparing polymorphisms using user-defined sets of individuals and then establishing a reliable genotyping data matrix for further analyses. Namely, in addition to the various scaled-up analyses allowed by the application (genomic annotation of SNP, diversity analysis, haplotype reconstruction and network, linkage disequilibrium), SNiPlay3 proposes new modules for GWAS (genome-wide association studies), population stratification, distance tree analysis and visualization of SNP density. Addi...
PloS one, 2015
Chaperone/usher (CU) assembly pathway is used by a wide range of Enterobacteriaceae to assemble a... more Chaperone/usher (CU) assembly pathway is used by a wide range of Enterobacteriaceae to assemble adhesive surface structures called pili or fimbriae that play a role in bacteria-host cell interactions. In silico analysis revealed that the genome of Klebsiella pneumoniae LM21 harbors eight chromosomal CU loci belonging to γκп and ϭ clusters. Of these, only two correspond to previously described operons, namely type 1 and type 3-encoding operons. Isogenic usher deletion mutants of K. pneumoniae LM21 were constructed for each locus and their role in adhesion to animal (Intestine 407) and plant (Arabidopsis thaliana) cells, biofilm formation and murine intestinal colonization was investigated. Type 3 pili usher deleted mutant was impaired in all assays, whereas type 1 pili usher deleted mutant only showed attenuation in adhesion to plant cells and in intestinal colonization. The LM21ΔkpjC mutant was impaired in its capacity to adhere to Arabidopsis cells and to colonize the murine intest...
ABSTRACT Comparison of homologous sequences is an essential step for many studies related to mole... more ABSTRACT Comparison of homologous sequences is an essential step for many studies related to molecular biology and evolution: to identify important regions in genomic sequences, to study evolution at the molecular level, to determine phylogeny of species or to predict the function of a new gene. In this view, databases of homologous genes are very useful. We will now present the second release of HOGENOM, a extended gene family database which contains homologous genes families from 182 complete genomes of eukarya, bacteria, and archaea. HOGENOM can be queried according to several criteria as phylogeny, orthology/paralogy relationship, sequence data, bibliographic information.
Methods in Molecular Biology, 2009
Our understanding of the origins, the functions and/or the structures of biological sequences str... more Our understanding of the origins, the functions and/or the structures of biological sequences strongly depends on our ability to decipher the mechanisms of molecular evolution. These complex processes can be described through the comparison of homologous sequences in a phylogenetic framework. Moreover, phylogenetic inference provides sound statistical tools to exhibit the main features of molecular evolution from the analysis of actual sequences. This chapter focuses on phylogenetic tree estimation under the maximum likelihood (ML) principle. Phylogenies inferred under this probabilistic criterion are usually reliable and important biological hypotheses can be tested through the comparison of different models. Estimating ML phylogenies is computationally demanding, and careful examination of the results is warranted. This chapter focuses on PhyML, a software that implements recent ML phylogenetic methods and algorithms. We illustrate the strengths and pitfalls of this program through the analysis of a real data set. PhyML v3.
Nature genetics, 2015
Orchidaceae, renowned for its spectacular flowers and other reproductive and ecological adaptatio... more Orchidaceae, renowned for its spectacular flowers and other reproductive and ecological adaptations, is one of the most diverse plant families. Here we present the genome sequence of the tropical epiphytic orchid Phalaenopsis equestris, a frequently used parent species for orchid breeding. P. equestris is the first plant with crassulacean acid metabolism (CAM) for which the genome has been sequenced. Our assembled genome contains 29,431 predicted protein-coding genes. We find that contigs likely to be underassembled, owing to heterozygosity, are enriched for genes that might be involved in self-incompatibility pathways. We find evidence for an orchid-specific paleopolyploidy event that preceded the radiation of most orchid clades, and our results suggest that gene duplication might have contributed to the evolution of CAM photosynthesis in P. equestris. Finally, we find expanded and diversified families of MADS-box C/D-class, B-class AP3 and AGL6-class genes, which might contribute ...
Nature genetics, Jan 28, 2015