Lorenzo Cerutti - Academia.edu (original) (raw)
Papers by Lorenzo Cerutti
Additional file 1: Fig. S1. Full genome genetic distance comparisons between HIV-1 subtype A sub-... more Additional file 1: Fig. S1. Full genome genetic distance comparisons between HIV-1 subtype A sub-subtypes according to our classification proposal. X-axis scale lines indicate genetic distance thresholds allowing, in our alignment and model conditions, for group, subtype and sub-subtype identification. Fig. S2. Phylogenetic tree of sub-type A obtained with pol gene. Sequences from A1, A2, A3, A4 and A6 clades have been collapsed for readability. One pol sequence was identified in the A7 clade in addition to the two full genome sequences and is highlighted by an arrow. The tree has been obtained with PhyML 3.0, using GTR-G nucleotide substitution model and branch support obtained by bootstrap method is given for each node. Several sequences, depicted in black, clustered outside the defined clades but cannot be retained in the classification proposal because of poor branch support values or absence of available full genome sequences. Fig. S3. Phylogenetic tree of sub-type A obtained w...
PROSITE (http://prosite.expasy.org/) consists of documentation entries describing protein domains... more PROSITE (http://prosite.expasy.org/) consists of documentation entries describing protein domains, families and functional sites, as well as associated patterns and profiles to identify them. It is complemented by ProRule a collection of rules, which increases the discriminatory power of these profiles and patterns by providing additional information about functionally and/or structurally critical amino acids. PROSITE signatures, together with ProRule, are used for the annotation of domains and features of UniProtKB/Swiss-Prot entries. Here, we describe recent developments that allow users to perform whole-proteome annotation as well as a number of filtering options that can be combined to perform powerful targeted searches for biological discovery. The latest version of PROSITE (release 20.85, of 30 August 2012) contains 1308 patterns, 1039 profiles and 1041 ProRules.
Summary: The PROSITE resource provides a rich and well annotated source of signatures in the form... more Summary: The PROSITE resource provides a rich and well annotated source of signatures in the form of generalized profiles that allow protein domain detection and functional annotation. One of the major limiting factors in the application of PROSITE in genome and metagenome annotation pipelines is the time required to search protein sequence databases for putative matches. We describe an improved and optimized implementation of the PROSITE search tool pfsearch that, combined with a newly developed heuristic, addresses this limitation. On a modern x86 64 hyper-threaded quad-core desktop computer the new pfsearchV3 is 2 orders of magnitude faster than the original algorithm. Availability: Source code and binaries of pfsearchV3 are freely available for download at
In yeast, microtubules are dynamic Wlaments necessary for spindle and nucleus positioning, as wel... more In yeast, microtubules are dynamic Wlaments necessary for spindle and nucleus positioning, as well as for proper chromosome segregation. We identify a func-tion for the yeast gene BER1 (Benomyl REsistant 1) in microtubule stability. BER1 belongs to an evolutionary conserved gene family whose founding member Sensitiv-ity to Red light Reduced is involved in red-light perception and circadian rhythms in Arabidopsis. Here, we present data showing that the ber1 mutant is aVected in microtu-bule stability, particularly in presence of microtubule-depolymerising drugs. The pattern of synthetic lethal interactions obtained with the ber1 mutant suggests that Ber1 may function in N-terminal protein acetylation. Our work thus suggests that microtubule stability might be reg-ulated through this post-translational modiWcation on yet-to-be determined proteins.
PROSITE consists of documentation entries describing protein domains, families and functional sit... more PROSITE consists of documentation entries describing protein domains, families and functional sites, as well as associated patterns and profiles to identify them. It is complemented by ProRule, a collection of rules based on profiles and patterns, which increases the discriminatory power of profiles and patterns by providing additional information about functionally and/or structurally critical amino acids. In this article, we describe the implementation of a new method to assign a status to pattern matches, the new PROSITE web page and a new approach to improve the specificity and sensitivity of PROSITE methods. The latest version of PROSITE
PROSITE consists of documentation entries describing protein domains, families and functional sit... more PROSITE consists of documentation entries describing protein domains, families and functional sites, as well as associated patterns and profiles to identify them. It is complemented by ProRule, a collection of rules based on profiles and patterns, which increases the discriminatory power of profiles and patterns by providing additional information about functionally and/or structurally critical amino acids. In this article, we describe the implementation of a new method to assign a status to pattern matches, the new PROSITE web page and a new approach to improve the specificity and sensitivity of PROSITE methods. The latest version of PROSITE
Frontiers in Microbiology
Candida albicans causes life-threatening systemic infections in immunosuppressed patients. These ... more Candida albicans causes life-threatening systemic infections in immunosuppressed patients. These infections are commonly treated with fluconazole, an antifungal agent targeting the ergosterol biosynthesis pathway. Current Antifungal Susceptibility Testing (AFST) methods are time-consuming and are often subjective. Moreover, they cannot reliably detect the tolerance phenomenon, a breeding ground for the resistance. An alternative to the classical AFST methods could use Matrix-Assisted Laser Desorption/Ionization Time-of-Flight (MALDI-TOF) Mass spectrometry (MS). This tool, already used in clinical microbiology for microbial species identification, has already offered promising results to detect antifungal resistance on non-azole tolerant yeasts. Here, we propose a machine-learning approach, adapted to MALDI-TOF MS data, to qualitatively detect fluconazole resistance in the azole tolerant species C. albicans. MALDI-TOF MS spectra were acquired from 33 C. albicans clinical strains isolated from 15 patients. Those strains were exposed for 3 h to 3 fluconazole concentrations (256, 16, 0 µg/mL) and with (5 µg/mL) or without cyclosporin A, an azole tolerance inhibitor, leading to six different experimental conditions. We then optimized a protein extraction protocol allowing the acquisition of high-quality spectra, which were further filtered through two quality controls. The first one consisted of discarding not identified spectra and the second one selected only the most similar spectra among replicates. Quality-controlled spectra were divided into six sets, following the sample preparation's protocols. Each set was then processed through an R based script using pre-defined housekeeping peaks allowing peak spectra positioning. Finally, 32 machine-learning algorithms applied on the six sets of spectra were compared, leading to 192 different pipelines of analysis. We selected the most robust pipeline with the best accuracy. This LDA model applied to the samples prepared in presence of tolerance inhibitor
Journal of Clinical Microbiology
Objective: This first pilot on external quality assessment (EQA) of SARS-CoV-2 whole genome seque... more Objective: This first pilot on external quality assessment (EQA) of SARS-CoV-2 whole genome sequencing, initiated by the ESCMID Study Group for Genomic and Molecular Diagnostics (ESGMD) and Swiss Society for Microbiology (SSM), aims to build a framework between laboratories in order to improve pathogen surveillance sequencing. Methods: Ten samples with varying viral loads were sent out to 15 clinical laboratories who had free choice of sequencing methods and bioinformatic analyses. The key aspects on which the individual centres were compared on were identification of 1) SNPs and indels, 2) Pango lineages, and 3) clusters between samples. Results: The participating laboratories used a wide array of methods and analysis pipelines. Most were able to generate whole genomes for all samples. Genomes were sequenced to varying depth (up to 100-fold difference across centres). There was a very good consensus regarding the majority of reporting criteria, but there were a few discrepancies in...
Clinical Microbiology Reviews
This review provides a state-of-the-art description of the performance of Sanger cycle sequencing... more This review provides a state-of-the-art description of the performance of Sanger cycle sequencing of the 16S rRNA gene for routine identification of bacteria in the clinical microbiology laboratory. A detailed description of the technology and current methodology is outlined with a major focus on proper data analyses and interpretation of sequences. The remainder of the article is focused on a comprehensive evaluation of the application of this method for identification of bacterial pathogens based on analyses of 16S multialignment sequences.
Journal of cell science, 1999
Nucleic Acids Research, 2014
The mission of the Universal Protein Resource (UniProt) (http://www.uniprot.org) is to provide th... more The mission of the Universal Protein Resource (UniProt) (http://www.uniprot.org) is to provide the scientific community with a comprehensive, highquality and freely accessible resource of protein sequences and functional annotation. It integrates, interprets and standardizes data from literature and numerous resources to achieve the most comprehensive catalog possible of protein information. The central activities are the biocuration of the UniProt Knowledgebase and the dissemination of these data through our Web site and web services. UniProt is produced by the UniProt Consortium, which consists of groups from the European Bioinformatics Institute (EBI), the SIB Swiss Institute of Bioinformatics (SIB) and the Protein Information Resource (PIR). UniProt is updated and distributed every 4 weeks and can be accessed online for searches or downloads.
Retrovirology
Background: The large and constantly evolving HIV-1 pandemic has led to an increasingly complex d... more Background: The large and constantly evolving HIV-1 pandemic has led to an increasingly complex diversity. Because of some taxonomic difficulties among the most diverse HIV-1 subtypes, and taking advantage of the large amount of sequence data generated in the recent years, we investigated novel lineage patterns among the main HIV-1 subtypes. Results: All HIV full-length genomes available in public databases were analysed (n = 2017). Maximum likelihood phylogenies and pairwise genetic distance were obtained. Clustering patterns and mean distributions of genetic distances were compared within and across the current groups, subtypes and sub-subtypes of HIV-1 to detect and analyse any divergent lineages within previously defined HIV lineages. The level of genetic similarity observed between most HIV clades was deeply consistent with the current classification. However, both subtypes A and D showed evidence of further intra-subtype diversification not fully described by the nomenclature system at the time and could be divided into several distinct sub-subtypes. Conclusions: With this work, we propose an updated nomenclature of sub-types A and D better reflecting their current genetic diversity and evolutionary patterns. Allowing a more accurate nomenclature and classification system is a necessary step for easier subtyping of HIV strains and a better detection or follow-up of viral epidemiology shifts.
Research in Microbiology, Aug 1, 2009
Conservation of the function of open reading frames recently identified in fungal genome projects... more Conservation of the function of open reading frames recently identified in fungal genome projects can be assessed by complementation of deletion mutants of putative Saccharomyces cerevisiae orthologs. A parallel complementation assay expressing the homologous wild type S. cerevisiae gene is generally performed as a positive control. However, we and others have found that failure of complementation can occur in this case. We investigated the specific cases of S. cerevisiae TBF1 and TIM54 essential genes. Heterologous complementation with Candida glabrata TBF1 or TIM54 gene was successful using the constitutive promoters TDH3 and TEF. In contrast, homologous complementation with S. cerevisiae TBF1 or TIM54 genes failed using these promoters, and was successful only using the natural promoters of these genes. The reduced growth rate of S. cerevisiae complemented with C. glabrata TBF1 or TIM54 suggested a diminished functionality of the heterologous proteins compared to the homologous proteins. The requirement of the homologous gene for the natural promoter was alleviated for TBF1 when complementation was assayed in the absence of sporulation and germination, and for TIM54 when two regions of the protein presumably responsible for a unique translocation pathway of the TIM54 protein into the mitochondrial membrane were deleted. Our results demonstrate that the use of different promoters may prove necessary to obtain successful complementation, with use of the natural promoter being the best approach for homologous complementation.
Research in Microbiology, 2009
Conservation of the function of open reading frames recently identified in fungal genome projects... more Conservation of the function of open reading frames recently identified in fungal genome projects can be assessed by complementation of deletion mutants of putative Saccharomyces cerevisiae orthologs. A parallel complementation assay expressing the homologous wild type S. cerevisiae gene is generally performed as a positive control. However, we and others have found that failure of complementation can occur in this case. We investigated the specific cases of S. cerevisiae TBF1 and TIM54 essential genes. Heterologous complementation with Candida glabrata TBF1 or TIM54 gene was successful using the constitutive promoters TDH3 and TEF. In contrast, homologous complementation with S. cerevisiae TBF1 or TIM54 genes failed using these promoters, and was successful only using the natural promoters of these genes. The reduced growth rate of S. cerevisiae complemented with C. glabrata TBF1 or TIM54 suggested a diminished functionality of the heterologous proteins compared to the homologous proteins. The requirement of the homologous gene for the natural promoter was alleviated for TBF1 when complementation was assayed in the absence of sporulation and germination, and for TIM54 when two regions of the protein presumably responsible for a unique translocation pathway of the TIM54 protein into the mitochondrial membrane were deleted. Our results demonstrate that the use of different promoters may prove necessary to obtain successful complementation, with use of the natural promoter being the best approach for homologous complementation.
2006 Second IEEE International Conference on e-Science and Grid Computing (e-Science'06), 2006
Bioinformatics algorithms such as sequence alignment methods based on profile-HMM (Hidden Markov ... more Bioinformatics algorithms such as sequence alignment methods based on profile-HMM (Hidden Markov Model) are popular but CPU-intensive. If large amounts of data are processed, a single computer often runs for many hours or even days. High performance infrastructures such as clusters or computational Grids provide the techniques to speed up the process by distributing the workload to remote nodes, running parts of the work load in parallel. Biologists often do not have access to such hardware systems. Therefore, we propose a new system using a modern Grid approach to optimise an embarrassingly parallel problem. We achieve speed ups by at least two orders of magnitude given that we can use a powerful, worldwide distributed Grid infrastructure. For large-scale problems our method can outperform algorithms designed for mid-size clusters even considering additional latencies imposed by Grid infrastructures.
The PROSITE resource provides a rich and well annotated source of signatures in the form of gener... more The PROSITE resource provides a rich and well annotated source of signatures in the form of generalized profiles that allow protein domain detection and functional annotation. One of the major limiting factors in the application of PROSITE in genome and metagenome annotation pipelines is the time required to search protein sequence databases for putative matches. We describe an improved and optimized implementation of the PROSITE search tool pfsearch that, combined with a newly developed heuristic, addresses this limitation. On a modern x86_64 hyper-threaded quad-core desktop computer, the new pfsearchV3 is two orders of magnitude faster than the original algorithm. Availability and implementation: Source code and binaries of pfsearchV3 are freely available for download at http://web.expasy.org/pftools/#pfsearchV3, implemented in C and supported on Linux. PROSITE generalized profiles including the heuristic cut-off scores are available at the same address. Contact: pftools@isb-sib.ch
Additional file 1: Fig. S1. Full genome genetic distance comparisons between HIV-1 subtype A sub-... more Additional file 1: Fig. S1. Full genome genetic distance comparisons between HIV-1 subtype A sub-subtypes according to our classification proposal. X-axis scale lines indicate genetic distance thresholds allowing, in our alignment and model conditions, for group, subtype and sub-subtype identification. Fig. S2. Phylogenetic tree of sub-type A obtained with pol gene. Sequences from A1, A2, A3, A4 and A6 clades have been collapsed for readability. One pol sequence was identified in the A7 clade in addition to the two full genome sequences and is highlighted by an arrow. The tree has been obtained with PhyML 3.0, using GTR-G nucleotide substitution model and branch support obtained by bootstrap method is given for each node. Several sequences, depicted in black, clustered outside the defined clades but cannot be retained in the classification proposal because of poor branch support values or absence of available full genome sequences. Fig. S3. Phylogenetic tree of sub-type A obtained w...
PROSITE (http://prosite.expasy.org/) consists of documentation entries describing protein domains... more PROSITE (http://prosite.expasy.org/) consists of documentation entries describing protein domains, families and functional sites, as well as associated patterns and profiles to identify them. It is complemented by ProRule a collection of rules, which increases the discriminatory power of these profiles and patterns by providing additional information about functionally and/or structurally critical amino acids. PROSITE signatures, together with ProRule, are used for the annotation of domains and features of UniProtKB/Swiss-Prot entries. Here, we describe recent developments that allow users to perform whole-proteome annotation as well as a number of filtering options that can be combined to perform powerful targeted searches for biological discovery. The latest version of PROSITE (release 20.85, of 30 August 2012) contains 1308 patterns, 1039 profiles and 1041 ProRules.
Summary: The PROSITE resource provides a rich and well annotated source of signatures in the form... more Summary: The PROSITE resource provides a rich and well annotated source of signatures in the form of generalized profiles that allow protein domain detection and functional annotation. One of the major limiting factors in the application of PROSITE in genome and metagenome annotation pipelines is the time required to search protein sequence databases for putative matches. We describe an improved and optimized implementation of the PROSITE search tool pfsearch that, combined with a newly developed heuristic, addresses this limitation. On a modern x86 64 hyper-threaded quad-core desktop computer the new pfsearchV3 is 2 orders of magnitude faster than the original algorithm. Availability: Source code and binaries of pfsearchV3 are freely available for download at
In yeast, microtubules are dynamic Wlaments necessary for spindle and nucleus positioning, as wel... more In yeast, microtubules are dynamic Wlaments necessary for spindle and nucleus positioning, as well as for proper chromosome segregation. We identify a func-tion for the yeast gene BER1 (Benomyl REsistant 1) in microtubule stability. BER1 belongs to an evolutionary conserved gene family whose founding member Sensitiv-ity to Red light Reduced is involved in red-light perception and circadian rhythms in Arabidopsis. Here, we present data showing that the ber1 mutant is aVected in microtu-bule stability, particularly in presence of microtubule-depolymerising drugs. The pattern of synthetic lethal interactions obtained with the ber1 mutant suggests that Ber1 may function in N-terminal protein acetylation. Our work thus suggests that microtubule stability might be reg-ulated through this post-translational modiWcation on yet-to-be determined proteins.
PROSITE consists of documentation entries describing protein domains, families and functional sit... more PROSITE consists of documentation entries describing protein domains, families and functional sites, as well as associated patterns and profiles to identify them. It is complemented by ProRule, a collection of rules based on profiles and patterns, which increases the discriminatory power of profiles and patterns by providing additional information about functionally and/or structurally critical amino acids. In this article, we describe the implementation of a new method to assign a status to pattern matches, the new PROSITE web page and a new approach to improve the specificity and sensitivity of PROSITE methods. The latest version of PROSITE
PROSITE consists of documentation entries describing protein domains, families and functional sit... more PROSITE consists of documentation entries describing protein domains, families and functional sites, as well as associated patterns and profiles to identify them. It is complemented by ProRule, a collection of rules based on profiles and patterns, which increases the discriminatory power of profiles and patterns by providing additional information about functionally and/or structurally critical amino acids. In this article, we describe the implementation of a new method to assign a status to pattern matches, the new PROSITE web page and a new approach to improve the specificity and sensitivity of PROSITE methods. The latest version of PROSITE
Frontiers in Microbiology
Candida albicans causes life-threatening systemic infections in immunosuppressed patients. These ... more Candida albicans causes life-threatening systemic infections in immunosuppressed patients. These infections are commonly treated with fluconazole, an antifungal agent targeting the ergosterol biosynthesis pathway. Current Antifungal Susceptibility Testing (AFST) methods are time-consuming and are often subjective. Moreover, they cannot reliably detect the tolerance phenomenon, a breeding ground for the resistance. An alternative to the classical AFST methods could use Matrix-Assisted Laser Desorption/Ionization Time-of-Flight (MALDI-TOF) Mass spectrometry (MS). This tool, already used in clinical microbiology for microbial species identification, has already offered promising results to detect antifungal resistance on non-azole tolerant yeasts. Here, we propose a machine-learning approach, adapted to MALDI-TOF MS data, to qualitatively detect fluconazole resistance in the azole tolerant species C. albicans. MALDI-TOF MS spectra were acquired from 33 C. albicans clinical strains isolated from 15 patients. Those strains were exposed for 3 h to 3 fluconazole concentrations (256, 16, 0 µg/mL) and with (5 µg/mL) or without cyclosporin A, an azole tolerance inhibitor, leading to six different experimental conditions. We then optimized a protein extraction protocol allowing the acquisition of high-quality spectra, which were further filtered through two quality controls. The first one consisted of discarding not identified spectra and the second one selected only the most similar spectra among replicates. Quality-controlled spectra were divided into six sets, following the sample preparation's protocols. Each set was then processed through an R based script using pre-defined housekeeping peaks allowing peak spectra positioning. Finally, 32 machine-learning algorithms applied on the six sets of spectra were compared, leading to 192 different pipelines of analysis. We selected the most robust pipeline with the best accuracy. This LDA model applied to the samples prepared in presence of tolerance inhibitor
Journal of Clinical Microbiology
Objective: This first pilot on external quality assessment (EQA) of SARS-CoV-2 whole genome seque... more Objective: This first pilot on external quality assessment (EQA) of SARS-CoV-2 whole genome sequencing, initiated by the ESCMID Study Group for Genomic and Molecular Diagnostics (ESGMD) and Swiss Society for Microbiology (SSM), aims to build a framework between laboratories in order to improve pathogen surveillance sequencing. Methods: Ten samples with varying viral loads were sent out to 15 clinical laboratories who had free choice of sequencing methods and bioinformatic analyses. The key aspects on which the individual centres were compared on were identification of 1) SNPs and indels, 2) Pango lineages, and 3) clusters between samples. Results: The participating laboratories used a wide array of methods and analysis pipelines. Most were able to generate whole genomes for all samples. Genomes were sequenced to varying depth (up to 100-fold difference across centres). There was a very good consensus regarding the majority of reporting criteria, but there were a few discrepancies in...
Clinical Microbiology Reviews
This review provides a state-of-the-art description of the performance of Sanger cycle sequencing... more This review provides a state-of-the-art description of the performance of Sanger cycle sequencing of the 16S rRNA gene for routine identification of bacteria in the clinical microbiology laboratory. A detailed description of the technology and current methodology is outlined with a major focus on proper data analyses and interpretation of sequences. The remainder of the article is focused on a comprehensive evaluation of the application of this method for identification of bacterial pathogens based on analyses of 16S multialignment sequences.
Journal of cell science, 1999
Nucleic Acids Research, 2014
The mission of the Universal Protein Resource (UniProt) (http://www.uniprot.org) is to provide th... more The mission of the Universal Protein Resource (UniProt) (http://www.uniprot.org) is to provide the scientific community with a comprehensive, highquality and freely accessible resource of protein sequences and functional annotation. It integrates, interprets and standardizes data from literature and numerous resources to achieve the most comprehensive catalog possible of protein information. The central activities are the biocuration of the UniProt Knowledgebase and the dissemination of these data through our Web site and web services. UniProt is produced by the UniProt Consortium, which consists of groups from the European Bioinformatics Institute (EBI), the SIB Swiss Institute of Bioinformatics (SIB) and the Protein Information Resource (PIR). UniProt is updated and distributed every 4 weeks and can be accessed online for searches or downloads.
Retrovirology
Background: The large and constantly evolving HIV-1 pandemic has led to an increasingly complex d... more Background: The large and constantly evolving HIV-1 pandemic has led to an increasingly complex diversity. Because of some taxonomic difficulties among the most diverse HIV-1 subtypes, and taking advantage of the large amount of sequence data generated in the recent years, we investigated novel lineage patterns among the main HIV-1 subtypes. Results: All HIV full-length genomes available in public databases were analysed (n = 2017). Maximum likelihood phylogenies and pairwise genetic distance were obtained. Clustering patterns and mean distributions of genetic distances were compared within and across the current groups, subtypes and sub-subtypes of HIV-1 to detect and analyse any divergent lineages within previously defined HIV lineages. The level of genetic similarity observed between most HIV clades was deeply consistent with the current classification. However, both subtypes A and D showed evidence of further intra-subtype diversification not fully described by the nomenclature system at the time and could be divided into several distinct sub-subtypes. Conclusions: With this work, we propose an updated nomenclature of sub-types A and D better reflecting their current genetic diversity and evolutionary patterns. Allowing a more accurate nomenclature and classification system is a necessary step for easier subtyping of HIV strains and a better detection or follow-up of viral epidemiology shifts.
Research in Microbiology, Aug 1, 2009
Conservation of the function of open reading frames recently identified in fungal genome projects... more Conservation of the function of open reading frames recently identified in fungal genome projects can be assessed by complementation of deletion mutants of putative Saccharomyces cerevisiae orthologs. A parallel complementation assay expressing the homologous wild type S. cerevisiae gene is generally performed as a positive control. However, we and others have found that failure of complementation can occur in this case. We investigated the specific cases of S. cerevisiae TBF1 and TIM54 essential genes. Heterologous complementation with Candida glabrata TBF1 or TIM54 gene was successful using the constitutive promoters TDH3 and TEF. In contrast, homologous complementation with S. cerevisiae TBF1 or TIM54 genes failed using these promoters, and was successful only using the natural promoters of these genes. The reduced growth rate of S. cerevisiae complemented with C. glabrata TBF1 or TIM54 suggested a diminished functionality of the heterologous proteins compared to the homologous proteins. The requirement of the homologous gene for the natural promoter was alleviated for TBF1 when complementation was assayed in the absence of sporulation and germination, and for TIM54 when two regions of the protein presumably responsible for a unique translocation pathway of the TIM54 protein into the mitochondrial membrane were deleted. Our results demonstrate that the use of different promoters may prove necessary to obtain successful complementation, with use of the natural promoter being the best approach for homologous complementation.
Research in Microbiology, 2009
Conservation of the function of open reading frames recently identified in fungal genome projects... more Conservation of the function of open reading frames recently identified in fungal genome projects can be assessed by complementation of deletion mutants of putative Saccharomyces cerevisiae orthologs. A parallel complementation assay expressing the homologous wild type S. cerevisiae gene is generally performed as a positive control. However, we and others have found that failure of complementation can occur in this case. We investigated the specific cases of S. cerevisiae TBF1 and TIM54 essential genes. Heterologous complementation with Candida glabrata TBF1 or TIM54 gene was successful using the constitutive promoters TDH3 and TEF. In contrast, homologous complementation with S. cerevisiae TBF1 or TIM54 genes failed using these promoters, and was successful only using the natural promoters of these genes. The reduced growth rate of S. cerevisiae complemented with C. glabrata TBF1 or TIM54 suggested a diminished functionality of the heterologous proteins compared to the homologous proteins. The requirement of the homologous gene for the natural promoter was alleviated for TBF1 when complementation was assayed in the absence of sporulation and germination, and for TIM54 when two regions of the protein presumably responsible for a unique translocation pathway of the TIM54 protein into the mitochondrial membrane were deleted. Our results demonstrate that the use of different promoters may prove necessary to obtain successful complementation, with use of the natural promoter being the best approach for homologous complementation.
2006 Second IEEE International Conference on e-Science and Grid Computing (e-Science'06), 2006
Bioinformatics algorithms such as sequence alignment methods based on profile-HMM (Hidden Markov ... more Bioinformatics algorithms such as sequence alignment methods based on profile-HMM (Hidden Markov Model) are popular but CPU-intensive. If large amounts of data are processed, a single computer often runs for many hours or even days. High performance infrastructures such as clusters or computational Grids provide the techniques to speed up the process by distributing the workload to remote nodes, running parts of the work load in parallel. Biologists often do not have access to such hardware systems. Therefore, we propose a new system using a modern Grid approach to optimise an embarrassingly parallel problem. We achieve speed ups by at least two orders of magnitude given that we can use a powerful, worldwide distributed Grid infrastructure. For large-scale problems our method can outperform algorithms designed for mid-size clusters even considering additional latencies imposed by Grid infrastructures.
The PROSITE resource provides a rich and well annotated source of signatures in the form of gener... more The PROSITE resource provides a rich and well annotated source of signatures in the form of generalized profiles that allow protein domain detection and functional annotation. One of the major limiting factors in the application of PROSITE in genome and metagenome annotation pipelines is the time required to search protein sequence databases for putative matches. We describe an improved and optimized implementation of the PROSITE search tool pfsearch that, combined with a newly developed heuristic, addresses this limitation. On a modern x86_64 hyper-threaded quad-core desktop computer, the new pfsearchV3 is two orders of magnitude faster than the original algorithm. Availability and implementation: Source code and binaries of pfsearchV3 are freely available for download at http://web.expasy.org/pftools/#pfsearchV3, implemented in C and supported on Linux. PROSITE generalized profiles including the heuristic cut-off scores are available at the same address. Contact: pftools@isb-sib.ch