Rolf Kaas | Technical University of Denmark (DTU) (original) (raw)

Papers by Rolf Kaas

Research paper thumbnail of Solving the Problem of Comparing Whole Bacterial Genomes across Different Sequencing Platforms

Whole genome sequencing (WGS) shows great potential for real-time monitoring and identification o... more Whole genome sequencing (WGS) shows great potential for real-time monitoring and identification of infectious disease outbreaks. However, rapid and reliable comparison of data generated in multiple laboratories and using multiple technologies is essential. So far studies have focused on using one technology because each technology has a systematic bias making integration of data generated from different platforms difficult. We developed two different procedures for identifying variable sites and inferring phylogenies in WGS data across multiple platforms. The methods were evaluated on three bacterial data sets and sequenced on three different platforms (Illumina, 454, Ion Torrent). We show that the methods are able to overcome the systematic biases caused by the sequencers and infer the expected phylogenies. It is concluded that the cause of the success of these new procedures is due to a validation of all informative sites that are included in the analysis. The procedures are available as web tools.

Research paper thumbnail of Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes

Nature biotechnology, 2014

Most current approaches for analyzing metagenomic data rely on comparisons to reference genomes, ... more Most current approaches for analyzing metagenomic data rely on comparisons to reference genomes, but the microbial diversity of many environments extends far beyond what is covered by reference databases. De novo segregation of complex metagenomic data into specific biological entities, such as particular bacterial strains or viruses, remains a largely unsolved problem. Here we present a method, based on binning co-abundant genes across a series of metagenomic samples, that enables comprehensive discovery of new microbial organisms, viruses and co-inherited genetic entities and aids assembly of microbial genomes without the need for reference sequences. We demonstrate the method on data from 396 human gut microbiome samples and identify 7,381 co-abundance gene groups (CAGs), including 741 metagenomic species (MGS). We use these to assemble 238 high-quality microbial genomes and identify affiliations between MGS and hundreds of viruses or genetic entities. Our method provides the mea...

Research paper thumbnail of Genomic Dissection of Travel-Associated Extended-Spectrum-Beta-Lactamase-Producing Salmonella enterica Serovar Typhi Isolates Originating from the Philippines: a One-Off Occurrence or a Threat to Effective Treatment of Typhoid Fever?

Journal of Clinical Microbiology, 2014

One unreported case of extended-spectrum-beta-lactamase (ESBL)-producing Salmonella enterica sero... more One unreported case of extended-spectrum-beta-lactamase (ESBL)-producing Salmonella enterica serovar Typhi was identified, whole-genome sequence typed, among other analyses, and compared to other available genomes of S. Typhi. The reported strain was similar to a previously published strain harboring bla SHV-12 from the Philippines and likely part of an undetected outbreak, the first of ESBL-producing S. Typhi. Aarestrup FM, Hasman H. 2015. Genomic dissection of travel-associated extended-spectrum-beta-lactamase-producing Salmonella enterica serovar Typhi isolates originating from the Philippines: a one-off occurrence or a threat to effective treatment of typhoid fever? J Clin Microbiol 53:000 -000.

Research paper thumbnail of Genomic Signature of Multidrug-Resistant Salmonella enterica Serovar Typhi Isolates Related to a Massive Outbreak in Zambia between 2010 and 2012

Journal of Clinical Microbiology, 2014

Retrospectively, we investigated the epidemiology of a massive Salmonella enterica serovar Typhi ... more Retrospectively, we investigated the epidemiology of a massive Salmonella enterica serovar Typhi outbreak in Zambia during 2010 to 2012. Ninety-four isolates were susceptibility tested by MIC determinations. Whole-genome sequence typing (WGST) of 33 isolates and bioinformatic analysis identified the multilocus sequence type (MLST), haplotype, plasmid replicon, antimicrobial resistance genes, and genetic relatedness by single nucleotide polymorphism (SNP) analysis and genomic deletions. The outbreak affected 2,040 patients, with a fatality rate of 0.5%. Most (83.0%) isolates were multidrug resistant (MDR). The isolates belonged to MLST ST1 and a new variant of the haplotype, H58B. Most isolates contained a chromosomally translocated region containing seven antimicrobial resistance genes, catA1, bla TEM-1 , dfrA7, sul1, sul2, strA, and strB, and fragments of the incompatibility group Q1 (IncQ1) plasmid replicon, the class 1 integron, and the mer operon. The genomic analysis revealed 415 SNP differences overall and 35 deletions among 33 of the isolates subjected to whole-genome sequencing. In comparison with other genomes of H58, the Zambian isolates separated from genomes from Central Africa and India by 34 and 52 SNPs, respectively. The phylogenetic analysis indicates that 32 of the 33 isolates sequenced belonged to a tight clonal group distinct from other H58 genomes included in the study. The small numbers of SNPs identified within this group are consistent with the short-term transmission that can be expected over a period of 2 years. The phylogenetic analysis and deletions suggest that a single MDR clone was responsible for the outbreak, during which occasional other S. Typhi lineages, including sensitive ones, continued to cocirculate. The common view is that the emerging global S. Typhi haplotype, H58B, containing the MDR IncHI1 plasmid is responsible for the majority of typhoid infections in Asia and sub-Saharan Africa; we found that a new variant of the haplotype harboring a chromosomally translocated region containing the MDR islands of IncHI1 plasmid has emerged in Zambia. This could change the perception of the term "classical MDR typhoid" currently being solely associated with the IncHI1 plasmid. It might be more common than presently thought that S. Typhi haplotype H58B harbors the IncHI1 plasmid or a chromosomally translocated MDR region or both.

Research paper thumbnail of Evaluation of Whole Genome Sequencing for Outbreak Detection of Salmonella enterica

PLoS ONE, 2014

Salmonella enterica is a common cause of minor and large food borne outbreaks. To achieve success... more Salmonella enterica is a common cause of minor and large food borne outbreaks. To achieve successful and nearly 'real-time' monitoring and identification of outbreaks, reliable sub-typing is essential. Whole genome sequencing (WGS) shows great promises for using as a routine epidemiological typing tool. Here we evaluate WGS for typing of S. Typhimurium including different approaches for analyzing and comparing the data. A collection of 34 S. Typhimurium isolates was sequenced. This consisted of 18 isolates from six outbreaks and 16 epidemiologically unrelated background strains. In addition, 8 S. Enteritidis and 5 S. Derby were also sequenced and used for comparison. A number of different bioinformatics approaches were applied on the data; including pan-genome tree, k-mer tree, nucleotide difference tree and SNP tree. The outcome of each approach was evaluated in relation to the association of the isolates to specific outbreaks. The pan-genome tree clustered 65% of the S. Typhimurium isolates according to the pre-defined epidemiology, the k-mer tree 88%, the nucleotide difference tree 100% and the SNP tree 100% of the strains within S. Typhimurium. The resulting outcome of the four phylogenetic analyses were also compared to PFGE reveling that WGS typing achieved the greater performance than the traditional method. In conclusion, for S. Typhimurium, SNP analysis and nucleotide difference approach of WGS data seem to be the superior methods for epidemiological typing compared to other phylogenetic analytic approaches that may be used on WGS. These approaches were also superior to the more classical typing method, PFGE. Our study also indicates that WGS alone is insufficient to determine whether strains are related or un-related to outbreaks. This still requires the combination of epidemiological data and whole genome sequencing results.

Research paper thumbnail of Genome-Wide High-Throughput Screening to Investigate Essential Genes Involved in Methicillin-Resistant Staphylococcus aureus Sequence Type 398 Survival

PLoS ONE, 2014

Livestock-associated methicillin-resistant Staphylococcus aureus (LA-MRSA) Sequence Type 398 (ST3... more Livestock-associated methicillin-resistant Staphylococcus aureus (LA-MRSA) Sequence Type 398 (ST398) is an opportunistic pathogen that is able to colonize and cause disease in several animal species including humans. To better understand the adaptation, evolution, transmission and pathogenic capacity, further investigations into the importance of the different genes harboured by LA-MRSA ST398 are required. In this study we generated a genome-wide transposon mutant library in an LA-MRSA ST398 isolate to evaluate genes important for bacterial survival in laboratory and host-specific environments. The transposon mutant library consisted of approximately 1 million mutants with around 140,000 unique insertion sites and an average number of unique inserts per gene of 44.8. We identified LA-MRSA ST398 essential genes comparable to other high-throughput S. aureus essential gene studies. As ST398 is the most common MRSA isolated from pigs, the transposon mutant library was screened in whole porcine blood. Twenty-four genes were specifically identified as important for bacterial survival in porcine blood. Mutations in 23 of these genes resulted in attenuated bacterial fitness. Seven of the 23 genes were of unknown function, whereas 16 genes were annotated with functions predominantly related to carbon metabolism, pH shock and a variety of regulations and only indirectly to virulence factors. Mutations in one gene of unknown function resulted in a hypercompetitive mutant. Further evaluation of these genes is required to determine their specific relevance in blood survival.

Research paper thumbnail of Real-Time Whole-Genome Sequencing for Routine Typing, Surveillance, and Outbreak Detection of Verotoxigenic Escherichia coli

Journal of Clinical Microbiology, 2014

Fast and accurate identification and typing of pathogens are essential for effective surveillance... more Fast and accurate identification and typing of pathogens are essential for effective surveillance and outbreak detection. The current routine procedure is based on a variety of techniques, making the procedure laborious, time-consuming, and expensive. With whole-genome sequencing (WGS) becoming cheaper, it has huge potential in both diagnostics and routine surveillance. The aim of this study was to perform a real-time evaluation of WGS for routine typing and surveillance of verocytotoxin-producing Escherichia coli (VTEC). In Denmark, the Statens Serum Institut (SSI) routinely receives all suspected VTEC isolates. During a 7-week period in the fall of 2012, all incoming isolates were concurrently subjected to WGS using IonTorrent PGM. Real-time bioinformatics analysis was performed using web-tools (www.genomicepidemiology.org) for species determination, multilocus sequence type (MLST) typing, and determination of phylogenetic relationship, and a specific VirulenceFinder for detection of E. coli virulence genes was developed as part of this study. In total, 46 suspected VTEC isolates were characterized in parallel during the study. VirulenceFinder proved successful in detecting virulence genes included in routine typing, explicitly verocytotoxin 1 (vtx1), verocytotoxin 2 (vtx2), and intimin (eae), and also detected additional virulence genes. VirulenceFinder is also a robust method for assigning verocytotoxin (vtx) subtypes. A real-time clustering of isolates in agreement with the epidemiology was established from WGS, enabling discrimination between sporadic and outbreak isolates. Overall, WGS typing produced results faster and at a lower cost than the current routine. Therefore, WGS typing is a superior alternative to conventional typing strategies. This approach may also be applied to typing and surveillance of other pathogens.

Research paper thumbnail of Draft Genome Sequence of the Yeast Pachysolen tannophilus CBS 4044/NRRL Y-2460

Eukaryotic Cell, 2012

A draft genome sequence of the yeast Pachysolen tannophilus CBS 4044/NRRL Y-2460 is presented. Th... more A draft genome sequence of the yeast Pachysolen tannophilus CBS 4044/NRRL Y-2460 is presented. The organism has the potential to be developed as a cell factory for biorefineries due to its ability to utilize waste feedstocks. The sequenced genome size was 12,238,196 bp, consisting of 34 scaffolds. A total of 4,463 genes from 5,346 predicted open reading frames were annotated with function.

Research paper thumbnail of snpTree - a web-server to identify and construct SNP trees from whole genome sequence data

BMC Genomics, 2012

Background: The advances and decreasing economical cost of whole genome sequencing (WGS), will so... more Background: The advances and decreasing economical cost of whole genome sequencing (WGS), will soon make this technology available for routine infectious disease epidemiology. In epidemiological studies, outbreak isolates have very little diversity and require extensive genomic analysis to differentiate and classify isolates. One of the successfully and broadly used methods is analysis of single nucletide polymorphisms (SNPs). Currently, there are different tools and methods to identify SNPs including various options and cut-off values. Furthermore, all current methods require bioinformatic skills. Thus, we lack a standard and simple automatic tool to determine SNPs and construct phylogenetic tree from WGS data. Results: Here we introduce snpTree, a server for online-automatic SNPs analysis. This tool is composed of different SNPs analysis suites, perl and python scripts. snpTree can identify SNPs and construct phylogenetic trees from WGS as well as from assembled genomes or contigs. WGS data in fastq format are aligned to reference genomes by BWA while contigs in fasta format are processed by Nucmer. SNPs are concatenated based on position on reference genome and a tree is constructed from concatenated SNPs using FastTree and a perl script. The online server was implemented by HTML, Java and python script. The server was evaluated using four published bacterial WGS data sets (V. cholerae, S. aureus CC398, S. Typhimurium and M. tuberculosis). The evalution results for the first three cases was consistent and concordant for both raw reads and assembled genomes. In the latter case the original publication involved extensive filtering of SNPs, which could not be repeated using snpTree.

Research paper thumbnail of Veillonella, Firmicutes: Microbes disguised as Gram negatives

Standards in Genomic Sciences, 2013

The Firmicutes represent a major component of the intestinal microflora. The intestinal Firmicute... more The Firmicutes represent a major component of the intestinal microflora. The intestinal Firmicutes are a large, diverse group of organisms, many of which are poorly characterized due to their anaerobic growth requirements. Although most Firmicutes are Gram positive, members of the class Negativicutes, including the genus Veillonella, stain Gram negative. Veillonella are among the most abundant organisms of the oral and intestinal microflora of animals and humans, in spite of being strict anaerobes. In this work, the genomes of 24 Negativicutes, including eight Veillonella spp., are compared to 20 other Firmicutes genomes; a further 101 prokaryotic genomes were included, covering 26 phyla. Thus a total of 145 prokaryotic genomes were analyzed by various methods to investigate the apparent conflict of the Veillonella Gram stain and their taxonomic position within the Firmicutes. Comparison of the genome sequences confirms that the Negativicutes are distantly related to Clostridium spp., based on 16S rRNA, complete genomic DNA sequences, and a consensus tree based on conserved proteins. The genus Veillonella is relatively homogeneous: inter-genus pairwise comparison identifies at least 1,350 shared proteins, although less than half of these are found in any given Clostridium g enome. Only 27 proteins are found conserved in all analyzed prokaryote genomes. Veillonella has distinct metabolic properties, and significant similarities to genomes of Proteobacteria are not detected, with the exception of a shared LPS biosynthesis pathway. The clade within the class Negativicutes to which the genus Veillonella belongs exhibits unique properties, most of which are in common with Gram-positives and some with Gram negatives. They are only distantly related to Clostridia, but are even less closely related to Gram-negative species. Thoug h the Negativicutes stain Gram-negative and possess two membranes, the genome and proteome analysis presented here confirm their place within the (mainly) Gram positive phylum of the Firmicutes. Further studies are required to unveil the evolutionary history of the Veillonella and other Negativicutes.

Research paper thumbnail of Population Genetics of Vibrio cholerae from Nepal in 2010: Evidence on the Origin of the Haitian Outbreak

mBio, 2011

Cholera continues to be an important cause of human infections, and outbreaks are often observed ... more Cholera continues to be an important cause of human infections, and outbreaks are often observed after natural disasters, such as the one following the 2010 earthquake in Haiti. Once the cholera outbreak was confirmed, rumors spread that the disease was brought to Haiti by a battalion of Nepalese soldiers serving as United Nations peacekeepers. This possible connection has never been confirmed. We used whole-genome sequence typing (WGST), pulsed-field gel electrophoresis (PFGE), and antimicrobial susceptibility testing to characterize 24 recent Vibrio cholerae isolates from Nepal and evaluate the suggested epidemiological link with the Haitian outbreak. The isolates were obtained from 30 July to 1 November 2010 from five different districts in Nepal. We compared the 24 genomes to 10 previously sequenced V. cholerae isolates, including 3 from the Haitian outbreak (began July 2010). Antimicrobial susceptibility and PFGE patterns were consistent with an epidemiological link between the isolates from Nepal and Haiti. WGST showed that all 24 V. cholerae isolates from Nepal belonged to a single monophyletic group that also contained isolates from Bangladesh and Haiti. The Nepalese isolates were divided into four closely related clusters. One cluster contained three Nepalese isolates and three Haitian isolates that were almost identical, with only 1-or 2-bp differences.

Research paper thumbnail of Solving the Problem of Comparing Whole Bacterial Genomes across Different Sequencing Platforms

Whole genome sequencing (WGS) shows great potential for real-time monitoring and identification o... more Whole genome sequencing (WGS) shows great potential for real-time monitoring and identification of infectious disease outbreaks. However, rapid and reliable comparison of data generated in multiple laboratories and using multiple technologies is essential. So far studies have focused on using one technology because each technology has a systematic bias making integration of data generated from different platforms difficult. We developed two different procedures for identifying variable sites and inferring phylogenies in WGS data across multiple platforms. The methods were evaluated on three bacterial data sets and sequenced on three different platforms (Illumina, 454, Ion Torrent). We show that the methods are able to overcome the systematic biases caused by the sequencers and infer the expected phylogenies. It is concluded that the cause of the success of these new procedures is due to a validation of all informative sites that are included in the analysis. The procedures are available as web tools.

Research paper thumbnail of Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes

Nature biotechnology, 2014

Most current approaches for analyzing metagenomic data rely on comparisons to reference genomes, ... more Most current approaches for analyzing metagenomic data rely on comparisons to reference genomes, but the microbial diversity of many environments extends far beyond what is covered by reference databases. De novo segregation of complex metagenomic data into specific biological entities, such as particular bacterial strains or viruses, remains a largely unsolved problem. Here we present a method, based on binning co-abundant genes across a series of metagenomic samples, that enables comprehensive discovery of new microbial organisms, viruses and co-inherited genetic entities and aids assembly of microbial genomes without the need for reference sequences. We demonstrate the method on data from 396 human gut microbiome samples and identify 7,381 co-abundance gene groups (CAGs), including 741 metagenomic species (MGS). We use these to assemble 238 high-quality microbial genomes and identify affiliations between MGS and hundreds of viruses or genetic entities. Our method provides the mea...

Research paper thumbnail of Genomic Dissection of Travel-Associated Extended-Spectrum-Beta-Lactamase-Producing Salmonella enterica Serovar Typhi Isolates Originating from the Philippines: a One-Off Occurrence or a Threat to Effective Treatment of Typhoid Fever?

Journal of Clinical Microbiology, 2014

One unreported case of extended-spectrum-beta-lactamase (ESBL)-producing Salmonella enterica sero... more One unreported case of extended-spectrum-beta-lactamase (ESBL)-producing Salmonella enterica serovar Typhi was identified, whole-genome sequence typed, among other analyses, and compared to other available genomes of S. Typhi. The reported strain was similar to a previously published strain harboring bla SHV-12 from the Philippines and likely part of an undetected outbreak, the first of ESBL-producing S. Typhi. Aarestrup FM, Hasman H. 2015. Genomic dissection of travel-associated extended-spectrum-beta-lactamase-producing Salmonella enterica serovar Typhi isolates originating from the Philippines: a one-off occurrence or a threat to effective treatment of typhoid fever? J Clin Microbiol 53:000 -000.

Research paper thumbnail of Genomic Signature of Multidrug-Resistant Salmonella enterica Serovar Typhi Isolates Related to a Massive Outbreak in Zambia between 2010 and 2012

Journal of Clinical Microbiology, 2014

Retrospectively, we investigated the epidemiology of a massive Salmonella enterica serovar Typhi ... more Retrospectively, we investigated the epidemiology of a massive Salmonella enterica serovar Typhi outbreak in Zambia during 2010 to 2012. Ninety-four isolates were susceptibility tested by MIC determinations. Whole-genome sequence typing (WGST) of 33 isolates and bioinformatic analysis identified the multilocus sequence type (MLST), haplotype, plasmid replicon, antimicrobial resistance genes, and genetic relatedness by single nucleotide polymorphism (SNP) analysis and genomic deletions. The outbreak affected 2,040 patients, with a fatality rate of 0.5%. Most (83.0%) isolates were multidrug resistant (MDR). The isolates belonged to MLST ST1 and a new variant of the haplotype, H58B. Most isolates contained a chromosomally translocated region containing seven antimicrobial resistance genes, catA1, bla TEM-1 , dfrA7, sul1, sul2, strA, and strB, and fragments of the incompatibility group Q1 (IncQ1) plasmid replicon, the class 1 integron, and the mer operon. The genomic analysis revealed 415 SNP differences overall and 35 deletions among 33 of the isolates subjected to whole-genome sequencing. In comparison with other genomes of H58, the Zambian isolates separated from genomes from Central Africa and India by 34 and 52 SNPs, respectively. The phylogenetic analysis indicates that 32 of the 33 isolates sequenced belonged to a tight clonal group distinct from other H58 genomes included in the study. The small numbers of SNPs identified within this group are consistent with the short-term transmission that can be expected over a period of 2 years. The phylogenetic analysis and deletions suggest that a single MDR clone was responsible for the outbreak, during which occasional other S. Typhi lineages, including sensitive ones, continued to cocirculate. The common view is that the emerging global S. Typhi haplotype, H58B, containing the MDR IncHI1 plasmid is responsible for the majority of typhoid infections in Asia and sub-Saharan Africa; we found that a new variant of the haplotype harboring a chromosomally translocated region containing the MDR islands of IncHI1 plasmid has emerged in Zambia. This could change the perception of the term "classical MDR typhoid" currently being solely associated with the IncHI1 plasmid. It might be more common than presently thought that S. Typhi haplotype H58B harbors the IncHI1 plasmid or a chromosomally translocated MDR region or both.

Research paper thumbnail of Evaluation of Whole Genome Sequencing for Outbreak Detection of Salmonella enterica

PLoS ONE, 2014

Salmonella enterica is a common cause of minor and large food borne outbreaks. To achieve success... more Salmonella enterica is a common cause of minor and large food borne outbreaks. To achieve successful and nearly 'real-time' monitoring and identification of outbreaks, reliable sub-typing is essential. Whole genome sequencing (WGS) shows great promises for using as a routine epidemiological typing tool. Here we evaluate WGS for typing of S. Typhimurium including different approaches for analyzing and comparing the data. A collection of 34 S. Typhimurium isolates was sequenced. This consisted of 18 isolates from six outbreaks and 16 epidemiologically unrelated background strains. In addition, 8 S. Enteritidis and 5 S. Derby were also sequenced and used for comparison. A number of different bioinformatics approaches were applied on the data; including pan-genome tree, k-mer tree, nucleotide difference tree and SNP tree. The outcome of each approach was evaluated in relation to the association of the isolates to specific outbreaks. The pan-genome tree clustered 65% of the S. Typhimurium isolates according to the pre-defined epidemiology, the k-mer tree 88%, the nucleotide difference tree 100% and the SNP tree 100% of the strains within S. Typhimurium. The resulting outcome of the four phylogenetic analyses were also compared to PFGE reveling that WGS typing achieved the greater performance than the traditional method. In conclusion, for S. Typhimurium, SNP analysis and nucleotide difference approach of WGS data seem to be the superior methods for epidemiological typing compared to other phylogenetic analytic approaches that may be used on WGS. These approaches were also superior to the more classical typing method, PFGE. Our study also indicates that WGS alone is insufficient to determine whether strains are related or un-related to outbreaks. This still requires the combination of epidemiological data and whole genome sequencing results.

Research paper thumbnail of Genome-Wide High-Throughput Screening to Investigate Essential Genes Involved in Methicillin-Resistant Staphylococcus aureus Sequence Type 398 Survival

PLoS ONE, 2014

Livestock-associated methicillin-resistant Staphylococcus aureus (LA-MRSA) Sequence Type 398 (ST3... more Livestock-associated methicillin-resistant Staphylococcus aureus (LA-MRSA) Sequence Type 398 (ST398) is an opportunistic pathogen that is able to colonize and cause disease in several animal species including humans. To better understand the adaptation, evolution, transmission and pathogenic capacity, further investigations into the importance of the different genes harboured by LA-MRSA ST398 are required. In this study we generated a genome-wide transposon mutant library in an LA-MRSA ST398 isolate to evaluate genes important for bacterial survival in laboratory and host-specific environments. The transposon mutant library consisted of approximately 1 million mutants with around 140,000 unique insertion sites and an average number of unique inserts per gene of 44.8. We identified LA-MRSA ST398 essential genes comparable to other high-throughput S. aureus essential gene studies. As ST398 is the most common MRSA isolated from pigs, the transposon mutant library was screened in whole porcine blood. Twenty-four genes were specifically identified as important for bacterial survival in porcine blood. Mutations in 23 of these genes resulted in attenuated bacterial fitness. Seven of the 23 genes were of unknown function, whereas 16 genes were annotated with functions predominantly related to carbon metabolism, pH shock and a variety of regulations and only indirectly to virulence factors. Mutations in one gene of unknown function resulted in a hypercompetitive mutant. Further evaluation of these genes is required to determine their specific relevance in blood survival.

Research paper thumbnail of Real-Time Whole-Genome Sequencing for Routine Typing, Surveillance, and Outbreak Detection of Verotoxigenic Escherichia coli

Journal of Clinical Microbiology, 2014

Fast and accurate identification and typing of pathogens are essential for effective surveillance... more Fast and accurate identification and typing of pathogens are essential for effective surveillance and outbreak detection. The current routine procedure is based on a variety of techniques, making the procedure laborious, time-consuming, and expensive. With whole-genome sequencing (WGS) becoming cheaper, it has huge potential in both diagnostics and routine surveillance. The aim of this study was to perform a real-time evaluation of WGS for routine typing and surveillance of verocytotoxin-producing Escherichia coli (VTEC). In Denmark, the Statens Serum Institut (SSI) routinely receives all suspected VTEC isolates. During a 7-week period in the fall of 2012, all incoming isolates were concurrently subjected to WGS using IonTorrent PGM. Real-time bioinformatics analysis was performed using web-tools (www.genomicepidemiology.org) for species determination, multilocus sequence type (MLST) typing, and determination of phylogenetic relationship, and a specific VirulenceFinder for detection of E. coli virulence genes was developed as part of this study. In total, 46 suspected VTEC isolates were characterized in parallel during the study. VirulenceFinder proved successful in detecting virulence genes included in routine typing, explicitly verocytotoxin 1 (vtx1), verocytotoxin 2 (vtx2), and intimin (eae), and also detected additional virulence genes. VirulenceFinder is also a robust method for assigning verocytotoxin (vtx) subtypes. A real-time clustering of isolates in agreement with the epidemiology was established from WGS, enabling discrimination between sporadic and outbreak isolates. Overall, WGS typing produced results faster and at a lower cost than the current routine. Therefore, WGS typing is a superior alternative to conventional typing strategies. This approach may also be applied to typing and surveillance of other pathogens.

Research paper thumbnail of Draft Genome Sequence of the Yeast Pachysolen tannophilus CBS 4044/NRRL Y-2460

Eukaryotic Cell, 2012

A draft genome sequence of the yeast Pachysolen tannophilus CBS 4044/NRRL Y-2460 is presented. Th... more A draft genome sequence of the yeast Pachysolen tannophilus CBS 4044/NRRL Y-2460 is presented. The organism has the potential to be developed as a cell factory for biorefineries due to its ability to utilize waste feedstocks. The sequenced genome size was 12,238,196 bp, consisting of 34 scaffolds. A total of 4,463 genes from 5,346 predicted open reading frames were annotated with function.

Research paper thumbnail of snpTree - a web-server to identify and construct SNP trees from whole genome sequence data

BMC Genomics, 2012

Background: The advances and decreasing economical cost of whole genome sequencing (WGS), will so... more Background: The advances and decreasing economical cost of whole genome sequencing (WGS), will soon make this technology available for routine infectious disease epidemiology. In epidemiological studies, outbreak isolates have very little diversity and require extensive genomic analysis to differentiate and classify isolates. One of the successfully and broadly used methods is analysis of single nucletide polymorphisms (SNPs). Currently, there are different tools and methods to identify SNPs including various options and cut-off values. Furthermore, all current methods require bioinformatic skills. Thus, we lack a standard and simple automatic tool to determine SNPs and construct phylogenetic tree from WGS data. Results: Here we introduce snpTree, a server for online-automatic SNPs analysis. This tool is composed of different SNPs analysis suites, perl and python scripts. snpTree can identify SNPs and construct phylogenetic trees from WGS as well as from assembled genomes or contigs. WGS data in fastq format are aligned to reference genomes by BWA while contigs in fasta format are processed by Nucmer. SNPs are concatenated based on position on reference genome and a tree is constructed from concatenated SNPs using FastTree and a perl script. The online server was implemented by HTML, Java and python script. The server was evaluated using four published bacterial WGS data sets (V. cholerae, S. aureus CC398, S. Typhimurium and M. tuberculosis). The evalution results for the first three cases was consistent and concordant for both raw reads and assembled genomes. In the latter case the original publication involved extensive filtering of SNPs, which could not be repeated using snpTree.

Research paper thumbnail of Veillonella, Firmicutes: Microbes disguised as Gram negatives

Standards in Genomic Sciences, 2013

The Firmicutes represent a major component of the intestinal microflora. The intestinal Firmicute... more The Firmicutes represent a major component of the intestinal microflora. The intestinal Firmicutes are a large, diverse group of organisms, many of which are poorly characterized due to their anaerobic growth requirements. Although most Firmicutes are Gram positive, members of the class Negativicutes, including the genus Veillonella, stain Gram negative. Veillonella are among the most abundant organisms of the oral and intestinal microflora of animals and humans, in spite of being strict anaerobes. In this work, the genomes of 24 Negativicutes, including eight Veillonella spp., are compared to 20 other Firmicutes genomes; a further 101 prokaryotic genomes were included, covering 26 phyla. Thus a total of 145 prokaryotic genomes were analyzed by various methods to investigate the apparent conflict of the Veillonella Gram stain and their taxonomic position within the Firmicutes. Comparison of the genome sequences confirms that the Negativicutes are distantly related to Clostridium spp., based on 16S rRNA, complete genomic DNA sequences, and a consensus tree based on conserved proteins. The genus Veillonella is relatively homogeneous: inter-genus pairwise comparison identifies at least 1,350 shared proteins, although less than half of these are found in any given Clostridium g enome. Only 27 proteins are found conserved in all analyzed prokaryote genomes. Veillonella has distinct metabolic properties, and significant similarities to genomes of Proteobacteria are not detected, with the exception of a shared LPS biosynthesis pathway. The clade within the class Negativicutes to which the genus Veillonella belongs exhibits unique properties, most of which are in common with Gram-positives and some with Gram negatives. They are only distantly related to Clostridia, but are even less closely related to Gram-negative species. Thoug h the Negativicutes stain Gram-negative and possess two membranes, the genome and proteome analysis presented here confirm their place within the (mainly) Gram positive phylum of the Firmicutes. Further studies are required to unveil the evolutionary history of the Veillonella and other Negativicutes.

Research paper thumbnail of Population Genetics of Vibrio cholerae from Nepal in 2010: Evidence on the Origin of the Haitian Outbreak

mBio, 2011

Cholera continues to be an important cause of human infections, and outbreaks are often observed ... more Cholera continues to be an important cause of human infections, and outbreaks are often observed after natural disasters, such as the one following the 2010 earthquake in Haiti. Once the cholera outbreak was confirmed, rumors spread that the disease was brought to Haiti by a battalion of Nepalese soldiers serving as United Nations peacekeepers. This possible connection has never been confirmed. We used whole-genome sequence typing (WGST), pulsed-field gel electrophoresis (PFGE), and antimicrobial susceptibility testing to characterize 24 recent Vibrio cholerae isolates from Nepal and evaluate the suggested epidemiological link with the Haitian outbreak. The isolates were obtained from 30 July to 1 November 2010 from five different districts in Nepal. We compared the 24 genomes to 10 previously sequenced V. cholerae isolates, including 3 from the Haitian outbreak (began July 2010). Antimicrobial susceptibility and PFGE patterns were consistent with an epidemiological link between the isolates from Nepal and Haiti. WGST showed that all 24 V. cholerae isolates from Nepal belonged to a single monophyletic group that also contained isolates from Bangladesh and Haiti. The Nepalese isolates were divided into four closely related clusters. One cluster contained three Nepalese isolates and three Haitian isolates that were almost identical, with only 1-or 2-bp differences.