Arthur Gruber - Academia.edu (original) (raw)

Papers by Arthur Gruber

Research paper thumbnail of Characterization of a Novel Mitovirus of the Sand Fly Lutzomyia longipalpis Using Genomic and Virus–Host Interaction Signatures

Viruses

Hematophagous insects act as the major reservoirs of infectious agents due to their intimate cont... more Hematophagous insects act as the major reservoirs of infectious agents due to their intimate contact with a large variety of vertebrate hosts. Lutzomyia longipalpis is the main vector of Leishmania chagasi in the New World, but its role as a host of viruses is poorly understood. In this work, Lu. longipalpis RNA libraries were subjected to progressive assembly using viral profile HMMs as seeds. A sequence phylogenetically related to fungal viruses of the genus Mitovirus was identified and this novel virus was named Lul-MV-1. The 2697-base genome presents a single gene coding for an RNA-directed RNA polymerase with an organellar genetic code. To determine the possible host of Lul-MV-1, we analyzed the molecular characteristics of the viral genome. Dinucleotide composition and codon usage showed profiles similar to mitochondrial DNA of invertebrate hosts. Also, the virus-derived small RNA profile was consistent with the activation of the siRNA pathway, with size distribution and 5′ ba...

Research paper thumbnail of Description and charactrization of the Amazonian entomopathogenic bacterium Photorhabdus luminescens MN7

Many isolates of the genus Photorhabdus have been reported around the world. Here we describe the... more Many isolates of the genus Photorhabdus have been reported around the world. Here we describe the first Brazilian Photorhabdus isolate, found in association with the entomopathogenic nematode Heterorhabditis baujardi LPP7, from the Amazonian forest in Monte Negro (RO, Brazil). The new isolate can be grouped with the Hb-Hm clade of P. luminescens subsp. luminescens, close to the new subspecies P. luminescens subsp. sonorensis. P. luminescens MN7 has several characteristics expected of variant form I cells, such as the presence of intracellular crystals, secretion of hydrolytic enzymes (lipases and proteases) and bioluminescence. Although H. baujardi LPP7 is not prolific when compared to H. bacteriophora HP88, P. luminescens MN7 is clearly pathogenic and probably secretes the same toxins as P. luminescens subsp. luminescens W14, when fed to larvae of the greater wax moth Galleria mellonella. This behavior is different from what is found in Photorhabdus luminescens subsp. laumondii HP8...

Research paper thumbnail of Bioinformatics Meets Virology: The European Virus Bioinformatics Center's Second Annual Meeting

Viruses, May 14, 2018

The Second Annual Meeting of the European Virus Bioinformatics Center (EVBC), held in Utrecht, Ne... more The Second Annual Meeting of the European Virus Bioinformatics Center (EVBC), held in Utrecht, Netherlands, focused on computational approaches in virology, with topics including (but not limited to) virus discovery, diagnostics, (meta-)genomics, modeling, epidemiology, molecular structure, evolution, and viral ecology. The goals of the Second Annual Meeting were threefold: (i) to bring together virologists and bioinformaticians from across the academic, industrial, professional, and training sectors to share best practice; (ii) to provide a meaningful and interactive scientific environment to promote discussion and collaboration between students, postdoctoral fellows, and both new and established investigators; (iii) to inspire and suggest new research directions and questions. Approximately 120 researchers from around the world attended the Second Annual Meeting of the EVBC this year, including 15 renowned international speakers. This report presents an overview of new development...

Research paper thumbnail of Use of profile hidden Markov models in viral discovery: current insights

Advances in Genomics and Genetics

Sequence similarity searches are the bioinformatic cornerstone of molecular sequence analysis for... more Sequence similarity searches are the bioinformatic cornerstone of molecular sequence analysis for all domains of life. However, large amounts of divergence between organisms, such as those seen among viruses, can significantly hamper analyses. Profile hidden Markov models (profile HMMs) are among the most successful approaches for dealing with this problem, which represent an invaluable tool for viral identification efforts. Profile HMMs are statistical models that convert information from a multiple sequence alignment into a set of probability values that reflect positionspecific variation levels in all members of evolutionarily related sequences. Since profile HMMs represent a wide spectrum of variation, these models show higher sensitivity than conventional similarity methods such as BLAST for the detection of remote homologs. In recent years, there has been an effort to compile viral sequences from different viral taxonomic groups into integrated databases, such as Prokaryotic Virus Orthlogous Groups (pVOGs) and database of profile HMMs (vFam) database, which provide functional annotation, multiple sequence alignments, and profile HMMs. Since these databases rely on viral sequences collected from GenBank and RefSeq, they suffer in variable extent from uneven taxonomic sampling, with low sequence representation of many viral groups, which affects the efficacy of the models. One of the interesting applications of viral profile HMMs is the detection and sequence reconstruction of specific viral genomes from metagenomic data. In fact, several DNA assembly programs that use profile HMMs as seeds have been developed to identify and build gene-sized assemblies or viral genome sequences of unrestrained length, using conventional and progressive assembly approaches, respectively. In this review, we address these aspects and cover some up-to-date information on viral genomics that should be considered in the choice of molecular markers for viral discovery. Finally, we propose a roadmap for rational development of viral profile HMMs and discuss the main challenges associated with this task.

Research paper thumbnail of Draft Genome Sequence of Curtobacterium sp. Strain ER1/6, an Endophytic Strain Isolated from Citrus sinensis with Potential To Be Used as a Biocontrol Agent

Genome Announcements, 2016

Herein, we report a draft genome sequence of the endophytic Curtobacterium sp. strain ER1/6, isol... more Herein, we report a draft genome sequence of the endophytic Curtobacterium sp. strain ER1/6, isolated from a surface-sterilized Citrus sinensis branch, and it presented the capability to control phytopathogens. Functional annotation of the ~3.4-Mb genome revealed 3,100 protein-coding genes, with many products related to known ecological and biotechnological aspects of this bacterium.

Research paper thumbnail of Pilot survey of expressed sequence tags (ESTs)

Research paper thumbnail of Bioinformatics in Tropical Disease Research

They were designed as crash courses for biological researchers, trying to cover in 2 weeks enough... more They were designed as crash courses for biological researchers, trying to cover in 2 weeks enough information to enable biologists to continue their bioinformatics training independently. During these courses, it became clear that, in spite of the growing number of books in bioinformatics, most of the books covered the issues either at a theoretical level that was too deep for beginners or at a level that was too superficial to convey the complexity of the issues. Additionally, we were not aware of any book that, for each theoretical subject, provided a detailed tutorial of free software that could be used in day-today research. Finally, in our experience, most of the researchers in biological subject areas feel intimidated when confronted with bioinformatics programs. Hernando del Portillo, the initial course coordinator, then had the idea of creating such a book, taking advantage of the great human resources that were involved in the lectures from the various modules of the several WHO-TDR courses held in Brazil and other countries. He was also the person who got the group of editors together. Hernando, unfortunately, was able to be involved in only the first part of the editorial process, having to leave the book project before the last phase of the work. However, his contribution remains in the effort to select the authors, advocating for the case studies part, making the initial evaluations of the chapters, and co-authoring one of the chapters. All of the editors of this book have extensive experience in both teaching and organizing bioinformatics courses. Arthur Gruber was part of the organizing committee of all the Brazilian courses, co-Primary Investigator (PI) of the last one, and instructor in short-term Bioinformatics courses in Brazil, South Africa, Peru, and Colombia. Alan Durham was part of the organizing committee of the first course in Brazil, co-PI of the second and third courses in Brazil, and PI of the 2006 course in Brazil; he was also part of the organizing committees and teaching staff in short-term courses in Brazil, South Africa, Thailand, and India. Chuong Huynh has been in the coordinating committee for all of the WHO-TDR courses around the world (five in Brazil, four in South Africa, four in India, and seven in Thailand) and also has been an instructor in all of them. Hernando del Portillo was the PI of the first three Brazilian courses. Finally, Alan, Arthur, and Hernando were part of the group that organized in 2002 one of the first PhD Bioinformatics Programs in Brazil. This book is intended to serve both as a textbook for short bioinformatics courses and as a base for a selfteaching endeavor. It is divided in two parts: A. Bioinformatics Techniques and B. Case Studies. Each chapter of the first part addresses a specific problem in bioinformatics and consists of a theoretical part and of a detailed tutorial with practical applications of that theory using software freely available on the Internet. All of the authors who were selected for this part of the book have extensive experience in teaching Bioinformatics, either at WHO-TDR and other short-term courses, at universities around the world, or at both. In the second part, we invited renowned researchers to write chapters that represent up-to-date reviews of particular human diseases, including biological aspects and bioinformatics approaches that helped to solve specific problems. The book is intended to be a continuous project, and we expect it to be regularly expanded and updated in the future. This characteristic and the desire to make the book widely available, particularly to researchers in developing countries, were key points that led to the decision of open web publishing, as opposed to a paper version. We feel extremely fortunate with our choice of authors and expect this book to be very useful for both teachers and individual researchers. We thank all of the authors for their great work and for maintaining the motivation in spite of the delays during these last 2 years. Finally, we are deeply grateful to Belinda Beck and Laura Dean for their extremely competent and comprehensive editorial work.

Research paper thumbnail of Prevalência de Campylobacter jejuni e C. coli em fezes normais e diarréicas de cães em São Paulo

Revista De Microbiologia, Dec 1, 1985

Research paper thumbnail of Estudo da doença de Chagas: abordagem molecular

Research paper thumbnail of GenSeed-HMM: A Tool for Progressive Assembly Using Profile HMMs as Seeds and its Application in Alpavirinae Viral Discovery from Metagenomic Data

Frontiers in Microbiology, 2016

This work reports the development of GenSeed-HMM, a program that implements seed-driven progressi... more This work reports the development of GenSeed-HMM, a program that implements seed-driven progressive assembly, an approach to reconstruct specific sequences from unassembled data, starting from short nucleotide or protein seed sequences or profile Hidden Markov Models (HMM). The program can use any one of a number of sequence assemblers. Assembly is performed in multiple steps and relatively few reads are used in each cycle, consequently the program demands low computational resources. As a proof-of-concept and to demonstrate the power of HMM-driven progressive assemblies, GenSeed-HMM was applied to metagenomic datasets in the search for diverse ssDNA bacteriophages from the recently described Alpavirinae subfamily. Profile HMMs were built using Alpavirinae-specific regions from multiple sequence alignments (MSA) using either the viral protein 1 (VP1; major capsid protein) or VP4 (genome replication initiation protein). These profile HMMs were used by GenSeed-HMM (running Newbler assembler) as seeds to reconstruct viral genomes from sequencing datasets of human fecal samples. All contigs obtained were annotated and taxonomically classified using similarity searches and phylogenetic analyses. The most specific profile HMM seed enabled the reconstruction of 45 partial or complete Alpavirinae genomic sequences. A comparison with conventional (global) assembly of the same original dataset, using Newbler in a standalone execution, revealed that GenSeed-HMM outperformed global genomic assembly in several metrics employed. This approach is capable of detecting organisms that have not been used in the construction of the profile HMM, which opens up the possibility of diagnosing novel viruses, without previous specific information, constituting a de novo diagnosis. Additional applications include, but are not limited to, the specific assembly of extrachromosomal elements such as plastid and mitochondrial genomes Alves et al. Viral Discovery Using GenSeed-HMM from metagenomic data. Profile HMM seeds can also be used to reconstruct specific protein coding genes for gene diversity studies, and to determine all possible gene variants present in a metagenomic sample. Such surveys could be useful to detect the emergence of drug-resistance variants in sensitive environments such as hospitals and animal production facilities, where antibiotics are regularly used. Finally, GenSeed-HMM can be used as an adjunct for gap closure on assembly finishing projects, by using multiple contig ends as anchored seeds.

Research paper thumbnail of Trypanosoma cruzi defined antigens in the serological evaluation of an outbreak of acute Chagas disease in Brazil (Catolé do Rocha, Paraíba)

Memorias Do Instituto Oswaldo Cruz, Feb 1, 1996

Immunoglobulin G and M humoral response to recombinant protein B13 and glycoconjugate LPPG Trypan... more Immunoglobulin G and M humoral response to recombinant protein B13 and glycoconjugate LPPG Trypanosoma cruzi defined antigens was evaluated by ELISA in 18 patients in the acute phase of Chagas disease, who were contaminated on the same occasion. LPPG showed 100% positivity detecting both IgM and IgG antibodies, while positivity of 55-65% was observed for B13. An epimastigote alkaline extract (EPI) also showed high sensitivity for acute IgM (100%) and IgG (90%) antibodies. However LPPG had better discriminatory reactivity since with EPI two patients showed negative IgG, and several other sera presented OD values for IgG and IgM antibodies very close to the cutoff. Thus, it is suggested that detection of IgM antibodies by LPPG may be used for diagnosis of the acute phase of Chagas disease. An intense decline of IgG and IgM antibodies to the three antigens was observed in response to anti-T. cruzi chemotherapy in all acute phase patients. After treatment, six (30%) individuals maintained IgG positivity to EPI, LPPG, and B13 with lower reactivity than that measured at the acute phase. For comparison, serology of a group of 22 patients in the chronic phase of Chagas disease and also submitted to chemotherapy was determined. Positive IgM antibodies to EPI, LPPG and B13 were detected in only 5-9% cases. In all chronic-phase patients IgG antibodies highly reactive to the three antigens were present and no significant decrease resulted after benznidazole administration. These observations reinforce previous reports that treatment in the acute phase may reduce or eliminate the parasite.

Research paper thumbnail of The pangenome of the Anticarsia gemmatalis multiple nucleopolyhedrovirus (AgMNPV)

Genome biology and evolution, Jan 27, 2015

The alphabaculovirus Anticarsia gemmatalis multiple nucleopolyhedrovirus (AgMNPV) is the world&#3... more The alphabaculovirus Anticarsia gemmatalis multiple nucleopolyhedrovirus (AgMNPV) is the world's most successful viral bioinsecticide. Through the 1980's and 1990's, this virus was extensively used for biological control of populations of Anticarsia gemmatalis (Velvetbean caterpillar) in soybean crops. During this period, genetic studies identified several variable loci in the AgMNPV, however, most of them were not characterized at the sequence level. In this study we report a full genome comparison among 17 wild-type isolates of AgMNPV. We found the pangenome of this virus to contain at least 167 hypothetical genes, 151 of which are shared by all genomes. The gene bro-a that might be involved in host specificity and carrying transporter, is absent in some genomes, and new hypothetical genes were observed. Among these genes there is a unique rnf12-like gene, probably implicated in ubiquitination. Events of gene fission and fusion are common, since four genes have been ob...

Research paper thumbnail of Caracterização de dois genes contíguos de Trypanosoma cruzi que codificam antígenos com repetições de epitopos imunodominantes

A Jorge Yanovsky (Polychaco 8 S.A.I.C-Argentina), pelos dados dos experimentos de ELISAfeitos com... more A Jorge Yanovsky (Polychaco 8 S.A.I.C-Argentina), pelos dados dos experimentos de ELISAfeitos com soros do Instituto Fatala Chaben. A Egler Chiari (UFMG)e João Carlos Pinto Dias (FIOCRUZ-MG) pelos soros humanos da região de Bambuí. Aos professores Hamza F. A. El Dorry, Suely L. Gomes e Antonio G. Bianchi por permitirem a utilização de reagentes e equipamentos em seus laboratórios e pelas inúmeras sugestões que sempre me deram. Aos colegas do nosso laboratório: Jesus,

Research paper thumbnail of E-Gene - A modular and configurable pipeline system for automated DNA sequence analysis

DNA reads generated by large-scale sequencing projects have to be processed before further analys... more DNA reads generated by large-scale sequencing projects have to be processed before further analyses in order to perform vector/primer masking, low-quality trimming and contaminant removal. This sequential processing involves several steps and the use of different computer programs, each one following its own calling convention and input/output formats. As a consequence, building a sequence processing pipeline is generally a tedious and exhaustive task. In addition, the currently available pipelines are often poorly documented, requiring careful examination of Perl, Python or shellscript codes before they can be properly configured. Some others are more specialized to perform determined functions [1,2], not allowing the incorporation of new modules in an integrated way. This paper reports the development of E-Gene, an integrated system that makes pipeline construction a modular job. We have created a unified view of the building blocks of the pipeline and of the integration pattern o...

Research paper thumbnail of One-base double-stranded DNA sequencing reaction using commercial kits

Research paper thumbnail of Molecular approaches to diagnosis of Chagas' disease: use of defined antigens and a target ribosomal RNA sequence

Biological research, 1993

We evaluated the performance of two defined antigens in the serological diagnosis of Chagas' ... more We evaluated the performance of two defined antigens in the serological diagnosis of Chagas' disease. One of them is a recombinant protein named B13 isolated from a genomic library of Trypanosoma cruzi in the expression vector lambda gtll. We show that the gene corresponding to B13 is conserved in the evolutive stages of the two "polar" strains of T. cruzi. The protein epitopes cloned in B13 are represented in 140 kDa, 116 kDa and 35 kDa polypeptides of trypomastigotes. The other antigen chosen for serodiagnosis is a lipopeptidophosphoglycan (LPPG). This glycoconjugate is also widely distributed in T. cruzi strains. The use of a rabbit serum to LPPG allowed the demonstration that this molecule bears epitopes in common to LPPG-like components and to 80-90 kDa glycoproteins of trypomastigotes. Both B13 and LPPG were evaluated in serodiagnosis by ELISA and RIA using a panel of normal human, Chagasic and Leishmaniasis sera. It was observed that B13 presents high sensitivit...

Research paper thumbnail of Proceedings of the IXth International Coccidiosis Conference, Foz de Iguassu, Parana, Brazil, 19-23 September, 2005

Long, P. L. (1969). Observations on the growth of Eimeria tenella in cultured cells from the para... more Long, P. L. (1969). Observations on the growth of Eimeria tenella in cultured cells from the parasitized chorioallantoic membranes of the developing chick embryo.

Research paper thumbnail of Estudo da doença de Chagas: abordagem molecular

Research paper thumbnail of A Selective Review of Advances in Coccidiosis Research

Advances in Parasitology, 2013

Coccidiosis is a widespread and economically significant disease of livestock caused by protozoan... more Coccidiosis is a widespread and economically significant disease of livestock caused by protozoan parasites of the genus Eimeria. This disease is worldwide in occurrence and costs the animal agricultural industry many millions of dollars to control. In recent years, the modern tools of molecular biology, biochemistry, cell biology and immunology have been used to expand greatly our knowledge of these parasites and the disease they cause. Such studies are essential if we are to develop new means for the control of coccidiosis. In this chapter, selective aspects of the biology of these organisms, with emphasis on recent research in poultry, are reviewed. Topics considered include taxonomy, systematics, genetics, genomics, transcriptomics, proteomics, transfection, oocyst biogenesis, host cell invasion, immunobiology, diagnostics and control.

Research paper thumbnail of Pilot survey of expressed sequence tags (ESTs) from the asexual blood stages of Plasmodium vivax in human patients

Malaria Journal, 2003

Background: Plasmodium vivax is the most widely distributed human malaria, responsible for 70-80 ... more Background: Plasmodium vivax is the most widely distributed human malaria, responsible for 70-80 million clinical cases each year and large socio-economical burdens for countries such as Brazil where it is the most prevalent species. Unfortunately, due to the impossibility of growing this parasite in continuous in vitro culture, research on P. vivax remains largely neglected. Methods: A pilot survey of expressed sequence tags (ESTs) from the asexual blood stages of P. vivax was performed. To do so, 1,184 clones from a cDNA library constructed with parasites obtained from 10 different human patients in the Brazilian Amazon were sequenced. Sequences were automatedly processed to remove contaminants and low quality reads. A total of 806 sequences with an average length of 586 bp met such criteria and their clustering revealed 666 distinct events. The consensus sequence of each cluster and the unique sequences of the singlets were used in similarity searches against different databases that included P. vivax, Plasmodium falciparum, Plasmodium yoelii, Plasmodium knowlesi, Apicomplexa and the GenBank non-redundant database. An E-value of <10-30 was used to define a significant database match. ESTs were manually assigned a gene ontology (GO) terminology Results: A total of 769 ESTs could be assigned a putative identity based upon sequence similarity to known proteins in GenBank. Moreover, 292 ESTs were annotated and a GO terminology was assigned to 164 of them. Conclusion: These are the first ESTs reported for P. vivax and, as such, they represent a valuable resource to assist in the annotation of the P. vivax genome currently being sequenced. Moreover, since the GC-content of the P. vivax genome is strikingly different from that of P. falciparum, these ESTs will help in the validation of gene predictions for P. vivax and to create a gene index of this malaria parasite.

Research paper thumbnail of Characterization of a Novel Mitovirus of the Sand Fly Lutzomyia longipalpis Using Genomic and Virus–Host Interaction Signatures

Viruses

Hematophagous insects act as the major reservoirs of infectious agents due to their intimate cont... more Hematophagous insects act as the major reservoirs of infectious agents due to their intimate contact with a large variety of vertebrate hosts. Lutzomyia longipalpis is the main vector of Leishmania chagasi in the New World, but its role as a host of viruses is poorly understood. In this work, Lu. longipalpis RNA libraries were subjected to progressive assembly using viral profile HMMs as seeds. A sequence phylogenetically related to fungal viruses of the genus Mitovirus was identified and this novel virus was named Lul-MV-1. The 2697-base genome presents a single gene coding for an RNA-directed RNA polymerase with an organellar genetic code. To determine the possible host of Lul-MV-1, we analyzed the molecular characteristics of the viral genome. Dinucleotide composition and codon usage showed profiles similar to mitochondrial DNA of invertebrate hosts. Also, the virus-derived small RNA profile was consistent with the activation of the siRNA pathway, with size distribution and 5′ ba...

Research paper thumbnail of Description and charactrization of the Amazonian entomopathogenic bacterium Photorhabdus luminescens MN7

Many isolates of the genus Photorhabdus have been reported around the world. Here we describe the... more Many isolates of the genus Photorhabdus have been reported around the world. Here we describe the first Brazilian Photorhabdus isolate, found in association with the entomopathogenic nematode Heterorhabditis baujardi LPP7, from the Amazonian forest in Monte Negro (RO, Brazil). The new isolate can be grouped with the Hb-Hm clade of P. luminescens subsp. luminescens, close to the new subspecies P. luminescens subsp. sonorensis. P. luminescens MN7 has several characteristics expected of variant form I cells, such as the presence of intracellular crystals, secretion of hydrolytic enzymes (lipases and proteases) and bioluminescence. Although H. baujardi LPP7 is not prolific when compared to H. bacteriophora HP88, P. luminescens MN7 is clearly pathogenic and probably secretes the same toxins as P. luminescens subsp. luminescens W14, when fed to larvae of the greater wax moth Galleria mellonella. This behavior is different from what is found in Photorhabdus luminescens subsp. laumondii HP8...

Research paper thumbnail of Bioinformatics Meets Virology: The European Virus Bioinformatics Center's Second Annual Meeting

Viruses, May 14, 2018

The Second Annual Meeting of the European Virus Bioinformatics Center (EVBC), held in Utrecht, Ne... more The Second Annual Meeting of the European Virus Bioinformatics Center (EVBC), held in Utrecht, Netherlands, focused on computational approaches in virology, with topics including (but not limited to) virus discovery, diagnostics, (meta-)genomics, modeling, epidemiology, molecular structure, evolution, and viral ecology. The goals of the Second Annual Meeting were threefold: (i) to bring together virologists and bioinformaticians from across the academic, industrial, professional, and training sectors to share best practice; (ii) to provide a meaningful and interactive scientific environment to promote discussion and collaboration between students, postdoctoral fellows, and both new and established investigators; (iii) to inspire and suggest new research directions and questions. Approximately 120 researchers from around the world attended the Second Annual Meeting of the EVBC this year, including 15 renowned international speakers. This report presents an overview of new development...

Research paper thumbnail of Use of profile hidden Markov models in viral discovery: current insights

Advances in Genomics and Genetics

Sequence similarity searches are the bioinformatic cornerstone of molecular sequence analysis for... more Sequence similarity searches are the bioinformatic cornerstone of molecular sequence analysis for all domains of life. However, large amounts of divergence between organisms, such as those seen among viruses, can significantly hamper analyses. Profile hidden Markov models (profile HMMs) are among the most successful approaches for dealing with this problem, which represent an invaluable tool for viral identification efforts. Profile HMMs are statistical models that convert information from a multiple sequence alignment into a set of probability values that reflect positionspecific variation levels in all members of evolutionarily related sequences. Since profile HMMs represent a wide spectrum of variation, these models show higher sensitivity than conventional similarity methods such as BLAST for the detection of remote homologs. In recent years, there has been an effort to compile viral sequences from different viral taxonomic groups into integrated databases, such as Prokaryotic Virus Orthlogous Groups (pVOGs) and database of profile HMMs (vFam) database, which provide functional annotation, multiple sequence alignments, and profile HMMs. Since these databases rely on viral sequences collected from GenBank and RefSeq, they suffer in variable extent from uneven taxonomic sampling, with low sequence representation of many viral groups, which affects the efficacy of the models. One of the interesting applications of viral profile HMMs is the detection and sequence reconstruction of specific viral genomes from metagenomic data. In fact, several DNA assembly programs that use profile HMMs as seeds have been developed to identify and build gene-sized assemblies or viral genome sequences of unrestrained length, using conventional and progressive assembly approaches, respectively. In this review, we address these aspects and cover some up-to-date information on viral genomics that should be considered in the choice of molecular markers for viral discovery. Finally, we propose a roadmap for rational development of viral profile HMMs and discuss the main challenges associated with this task.

Research paper thumbnail of Draft Genome Sequence of Curtobacterium sp. Strain ER1/6, an Endophytic Strain Isolated from Citrus sinensis with Potential To Be Used as a Biocontrol Agent

Genome Announcements, 2016

Herein, we report a draft genome sequence of the endophytic Curtobacterium sp. strain ER1/6, isol... more Herein, we report a draft genome sequence of the endophytic Curtobacterium sp. strain ER1/6, isolated from a surface-sterilized Citrus sinensis branch, and it presented the capability to control phytopathogens. Functional annotation of the ~3.4-Mb genome revealed 3,100 protein-coding genes, with many products related to known ecological and biotechnological aspects of this bacterium.

Research paper thumbnail of Pilot survey of expressed sequence tags (ESTs)

Research paper thumbnail of Bioinformatics in Tropical Disease Research

They were designed as crash courses for biological researchers, trying to cover in 2 weeks enough... more They were designed as crash courses for biological researchers, trying to cover in 2 weeks enough information to enable biologists to continue their bioinformatics training independently. During these courses, it became clear that, in spite of the growing number of books in bioinformatics, most of the books covered the issues either at a theoretical level that was too deep for beginners or at a level that was too superficial to convey the complexity of the issues. Additionally, we were not aware of any book that, for each theoretical subject, provided a detailed tutorial of free software that could be used in day-today research. Finally, in our experience, most of the researchers in biological subject areas feel intimidated when confronted with bioinformatics programs. Hernando del Portillo, the initial course coordinator, then had the idea of creating such a book, taking advantage of the great human resources that were involved in the lectures from the various modules of the several WHO-TDR courses held in Brazil and other countries. He was also the person who got the group of editors together. Hernando, unfortunately, was able to be involved in only the first part of the editorial process, having to leave the book project before the last phase of the work. However, his contribution remains in the effort to select the authors, advocating for the case studies part, making the initial evaluations of the chapters, and co-authoring one of the chapters. All of the editors of this book have extensive experience in both teaching and organizing bioinformatics courses. Arthur Gruber was part of the organizing committee of all the Brazilian courses, co-Primary Investigator (PI) of the last one, and instructor in short-term Bioinformatics courses in Brazil, South Africa, Peru, and Colombia. Alan Durham was part of the organizing committee of the first course in Brazil, co-PI of the second and third courses in Brazil, and PI of the 2006 course in Brazil; he was also part of the organizing committees and teaching staff in short-term courses in Brazil, South Africa, Thailand, and India. Chuong Huynh has been in the coordinating committee for all of the WHO-TDR courses around the world (five in Brazil, four in South Africa, four in India, and seven in Thailand) and also has been an instructor in all of them. Hernando del Portillo was the PI of the first three Brazilian courses. Finally, Alan, Arthur, and Hernando were part of the group that organized in 2002 one of the first PhD Bioinformatics Programs in Brazil. This book is intended to serve both as a textbook for short bioinformatics courses and as a base for a selfteaching endeavor. It is divided in two parts: A. Bioinformatics Techniques and B. Case Studies. Each chapter of the first part addresses a specific problem in bioinformatics and consists of a theoretical part and of a detailed tutorial with practical applications of that theory using software freely available on the Internet. All of the authors who were selected for this part of the book have extensive experience in teaching Bioinformatics, either at WHO-TDR and other short-term courses, at universities around the world, or at both. In the second part, we invited renowned researchers to write chapters that represent up-to-date reviews of particular human diseases, including biological aspects and bioinformatics approaches that helped to solve specific problems. The book is intended to be a continuous project, and we expect it to be regularly expanded and updated in the future. This characteristic and the desire to make the book widely available, particularly to researchers in developing countries, were key points that led to the decision of open web publishing, as opposed to a paper version. We feel extremely fortunate with our choice of authors and expect this book to be very useful for both teachers and individual researchers. We thank all of the authors for their great work and for maintaining the motivation in spite of the delays during these last 2 years. Finally, we are deeply grateful to Belinda Beck and Laura Dean for their extremely competent and comprehensive editorial work.

Research paper thumbnail of Prevalência de Campylobacter jejuni e C. coli em fezes normais e diarréicas de cães em São Paulo

Revista De Microbiologia, Dec 1, 1985

Research paper thumbnail of Estudo da doença de Chagas: abordagem molecular

Research paper thumbnail of GenSeed-HMM: A Tool for Progressive Assembly Using Profile HMMs as Seeds and its Application in Alpavirinae Viral Discovery from Metagenomic Data

Frontiers in Microbiology, 2016

This work reports the development of GenSeed-HMM, a program that implements seed-driven progressi... more This work reports the development of GenSeed-HMM, a program that implements seed-driven progressive assembly, an approach to reconstruct specific sequences from unassembled data, starting from short nucleotide or protein seed sequences or profile Hidden Markov Models (HMM). The program can use any one of a number of sequence assemblers. Assembly is performed in multiple steps and relatively few reads are used in each cycle, consequently the program demands low computational resources. As a proof-of-concept and to demonstrate the power of HMM-driven progressive assemblies, GenSeed-HMM was applied to metagenomic datasets in the search for diverse ssDNA bacteriophages from the recently described Alpavirinae subfamily. Profile HMMs were built using Alpavirinae-specific regions from multiple sequence alignments (MSA) using either the viral protein 1 (VP1; major capsid protein) or VP4 (genome replication initiation protein). These profile HMMs were used by GenSeed-HMM (running Newbler assembler) as seeds to reconstruct viral genomes from sequencing datasets of human fecal samples. All contigs obtained were annotated and taxonomically classified using similarity searches and phylogenetic analyses. The most specific profile HMM seed enabled the reconstruction of 45 partial or complete Alpavirinae genomic sequences. A comparison with conventional (global) assembly of the same original dataset, using Newbler in a standalone execution, revealed that GenSeed-HMM outperformed global genomic assembly in several metrics employed. This approach is capable of detecting organisms that have not been used in the construction of the profile HMM, which opens up the possibility of diagnosing novel viruses, without previous specific information, constituting a de novo diagnosis. Additional applications include, but are not limited to, the specific assembly of extrachromosomal elements such as plastid and mitochondrial genomes Alves et al. Viral Discovery Using GenSeed-HMM from metagenomic data. Profile HMM seeds can also be used to reconstruct specific protein coding genes for gene diversity studies, and to determine all possible gene variants present in a metagenomic sample. Such surveys could be useful to detect the emergence of drug-resistance variants in sensitive environments such as hospitals and animal production facilities, where antibiotics are regularly used. Finally, GenSeed-HMM can be used as an adjunct for gap closure on assembly finishing projects, by using multiple contig ends as anchored seeds.

Research paper thumbnail of Trypanosoma cruzi defined antigens in the serological evaluation of an outbreak of acute Chagas disease in Brazil (Catolé do Rocha, Paraíba)

Memorias Do Instituto Oswaldo Cruz, Feb 1, 1996

Immunoglobulin G and M humoral response to recombinant protein B13 and glycoconjugate LPPG Trypan... more Immunoglobulin G and M humoral response to recombinant protein B13 and glycoconjugate LPPG Trypanosoma cruzi defined antigens was evaluated by ELISA in 18 patients in the acute phase of Chagas disease, who were contaminated on the same occasion. LPPG showed 100% positivity detecting both IgM and IgG antibodies, while positivity of 55-65% was observed for B13. An epimastigote alkaline extract (EPI) also showed high sensitivity for acute IgM (100%) and IgG (90%) antibodies. However LPPG had better discriminatory reactivity since with EPI two patients showed negative IgG, and several other sera presented OD values for IgG and IgM antibodies very close to the cutoff. Thus, it is suggested that detection of IgM antibodies by LPPG may be used for diagnosis of the acute phase of Chagas disease. An intense decline of IgG and IgM antibodies to the three antigens was observed in response to anti-T. cruzi chemotherapy in all acute phase patients. After treatment, six (30%) individuals maintained IgG positivity to EPI, LPPG, and B13 with lower reactivity than that measured at the acute phase. For comparison, serology of a group of 22 patients in the chronic phase of Chagas disease and also submitted to chemotherapy was determined. Positive IgM antibodies to EPI, LPPG and B13 were detected in only 5-9% cases. In all chronic-phase patients IgG antibodies highly reactive to the three antigens were present and no significant decrease resulted after benznidazole administration. These observations reinforce previous reports that treatment in the acute phase may reduce or eliminate the parasite.

Research paper thumbnail of The pangenome of the Anticarsia gemmatalis multiple nucleopolyhedrovirus (AgMNPV)

Genome biology and evolution, Jan 27, 2015

The alphabaculovirus Anticarsia gemmatalis multiple nucleopolyhedrovirus (AgMNPV) is the world&#3... more The alphabaculovirus Anticarsia gemmatalis multiple nucleopolyhedrovirus (AgMNPV) is the world's most successful viral bioinsecticide. Through the 1980's and 1990's, this virus was extensively used for biological control of populations of Anticarsia gemmatalis (Velvetbean caterpillar) in soybean crops. During this period, genetic studies identified several variable loci in the AgMNPV, however, most of them were not characterized at the sequence level. In this study we report a full genome comparison among 17 wild-type isolates of AgMNPV. We found the pangenome of this virus to contain at least 167 hypothetical genes, 151 of which are shared by all genomes. The gene bro-a that might be involved in host specificity and carrying transporter, is absent in some genomes, and new hypothetical genes were observed. Among these genes there is a unique rnf12-like gene, probably implicated in ubiquitination. Events of gene fission and fusion are common, since four genes have been ob...

Research paper thumbnail of Caracterização de dois genes contíguos de Trypanosoma cruzi que codificam antígenos com repetições de epitopos imunodominantes

A Jorge Yanovsky (Polychaco 8 S.A.I.C-Argentina), pelos dados dos experimentos de ELISAfeitos com... more A Jorge Yanovsky (Polychaco 8 S.A.I.C-Argentina), pelos dados dos experimentos de ELISAfeitos com soros do Instituto Fatala Chaben. A Egler Chiari (UFMG)e João Carlos Pinto Dias (FIOCRUZ-MG) pelos soros humanos da região de Bambuí. Aos professores Hamza F. A. El Dorry, Suely L. Gomes e Antonio G. Bianchi por permitirem a utilização de reagentes e equipamentos em seus laboratórios e pelas inúmeras sugestões que sempre me deram. Aos colegas do nosso laboratório: Jesus,

Research paper thumbnail of E-Gene - A modular and configurable pipeline system for automated DNA sequence analysis

DNA reads generated by large-scale sequencing projects have to be processed before further analys... more DNA reads generated by large-scale sequencing projects have to be processed before further analyses in order to perform vector/primer masking, low-quality trimming and contaminant removal. This sequential processing involves several steps and the use of different computer programs, each one following its own calling convention and input/output formats. As a consequence, building a sequence processing pipeline is generally a tedious and exhaustive task. In addition, the currently available pipelines are often poorly documented, requiring careful examination of Perl, Python or shellscript codes before they can be properly configured. Some others are more specialized to perform determined functions [1,2], not allowing the incorporation of new modules in an integrated way. This paper reports the development of E-Gene, an integrated system that makes pipeline construction a modular job. We have created a unified view of the building blocks of the pipeline and of the integration pattern o...

Research paper thumbnail of One-base double-stranded DNA sequencing reaction using commercial kits

Research paper thumbnail of Molecular approaches to diagnosis of Chagas' disease: use of defined antigens and a target ribosomal RNA sequence

Biological research, 1993

We evaluated the performance of two defined antigens in the serological diagnosis of Chagas' ... more We evaluated the performance of two defined antigens in the serological diagnosis of Chagas' disease. One of them is a recombinant protein named B13 isolated from a genomic library of Trypanosoma cruzi in the expression vector lambda gtll. We show that the gene corresponding to B13 is conserved in the evolutive stages of the two "polar" strains of T. cruzi. The protein epitopes cloned in B13 are represented in 140 kDa, 116 kDa and 35 kDa polypeptides of trypomastigotes. The other antigen chosen for serodiagnosis is a lipopeptidophosphoglycan (LPPG). This glycoconjugate is also widely distributed in T. cruzi strains. The use of a rabbit serum to LPPG allowed the demonstration that this molecule bears epitopes in common to LPPG-like components and to 80-90 kDa glycoproteins of trypomastigotes. Both B13 and LPPG were evaluated in serodiagnosis by ELISA and RIA using a panel of normal human, Chagasic and Leishmaniasis sera. It was observed that B13 presents high sensitivit...

Research paper thumbnail of Proceedings of the IXth International Coccidiosis Conference, Foz de Iguassu, Parana, Brazil, 19-23 September, 2005

Long, P. L. (1969). Observations on the growth of Eimeria tenella in cultured cells from the para... more Long, P. L. (1969). Observations on the growth of Eimeria tenella in cultured cells from the parasitized chorioallantoic membranes of the developing chick embryo.

Research paper thumbnail of Estudo da doença de Chagas: abordagem molecular

Research paper thumbnail of A Selective Review of Advances in Coccidiosis Research

Advances in Parasitology, 2013

Coccidiosis is a widespread and economically significant disease of livestock caused by protozoan... more Coccidiosis is a widespread and economically significant disease of livestock caused by protozoan parasites of the genus Eimeria. This disease is worldwide in occurrence and costs the animal agricultural industry many millions of dollars to control. In recent years, the modern tools of molecular biology, biochemistry, cell biology and immunology have been used to expand greatly our knowledge of these parasites and the disease they cause. Such studies are essential if we are to develop new means for the control of coccidiosis. In this chapter, selective aspects of the biology of these organisms, with emphasis on recent research in poultry, are reviewed. Topics considered include taxonomy, systematics, genetics, genomics, transcriptomics, proteomics, transfection, oocyst biogenesis, host cell invasion, immunobiology, diagnostics and control.

Research paper thumbnail of Pilot survey of expressed sequence tags (ESTs) from the asexual blood stages of Plasmodium vivax in human patients

Malaria Journal, 2003

Background: Plasmodium vivax is the most widely distributed human malaria, responsible for 70-80 ... more Background: Plasmodium vivax is the most widely distributed human malaria, responsible for 70-80 million clinical cases each year and large socio-economical burdens for countries such as Brazil where it is the most prevalent species. Unfortunately, due to the impossibility of growing this parasite in continuous in vitro culture, research on P. vivax remains largely neglected. Methods: A pilot survey of expressed sequence tags (ESTs) from the asexual blood stages of P. vivax was performed. To do so, 1,184 clones from a cDNA library constructed with parasites obtained from 10 different human patients in the Brazilian Amazon were sequenced. Sequences were automatedly processed to remove contaminants and low quality reads. A total of 806 sequences with an average length of 586 bp met such criteria and their clustering revealed 666 distinct events. The consensus sequence of each cluster and the unique sequences of the singlets were used in similarity searches against different databases that included P. vivax, Plasmodium falciparum, Plasmodium yoelii, Plasmodium knowlesi, Apicomplexa and the GenBank non-redundant database. An E-value of <10-30 was used to define a significant database match. ESTs were manually assigned a gene ontology (GO) terminology Results: A total of 769 ESTs could be assigned a putative identity based upon sequence similarity to known proteins in GenBank. Moreover, 292 ESTs were annotated and a GO terminology was assigned to 164 of them. Conclusion: These are the first ESTs reported for P. vivax and, as such, they represent a valuable resource to assist in the annotation of the P. vivax genome currently being sequenced. Moreover, since the GC-content of the P. vivax genome is strikingly different from that of P. falciparum, these ESTs will help in the validation of gene predictions for P. vivax and to create a gene index of this malaria parasite.