A random sequencing approach for the analysis of the Trypanosoma cruzi genome: general structure, large gene and repetitive DNA families, and gene discovery (original) (raw)

Biology of the Trypanosoma cruzi Genome

Biology of Trypanosoma cruzi

The genome of Trypanosoma cruzi was first made available in 2005, and the intrinsic genome complexity of this parasite has hindered high-quality genome assembly and annotation. Recent technological developments in long read sequencing allowed to circumvent this problem, showing very interesting features in the genome architecture of T. cruzi, allowing to accurately estimate gene copy numbers, abundance and distribution of repetitive sequences (including satellites and retroelements), and the complexity of multigene families implied in host-parasite interactions. The genome of T. cruzi is composed of a "core compartment" and a "disruptive compartment" which exhibit opposite GC content and gene composition, with high differences on their regulatory regions. The novel tandem and dispersed repetitive sequences identified, in addition to recombination events, allows to conclude that genome plasticity is a key survival strategy during its complex life cycle.

Gene Survey of the Pathogenic Protozoan Trypanosoma cruzi

2000

We have performed a survey of the active genes in the important human pathogen Trypanosoma cruzi by analyzing 5013 expressed sequence tags (ESTs) generated from a normalized epimastigote cDNA library. Clustering of all sequences resulted in 771 clusters, comprising 54% of the ESTs. In total, the ESTs corresponded to 3054 transcripts that might represent one-fourth of the total gene repertoire in T. cruzi. About 33% of the T. cruzi transcripts showed similarity to sequences in the public databases, and a large number of hitherto undiscovered genes predicted to be involved in transcription, cell cycle control, cell division, signal transduction, secretion, and metabolism were identified. More than 140 full-length gene sequences were derived from the ESTs. Comparisons with all open reading frames in yeast and in Caenorhabditis elegans showed that only 12% of the T. cruzi transcripts were shared among diverse eukaryotic organisms. Comparison with other kinetoplastid sequences identified 237 orthologous genes that are shared between these evolutionarily divergent organisms. The generated data are a useful resource for further studies of the biology of the parasite and for development of new means to combat Chagas' disease.

Database of Trypanosoma cruzi repeated genes: 20 000 additional gene variants

BMC Genomics, 2007

Repeats are present in all genomes, and often have important functions. However, in large genome sequencing projects, many repetitive regions remain uncharacterized. The genome of the protozoan parasite Trypanosoma cruzi consists of more than 50% repeats. These repeats include surface molecule genes, and several other gene families. In the T. cruzi genome sequencing project, it was clear that not all copies of repetitive genes were present in the assembly, due to collapse of nearly identical repeats. However, at the time of publication of the T. cruzi genome, it was not clear to what extent this had occurred.

The Trypanosoma cruzi genome; conserved core genes and extremely variable surface molecule families

Research in Microbiology, 2011

The protozoan parasite Trypanosoma cruzi is an important but neglected pathogen that causes chagas disease, which affects millions of people, mainly in latin America. The population structure and epidemiology of the parasite are complex, with much variability among strains. The genome sequence of a reference strain, CL Brener, was published in 2005, and the availability of this sequence has both revealed the complexity of the parasite genome and greatly facilitated research into parasite biology and pathogenesis, by making the sequences of more than 8000 core genes available. The T. cruzi genome is highly repetitive, which has resulted in inaccuracies in the genome sequence, and attempts have been made to provide a deeper analysis of repeated genes as a complement to the genome sequence. The genome was found to be organized in stable core regions containing housekeeping and other genes, surrounded by highly repetitive, often sub-telomeric highly variable regions containing multiple members of large families of surface molecule genes. Comparative sequencing of T. cruzi strains has been initiated and the results show that the core gene content of the parasite is highly conserved, but that much sequence variability, 3e4% difference at the DNA level on average between strains in coding regions, is present. The additional genomes will improve the understanding of parasite biology and epidemiology.

Parasite genome projects and the Trypanosoma cruzi genome initiative

Memórias do Instituto …, 1997

Since the start of the human genome project, a great number of genome projects on other "model" organism have been initiated, some of them already completed. Several initiatives have also been started on parasite genomes, mainly through support from WHO/TDR, involving North-South and South-South collaborations, and great hopes are vested in that these initiatives will lead to new tools for disease control and prevention, as well as to the establishment of genomic research technology in developing countries. The Trypanosoma cruzi genome project, using the clone CL-Brener as starting point, has made considerable progress through the concerted action of more than 20 laboratories, most of them in the South. A brief overview of the current state of the project is given.