Long-read sequencing reveals a 4.4 kb tandem repeat region in the mitogenome of Echinococcus granulosus (sensu stricto) genotype G1 - PubMed (original) (raw)
Long-read sequencing reveals a 4.4 kb tandem repeat region in the mitogenome of Echinococcus granulosus (sensu stricto) genotype G1
Liina Kinkar et al. Parasit Vectors. 2019.
Abstract
Background: Echinococcus tapeworms cause a severe helminthic zoonosis called echinococcosis. The genus comprises various species and genotypes, of which E. granulosus (sensu stricto) represents a significant global public health and socioeconomic burden. Mitochondrial (mt) genomes have provided useful genetic markers to explore the nature and extent of genetic diversity within Echinococcus and have underpinned phylogenetic and population structure analyses of this genus. Our recent work indicated a sequence gap (> 1 kb) in the mt genomes of E. granulosus genotype G1, which could not be determined by PCR-based Sanger sequencing. The aim of the present study was to define the complete mt genome, irrespective of structural complexities, using a long-read sequencing method.
Methods: We extracted high molecular weight genomic DNA from protoscoleces from a single cyst of E. granulosus genotype G1 from a sheep from Australia using a conventional method and sequenced it using PacBio Sequel (long-read) technology, complemented by BGISEQ-500 short-read sequencing. Sequence data obtained were assembled using a recently-developed workflow.
Results: We assembled a complete mt genome sequence of 17,675 bp, which is > 4 kb larger than the complete mt genomes known for E. granulosus genotype G1. This assembly includes a previously-elusive tandem repeat region, which is 4417 bp long and consists of ten near-identical 441-445 bp repeat units, each harbouring a 184 bp non-coding region and adjacent regions. We also identified a short non-coding region of 183 bp, which includes an inverted repeat.
Conclusions: We report what we consider to be the first complete mt genome of E. granulosus genotype G1 and characterise all repeat regions in this genome. The numbers, sizes, sequences and functions of tandem repeat regions remain to be studied in different isolates of genotype G1 and in other genotypes and species. The discovery of such 'new' repeat elements in the mt genome of genotype G1 by PacBio sequencing raises a question about the completeness of some published genomes of taeniid cestodes assembled from conventional or short-read sequence datasets. This study shows that long-read sequencing readily overcomes the challenges of assembling repeat elements to achieve improved genomes.
Keywords: Complete mitochondrial (mt) genome; Echinococcus; Genotype G1; PacBio sequencing; Repetitive DNA.
Conflict of interest statement
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Figures
Fig. 1
The complete mitochondrial genome of Echinococcus granulosus (sensu stricto) genotype G1. The 12 protein-encoding genes, 2 rRNAs and 21 tRNAs (except tRNA-Gly) are depicted in light grey; the non-coding region NR1 is in darker grey. Transfer RNAs are designated by one-letter amino acid abbreviations; gene designations follow Le et al. [83]. The tandem repeat region (in four shades of brown) spans 4417 bp and includes 10 repeat units. Each unit contains the 3′-end of _nad_5, the non-coding region NR2, tRNA-Gly (proposed pseudo-tRNAs in repeat units 1–9), 3 bp-intergenic region (not shown on figure) and 5′-end of _cox_3. Repeat units 2–9 are identical, whereas units 1 and 10 each have a 3–4 nucleotide insertion, marked by an asterisk. The TTT insertion occurs in repeat unit 1, at the 3′-end of _nad_5, the TTTT insertion occurs in repeat unit 10, in tRNA-Gly. Secondary structures of NR1 (a) and NR2 (b) are shown at the top; parts of these structures are predicted to have hair-pin loops with no mis-matches - depicted in green (stem) and yellow (loop); mis-matches are boxed
References
- World Health Organization . Fourth WHO report on neglected tropical diseases. Integrating neglected tropical diseases into global health and development. Geneva: World Health Organization; 2017.
- Kinkar L, Laurimäe T, Sharbatkhori M, Mirhendi H, Kia EB, Ponce-Gordo F, et al. New mitogenome and nuclear evidence on the phylogeny and taxonomy of the highly zoonotic tapeworm Echinococcus granulosus sensu stricto. Infect Genet Evol. 2017;52:52–58. doi: 10.1016/j.meegid.2017.04.023. - DOI - PubMed
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Miscellaneous