A cost-effective straightforward protocol for shotgun Illumina libraries designed to assemble complete mitogenomes from non-model species (original) (raw)
Related papers
Nucleic acids research, 2014
The advent in high-throughput-sequencing (HTS) technologies has revolutionized conventional biodiversity research by enabling parallel capture of DNA sequences possessing species-level diagnosis. However, polymerase chain reaction (PCR)-based implementation is biased by the efficiency of primer binding across lineages of organisms. A PCR-free HTS approach will alleviate this artefact and significantly improve upon the multi-locus method utilizing full mitogenomes. Here we developed a novel multiplex sequencing and assembly pipeline allowing for simultaneous acquisition of full mitogenomes from pooled animals without DNA enrichment or amplification. By concatenating assemblies from three de novo assemblers, we obtained high-quality mitogenomes for all 49 pooled taxa, with 36 species >15 kb and the remaining >10 kb, including 20 complete mitogenomes and nearly all protein coding genes (99.6%). The assembly quality was carefully validated with Sanger sequences, reference genomes ...
Molecular ecology resources, 2015
Biodiversity analyses based on Next Generation Sequencing (NGS) platforms have developed by leaps and bounds in recent years. A PCR-free strategy, which can alleviate taxonomic bias, was considered as a promising approach to delivering reliable species compositions of targeted environments. The major impediment of such a method is the lack of appropriate mitochondrial DNA enrichment ways. Because mitochondrial genomes (mitogenomes) make up only a small proportion of total DNA, PCR-free methods will inevitably result in a huge excess of data (> 99%). Furthermore, the massive volume of sequence data is highly demanding on computing resources. Here, we present a mitogenome enrichment pipeline via a gene capture chip that was designed by virtue of the mitogenome sequences of the 1000 Insect Transcriptome Evolution project (1KITE, www .1kite . org). A mock sample containing 49 species was used to evaluate the efficiency of the mitogenome capture method. We demonstrate that the proport...
PeerJ
DNA barcoding is critical to conservation and biodiversity research, yet public reference databases are incomplete. Existing barcode databases are biased toward cytochrome oxidase subunit I (COI) and frequently lack associated voucher specimens or geospatial metadata, which can hinder reliable species assignments. The emergence of metabarcoding approaches such as environmental DNA (eDNA) has necessitated multiple marker techniques combined with barcode reference databases backed by voucher specimens. Reference barcodes have traditionally been generated by Sanger sequencing, however sequencing multiple markers is costly for large numbers of specimens, requires multiple separate PCR reactions, and limits resulting sequences to targeted regions. High-throughput sequencing techniques such as genome skimming enable assembly of complete mitogenomes, which contain the most commonly used barcoding loci (e.g., COI, 12S, 16S), as well as nuclear ribosomal repeat regions (e.g., ITS1&2, 18S). W...
bioRxiv (Cold Spring Harbor Laboratory), 2016
Hybridization capture is considered very cost-and time-effective method for enriching a massive amount of target loci distributed separately in a whole genome. However, divergent loci are difficult to enrich for the sequence mismatch between probes and target DNA. After analysis the distributional pattern of divergent loci in mitochondrial genomes (mitogenomes), we notice that the relatively variable regions are intercept by the relatively conservative regions. We propose to extend the length of library to overcome the problem. By using a home-made probe set to bait amphibian mitogeneomes DNA, we demonstrate that using 2 kb DNA libraries generate high sequence coverage in the highly variable regions than using 400 bp DNA libraries. These suggest that longer fragments in the library generally contain both relatively variable regions and relatively conservative regions. The divergent part DNA along with conservative part DNA is captured during hybridization. We present a protocol that allows users to overcome the gap problem for highly divergent gene capture.
Leaf-nosed bats (Phyllostomidae) are one of the most studied groups within the order Chiroptera mainly because of their outstanding species richness and diversity in morphological and ecological traits. Rapid diversification and multiple homoplasies have made the phylogeny of the family difficult to solve using morphological characters. Molecular data have contributed to shed light on the evolutionary history of phyllostomid bats, yet several relationships remain unresolved at the intra-familial level. Complete mitochondrial genomes have proven useful to deal with this kind of situation in other groups of mammals by providing access to a large number of molecular characters. At present, there are only two mitogenomes available for phyllostomid bats hinting at the need for further exploration of the mitogenomic approach in this group. We used both standard Sanger sequencing of PCR products and next-generation sequencing (NGS) of shotgun genomic DNA to obtain new complete mitochondrial genomes from 10 species of phyllostomid bats, including representatives of major subfamilies, plus one outgroup belonging to the closely-related mormoopids. We then evaluated the contribution of mitogenomics to the resolution of the phylogeny of leaf-nosed bats and compared the results to those based on mitochondrial genes and the RAG2 and VWF nuclear makers. Our results demonstrate the advantages of the Illumina NGS approach to efficiently obtain mitogenomes of phyllostomid bats. The phylogenetic signal provided by entire mitogenomes is highly comparable to the one of a concatenation of individual mitochondrial and nuclear markers, and allows increasing both resolution and statistical support for several clades. This enhanced phylogenetic signal is the result of combining markers with heterogeneous evolutionary rates representing a large number of nucleotide sites. Our results illustrate the potential of the NGS mitogenomic approach for resolving the evolutionary history of phyllostomid bats based on a denser species sampling.
Conservation Genetics Resources
The reconstruction of complete mitochondrial genomes (mitogenomes) has considerable potential to clarify species relationships in cases where morphological analysis and DNA sequencing of individual genes are inconclusive. However, the trend to use only mitogenomes for the phylogenies presented in mitogenome announcements carries the inherent risk that the study species’ taxonomy is incorrect because no mitogenomes have yet been reconstructed for its sister species. Here, I illustrate this problem using the mitogenomes of two seahorses, Hippocampus capensis and H. queenslandicus. Both specimens used for mitogenome reconstruction originated from traditional Chinese medicine markets rather than native habitats. Although mitogenome phylogenies placed these specimens correctly among the seahorses from which mitogenomes were available at the time, incorporating single-marker sequence from closely related species into the phylogenies revealed that both mitogenomes are problematic. The mito...
Mitochondrial metagenomics: letting the genes out of the bottle
'Mitochondrial metagenomics' (MMG) is a methodology for shotgun sequencing of total DNA from specimen mixtures and subsequent bioinformatic extraction of mitochondrial sequences. The approach can be applied to phylogenetic analysis of taxonomically selected taxa, as an economical alternative to mitogenome sequencing from individual species, or to environmental samples of mixed specimens, such as from mass trapping of invertebrates. The routine generation of mitochondrial genome sequences has great potential both for systematics and community phylogenetics. Mapping of reads from low-coverage shotgun sequencing of environmental samples also makes it possible to obtain data on spatial and temporal turnover in whole-community phylogenetic and species composition, even in complex ecosystems where species-level taxonomy and biodiversity patterns are poorly known. In addition, read mapping can produce information on species biomass, and potentially allows quantification of within-species genetic variation. The success of MMG relies on the formation of numerous mitochondrial genome contigs, achievable with standard genome assemblers, but various challenges for the efficiency of assembly remain, particularly in the face of variable relative species abundance and intra-specific genetic variation. Nevertheless, several studies have demonstrated the power of mitogenomes from MMG for accurate phylogenetic placement, evolutionary analysis of species traits, biodiversity discovery and the establishment of species distribution patterns; it offers a promising avenue for unifying the ecological and evolutionary understanding of species diversity.
Complete vertebrate mitogenomes reveal widespread repeats and gene duplications
Genome Biology
Background Modern sequencing technologies should make the assembly of the relatively small mitochondrial genomes an easy undertaking. However, few tools exist that address mitochondrial assembly directly. Results As part of the Vertebrate Genomes Project (VGP) we develop mitoVGP, a fully automated pipeline for similarity-based identification of mitochondrial reads and de novo assembly of mitochondrial genomes that incorporates both long (> 10 kbp, PacBio or Nanopore) and short (100–300 bp, Illumina) reads. Our pipeline leads to successful complete mitogenome assemblies of 100 vertebrate species of the VGP. We observe that tissue type and library size selection have considerable impact on mitogenome sequencing and assembly. Comparing our assemblies to purportedly complete reference mitogenomes based on short-read sequencing, we identify errors, missing sequences, and incomplete genes in those references, particularly in repetitive regions. Our assemblies also identify novel gene r...
An integrated pipeline for next-generation sequencing and annotation of mitochondrial genomes
Nucleic Acids Research, 2010
Mitochondrial (mt) genomics represents an understudied but important field of molecular biology. Increasingly, mt dysfunction is being linked to a range of human diseases, including neurodegenerative disorders, diabetes and impairment of childhood development. In addition, mt genomes provide important markers for systematic, evolutionary and population genetic studies. Some technological limitations have prevented the expanded generation and utilization of mt genomic data for some groups of organisms. These obstacles most acutely impede, but are not limited to, studies requiring the determination of complete mt genomic data from minute amounts of material (e.g. biopsy samples or microscopic organisms). Furthermore, post-sequencing bioinformatic annotation and analyses of mt genomes are time consuming and inefficient. Herein, we describe a high-throughput sequencing and bioinformatic pipeline for mt genomics, which will have implications for the annotation and analysis of other organellar (e.g. plastid or apicoplast genomes) and virus genomes as well as long, contiguous regions in nuclear genomes. We utilize this pipeline to sequence and annotate the complete mt genomes of 12 species of parasitic nematode (order Strongylida) simultaneously, each from an individual organism. These mt genomic data provide a rich source of markers for studies of the systematics and population genetics of a group of socioeconomically important pathogens of humans and other animals.