SIGNIFICANCE OF MASSIVELY PARALLEL SEQUENCING STRATEGIES AND DE NOVO ASSEMBLY ALGORITHMS IN WHOLE GENOME SEQUENCING (original) (raw)
Massively parallel sequencing (MPS) is a novel approach for sequencing genomes and it provides a significantly higher throughput when compared with the conventional sequencing platforms. Therefore, it has become a better solution for achieving genome sequences of particular organisms with a greater accuracy and precision. Genome assembly is the immediate process which is performed subsequently to genome sequencing and if the genome of a particular organism has not been sequenced previously, de novo assembly approach is the sole mode of acquiring the complete genome. When considering a plausible model organism, it is crucial to procure the complete genome sequence in order to provide a better biological insight to that particular organism. This article therefore, begins to discuss the significance of acquiring the complete genome sequence of a plausible model organism and then the major approaches that are available for genomes sequencing showing the suitability of massively parallel sequencing approaches for such exercise. Thereafter, the comparison of next generation sequencing platforms and sequencing assembly algorithms showing the importance of de novo assembly approach in achieving genome sequencing are discussed while highlighting the importance of quality assessment and validation procedures for sequenced genomes. Finally, the challenges and their countermeasures in whole genome sequencing of model organisms are addressed. Introduction Genome sequencing is the process in which the nucleotide order of a particular genome is achieved in terms of four different nucleotides that are Adenine, Guanine, Thymine, and Cytosine. There are numerous advantages of acquiring the complete genome sequence of an organism, especially in research purposes since it represents the entire biological and biochemical insight into the organism of interest [1]. As a result of that most of the biological and biochemical research projects that are based on animals or plants, are demanding for the sequence information in order to ascertain biological processes or biochemical pathways that are being taken place within them.
Related papers
Bioinformatics-Sequence_and_Genome_Analysis.(MAZEN SAEED)
The first sequences to be collected were those of proteins, 2 DNA sequence databases, 3 Sequence retrieval from public databases, 4 Sequence analysis programs, 5 The dot matrix or diagram method for comparing sequences, 5 Alignment of sequences by dynamic programming, 6 Finding local alignments between sequences, 8 Multiple sequence alignment, 9 Prediction of RNA secondary structure, 9 Discovery of evolutionary relationships using sequences, 10 Importance of database searches for similar sequences, 11 The FASTA and BLAST methods for database searches, 11 Predicting the sequence of a protein by translation of DNA sequences, 12 Predicting protein secondary structure, 13 The first complete genome sequence, 14 ACEDB, the first genome database, 15 REFERENCES, 15 2 s C H A P T E R 1 Many types of sequence databases are described in the first annual issue of the journal Nucleic Acids Research. The growth of the number of sequences in GenBank can be tracked at http://www. ncbi.nlm.nih.gov/Gen Bank/genebankstats. html.
Biological Data Analysis Program (BDAP): a multitasking biological sequence analysis program
Exploration of mysterious facts from the sequences and structures of biomolecules of an organism is the essential requirement for understanding their molecular and evolutionary processes. Sequence analysis approach is an exciting choice for exploring those mysterious facts from biological data at genomic, transcriptomic and pro-teomic level. Development of bioinformatics tools is the most challenging task for analyzing these biological data at above three levels. In this communication, an attempt has been made to develop a bioinformatics program ''Biolog-ical Data Analysis Program (BDAP)'' having the ability to analyze the DNA/RNA/protein sequence data at molecular level. It also includes the links of various online databases, tools, search engines and many of the prestigious journals. The coding of the program has been done in Perl language. BDAP is freely available at https://sites.google.com/site/ dwivediplanet/bdap under the terms and conditions of GNU General Public License.
Sequential and parallel algorithms for DNA sequencing
Bioinformatics, 1997
Reconstruction of the original DNA sequence in the sequencing by the hybridization approach (SBH) requires computational support due to a large number of possible combinations. One can notice a lack of algorithms admitting false-negative data and giving in addition all possible solutions. Results: In this paper, a new method of sequencing has been proposed. An algorithm based on its idea (for the general case, when some data are missing, like in the real experiment) has been implemented and tested. Authentic DNA sequences have been used for testing. A parallel version of the algorithm has also been implemented and tested. The quality of the reconstruction is satisfactory for the library of oligonucleotides of length between 8 and 12, and 100, 200 and 300 bp long sequences. A way to a further decrease in the computation time is also suggested.
The new parallel sequencing technologies produce gigabases of genome information in just a few days bring with them new problems for data storage and processing. Sequencing technologies have applications in human, plant and animal genome studies, metagenomics, epigenetics, discovery of non-coding RNAs and protein binding sites. There are two major problems in next generation sequencing (NGS) data processing: algorithms for alignment of sequences (for which exists a reference sequence) and algorithms for de novo genome (sequence) assembly (for which no reference sequence is available). Different factors define the choice of better algorithmic solution: cost, reads length, data volume, rate of data generation). As a result the particular bioinformatics solution depends on the biological application and on the type of sequencing technology used to generate the data. All the technologies have their strengths and weaknesses and limits of their performance for providing error free sequenc...
A field guide to whole-genome sequencing, assembly and annotation
Evolutionary applications, 2014
Genome sequencing projects were long confined to biomedical model organisms and required the concerted effort of large consortia. Rapid progress in high-throughput sequencing technology and the simultaneous development of bioinformatic tools have democratized the field. It is now within reach for individual research groups in the eco-evolutionary and conservation community to generate de novo draft genome sequences for any organism of choice. Because of the cost and considerable effort involved in such an endeavour, the important first step is to thoroughly consider whether a genome sequence is necessary for addressing the biological question at hand. Once this decision is taken, a genome project requires careful planning with respect to the organism involved and the intended quality of the genome draft. Here, we briefly review the state of the art within this field and provide a step-by-step introduction to the workflow involved in genome sequencing, assembly and annotation with pa...
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.