Slider--maximum use of probability information for alignment of short sequence reads and SNP detection - PubMed (original) (raw)

Slider--maximum use of probability information for alignment of short sequence reads and SNP detection

Nawar Malhis et al. Bioinformatics. 2009.

Abstract

Motivation: A plethora of alignment tools have been created that are designed to best fit different types of alignment conditions. While some of these are made for aligning Illumina Sequence Analyzer reads, none of these are fully utilizing its probability (prb) output. In this article, we will introduce a new alignment approach (Slider) that reduces the alignment problem space by utilizing each read base's probabilities given in the prb files.

Results: Compared with other aligners, Slider has higher alignment accuracy and efficiency. In addition, given that Slider matches bases with probabilities other than the most probable, it significantly reduces the percentage of base mismatches. The result is that its SNP predictions are more accurate than other SNP prediction approaches used today that start from the most probable sequence, including those using base quality.

PubMed Disclaimer

Figures

Fig. 1.

Fig. 1.

Slider scans both the lexographically sorted reference database and the lexicographically sorted Px_Reads (s.ol0) input table once to generate all exact matches. Exact matches are stored in the sorted s.m0 table. In this example, the set of input sequences is 6 bp (_SZ_r = 6), which is aligned to a reference database of 10 bp (_SZ_d = 10) oligos created with a sliding window across the reference. Reads that match are indicated in bold and underlined with an example of a unique match indicated by a solid line and that of a multiple match with a dashed line.

Fig. 2.

Fig. 2.

Probability that a given base mismatch is a true SNP as a function of the read sequence weight.

Similar articles

Cited by

References

    1. Aho AV, Corasick MJ. Efficient string matching: an aid to bibiographic search. Commun. ACM. 1975;18:333–340.
    1. Altschul SF, et al. Basic local alignment search tool. J. Mol. Biol. 1990;215:403–410. - PubMed
    1. Brockman W, et al. Quality scores and SNP detection in sequencing-by-synthesis systems. Genome Res. 2008;18:763–770. - PMC - PubMed
    1. Delcher AL, et al. Fast algorithms for large-scale genome alignment and comparison. Nucleic Acids Res. 2002;30:2478–2483. - PMC - PubMed
    1. Eppstein D, et al. Proceedings 1st Symposium Discrete Algorithms ACM and SIAM. San Francisco: 1990. Sparse dynamic programming; pp. 513–522.

Publication types

MeSH terms

LinkOut - more resources