A Faster and Simpler Algorithm for Sorting Signed Permutations by Reversals (original) (raw)
An algorithm to enumerate sorting reversals for signed permutations
2003
The rearrangement distance between single-chromosome genomes can be estimated as the minimum number of inversions required to transform the gene ordering observed in one into that observed in the other. This measure, known as" inversion distance," can be computed as the reversal distance between signed permutations.
Sorting Signed Permutations by Inversions in O ( n log n ) Time
Journal of Computational Biology, 2010
The study of genomic inversions (or reversals) has been a mainstay of computational genomics for nearly 20 years. After the initial breakthrough of Hannenhalli and Pevzner, who gave the first polynomial-time algorithm for sorting signed permutations by inversions, improved algorithms have been designed, culminating with an optimal linear-time algorithm for computing the inversion distance and a subquadratic algorithm for providing a shortest sequence of inversions-also known as sorting by inversions. Remaining open was the question of whether sorting by inversions could be done in O(n log n) time.
Sorting Signed Permutations by Reversal (Reversal Distance)
Encyclopedia of Algorithms, 2008
Sorting by inversions Problem Definition A signed permutation of size n is a permutation over f n; : : : ; 1; 1 : : : ng, where i = i for all i. The reversal = i; j (1 Ä i Ä j Ä n) is an operation that reverses the order and flips the signs of the elements
Journal of Computational Biology, 2001
Hannenhalli and Pevzner gave the first polynomial-time algorithm for computing the inversion distance between two signed permutations, as part of the larger task of determining the shortest sequence of inversions needed to transform one permutation into the other. Their algorithm (restricted to distance calculation) proceeds in two stages: in the first stage, the overlap graph induced by the permutation is decomposed into connected components, then in the second stage certain graph structures (hurdles and others) are identified. Berman and Hannenhalli avoided the explicit computation of the overlap graph and gave an O´nα´nµµ algorithm, based on a Union-Find structure, to find its connected components, where α is the inverse Ackerman function. Since for all practical purposes α´nµ is a constant no larger than four, this algorithm has been the fastest practical algorithm to date. In this paper, we present a new linear-time algorithm for computing the connected components, which is more efficient than that of Berman and Hannenhalli in both theory and practice. Our algorithm uses only a stack and is very easy to implement. We give the results of computational experiments over a large range of permutation pairs produced through simulated evolution; our experiments show a speed-up by a factor of 2 to 5 in the computation of the connected components and by a factor of 1.3 to 2 in the overall distance computation.
Improved bounds on sorting by length-weighted reversals
Journal of Computer and System Sciences, 2008
We study the problem of sorting binary sequences and permutations by length-weighted reversals. We consider a wide class of cost functions, namely f ( ) = α for all α 0, where is the length of the reversed subsequence. We present tight or nearly tight upper and lower bounds on the worst-case cost of sorting by reversals. Then we develop algorithms to approximate the optimal cost to sort a given input. Furthermore, we give polynomial-time algorithms to determine the optimal reversal sequence for a restricted but interesting class of sequences and cost functions. Our results have direct application in computational biology to the field of comparative genomics. The problems of sorting a given permutation by reversals and finding the reversal distance between two given permutations are equivalent: simply relabel the elements of the target permutation to be the identity and use the same relabeling for the source permutation.
An approximation algorithm for sorting by reversals and transpositions
Journal of Discrete Algorithms, 2008
Genome rearrangement algorithms are powerful tools to analyze gene orders in molecular evolution. Analysis of genomes evolving by reversals and transpositions leads to a combinatorial problem of sorting by reversals and transpositions, the problem of finding a shortest sequence of reversals and transpositions that sorts one genome into the other. In this paper we present a (4 − 2 k )-approximation algorithm for sorting by reversals and transpositions for unsigned permutations where k is the approximation ratio of the algorithm used for cycle decomposition. For the best known value of k our approximation ratio becomes 2.5909 + δ for any δ > 0. We also derive a lower bound on reversal and transposition distance of an unsigned permutation.
2008
A block-interchange is a rearrangement event that exchanges two, not necessarily consecutive, contiguous regions in a genome, maintaining the original orientation. Signed reversals are events that invert and change the orientation of a region in a genome. Both events are important for the comparative analysis of genomes. For this reason, we propose a new measure that consists in finding a minimum sequence of block-interchanges and signed reversals that transforms a genome into another. For each event, we assign a weight related to its norm and we argue the adequacy of this parameter to indicate the power of each event. We present a formula for the rearrangement measure and a polynomial time sorting algorithm for finding a sequence of block-interchanges and signed reversals that transforms a unichromosomal genome into another. 1
Sorting by Restricted-Length-Weighted Reversals
Genomics, Proteomics & Bioinformatics, 2005
Classical sorting by reversals uses the unit-cost model, that is, each reversal consumes an equal cost. This model limits the biological meaning of sorting by reversal. Bender and his colleagues extended it by assigning a cost function f (l) = l α for all α ≥ 0, where l is the length of the reversed subsequence. In this paper, we extend their results by considering a model in which long reversals are prohibited. Using the same cost function above for permitted reversals, we present tight or nearly tight bounds for the worst-case cost of sorting by reversals. Then we develop algorithms to approximate the optimal cost to sort a given 0/1 sequence as well as a given permutation. Our proposed problems are more biologically meaningful and more algorithmically general and challenging than the problem considered by Bender et al. Furthermore, our bounds are tight and nearly tight, whereas our algorithms provide good approximation ratios compared to the optimal cost to sort 0/1 sequences or permutations by reversals.
An (18/11)n upper bound for sorting by prefix reversals
Theoretical Computer Science, 2009
The pancake problem asks for the minimum number of prefix reversals sufficient for sorting any permutation of length n. We improve the upper bound for the pancake problem to (18/11)n + O(1) ≈ (1.6363)n.
Approximation Algorithm for Sorting by Reversals and Transpositions
2007
Genome rearrangement algorithms are powerful tools to analyze gene orders in molecular evolution. Analysis of genomes evolving by reversals and transpositions leads to a combinatorial problem of sorting by reversals and transpositions, the problem of finding a shortest sequence of reversals and transpositions that sorts one genome into the other. In this paper we present a (4 − 2 k )-approximation algorithm for sorting by reversals and transpositions for unsigned permutations where k is the approximation ratio of the algorithm used for cycle decomposition. For the best known value of k our approximation ratio becomes 2.5909 + δ for any δ > 0. We also derive a lower bound on reversal and transposition distance of an unsigned permutation.