A graphical model for computing the minimum cost transposition distance (original) (raw)

An algebraic 1.375-approximation algorithm for the Transposition Distance Problem

arXiv (Cornell University), 2020

Background: In genome rearrangements, the mutational event transposition swaps two adjacent blocks of genes in one chromosome. The Transposition Distance Problem (TDP) aims to find the minimum number of transpositions required to transform one chromosome into another (transposition distance), both represented as permutations. The pair of permutations can be transformed into another pair with the same distance where the target permutation is the identity, making TDP equivalent to the problem of Sorting by Transpositions (SBT). In 2012, SBT was proven to be N P-hard and the best approximation algorithm with a 1.375 ratio was proposed in 2006 by Elias and Hartman. Their algorithm employs simplification, a technique used to transform an input permutation π into a simple permutationπ, presumably easier to handle with. The permutationπ is obtained by inserting new symbols into π in a way that the lower bound of the transposition distance of π is kept onπ. The simplification is guaranteed to keep the lower bound, not the transposition distance. A sequence of operations sortingπ can be mimicked to sort π. Results and conclusions: First, we show that the algorithm of Elias and Hartman (EH algorithm) may require one extra transposition above the approximation ratio of 1.375, depending on how the input permutation is simplified. Next, using an algebraic approach, we propose a new upper bound for the transposition distance and a new 1.375-approximation algorithm to solve SBT skipping simplification and ensuring the approximation ratio of 1.375 for all the permutations in the Symmetric Group S n. We implemented our algorithm and EH's. Regarding the implementation of the EH algorithm, two issues needed to be fixed. We tested both algorithms against all permutations of size n, 2 ≤ n ≤ 12. The results show that the EH algorithm exceeds the approximation ratio of 1.375 for permutations with a size greater than 7. Overall, the average of the distances computed by our algorithm is a little better than the average of the ones computed by the EH algorithm and the execution times are similar. The percentage of computed distances that are equal to transposition distance, computed by both algorithms are also compared with others available in the literature. Finally, we investigate the performance of both implementations on longer permutations of maximum length 500. From this experiment, we conclude that both maximum and average distances computed by our algorithm are a little better than the ones computed by the EH algorithm. We also conclude that the running times of both algorithms are similar.

New bounds and tractable instances for the transposition distance

2006

The problem of sorting by transpositions asks for a sequence of adjacent interval exchanges that sorts a permutation and is of the shortest possible length. The distance of the permutation is defined as the length of such a sequence. Despite the apparently intuitive nature of this problem, introduced in 1995 by Bafna and Pevzner, the complexity of both finding an optimal sequence and computing the distance remains open today. In this paper, we establish connections between two different graph representations of permutations, which allows us to compute the distance of a few nontrivial classes of permutations in linear time and space, bypassing the use of any graph structure. By showing that every permutation can be obtained from one of these classes, we prove a new tight upper bound on the transposition distance. Finally, we give improved bounds on some other families of permutations and prove formulas for computing the exact distance of other classes of permutations, again in polynomial time.

Reconstruction Algorithms for Permutation Graphs and Distance-Hereditary Graphs

IEICE Transactions on Information and Systems, 2013

PREIMAGE CONSTRUCTION problem by Kratsch and Hemaspaandra naturally arose from the famous graph reconstruction conjecture. It deals with the algorithmic aspects of the conjecture. We present an O(n 8) time algorithm for PREIMAGE CONSTRUCTION on permutation graphs and an O(n 4 (n + m)) time algorithm for PREIMAGE CONSTRUCTION on distance-hereditary graphs, where n is the number of graphs in the input, and m is the number of edges in a preimage. Since each graph of the input has n − 1 vertices and O(n 2) edges, the input size is O(n 3) (, or O(nm)). There are polynomial time isomorphism algorithms for permutation graphs and distance-hereditary graphs. However the number of permutation (distance-hereditary) graphs obtained by adding a vertex to a permutation (distance-hereditary) graph is generally exponentially large. Thus exhaustive checking of these graphs does not achieve any polynomial time algorithm. Therefore reducing the number of preimage candidates is the key point.

Reconstruction Algorithms for Permutation Graphs and Distance-Hereditary Graphs (アルゴリズム(AL) Vol.2009-AL-126)

研究報告アルゴリズム（AL）, 2009

A Linear-Time Algorithm for Computing Inversion Distance between Signed Permutations with an Experimental Study

Journal of Computational Biology, 2001

Hannenhalli and Pevzner gave the first polynomial-time algorithm for computing the inversion distance between two signed permutations, as part of the larger task of determining the shortest sequence of inversions needed to transform one permutation into the other. Their algorithm (restricted to distance calculation) proceeds in two stages: in the first stage, the overlap graph induced by the permutation is decomposed into connected components, then in the second stage certain graph structures (hurdles and others) are identified. Berman and Hannenhalli avoided the explicit computation of the overlap graph and gave an O´nα´nµµ algorithm, based on a Union-Find structure, to find its connected components, where α is the inverse Ackerman function. Since for all practical purposes α´nµ is a constant no larger than four, this algorithm has been the fastest practical algorithm to date. In this paper, we present a new linear-time algorithm for computing the connected components, which is more efficient than that of Berman and Hannenhalli in both theory and practice. Our algorithm uses only a stack and is very easy to implement. We give the results of computational experiments over a large range of permutation pairs produced through simulated evolution; our experiments show a speed-up by a factor of 2 to 5 in the computation of the connected components and by a factor of 1.3 to 2 in the overall distance computation.

A new tight upper bound on the transposition distance

Algorithms in Bioinformatics, 2005

We study the problem of computing the minimal number of adjacent, non-intersecting block interchanges required to transform a permutation into the identity permutation. In particular, we use the graph of a permutation to compute that number for a particular class of permutations in linear time and space, and derive a new tight upper bound on the so-called transposition distance.

Using Graphs for the Analysis and Construction of Permutation Distance-Preserving Mappings

IEEE Transactions on Information Theory, 2000

A new way of looking at permutation distance-preserving mappings (DPMs) is presented by making use of a graph representation. The properties necessary to make such a graph distance-preserving, are also investigated. Further, this new knowledge is used to analyze previous constructions, as well as to construct a new general mapping algorithm for a previous multilevel construction.

Lower bounding edit distances between permutations

SIAM Journal on Discrete Mathematics 27 (3), 1410-1028, 2013

A number of fields, including the study of genome rearrangements and the design of interconnection networks, deal with the connected problems of sorting permutations in "as few moves as possible", using a given set of allowed operations, or computing the number of moves the sorting process requires, often referred to as the \emph{distance} of the permutation. These operations often act on just one or two segments of the permutation, e.g. by reversing one segment or exchanging two segments. The \emph{cycle graph} of the permutation to sort is a fundamental tool in the theory of genome rearrangements, and has proved useful in settling the complexity of many variants of the above problems. In this paper, we present an algebraic reinterpretation of the cycle graph of a permutation pi\pipi as an even permutation barpi\bar{\pi}barpi, and show how to reformulate our sorting problems in terms of particular factorisations of the latter permutation. Using our framework, we recover known results in a simple and unified way, and obtain a new lower bound on the \emph{prefix transposition distance} (where a \emph{prefix transposition} displaces the initial segment of a permutation), which is shown to outperform previous results. Moreover, we use our approach to improve the best known lower bound on the \emph{prefix transposition diameter} from 2n/32n/32n/3 to lfloor3n/4rfloor\lfloor3n/4\rfloorlfloor3n/4rfloor, and investigate a few relations between some statistics on pi\pipi and barpi\bar{\pi}barpi.

Branch-and-bound algorithms for the problem of sorting by transpositions

2006

In computational biology, genome rearrangements is a field in which we investigate the mutational event of transposition, that moves blocks of genes from a region to another inside a chromosome. This event generates the combinatorial problem of transposition distance, that consists in finding the minimum number of transpositions transforming a chromosome into another. The objective of this work is to propose algorithms for improving the results of this problem, such that it could be used by biologists to infer evolutionary distance between two organisms evolved one from another from transpositions. We devise a simple branch-and-bound algorithm, and propose a branch-and-bound heuristic to improve the results computed by the 1.5-approximation algorithm of , based on a structure called interleaving graph. Executions of our algorithms on all permutations with lengths 2 to 11 have shown better performance when compared to other results. The target of this article is to contribute for discovering the complexity of the problem of transposition distance, which remains open. This work is mainly an effort to consolidate the bioinformatics and computational biology areas on the midwest region of Brazil. This knowledge can be used on the context of the regional genome sequencing projects, and also for the formation of experts that can contribute for the development of science and technology on this region.

A graphical model for computing the minimum cost transposition distance (original) (raw)

Related papers