Motif Finding with Application to the Transcription Factor Binding Sites Problem (original) (raw)

An Efficient Ant Colony Algorithm for DNA Motif Finding

2014

Finding motifs in gene sequences is one of the most important problems of bioinformatics and belongs to NP-hard type. This paper proposes a new ant colony optimization algorithm based on consensus approach, in which a relax technique is applied to find the location of the motif. The efficiency of the algorithm is evaluated by comparing it with the state-of-the-art algorithms.

An ant colony optimization based algorithm for identifying gene regulatory elements

Computers in Biology and Medicine, 2013

It is one of the most important tasks in bioinformatics to identify the regulatory elements in gene sequences. Most of the existing algorithms for identifying regulatory elements are inclined to converge into a local optimum, and have high time complexity. Ant Colony Optimization (ACO) is a meta-heuristic method based on swarm intelligence and is derived from a model inspired by the collective foraging behavior of real ants. Taking advantage of the ACO in traits such as self-organization and robustness, this paper designs and implements an ACO based algorithm named ACRI (ant-colony-regulatory-identification) for identifying all possible binding sites of transcription factor from the upstream of co-expressed genes. To accelerate the ants' searching process, a strategy of local optimization is presented to adjust the ants' start positions on the searched sequences. By exploiting the powerful optimization ability of ACO, the algorithm ACRI can not only improve precision of the results, but also achieve a very high speed. Experimental results on real world datasets show that ACRI can outperform other traditional algorithms in the respects of speed and quality of solutions.

Algorithm based on the Ant Colony Optimization for the DNA motif fusion

are useful to determine nucleotides/amino-acids that are likely involved in structures, functions, regulations and evolutions, or to infer homology between genes/proteins. The main objective of this paper is the fusion of motifs. Our task is to analyze a set of possible motifs and to detect if similarity exists between them, to construct a general motif. The motifs fusion method is based on the algorithm of combinatorial optimization called Artificial Ants System. This method uses the nucleotides of the first motif to construct the graph where the ants will walk. Then, the graph is crossed by the ants according to the path of the second motif, using a transition function that promoves to flow the path between similar nucleotides. The ants when walking leave pheromone in the nodes, in a way that at the end several have a lot of or little pheromone. Finally the graph is crossed again to construct the resultant motif composed by the nodes with much pheromone.

Implementation of an ant colony system for DNA sequence optimization

Artificial Life and Robotics, 2009

DNA computation exploits the computational power inherent in molecules for information processing. However, in order to perform the computation correctly, a set of good DNA sequences is crucial. A lot of work has been carried out on designing good DNA sequences to archive a reliable molecular computation. In this article, the ant colony system (ACS) is introduced as a new tool for DNA sequence design. In this approach, the DNA sequence design is modeled as a path-finding problem, which consists of four nodes, to enable the implementation of the ACS. The results of the proposed approach are compared with other methods such as the genetic algorithm.

GAMOT: An efficient genetic algorithm for finding challenging motifs in DNA sequences

2006

Weak signals that mark transcription factor binding sites involved in gene regulation are considered to be challenging motifs. Identifying these motifs in unaligned DNA sequences is a computationally hard problem which requires efficient algorithms. Genetic Algorithms (GA), inspired from evolution in nature, are a class of stochastic search algorithms which have been applied successfully to many computationally hard problems, including regulatory site prediction.

On the use of algorithms to discover motifs in DNA sequences

2011

Many approaches are currently devoted to find DNA motifs in nucleotide sequences. However, this task remains challenging for specialists nowadays due to the difficulties they find to deeply understand gene regulatory mechanisms, especially when analyzing binding sites in DNA. These sites or specific nucleotide sequences are known to be responsible for transcription processes. Thus, this work aims at providing an updated overview on strategies developed to discover meaningful motifs in DNA-related sequences, and, in particular, their attempts to find out relevant binding sites. From all existing approaches, this work is focused on dictionary, ensemble, and artificial intelligence-based algorithms since they represent the classical and the leading ones, respectively.

Ant Colony Optimization for Construction of Common Pattern of the Protein Motifs

In this work is presented an approach for the construction of common patterns of the protein motifs of the amyloid protein motifs, extracted from the database AMYPdb, denoted as regular expressions using the rules PROSITE. Our task is to analyze a set of possible motifs and to detect if similarity exists between them, in order to construct a general motif. The Ant Colony Optimization Model uses an algorithm of combinatorial optimization based on Ant Colonies. It uses the amino acids of the first motif to construct the graph where the ants will walk. Then, the graph is crossed by the ants according to the path of the second motif, used by a transition function that promove to flow the path between similars amino acids. The ants when walking leave pheromone in the nodes, in a way that at the end several have a lot of or little pheromone. Finally the graph is crossed again to construct the resultant regular expression composed by the nodes with much pheromone.

Novel Applications of Ant Colony Optimization with the Traveling Salesman Problem in DNA Sequence Optimization

IEEE, 2022

The Ant Colony Optimization Algorithm is a novel optimization algorithm based on the intelligence of ant behavior, whereas the Traveling Salesman Problem is the problem of determining the shortest route between a group of cities that start in one city and visit each other city only once before returning to the starting (home) city. This study proposes an Ant Colony Optimization approach with the Traveling Salesman Problem (ACO-TSP) for DNA Sequence Optimizations. The proposed technique is a unique ant colony optimization approach for reconstructing DNA sequences from fragments of DNA. Existing meta-heuristics, on the other hand, are consistently outperformed in terms of performance by newly invented constructive heuristics. This model was developed based on these novel heuristics, with four nodes (cities) representing the four DNA bases. According to the findings of the experiments, the new approach is more reliable and generates higher-quality results with a minimum average of 17.6 and a maximum DNA accuracy similarity of 59.1.

An Efficient Combinatorial Approach for Solving the DNA Motif Finding Problem

2009

The detection of an over-represented sub-sequence in a set of (carefully chosen) DNA sequences is often the main clue leading to the investigation of a possible functional role for such a subsequence. Over-represented substrings (with possibly local mutations) in a biological string are termed motifs. A typical functional unit that can be modeled by a motif is a Transcription Factor Binding Site (TFBS), a portion of the DNA sequence apt to the binding of a protein that participates in complex transcriptomic biochemical reactions.