Accelerated Motif Detection Using Combinatorial Techniques (original) (raw)

A New Approach to Count Pattern Motifs Using Combinatorial Techniches

2012

Background: Network motif algorithms have been a topic of research mainly after the 2002-seminal paper from Milo et al, that provided motifs as a way to uncover the basic building blocks of most networks. In Bioinformatics, motifs have been mainly applied in the field of gene regulation networks field. Results: This paper proposes new algorithms to exactly count isomorphic pattern motifs of sizes 3, 4 and 5 in directed graphs. Let G(V, E) be a directed graph with m = |E|. We describe an O(m √ m) time complexity algorithm to count isomorphic patterns of size 3. In order to count isomorphic patterns of size 4, we propose an O(m 2) algorithm. To count patterns with 5 vertices, the algorithm is O(m 2 n). Conclusion: The new algorithms were implemented and compared with FANMOD and Kavosh motif detection tools. The experiments show that our algorithms are expressively faster than FANMOD and Kavosh's. We also let our motif-detecting tool available in the Internet.

acc-MOTIF: Accelerated Motif Detection Using Combinatorial Techniques

2012

Background: Network motif algorithms have been a topic of research mainly after the 2002-seminal paper from Milo et al, that provided motifs as a way to uncover the basic building blocks of most networks. In bioinformatics, motifs has been mainly applied in gene regulation networks field. Results: This article proposes two new algorithms to exactly count isomorphic pattern motifs of size 3 and 4 in directed graphs. The algorithms are accelerated by combinatorial techniques. Let G(V, E) be a directed graph with m = |E|. We describe an O(m √ m) time complexity algorithm to count isomorphic patterns of size 3. To counting isomorphic patterns of size 4, we propose an O(m 2) algorithm. Conclusion: The new algorithms were implemented and compared with Fanmod motif detection tool. The experiments show that our algorithms are expressively faster than the other tools. We also let our tool to detect motifs available in the Internet.

Efficient Counting of Network Motifs

2010 IEEE 30th International Conference on Distributed Computing Systems Workshops, 2010

Counting network motifs has an important role in studying a wide range of complex networks. However, when the network size is large, as in the case of Internet Topology and WWW graphs counting the number of motifs becomes prohibitive. Devising efficient motif counting algorithms thus becomes an important goal.

acc-Motif Detection Tool

Network motif algorithms have been a topic of research mainly after the 2002-seminal paper from Milo \emph{et al}, that provided motifs as a way to uncover the basic building blocks of most networks. In Bioinformatics, motifs have been mainly applied in the field of gene regulation networks. This paper proposes new algorithms to exactly count isomorphic pattern motifs of sizes 3, 4 and 5 in directed graphs. Let G(V,E)G(V,E)G(V,E) be a directed graph with m=∣E∣m=|E|m=E. We describe an O(msqrtm)O({m\sqrt{m}})O(msqrtm) time complexity algorithm to count isomorphic patterns of size 3. In order to count isomorphic patterns of size 4, we propose an O(m2)O(m^2)O(m2) algorithm. To count patterns with 5 vertices, the algorithm is O(m2n)O(m^2n)O(m2n). The new algorithms were implemented and compared with FANMOD and Kavosh motif detection tools. The experiments show that our algorithms are expressively faster than FANMOD and Kavosh's. We also let our motif-detecting tool available in the Internet.

An Faster Network Motif Detection Tool

2018

Network motif provides a way to uncover the basic building blocks of most complex networks. This task usually demands high computer processing, specially for motif with 5 or more vertices. This paper presents an extended methodology with the following features: (i) search for motifs up to 6 vertices, (ii) multithread processing, and a (iii) new enumeration algorithm with lower complexity. The algorithm to compute motifs solve isomorphism in O(1)O(1)O(1) with the use of hash table. Concurrent threads evaluates distinct graphs. The enumeration algorithm has smaller computational complexity. The experiments shows better performance with respect to other methods available in literature, allowing bioinformatic researchers to efficiently identify motifs of size 3, 4, 5, and 6.

acc-Motif: Accelerated Network Motif Detection

IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2014

Network motif algorithms have been a topic of research mainly after the 2002-seminal paper from Milo et al. [1], which provided motifs as a way to uncover the basic building blocks of most networks. Motifs have been mainly applied in Bioinformatics, regarding gene regulation networks. Motif detection is based on induced subgraph counting. This paper proposes an algorithm to count subgraphs of size k + 2 based on the set of induced subgraphs of size k. The general technique was applied to detect 3, 4 and 5-sized motifs in directed graphs. Such algorithms have time complexity O(a(G)m), O(m 2) and O(nm 2), respectively, where a(G) is the arboricity of G(V, E). The computational experiments in public datasets show that the proposed technique was one order of magnitude faster than Kavosh and FANMOD. When compared to NetMODE, acc-Motif had a slightly improved performance.

Kavosh: a new algorithm for finding network motifs

BMC …, 2009

Background: Complex networks are studied across many fields of science and are particularly important to understand biological processes. Motifs in networks are small connected sub-graphs that occur significantly in higher frequencies than in random networks. They have recently gathered much attention as a useful concept to uncover structural design principles of complex networks. Existing algorithms for finding network motifs are extremely costly in CPU time and memory consumption and have practically restrictions on the size of motifs.

Approximating the Number of Network Motifs

Internet Mathematics, 2009

World Wide Web, the Internet, coupled biological and chemical systems, neural networks, and social interacting species, are only a few examples of systems composed by a large number of highly interconnected dynamical units. These networks contain characteristic patterns, termed network motifs, which occur far more often than in randomized networks with the same degree sequence. Several algorithms have been suggested for counting or detecting the number of induced or non-induced occurrences of network motifs in the form of trees and bounded treewidth subgraphs of size O(log n), and of size at most 7 for some motifs. In addition, counting the number of motifs a node is part of was recently suggested as a method to classify nodes in the network. The promise is that the distribution of motifs a node participate in is an indication of its function in the network. Therefore, counting the number of network motifs a node is part of provides a major challenge. However, no such practical algorithm exists. We present several algorithms with time complexity O e 2k k · n · |E| · log 1 δ / 2 that, for the first time, approximate for every vertex the number of non-induced occurrences of the motif the vertex is part of, for k-length cycles, k-length cycles with a chord, and (k − 1)-length paths, where k = O(log n), and for all motifs of size of at most four. In addition, we show algorithms that approximate the total number of non-induced occurrences of these network motifs, when no efficient algorithm exists. Some of our algorithms use the color coding technique.

Suffix Graph - An Efficient Approach for Network Motif Mining

Journal of Data Mining in Genomics & Proteomics, 2016

Network motif is a pattern of inter-connections occurring in complex network in numbers that are significantly higher than those in similar randomized network. The basic premise of finding network motifs lie in the ability to compute the frequency of the subgraphs. In order to discover network motif, one has to compute a subgraph census on the original network that calculates the frequency of all the subgraphs of certain type. Then there is a need to compute the frequency of a set of subgraphs on the randomized similar network. The bottleneck of the entire motif discovery process is therefore to compute the subgraph frequencies and this is the core computational problem. The proposed work is to present the Suffix-Graph, a data structure that store graphs efficiently and to design an algorithm to retrieve subgraph efficiently that detects network motifs and apply them to transcriptional interactions in Escherichia coli.

G-tries: an efficient data structure for discovering network motifs

Proceedings of the 2010 ACM Symposium on …, 2010

In this paper we propose a novel specialized data structure that we call g-trie, designed to deal with collections of subgraphs. The main conceptual idea is akin to a prefix tree in the sense that we take advantage of common topology by constructing a multiway tree where the descendants of a node share a common substructure. We give algorithms to construct a g-trie, to list all stored subgraphs, and to find occurrences on another graph of the subgraphs stored in the g-trie. We evaluate the implementation of this structure and its associated algorithms on a set of representative benchmark biological networks in order to find network motifs. To assess the efficiency of our algorithms we compare their performance with other known network motif algorithms also implemented in the same common platform. Our results show that indeed, g-tries are a feasible, adequate and very efficient data structure for network motifs discovery, clearly outperforming previous algorithms and data structures.