Speeding up graph clustering via modular decomposition based compression (original) (raw)
Related papers
Enhancing Modularity-Based Graph Clustering
Graph clustering is defined as grouping the vertices of a given input graph into clusters. This article proposes a Two-Phase Modularity-Based Graph Clustering (2-PMGC) algorithm based on modularity optimization. The algorithm consists mainly of two steps; namely, coarsening and refinement. The coarsening phase takes the original graph as input and produces levels of coarsen graphs. The second phase starts with the coarsest graph resulting from the previous phase and enhances clustering by further moving the vertices of each coarsen level between clusters. Our algorithm is evaluated for 16 real-world networks, where an obvious increase in modularity is achieved by the proposed algorithm.
An Ultra-Fast Modularity-Based Graph Clustering Algorithm
links (Vi, Vi), 2009
In this paper, we propose a multilevel graph partitioning scheme to speed up a modularity-based graph clustering technique. The modularity-based algorithm was proposed by Newman for partitioning graphs into communities without the input of the number of clusters. The algorithm seeks to maximize a modularity measure. However, its worst-time complexity on sparse graphs is O(n 2 ), where n is the number of vertices, which can be prohibitive for many applications. The multilevel graph partitioning scheme consists of three phases: (i) reduction of the size (coarsen) of original graph by collapsing vertices and edges, (ii) partitioning the coarsened graph, and (iii) uncoarsen it to obtain a partition for the original graph. The rationale behind this strategy is to apply a computationally expensive method in a coarsened graph, i.e., with a significantly reduced number of vertices and edges. Empirical evaluation using this approach demonstrate a significant speed up of the modularity-based algorithm, keeping a good quality clusters partitioning.
Graph Compression: Toward a Generalized Algorithm
Zenodo (CERN European Organization for Nuclear Research), 2022
Currently, most graph compression algorithms focus on in-memory compression (such as for web graphs)-few are feasible for external compression, and there is no generalized approach to either task. These compressed representations are versatile and can be applied to a great number of different applications, with the most common being social network and search systems. We present a new set of compression approaches, both lossless and lossy, for external memory graph compression. These new algorithms may also be applicable for runtime usage (i.e., running graph algorithms on the compressed representation).
Proceedings of the LinkKDD workshop at the …, 2004
Graphs form the foundation of many real-world datasets ranging from Internet connectivity to social networks. Yet despite this underlying structure, the size of these datasets presents a nearly insurmountable obstacle to understanding the essential character of the data. We want to understand "what the graph looks like;" we want to know which vertices and edges are important and what are the significant features in the graph. For a communication network, such an understanding entails recognizing the overall design of the network (e.g., hub-and-spoke, mesh, backbone), as well as identifying the "important" nodes and links.
Scalable Compression of a Weighted Graph
ArXiv, 2016
Graph is a useful data structure to model various real life aspects like email communications, co-authorship among researchers, interactions among chemical compounds, and so on. Supporting such real life interactions produce a knowledge rich massive repository of data. However, efficiently understanding underlying trends and patterns is hard due to large size of the graph. Therefore, this paper presents a scalable compression solution to compute summary of a weighted graph. All the aforementioned interactions from various domains are represented as edge weights in a graph. Therefore, creating a summary graph while considering this vital aspect is necessary to learn insights of different communication patterns. By experimenting the proposed method on two real world and publically available datasets against a state of the art technique, we obtain order of magnitude performance gain and better summarization accuracy.
Compression of Weighted Graphs
2011
We propose to compress weighted graphs (networks), motivated by the observation that large networks of social, biological, or other relations can be complex to handle and visualize. In the process also known as graph simplification, nodes and (unweighted) edges are grouped to supernodes and superedges, respectively, to obtain a smaller graph. We propose models and algorithms for weighted graphs. The interpretation (i.e. decompression) of a compressed, weighted graph is that a pair of original nodes is connected by an edge if their supernodes are connected by one, and that the weight of an edge is approximated to be the weight of the superedge. The compression problem now consists of choosing supernodes, superedges, and superedge weights so that the approximation error is minimized while the amount of compression is maximized.
Modular decomposition of graphs and hierarchical modeling
2018
We consider Gallai's graph Modular Decomposition theory for network analytics. On the one hand, by arguing that this is a choice tool for understanding structural and functional similarities among nodes in a network. On the other, by proposing a model for random graphs based on this decomposition. Our approach establishes a well defined context for hierarchical modeling and provides a solid theoretical framework for probabilistic and statistical methods. Theoretical and simulation results show the model acknowledges scale free networks, high clustering coefficients and small diameters all of which are observed features in many natural and social networks.
A New Approach of Compression of Large Community Graph Using Graph Mining Techniques
Graph representations have vast applications and are used for knowledge extraction. With increase in applications of graph, it has become more and more complex and larger in sizes. Visualization and analyzing large community graph are challenging. To study a large community graph, compression technique may be used. There should not be any loss of information or knowledge while compressing the community graph. This paper starts with a formal introduction followed by representing the graph models in compressed form. Greedy Algorithm is used for the purpose. The paper proceeds in the same direction and proposes a similar technique for compressing a large community graph, which is suitable for carrying out steps of graph mining. Observations show that the proposed technique reduces the iteration steps may leads to a better efficiency. Algorithm on the proposed technique has been elaborated followed by a suitable example.
A simple linear-time modular decomposition algorithm for graphs, using order extension
2004
The first polynomial time algorithm (O(n 4 )) for modular decomposition appeared in 1972 [8] and since then there have been incremental improvements, eventually resulting in linear time algorithms . Although an optimal time complexity these algorithms are quite complicated and difficult to implement. In this paper we present an easily implementable linear time algorithm for modular decomposition. This algorithm use the notion of factorizing permutation and a new datastructure, the Ordered Chain Partitions.
A Structural Approach to Graph Compression
We consider graph compression in terms of graph families. In particular, we show that graphs of bounded genus can be compressed to O(n) bits, where n is the number of vertices. We identify a property based on separators that makes O(n)-bit compression possible for some graphs of bounded arboricity.