Commute-time convolution kernels for graph clustering (original) (raw)

Kernels on Graphs as Proximity Measures

Lecture Notes in Computer Science, 2017

Kernels and, broadly speaking, similarity measures on graphs are extensively used in graph-based unsupervised and semi-supervised learning algorithms as well as in the link prediction problem. We analytically study proximity and distance properties of various kernels and similarity measures on graphs. This can potentially be useful for recommending the adoption of one or another similarity measure in a machine learning method. Also, we numerically compare various similarity measures in the context of spectral clustering and observe that normalized heat-type similarity measures with log modification generally perform the best.

Graph Kernels by Spectral Transforms

Many graph-based semi-supervised learning methods can be viewed as imposing smoothness conditions on the target function with respect to a graph representing the data points to be labeled. The smoothness properties of the functions are encoded in terms of Mercer kernels over the graph. The central quantity in such regularization is the spectral decomposition of the graph Laplacian, a matrix derived from the graph's edge weights. The eigenvectors with small eigenvalues are smooth, and ideally represent large cluster structures within the data. The eigenvectors having large eigenvalues are rugged, and considered noise.

Clustering and embedding using commute times

IEEE Transactions on Pattern Analysis and …, 2007

This paper exploits the properties of the commute time between nodes of a graph for the purposes of clustering and embedding and explores its applications to image segmentation and multibody motion tracking. Our starting point is the lazy random walk on the graph, which is determined by the heat kernel of the graph and can be computed from the spectrum of the graph Laplacian. We characterize the random walk using the commute time (that is, the expected time taken for a random walk to travel between two nodes and return) and show how this quantity may be computed from the Laplacian spectrum using the discrete Green's function. Our motivation is that the commute time can be anticipated to be a more robust measure of the proximity of data than the raw proximity matrix. In this paper, we explore two applications of the commute time. The first is to develop a method for image segmentation using the eigenvector corresponding to the smallest eigenvalue of the commute time matrix. We show that our commute time segmentation method has the property of enhancing the intragroup coherence while weakening intergroup coherence and is superior to the normalized cut. The second application is to develop a robust multibody motion tracking method using an embedding based on the commute time. Our embedding procedure preserves commute time and is closely akin to kernel PCA, the Laplacian eigenmap, and the diffusion map. We illustrate the results on both synthetic image sequences and real-world video sequences and compare our results with several alternative methods.

Graph Kernels from the Jensen-Shannon Divergence

Graph-based representations have been proved powerful in computer vision. The challenge that arises with large amounts of graph data is that of computationally burdensome edit distance computation. Graph kernels can be used to formulate efficient algorithms to deal with high dimensional data, and have been proved an elegant way to overcome this computational bottleneck. In this paper, we investigate whether the Jensen-Shannon divergence can be used as a means of establishing a graph kernel. The Jensen-Shannon kernel is nonextensive information theoretic kernel, and is defined using the entropy and mutual information computed from probability distributions over the structures being compared. To establish a Jensen-Shannon graph kernel, we explore two different approaches. The first of these is based on the von Neumann entropy associated with a graph. The second approach uses the Shannon entropy associated with the probability state vector for a steady state random walk on a graph. We compare the two resulting graph kernels for the problem of graph clustering. We use kernel principle components analysis (kPCA) to embed graphs into a feature space. Experimental results reveal that the method gives good classification results on graphs extracted both from an object recognition database and from an application in bioinformation.

Geometric characterization and clustering of graphs using heat kernel embeddings

IMAGE AND VISION COMPUTING, 2010

In this paper, we investigate the use of heat kernels as a means of embedding the individual nodes of a graph in a vector space. The reason for turning to the heat kernel is that it encapsulates information concerning the distribution of path lengths and hence node affinities on the graph. The heat kernel of the graph is found by exponentiating the Laplacian eigensystem over time. In this paper, we explore how graphs can be characterized in a geometric manner using embeddings into a vector space obtained from the heat kernel. We explore two different embedding strategies. The first of these is a direct method in which the matrix of embedding coordinates is obtained by performing a Young-Householder decomposition on the heat kernel. The second method is indirect and involves performing a low-distortion embedding by applying multidimensional scaling to the geodesic distances between nodes. We show how the required geodesic distances can be computed using parametrix expansion of the heat kernel. Once the nodes of the graph are embedded using one of the two alternative methods, we can characterize them in a geometric manner using the distribution of the node coordinates. We investigate several alternative methods of characterization, including spatial moments for the embedded points, the Laplacian spectrum for the Euclidean distance matrix and scalar curvatures computed from the difference in geodesic and Euclidean distances. We experiment with the resulting algorithms on the COIL database.

Graph nodes clustering with the sigmoid commute-time kernel: A comparative study

Data & Knowledge Engineering, 2009

This work addresses the problem of detecting clusters in a weighted, undirected, graph by using kernel-based clustering methods, directly partitioning the graph according to a welldefined similarity measure between the nodes (a kernel on a graph). The proposed algorithms are based on a two-step procedure. First, a kernel or similarity matrix, providing a meaningful similarity measure between any couple of nodes, is computed from the adjacency matrix of the graph. Then, the nodes of the graph are clustered by performing a kernel clustering on this similarity matrix. Besides the introduction of a prototype-based kernel version of the gaussian mixtures model and Ward's hierarchical clustering, in addition to the already known kernel k-means and fuzzy k-means, a new kernel, called the sigmoid commute-time kernel (K S CT) is presented. The joint use of the K S CT kernel matrix and kernel clustering appears to be quite effective. Indeed, this methodology provides the best results on a systematic comparison with a selection of graph clustering and communities detection algorithms on three real-world databases. Finally, some links between the proposed hierarchical kernel clustering and spectral clustering are examined.

Automatic graph pruning based on kernel alignment for spectral clustering

Pattern Recognition Letters, 2016

Detection of data structures in spectral clustering approaches becomes a difficult task when dealing with complex distributions. Moreover, there is a need of a real user prior knowledge about the influence of the free parameters when building the graph. Here, we introduce a graph pruning approach, termed Kernel Alignment based Graph Pruning (KAGP), within a spectral clustering framework that enhances both the local and global data consistencies for a given input similarity. The KAGP allows revealing hidden data structures by finding relevant pair-wise relationships among samples. So, KAGP estimates the loss of information during the pruning process in terms of a kernel alignment-based cost function. Besides, we encode the sample similarities using a compactly supported kernel function that allows obtaining a sparse data representation to support spectral clustering techniques. Attained results shows that KAGP enhances the clustering performance in most of the cases. In addition, KAGP avoids the need for a comprehensive user knowledge regarding the influence of its free parameters.

Discriminating graphs through spectral projections

Computer Networks, 2011

This paper proposes a novel non-parametric technique for clustering networks based on their structure. Many topological measures have been introduced in the literature to characterize topological properties of networks. These measures provide meaningful information about the structural properties of a network, but many networks share similar values of a given measure . Furthermore, strong correlation between these measures occur on real-world graphs [2], so that using them to distinguish arbitrary graphs is difficult in practice .