Learning eigenfunctions links spectral embedding and kernel PCA (original) (raw)

Spectral dimensionality reduction

2006

In this chapter, we study and put under a common framework a number of non-linear dimensionality reduction methods, such as Locally Linear Embedding, Isomap, Laplacian eigenmaps and kernel PCA, which are based on performing an eigen-decomposition (hence the name “spectral”). That framework also includes classical methods such as PCA and metric multidimensional scaling (MDS). It also includes the data transformation step used in spectral clustering.

Out-of-sample extensions for lle, isomap, mds, eigenmaps, and spectral clustering

Advances in neural …, 2004

Several unsupervised learning algorithms based on an eigendecomposition provide either an embedding or a clustering only for given training points, with no straightforward extension for out-of-sample examples short of recomputing eigenvectors. This paper provides a unified framework for extending Local Linear Embedding (LLE), Isomap, Laplacian Eigenmaps, Multi-Dimensional Scaling (for dimensionality reduction) as well as for Spectral Clustering. This framework is based on seeing these algorithms as learning eigenfunctions of a data-dependent kernel. Numerical experiments show that the generalizations performed have a level of error comparable to the variability of the embedding algorithms due to the choice of training data.

Robust non-linear dimensionality reduction using successive 1-dimensional Laplacian Eigenmaps

Proceedings of the 24th international conference on Machine learning - ICML '07, 2007

Non-linear dimensionality reduction of noisy data is a challenging problem encountered in a variety of data analysis applications. Recent results in the literature show that spectral decomposition, as used for example by the Laplacian Eigenmaps algorithm, provides a powerful tool for non-linear dimensionality reduction and manifold learning. In this paper, we discuss a significant shortcoming of these approaches, which we refer to as the repeated eigendirections problem. We propose a novel approach that combines successive 1dimensional spectral embeddings with a data advection scheme that allows us to address this problem. The proposed method does not depend on a non-linear optimization scheme; hence, it is not prone to local minima. Experiments with artificial and real data illustrate the advantages of the proposed method over existing approaches. We also demonstrate that the approach is capable of correctly learning manifolds corrupted by significant amounts of noise.

Semi-supervised learning in Spectral Dimensionality Reduction

2016

Biometric face data are essentially high dimensional data and as such are susceptible to the well-known problem of the curse of dimensionality when analyzed using machine learning techniques. Various dimensionality reduction methods have been proposed in the literature to represent high dimensional data in a lower dimensional space. Research has shown that biometric face data are non-linear in structure, and when subject to analysis using linear dimensionality reduction methods, such as PCA and LDA, important information is lost. However, manifold learning methods (LLE, Laplacian Eigenmaps, Isomap) are able to preserve the original non-linear structure of high dimensional data in lower dimensional space, resulting in much less information loss. Despite the success in preserving the non-linear structure of data, manifold learning methods suffer from two problems. First the generalization problem, that is, the proposed methods operate in batch mode and are not extendable to new unseen...

An Iterative Locally Linear Embedding Algorithm

2012

Locally Linear embedding (LLE) is a popular dimension reduction method. In this paper, we systematically improve the two main steps of LLE: (A) learning the graph weights W, and (B) learning the embedding Y. We propose a sparse nonnegative W learning algorithm. We propose a weighted formulation for learning Y and show the results are identical to normalized cuts spectral clustering. We further propose to iterate the two steps in LLE repeatedly to improve the results. Extensive experiment results show that iterative LLE algorithm significantly improves both classification and clustering results.

Efficient regularized spectral data embedding

Advances in Data Analysis and Classification, 2020

Data embedding (DE) or dimensionality reduction techniques are particularly well suited to embedding high-dimensional data into a space that in most cases will have just two dimensions. Low-dimensional space, in which data samples (data points) can more easily be visualized, is also often used for learning methods such as clustering. Sometimes, however, DE will identify dimensions that contribute little in terms of the clustering structures that they reveal. In this paper we look at regularized data embedding by clustering, and we propose a simultaneous learning approach for DE and clustering that reinforces the relationships between these two tasks. Our approach is based on a matrix decomposition technique for learning a spectral DE, a cluster membership matrix, and a rotation matrix that closely maps out the continuous spectral embedding, in order to obtain a good clustering solution. We compare our approach with some traditional clustering methods and perform numerical experiments on a collection of benchmark datasets to demonstrate its potential.

Kernel-based framework for spectral dimensionality reduction and clustering formulation: A theoretical study

ADCAIJ: Advances in Distributed Computing and Artificial Intelligence Journal

This work outlines a unified formulation to represent spectral approaches for both dimensionality reduction and clustering. Proposed formulation starts with a generic latent variable model in terms of the projected input data matrix.Particularly, such a projection maps data onto a unknown high-dimensional space. Regarding this model, a generalized optimization problem is stated using quadratic formulations and a least-squares support vector machine.The solution of the optimization is addressed through a primal-dual scheme.Once latent variables and parameters are determined, the resultant model outputs a versatile projected matrix able to represent data in a low-dimensional space, as well as to provide information about clusters. Particularly, proposedformulation yields solutions for kernel spectral clustering and weighted-kernel principal component analysis.

Spectral methods for dimensionality reduction

2006

How can we search for low dimensional structure in high dimensional data? If the data is mainly confined to a low dimensional subspace, then simple linear methods can be used to discover the subspace and estimate its dimensionality. More generally, though, if the data lies on (or near) a low dimensional submanifold, then its structure may be highly nonlinear, and linear methods are bound to fail. Spectral methods have recently emerged as a powerful tool for nonlinear dimensionality reduction and manifold learning.

Spectral embedding finds meaningful (relevant) structure in image and microarray data

BMC Bioinformatics, 2006

Background Accurate methods for extraction of meaningful patterns in high dimensional data have become increasingly important with the recent generation of data types containing measurements across thousands of variables. Principal components analysis (PCA) is a linear dimensionality reduction (DR) method that is unsupervised in that it relies only on the data; projections are calculated in Euclidean or a similar linear space and do not use tuning parameters for optimizing the fit to the data. However, relationships within sets of nonlinear data types, such as biological networks or images, are frequently mis-rendered into a low dimensional space by linear methods. Nonlinear methods, in contrast, attempt to model important aspects of the underlying data structure, often requiring parameter(s) fitting to the data type of interest. In many cases, the optimal parameter values vary when different classification algorithms are applied on the same rendered subspace, making the results of such methods highly dependent upon the type of classifier implemented. Results We present the results of applying the spectral method of Lafon, a nonlinear DR method based on the weighted graph Laplacian, that minimizes the requirements for such parameter optimization for two biological data types. We demonstrate that it is successful in determining implicit ordering of brain slice image data and in classifying separate species in microarray data, as compared to two conventional linear methods and three nonlinear methods (one of which is an alternative spectral method). This spectral implementation is shown to provide more meaningful information, by preserving important relationships, than the methods of DR presented for comparison. Tuning parameter fitting is simple and is a general, rather than data type or experiment specific approach, for the two datasets analyzed here. Tuning parameter optimization is minimized in the DR step to each subsequent classification method, enabling the possibility of valid cross-experiment comparisons. Conclusion Results from the spectral method presented here exhibit the desirable properties of preserving meaningful nonlinear relationships in lower dimensional space and requiring minimal parameter fitting, providing a useful algorithm for purposes of visualization and classification across diverse datasets, a common challenge in systems biology.

Diffusion Maps - a Probabilistic Interpretation for Spectral Embedding and Clustering Algorithms

Lecture Notes in Computational Science and Enginee, 2008

Spectral embedding and spectral clustering are common methods for non-linear dimensionality reduction and clustering of complex high dimensional datasets. In this paper we provide a diffusion based probabilistic analysis of algorithms that use the normalized graph Laplacian. Given the pairwise adjacency matrix of all points in a dataset, we define a random walk on the graph of points and a diffusion distance between any two points. We show that the diffusion distance is equal to the Euclidean distance in the embedded space with all eigenvectors of the normalized graph Laplacian. This identity shows that characteristic relaxation times and processes of the random walk on the graph are the key concept that governs the properties of these spectral clustering and spectral embedding algorithms. Specifically, for spectral clustering to succeed, a necessary condition is that the mean exit times from each cluster need to be significantly larger than the largest (slowest) of all relaxation times inside all of the individual clusters. For complex, multiscale data, this condition may not hold and multiscale methods need to be developed to handle such situations.