Metric learning with multiple kernels (original) (raw)

A Metric-learning based framework for Support Vector Machines and Multiple Kernel Learning

Most metric learning algorithms, as well as Fisher's Discriminant Analysis (FDA), optimize some cost function of different measures of within-and between-class distances. On the other hand, Support Vector Machines(SVMs) and several Multiple Kernel Learning (MKL) algorithms are based on the SVM large margin theory. Recently, SVMs have been analyzed from a metric learning perspective, and formulated as a Mahalanobis metric learning problem. This new perspective allows us to combine ideas from both SVM and metric learning, and to develop new algorithms that build on the strengths of each. Inspired by the metric learning interpretation of SVM, we develop here a new metric-learning based SVM framework in which we incorporate metric learning concepts within SVM. We extend the optimization problem of SVM to include some measure of the within-class distance and along the way we develop a new within-class distance measure which is appropriate for SVM. In addition, we adopt the same approach for MKL and show that it can be also formulated as a Mahalanobis metric learning problem. Our end result is a number of SVM/MKL algorithms that incorporate metric learning concepts. We experiment with them on a set of benchmark datasets and observe important predictive performance improvements.

Metric and kernel learning using a linear transformation

2012

Metric and kernel learning are important in several machine learning applications. However, most existing metric learning algorithms are limited to learning metrics over low-dimensional data, while existing kernel learning algorithms are often limited to the transductive setting and do not generalize to new data points. In this paper, we study metric learning as a problem of learning a linear transformation of the input data. We show that for high-dimensional data, a particular framework for learning a linear transformation of the data based on the LogDet divergence can be efficiently kernelized to learn a metric (or equivalently, a kernel function) over an arbitrarily high dimensional space. We further demonstrate that a wide class of convex loss functions for learning linear transformations can similarly be kernelized, thereby considerably expanding the potential applications of metric learning. We demonstrate our learning approach by applying it to large-scale real world problems in computer vision and text mining.

Distance metric learning with kernels

2003

In this paper, we propose a feature weighting method that works in both the input space and the kernel-induced feature space. It assumes only the availability of similarity (dissimilarity) information, and the number of parameters in the transformation does not depend on the number of features. Besides feature weighting, it can also be regarded as performing nonparametric kernel adaptation. Experimental results on both toy and real-world datasets show promising results.

Learning Multi-Kernel Distance Metrics using Relative Comparisons

In this manuscript, a new form of distance function that can model spaces where a Mahalanobis distance cannot be assumed is proposed. Two novel learning algorithms are proposed to allow that distance function to be learnt, assuming only relativecomparisons training examples. This allows a distance function to be learnt in non-linear, discontinuous spaces, avoiding the need for labelled or quantitative information. The first algorithm builds a set of basic distance bases. The second algorithm improves generalisation capability by merging different distance bases together. It is shown how the learning algorithms produce a distance function for clustering multiple disjoint clusters belonging to the same class. Crucially, this is achieved despite the lack of any explicit form of class labelling on the training data.

Learning Hierarchical Feature Space Using CLAss-specific Subspace Multiple Kernel - Metric Learning for Classification

ArXiv, 2019

Metric learning for classification has been intensively studied over the last decade. The idea is to learn a metric space induced from a normed vector space on which data from different classes are well separated. Different measures of the separation thus lead to various designs of the objective function in the metric learning model. One classical metric is the Mahalanobis distance, where a linear transformation matrix is designed and applied on the original dataset to obtain a new subspace equipped with the Euclidean norm. The kernelized version has also been developed, followed by Multiple-Kernel learning models. In this paper, we consider metric learning to be the identification of the best kernel function with respect to a high class separability in the corresponding metric space. The contribution is twofold: 1) No pairwise computations are required as in most metric learning techniques; 2) Better flexibility and lower computational complexity is achieved using the CLAss-Specifi...

Kernel relevant component analysis for distance metric learning

Neural Networks, 2005. …, 2005

Abstract-Defining a good distance measure between patterns is of crucial importance in many classification andclustering algorithms. Recently, relevant component analysis (RCA) is proposed which offers a simple yet powerful method to learn this distance metric. However, it is ...

Multitask Metric Learning: Theory and Algorithm

2019

In this paper, we study the problem of multitask metric learning (mtML). We first examine the generalization bound of the regularized mtML formulation based on the notion of algorithmic stability, proving the convergence rate of mtML and revealing the trade-off between the tasks. Moreover, we also establish the theoretical connection between the mtML, single-task learning and pooling-task learning approaches. In addition, we present a novel boosting-based mtML (mt-BML) algorithm, which scales well with the feature dimension of the data. Finally, we also devise an efficient second-order Riemannian retraction operator which is tailored specifically to our mt-BML algorithm. It produces a low-rank solution of mtML to reduce the model complexity, and may also improve generalization performances. Extensive evaluations on several benchmark data sets verify the effectiveness of our learning algorithm.

Kernel-based distance metric learning in the output space

The 2013 International Joint Conference on Neural Networks (IJCNN), 2013

In this paper we present two related, kernel-based Distance Metric Learning (DML) methods. Their respective models non-linearly map data from their original space to an output space, and subsequent distance measurements are performed in the output space via a Mahalanobis metric. The dimensionality of the output space can be directly controlled to facilitate the learning of a low-rank metric. Both methods allow for simultaneous inference of the associated metric and the mapping to the output space, which can be used to visualize the data, when the output space is 2-or 3-dimensional. Experimental results for a collection of classification tasks illustrate the advantages of the proposed methods over other traditional and kernel-based DML approaches.

Geometry-aware metric learning

Proceedings of the 26th Annual …, 2009

In this paper, we introduce a generic framework for semi-supervised kernel learning. Given pairwise (dis-)similarity constraints, we learn a kernel matrix over the data that respects the provided side-information as well as the local geometry of the data. Our framework is based on metric learning methods, where we jointly model the metric/kernel over the data along with the underlying manifold. Furthermore, we show that for some important parameterized forms of the underlying manifold model, we can estimate the model parameters and the kernel matrix efficiently. Our resulting algorithm is able to incorporate local geometry into the metric learning task; at the same time it can handle a wide class of constraints. Finally, our algorithm is fast and scalable -unlike most of the existing methods, it is able to exploit the low dimensional manifold structure and does not require semi-definite programming. We demonstrate wide applicability and effectiveness of our framework by applying to various machine learning tasks such as semisupervised classification, colored dimensionality reduction, manifold alignment etc. On each of the tasks our method performs competitively or better than the respective state-of-the-art method.