Ali Ghodsi - Academia.edu (original) (raw)

Papers by Ali Ghodsi

Research paper thumbnail of Reproducing Kernel Hilbert Space, Mercer's Theorem, Eigenfunctions, Nyström Method, and Use of Kernels in Machine Learning: Tutorial and Survey

ArXiv, 2021

This is a tutorial and survey paper on kernels, kernel methods, and related fields. We start with... more This is a tutorial and survey paper on kernels, kernel methods, and related fields. We start with reviewing the history of kernels in functional analysis and machine learning. Then, Mercer kernel, Hilbert and Banach spaces, Reproducing Kernel Hilbert Space (RKHS), Mercer’s theorem and its proof, frequently used kernels, kernel construction from distance metric, important classes of kernels (including bounded, integrally positive definite, universal, stationary, and characteristic kernels), kernel centering and normalization, and eigenfunctions are explained in detail. Then, we introduce types of use of kernels in machine learning including kernel methods (such as kernel support vector machines), kernel learning by semi-definite programming, HilbertSchmidt independence criterion, maximum mean discrepancy, kernel mean embedding, and kernel dimensionality reduction. We also cover rank and factorization of kernel matrix as well as the approximation of eigenfunctions and kernels using th...

Research paper thumbnail of Distance Metric Learning Versus Fisher Discriminant Analysis

National Conference on Artificial Intelligence, 2008

ABSTRACT There has been much recent attention to the problem of learning an appropriate distance ... more ABSTRACT There has been much recent attention to the problem of learning an appropriate distance metric, using class labels or other side information. Some proposed algorithms are iterative and computationally expensive. In this paper, we show how to solve one of these methods with a closed-form solution, rather than using semidefinite programming. We provide a new problem setup in which the algorithm performs better or as well as some standard methods, but without the computational complexity. Furthermore, we show a strong relationship between these methods and the Fisher Discriminant Analysis.

Research paper thumbnail of Subjective Localization with Action Respecting Embedding

Springer Tracts in Advanced Robotics

Robot localization is the problem of how to estimate a robot's pose within an objective frame of ... more Robot localization is the problem of how to estimate a robot's pose within an objective frame of reference. Traditional localization requires knowledge of two key conditional probabilities: the motion and sensor models. These models depend critically on the specific robot as well as its environment. Building these models can be time-consuming, manually intensive, and can require expert intuitions. However, the models are necessary for the robot to relate its own subjective view of sensors and motors to the robot's objective pose. In this paper we seek to remove the need for human provided models. We introduce a technique for subjective localization, relaxing the requirement that the robot localize within a global frame of reference. Using an algorithm for action-respecting non-linear dimensionality reduction, we learn a subjective representation of pose from a stream of actions and sensations. We then extract from the data natural motion and sensor models defined for this new representation. Monte Carlo localization is used to track this representation of the robot's pose while executing new actions and receiving new sensor readings. We evaluate the technique in a synthetic image manipulation domain and with a mobile robot using vision and laser sensors.

Research paper thumbnail of Protein Structure by Semidefinite Facial Reduction

Lecture Notes in Computer Science, 2012

All practical contemporary protein NMR structure determination methods use molecular dynamics cou... more All practical contemporary protein NMR structure determination methods use molecular dynamics coupled with a simulated annealing schedule. The objective of these methods is to minimize the error of deviating from the NOE distance constraints. However, this objective function is highly nonconvex and, consequently, difficult to optimize. Euclidean distance geometry methods based on semidefinite programming (SDP) provide a natural formulation for this problem. However, complexity of SDP solvers and ambiguous distance constraints are major challenges to this approach. The contribution of this paper is to provide a new SDP formulation of this problem that overcomes these two issues for the first time. We model the protein as a set of intersecting two-and three-dimensional cliques, then we adapt and extend a technique called semidefinite facial reduction to reduce the SDP problem size to approximately one quarter of the size of the original problem. The reduced SDP problem can not only be solved approximately 100 times faster, but is also resistant to numerical problems from having erroneous and inexact distance bounds.

Research paper thumbnail of Parameter selection for smoothing splines using Stein's Unbiased Risk Estimator

The 2011 International Joint Conference on Neural Networks, 2011

Abstract A challenging problem in smoothing spline regression is determining a value for the smoo... more Abstract A challenging problem in smoothing spline regression is determining a value for the smoothing parameter. The parameter establishes the tradeoff between the closeness of the data, versus the smoothness of the regression function. This paper proposes a new ...

Research paper thumbnail of Nonnegative matrix factorization via rank-one downdate

Proceedings of the 25th international conference on Machine learning - ICML '08, 2008

Nonnegative matrix factorization (NMF) was popularized as a tool for data mining by Lee and Seung... more Nonnegative matrix factorization (NMF) was popularized as a tool for data mining by Lee and Seung in 1999. NMF attempts to approximate a matrix with nonnegative entries by a product of two low-rank matrices, also with nonnegative entries. We propose an algorithm called rank-one downdate (R1D) for computing an NMF that is partly motivated by the singular value decomposition. This algorithm computes the dominant singular values and vectors of adaptively determined submatrices of a matrix. On each iteration, R1D extracts a rank-one submatrix from the original matrix according to an objective function. We establish a theoretical result that maximizing this objective function corresponds to correctly classifying articles in a nearly separable corpus. We also provide computational experiments showing the success of this method in identifying features in realistic datasets. The method is also much faster than other NMF routines.

Research paper thumbnail of Guided Locally Linear Embedding

Pattern Recognition Letters, 2011

Nonlinear dimensionality reduction is the problem of retrieving a low-dimensional representation ... more Nonlinear dimensionality reduction is the problem of retrieving a low-dimensional representation of a manifold that is embedded in a high-dimensional observation space. Locally Linear Embedding (LLE), a prominent dimensionality reduction technique is an unsupervised algorithm; as such, it is not possible to guide it toward modes of variability that may be of particular interest. This paper proposes a supervised variation of LLE. Similar to LLE, it retrieves a low-dimensional global coordinate system that faithfully represents the embedded manifold. Unlike LLE, however, it produces an embedding in which predefined modes of variation are preserved. This can improve several supervised learning tasks including pattern recognition, regression, and data visualization.

Research paper thumbnail of Supervised principal component analysis: Visualization, classification and regression on subspaces and submanifolds

Pattern Recognition, 2011

We propose "Supervised Principal Component Analysis (Supervised PCA)", a generalization of PCA th... more We propose "Supervised Principal Component Analysis (Supervised PCA)", a generalization of PCA that is uniquely effective for regression and classification problems with high-dimensional input data. It works by estimating a sequence of principal components that have maximal dependence on the response variable. The proposed Supervised PCA is solvable in closed-form, and has a dual formulation that significantly reduces the computational complexity of problems in which the number of predictors greatly exceeds the number of observations (such as DNA microarray experiments). Furthermore, we show how the algorithm can be kernelized, which makes it applicable to non-linear dimensionality reduction tasks. Experimental results on various visualization, classification and regression problems show significant improvement over other supervised approaches both in accuracy and computational efficiency.

Research paper thumbnail of A novel greedy algorithm for Nyström approximation

Research paper thumbnail of Dimensionality Reduction A Short Tutorial

Research paper thumbnail of Reproducing Kernel Hilbert Space, Mercer's Theorem, Eigenfunctions, Nyström Method, and Use of Kernels in Machine Learning: Tutorial and Survey

ArXiv, 2021

This is a tutorial and survey paper on kernels, kernel methods, and related fields. We start with... more This is a tutorial and survey paper on kernels, kernel methods, and related fields. We start with reviewing the history of kernels in functional analysis and machine learning. Then, Mercer kernel, Hilbert and Banach spaces, Reproducing Kernel Hilbert Space (RKHS), Mercer’s theorem and its proof, frequently used kernels, kernel construction from distance metric, important classes of kernels (including bounded, integrally positive definite, universal, stationary, and characteristic kernels), kernel centering and normalization, and eigenfunctions are explained in detail. Then, we introduce types of use of kernels in machine learning including kernel methods (such as kernel support vector machines), kernel learning by semi-definite programming, HilbertSchmidt independence criterion, maximum mean discrepancy, kernel mean embedding, and kernel dimensionality reduction. We also cover rank and factorization of kernel matrix as well as the approximation of eigenfunctions and kernels using th...

Research paper thumbnail of Distance Metric Learning Versus Fisher Discriminant Analysis

National Conference on Artificial Intelligence, 2008

ABSTRACT There has been much recent attention to the problem of learning an appropriate distance ... more ABSTRACT There has been much recent attention to the problem of learning an appropriate distance metric, using class labels or other side information. Some proposed algorithms are iterative and computationally expensive. In this paper, we show how to solve one of these methods with a closed-form solution, rather than using semidefinite programming. We provide a new problem setup in which the algorithm performs better or as well as some standard methods, but without the computational complexity. Furthermore, we show a strong relationship between these methods and the Fisher Discriminant Analysis.

Research paper thumbnail of Subjective Localization with Action Respecting Embedding

Springer Tracts in Advanced Robotics

Robot localization is the problem of how to estimate a robot's pose within an objective frame of ... more Robot localization is the problem of how to estimate a robot's pose within an objective frame of reference. Traditional localization requires knowledge of two key conditional probabilities: the motion and sensor models. These models depend critically on the specific robot as well as its environment. Building these models can be time-consuming, manually intensive, and can require expert intuitions. However, the models are necessary for the robot to relate its own subjective view of sensors and motors to the robot's objective pose. In this paper we seek to remove the need for human provided models. We introduce a technique for subjective localization, relaxing the requirement that the robot localize within a global frame of reference. Using an algorithm for action-respecting non-linear dimensionality reduction, we learn a subjective representation of pose from a stream of actions and sensations. We then extract from the data natural motion and sensor models defined for this new representation. Monte Carlo localization is used to track this representation of the robot's pose while executing new actions and receiving new sensor readings. We evaluate the technique in a synthetic image manipulation domain and with a mobile robot using vision and laser sensors.

Research paper thumbnail of Protein Structure by Semidefinite Facial Reduction

Lecture Notes in Computer Science, 2012

All practical contemporary protein NMR structure determination methods use molecular dynamics cou... more All practical contemporary protein NMR structure determination methods use molecular dynamics coupled with a simulated annealing schedule. The objective of these methods is to minimize the error of deviating from the NOE distance constraints. However, this objective function is highly nonconvex and, consequently, difficult to optimize. Euclidean distance geometry methods based on semidefinite programming (SDP) provide a natural formulation for this problem. However, complexity of SDP solvers and ambiguous distance constraints are major challenges to this approach. The contribution of this paper is to provide a new SDP formulation of this problem that overcomes these two issues for the first time. We model the protein as a set of intersecting two-and three-dimensional cliques, then we adapt and extend a technique called semidefinite facial reduction to reduce the SDP problem size to approximately one quarter of the size of the original problem. The reduced SDP problem can not only be solved approximately 100 times faster, but is also resistant to numerical problems from having erroneous and inexact distance bounds.

Research paper thumbnail of Parameter selection for smoothing splines using Stein's Unbiased Risk Estimator

The 2011 International Joint Conference on Neural Networks, 2011

Abstract A challenging problem in smoothing spline regression is determining a value for the smoo... more Abstract A challenging problem in smoothing spline regression is determining a value for the smoothing parameter. The parameter establishes the tradeoff between the closeness of the data, versus the smoothness of the regression function. This paper proposes a new ...

Research paper thumbnail of Nonnegative matrix factorization via rank-one downdate

Proceedings of the 25th international conference on Machine learning - ICML '08, 2008

Nonnegative matrix factorization (NMF) was popularized as a tool for data mining by Lee and Seung... more Nonnegative matrix factorization (NMF) was popularized as a tool for data mining by Lee and Seung in 1999. NMF attempts to approximate a matrix with nonnegative entries by a product of two low-rank matrices, also with nonnegative entries. We propose an algorithm called rank-one downdate (R1D) for computing an NMF that is partly motivated by the singular value decomposition. This algorithm computes the dominant singular values and vectors of adaptively determined submatrices of a matrix. On each iteration, R1D extracts a rank-one submatrix from the original matrix according to an objective function. We establish a theoretical result that maximizing this objective function corresponds to correctly classifying articles in a nearly separable corpus. We also provide computational experiments showing the success of this method in identifying features in realistic datasets. The method is also much faster than other NMF routines.

Research paper thumbnail of Guided Locally Linear Embedding

Pattern Recognition Letters, 2011

Nonlinear dimensionality reduction is the problem of retrieving a low-dimensional representation ... more Nonlinear dimensionality reduction is the problem of retrieving a low-dimensional representation of a manifold that is embedded in a high-dimensional observation space. Locally Linear Embedding (LLE), a prominent dimensionality reduction technique is an unsupervised algorithm; as such, it is not possible to guide it toward modes of variability that may be of particular interest. This paper proposes a supervised variation of LLE. Similar to LLE, it retrieves a low-dimensional global coordinate system that faithfully represents the embedded manifold. Unlike LLE, however, it produces an embedding in which predefined modes of variation are preserved. This can improve several supervised learning tasks including pattern recognition, regression, and data visualization.

Research paper thumbnail of Supervised principal component analysis: Visualization, classification and regression on subspaces and submanifolds

Pattern Recognition, 2011

We propose "Supervised Principal Component Analysis (Supervised PCA)", a generalization of PCA th... more We propose "Supervised Principal Component Analysis (Supervised PCA)", a generalization of PCA that is uniquely effective for regression and classification problems with high-dimensional input data. It works by estimating a sequence of principal components that have maximal dependence on the response variable. The proposed Supervised PCA is solvable in closed-form, and has a dual formulation that significantly reduces the computational complexity of problems in which the number of predictors greatly exceeds the number of observations (such as DNA microarray experiments). Furthermore, we show how the algorithm can be kernelized, which makes it applicable to non-linear dimensionality reduction tasks. Experimental results on various visualization, classification and regression problems show significant improvement over other supervised approaches both in accuracy and computational efficiency.

Research paper thumbnail of A novel greedy algorithm for Nyström approximation

Research paper thumbnail of Dimensionality Reduction A Short Tutorial