Intrinsic dimension estimation via nearest constrained subspace classifier (original) (raw)
Related papers
IDEA: Intrinsic Dimension Estimation Algorithm
Lecture Notes in Computer Science, 2011
The high dimensionality of some real life signals makes the usage of the most common signal processing and pattern recognition methods unfeasible. For this reason, in literature a great deal of research work has been devoted to the development of algorithms performing dimensionality reduction. To this aim, an useful help could be provided by the estimation of the intrinsic dimensionality of a given dataset, that is the minimum number of parameters needed to capture, and describe, all the information carried by the data. Although many techniques have been proposed, most of them fail in case of noisy data or when the intrinsic dimensionality is too high. In this paper we propose a local intrinsic dimension estimator exploiting the statistical properties of data neighborhoods. The algorithm evaluation on both synthetic and real datasets, and the comparison with state of the art algorithms, proves that the proposed technique is promising.
Optimized intrinsic dimension estimator using nearest neighbor graphs
2010 IEEE International Conference on Acoustics, Speech and Signal Processing, 2010
We develop an approach to intrinsic dimension estimation based on k-nearest neighbor (kNN) distances. The dimension estimator is derived using a general theory on functionals of kNN density estimates. This enables us to predict the performance of the dimension estimation algorithm. In addition, it allows for optimization of free parameters in the algorithm. We validate our theory through simulations and compare our estimator to previous kNN based dimensionality estimation approaches.
Weighted locally linear embedding for dimension reduction
Pattern Recognition, 2009
The low-dimensional representation of high-dimensional data and the concise description of its intrinsic structures are central problems in data analysis. In this paper, an unsupervised learning algorithm called weighted locally linear embedding (WLLE) is presented to discover the intrinsic structures of data, such as neighborhood relationships, global distributions and clustering. The WLLE algorithm is motivated by locally linear embedding (LLE) algorithm and cam weighted distance, a novel distance measure which usually gives a deflective cam contours for equal-distance contour in classification for an improved classification. It is a major advantage of the WLLE to optimize the process of intrinsic structure discovery by avoiding unreasonable neighbor searching, and at the same time, allow the discovery adapt to the characteristics of input data set. Furthermore, the algorithm discovers intrinsic structures which can be used to compute manipulative embedding for potential classification and recognition purposes, thus can work as a feature extraction algorithm. Simulation studies demonstrate that the WLLE can give better results in manifold learning and dimension reduction than LLE and neighborhood linear embedding (NLE), and is more robust to parameter changes. Experiments on face images data sets and comparison to other famous face recognition methods such as kernel-PCA (KPCA) and kernel direct discriminant analysis (KDDA) are done to show the potential of WLLE for real world problem.
A second order polynomial based subspace projection method for dimensionality reduction
2010 IEEE International Conference on Image Processing, 2010
A novel feature extraction method that utilizes nonlinear mapping from the original data space to the feature space is presented in this paper. For most practical systems, the meaningful features of a pattern class lie in a low dimensional nonlinear constraint region (manifold) within the high dimensional data space. A learning algorithm to model this nonlinear region and to project patterns to this feature space is developed. Least squares estimation approach that utilizes interdependency between points in training patterns is used to form the nonlinear region. A feature space encompassing multiple pattern classes can be trained by modeling a separate constraint region for each pattern class and obtaining a mean constraint region by averaging all the individual regions. Unlike most other nonlinear techniques, the proposed method provides an easy intuitive way to place new points onto a nonlinear region in the feature space. Classification accuracy is further improved by introducing the concepts of modularity and discriminant analysis into the proposed method.
Optimized intrinsic dimension estimation using nearest neighbor graphs
We develop an approach to intrinsic dimension estimation based on k-nearest neighbor (kNN) distances. The dimension estimator is derived using a general theory on functionals of kNN density estimates. This enables us to predict the performance of the dimension estimation algorithm. In addition, it allows for optimization of free parameters in the algorithm. We validate our theory through simulations and compare our estimator to previous kNN based dimensionality estimation approaches.
On Local Intrinsic Dimension Estimation and Its Applications
In this paper, we present multiple novel applications for local intrinsic dimension estimation. There has been much work done on estimating the global dimension of a data set, typically for the purposes of dimensionality reduction. We show that by estimating dimension locally, we are able to extend the uses of dimension estimation to many applications, which are not possible with global dimension estimation. Additionally, we show that local dimension estimation can be used to obtain a better global dimension estimate, alleviating the negative bias that is common to all known dimension estimation algorithms. We illustrate local dimension estimation's uses towards additional applications, such as learning on statistical manifolds, network anomaly detection, clustering, and image segmentation.
Minimum Neighbor Distance Estimators of Intrinsic Dimension
Most of the machine learning techniques suffer the “curse of dimensionality” effect when applied to high dimensional data. To face this limitation, a common preprocessing step consists in employing a dimensionality reduction technique. In literature, a great deal of research work has been devoted to the development of algorithms performing this task. Often, these techniques require as parameter the number of dimensions to be retained; to this aim, they need to estimate the “intrinsic dimensionality” of the given dataset, which refers to the minimum number of degrees of freedom needed to capture all the information carried by the data. Although many estimation techniques have been proposed, most of them fail in case of noisy data or when the intrinsic dimensionality is too high. In this paper we present a family of estimators based on the probability density function of the normalized nearest neighbor distance. We evaluate the proposed techniques on both synthetic and real datasets comparing their performances with those obtained by state of the art algorithms; the achieved results prove that the proposed methods are promising.
A Category Space Approach to Supervised Dimensionality Reduction
2016
Supervised dimensionality reduction has emerged as an important theme in the last decade. Despite the plethora of models and formulations, there is a lack of a simple model which aims to project the set of patterns into a space defined by the classes (or categories). To this end, we set up a model in which each class is represented as a 1D subspace of the vector space formed by the features. Assuming the set of classes does not exceed the cardinality of the features, the model results in multi-class supervised learning in which the features of each class are projected into the class subspace. Class discrimination is automatically guaranteed via the imposition of orthogonality of the 1D class sub-spaces. The resulting optimization problem - formulated as the minimization of a sum of quadratic functions on a Stiefel manifold - while being non-convex (due to the constraints), nevertheless has a structure for which we can identify when we have reached a global minimum. After formulating...
On Dimension Reduction using Supervised Distance Preserving Projection for Face Recognition
Universal Journal of Applied Mathematics, 2018
Personal identication or verification is a very common requirement in modern society specially to access restricted area or resources. Biometric identification specially face identification or recognition in a controlled or an uncontrolled scenario has become one of the most important and challenging area of research. Images often are represented as high-dimensional vectors or arrays. Operating directly on these vectors would lead to high computational costs and storage demands. Also working directly with raw data is difficult, challenging or even impossible sometimes. Dimensionality reduction has become a necessity for pre-processing data, representation and classification. It aims to represent data in a low-dimensional space that captures the intrinsic nature of the data. In this article we have applied a Supervised distance preserving projection (SDPP) technique, Semidefinite Least Square SDPP (SLS-SDPP), we have proposed recently to reduce the dimension of face image data. Numerical experiments conducted on very well known face image data sets both on gallery images and blurred images of various level demonstrate that the performance of SLS-SDPP is promising in comparison to two leading approach Eigenface and Fisherface.