Coordinating Principal Component Analyzers (original) (raw)
Related papers
Mixture Models for Exploring Local PCA Structures
2007
Principal component analysis (PCA) is one of the most popular technique for dimensionality reduction of multivariate data. This paper discusses a new learning algorithm to explore local PCA structure in which the observed data follow a mixture of several PCA models, where each model is described by a linear combination of independent and Gaussian sources. The proposed method is based on a mixture of several Gaussian distributions to extract all local PCA structures simultaneously. Parameters are estimated by maximizing likelihood function. The performance of the proposed method is compared with some existing PCA algorithms using synthetic datasets.
Demixed Principal Component Analysis
In many experiments, the data points collected live in high-dimensional observation spaces, yet can be assigned a set of labels or parameters. In electrophysiological recordings, for instance, the responses of populations of neurons generally depend on mixtures of experimentally controlled parameters. The heterogeneity and diversity of these parameter dependencies can make visualization and interpretation of such data extremely difficult. Standard dimensionality reduction techniques such as principal component analysis (PCA) can provide a succinct and complete description of the data, but the description is constructed independent of the relevant task variables and is often hard to interpret. Here, we start with the assumption that a particularly informative description is one that reveals the dependency of the high-dimensional data on the individual parameters. We show how to modify the loss function of PCA so that the principal components seek to capture both the maximum amount of variance about the data, while also depending on a minimum number of parameters. We call this method demixed principal component analysis (dPCA) as the principal components here segregate the parameter dependencies. We phrase the problem as a probabilistic graphical model, and present a fast Expectation-Maximization (EM) algorithm. We demonstrate the use of this algorithm for electrophysiological data and show that it serves to demix the parameter-dependence of a neural population response.
Mixture of Bilateral-Projection Two-Dimensional Probabilistic Principal Component Analysis
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016
The probabilistic principal component analysis (PPCA) is built upon a global linear mapping, with which it is insufficient to model complex data variation. This paper proposes a mixture of bilateral-projection probabilistic principal component analysis model (mixB2DPPCA) on 2D data. With multi-components in the mixture, this model can be seen as a 'soft' cluster algorithm and has capability of modeling data with complex structures. A Bayesian inference scheme has been proposed based on the variational EM (Expectation-Maximization) approach for learning model parameters. Experiments on some publicly available databases show that the performance of mixB2DPPCA has been largely improved, resulting in more accurate reconstruction errors and recognition rates than the existing PCA-based algorithms.
Covariance-Based PCA for Multi-size Data
2014 22nd International Conference on Pattern Recognition, 2014
Principal component analysis (PCA) is used in diverse settings for dimensionality reduction. If data elements are all the same size, there are many approaches to estimating the PCA decomposition of the dataset. However, many datasets contain elements of different sizes that must be coerced into a fixed size before analysis. Such approaches introduce errors into the resulting PCA decomposition. We introduce CO-MPCA, a nonlinear method of directly estimating the PCA decomposition from datasets with elements of different sizes. We compare our method with two baseline approaches on three datasets: a synthetic vector dataset, a synthetic image dataset, and a real dataset of color histograms extracted from surveillance video. We provide quantitative and qualitative evidence that using CO-MPCA gives a more accurate estimate of the PCA basis.
Pattern Recognition, 2011
We propose "Supervised Principal Component Analysis (Supervised PCA)", a generalization of PCA that is uniquely effective for regression and classification problems with high-dimensional input data. It works by estimating a sequence of principal components that have maximal dependence on the response variable. The proposed Supervised PCA is solvable in closed-form, and has a dual formulation that significantly reduces the computational complexity of problems in which the number of predictors greatly exceeds the number of observations (such as DNA microarray experiments). Furthermore, we show how the algorithm can be kernelized, which makes it applicable to non-linear dimensionality reduction tasks. Experimental results on various visualization, classification and regression problems show significant improvement over other supervised approaches both in accuracy and computational efficiency.
Probabilistic Disjoint Principal Component Analysis
Multivariate Behavioral Research, 2018
One of the most relevant problems in principal component analysis and factor analysis is the interpretation of the components/factors. In this paper, disjoint principal component analysis model is extended in a maximum-likelihood framework to allow for inference on the model parameters. A coordinate ascent algorithm is proposed to estimate the model parameters. The performance of the methodology is evaluated on simulated and real data sets.
Maximum likelihood principal component analysis
Journal of …, 1997
The theoretical principles and practical implementation of a new method for multivariate data analysis, maximum likelihood principal component analysis (MLPCA), are described. MLCPA is an analog to principal component analysis (PCA) that incorporates information about measurement errors to develop PCA models that are optimal in a maximum likelihood sense. The theoretical foundations of MLPCA are initially established using a regression model and extended to the framework of PCA and singular value decomposition (SVD). An efficient and reliable algorithm based on an alternating regression method is described. Generalization of the algorithm allows its adaptation to cases of correlated errors provided that the error covariance matrix is known. Models with intercept terms can also be accommodated. Simulated data and near-infrared spectra, with a variety of error structures, are used to evaluate the performance of the new algorithm. Convergence times depend on the error structure but are typically around a few minutes. In all cases, models determined by MLPCA are found to be superior to those obtained by PCA when non-uniform error distributions are present, although the level of improvement depends on the error structure of the particular data set.
A nonlinear PCA based on manifold approximation
Computational Statistics, 2000
We address the problem of generalizing Principal Component Analysis (PCA) from the approximation point of view. Given a data set in a high dimensional space, PCA proposes approximations by linear subspaces. These linear models can show some limits when the data distribution is not Gaussian. To overcome these limits, we present Auto-Associative Composite (AAC) models based on manifold approximation. AAC models benefit from interesting theoretical properties, generalizing PCA ones. We take profit of these properties to propose an iterative algorithm to compute the manifold, and prove its convergence in a finite number of steps. PCA models and AAC models are first compared on a theoretical point of view. As a result, we show that PCA is the unique additive AAC model. Then a practical comparison of AAC and PCA models is presented on a data set made of curves.
N-Dimensional Principal Component Analysis
2010
In this paper, we first briefly introduce the multidimensional Principal Component Analysis (PCA) techniques, and then amend our previous N-dimensional PCA (ND-PCA) scheme by introducing multidirectional decomposition into ND-PCA implementation. For the case of high dimensionality, PCA technique is usually extended to an arbitrary n-dimensional space by the Higher-Order Singular Value Decomposition (HO-SVD) technique. Due to the size of tensor, HO-SVD implementation usually leads to a huge matrix along some direction of tensor, which is always beyond the capacity of an ordinary PC. The novelty of this paper is to amend our previous ND-PCA scheme to deal with this challenge and further prove that the revised ND-PCA scheme can provide a near optimal linear solution under the given error bound. To evaluate the numerical property of the revised ND-PCA scheme, experiments are performed on a set of 3D volume datasets.