Maximum likelihood principal component analysis (original) (raw)
Related papers
Least Squares Regression Principal Component Analysis
2020
Dimension reduction is an important technique in surrogate modeling and machine learning. In this thesis, we present three existing dimension reduction methods in detail and then we propose a novel supervised dimension reduction method, 'Least Squares Regression Principal Component Analysis" (LSR-PCA), applicable to both classification and regression dimension reduction tasks. To show the efficacy of this method, we present different examples in visualization, classification and regression problems, comparing it to state-of-the-art dimension reduction methods. Furthermore, we present the kernel version of LSR-PCA for problems where the input are correlated non-linearly. The examples demonstrated that LSR-PCA can be a competitive dimension reduction method. I would like to express my gratitude to my thesis supervisor, Professor Xin Yee. I would like to thank her for giving me this wonderful opportunity and for her guidance and support during all the passing of this semester, putting herself at my disposal during the difficult times of the COVID-19 situation. Without her, the making of this thesis would not have been possible. I would like to extend my thanks to Mr. Pere Balsells, for allowing students like me to conduct their thesis abroad, as well as to the Balsells Foundation for its help and support throughout the whole stay. In addition, I would like to express my thanks to the second supervisor of this thesis, Professor Joan Torras, for helping me in the final stretch of the project, being as helpful as attentive. Finally, I wish to express my most sincere appreciation to my parents and my sister, Alicia, and to my friends, for their support and encouragement during the whole stay.
On the equivalence between total least squares and maximum likelihood PCA
Analytica Chimica Acta, 2005
The maximum likelihood PCA (MLPCA) method has been devised in chemometrics as a generalization of the well-known PCA method in order to derive consistent estimators in the presence of errors with known error distribution. For similar reasons, the total least squares (TLS) method has been generalized in the field of computational mathematics and engineering to maintain consistency of the parameter estimates in linear models with measurement errors of known distribution. The basic motivation for TLS is the following. Let a set of multidimensional data points (vectors) be given. How can one obtain a linear model that explains these data? The idea is to modify all data points in such a way that some norm of the modification is minimized subject to the constraint that the modified vectors satisfy a linear relation. Although the name "total least squares" appeared in the literature only 25 years ago, this method of fitting is certainly not new and has a long history in the statistical literature, where the method is known as "orthogonal regression", "errors-in-variables regression" or "measurement error modeling". The purpose of this paper is to explore the tight equivalences between MLPCA and element-wise weighted TLS (EW-TLS). Despite their seemingly different problem formulation, it is shown that both methods can be reduced to the same mathematical kernel problem, i.e. finding the closest (in a certain sense) weighted low rank matrix approximation where the weight is derived from the distribution of the errors in the data. Different solution approaches, as used in MLPCA and EW-TLS, are discussed. In particular, we will discuss the weighted low rank approximation (WLRA), the MLPCA, the EW-TLS and the generalized TLS (GTLS) problems. These four approaches tackle an equivalent weighted low rank approximation problem, but different algorithms are used to come up with the best approximation matrix. We will compare their computation times on chemical data and discuss their convergence behavior. .be (M. Schuermans). availability of efficient and numerically robust algorithms in which the singular value decomposition (SVD) plays a prominent role . Another reason is the fact that TLS is an application oriented procedure. It is suited for situations in which all data are corrupted by noise, which is almost always the case in engineering applications. In this sense, TLS and errors-invariables (EIV) modeling are a powerful extension of classical least squares and ordinary regression, which corresponds only to a partial modification of the data. A comprehensive description of the state of the art on TLS from its conception up to the summer of 1990 and its use in parameter estimation has been presented in . While the latter book is entirely devoted to TLS, a second [4] and third book [5] present the progress in TLS and in the broader field of errors-in-variables modeling
Journal of chemometrics, 1997
Multivariate calibration aims to model the relation between a dependent variable, e.g. analyte concentration, and the measured independent variables, e.g. spectra, for complex mixtures. The model parameters are obtained in the form of a regression vector from calibration data by regression methods such as principal component regression (PCR) or partial least squares (PLS). Subsequently, this regression vector is used to predict the dependent variable for unknown mixtures. The validation of the obtained predictions is a crucial part of the procedure, i.e. together with the point estimate an interval estimate is desired. The associated prediction intervals can be constructed from the covariance matrix of the estimated regression vector. However, currently known expressions for PCR and PLS are derived within the classical regression framework, i.e. they only take the uncertainty in the dependent variable into account. This severely limits their capability for establishing realistic prediction intervals in practical situations. In this paper, expressions are derived using the method of error propagation that also account for the measurement errors in the independent variables. An exact linear relation is assumed between the dependent and independent variables. The obtained expressions are therefore valid for the classical errors-in-variables (EIV) model. In order to make the presentation reasonably self-contained, relevant expressions are reviewed for the classical regression model as well as the classical EIV model, especially for ordinary least squares (OLS). The consequences for the limit of detection, wavelength selection, sample selection and local modeling are discussed. Diagnostics are proposed to determine the adequacy of the approximations used in the derivations. Finally, PCR and PLS are so-called biased regression methods. Compared with OLS, they yield small variance at the expense of increased bias. It follows that bias may be an important ingredient of the obtained predictions. Therefore considerable attention is paid to the quantification of bias and new stopping rules for model selection in PCR and PLS are proposed. The theoretical ideas are illustrated by the analysis of real data taken from the literature (classical regression model) as well as simulated data (classical EIV model).
Chapter 9: Principal Component Analysis
Springer eBooks, 2022
We will adopt the same notations as in the previous chapters. Lowercase letters x, y,. .. will denote real scalar variables, whether mathematical or random. Capital letters X, Y,. .. will be used to denote real matrix-variate mathematical or random variables, whether square or rectangular matrices are involved. A tilde will be placed on top of letters such asx,ỹ,X,Ỹ to denote variables in the complex domain. Constant matrices will for instance be denoted by A, B, C. A tilde will not be used on constant matrices unless the point is to be stressed that the matrix is in the complex domain. The determinant of a square matrix A will be denoted by |A| or det(A) and, in the complex case, the absolute value or modulus of the determinant of A will be denoted as |det(A)|. When matrices are square, their order will be taken as p × p, unless specified otherwise. When A is a full rank matrix in the complex domain, then AA * is Hermitian positive definite where an asterisk designates the complex conjugate transpose of a matrix. Additionally, dX will indicate the wedge product of all the distinct differentials of the elements of the matrix X. Letting the p × q matrix X = (x ij) where the x ij 's are distinct real scalar variables, dX = ∧ p i=1 ∧ q j =1 dx ij. For the complex matrixX = X 1 + iX 2 , i = √ (−1), where X 1 and X 2 are real, dX = dX 1 ∧ dX 2. The requisite theory for the study of Principal Component Analysis has already been introduced in Chap. 1, namely, the problem of optimizing a real quadratic form that is subject to a constraint. We shall formulate the problem with respect to a practical situation consisting of selecting the most "relevant" variables in a study. Suppose that a scientist would like to devise a "good health" index in terms of certain indicators. After selecting a random sample of individuals belonging to a population that is homogeneous with respect to a variety of factors, such as age group, racial background and environmental conditions, she managed to secure measurements on p = 15 variables, including for instance, x 1 : weight, x 2 : systolic pressure, x 3 : blood sugar level, and x 4 : height. She now
Multivariate Statistical Data Analysis-Principal Component Analysis (PCA
Principal component analysis (PCA) is a multivariate technique that analyzes a data table in which observations are described by several inter-correlated quantitative dependent variables. Its goal is to extract the important information from the statistical data to represent it as a set of new orthogonal variables called principal components, and to display the pattern of similarity between the observations and of the variables as points in spot maps. Mathematically, PCA depends upon the eigen-decomposition of positive semi-definite matrices and upon the singular value decomposition (SVD) of rectangular matrices. It is determined by eigenvectors and eigenvalues. Eigenvectors and eigenvalues are numbers and vectors associated to square matrices. Together they provide the eigen-decomposition of a matrix, which analyzes the structure of this matrix such as correlation, covariance, or cross-product matrices. Performing PCA is quite simple in practice. Organize a data set as an m × n matrix, where m is the number of measurement types and n is the number of trials. Subtract of the mean for each measurement type or row x i. Calculate the SVD or the eigenvectors of the co-variance. It was found that there were many interesting applications of PCA, out of which in day today life knowingly or unknowingly multivariate data analysis and image compression are being used alternatively.
Assessing the Impact of Parametric Uncertainty on the Performance of Model-Based PCA
IFAC Proceedings Volumes, 2000
In model-based PCA (MBPCA), principal component analysis is carried out on the residual process measurements that cannot be predicted using a physical model. In principle, this approach can improve the detection and identification of unmeasured disturbances and faults in non-stationary and batch processes. Since process knowledge is required for the implementation of MBPCA, the uncertainty associated with the process model will inevitably affect the achievable performance. This paper presents a method for estimating the impact of parametric uncertainty on the performance of MBPCA and demonstrates its application on the monitoring of a continuous stirred tank reactor.