A New Algorithm for Computing Disjoint Orthogonal Components in the Parallel Factor Analysis Model with Simulations and Applications to Real-World Data (original) (raw)
Related papers
Parallel factor analysis with constraints on the configurations: An overview
The purpose of the paper is to present an overview of recent developments with respect to the use of constraints in conjunction with the Parallel Factor Analysis parafac model . Constraints and the way they can be incorporated in the estimation process of the model are reviewed. Emphasis is placed on the relatively new triadic algorithm which provides a large number of new ways to use the parafac model.
Disjoint factor analysis with cross-loadings
Advances in Data Analysis and Classification, 2016
Disjoint factor analysis (DFA) is a new latent factor model that we propose here to identify factors that relate to disjoint subsets of variables, thus simplifying the loading matrix structure. Similarly to exploratory factor analysis (EFA), the DFA does not hypothesize prior information on the number of factors and on the relevant relations between variables and factors. In DFA the population variance-covariance structure is hypothesized block diagonal after the proper permutation of variables and estimated by Maximum Likelihood, using an Coordinate Descent type algorithm. Inference on parameters on the number of factors and to confirm the hypothesized simple structure are provided. Properties such as scale equivariance, uniqueness, optimal simplification of loadings are satisfied by DFA. Relevant cross-loadings are also estimated in case they are detected from the best DFA solution. DFA has also the option to constrain a variable to load on a pre-specified factor so that the researcher can assume, a priori, some relations between variables and loadings. A simulation study shows performances of DFA and an application to optimally identify the dimensions of wellbeing is used to illustrate characteristics of the new methodology. A final discussion concludes the paper.
A program for non-orthogonal rotation in factor analysis
TrAC Trends in Analytical Chemistry, 1993
Rotation of initial factors is a very important step in factor analysis. In this article the program OBLIQUE, which performs the nonorthogonal "Oblimin" rotations, ranging from "Quartimin" to "Covarimin", with all the possible intermediate solutions, is described.
2013 IEEE 13th International Conference on Data Mining, 2013
Boolean matrix factorization (BMF), or decomposition, received a considerable attention in data mining research. In this paper, we argue that research should extend beyond the Boolean case toward more general type of data such as ordinal data. Technically, such extension amounts to replacement of the two-element Boolean algebra utilized in BMF by more general structures, which brings non-trivial challenges. We first present the problem formulation, survey the existing literature, and provide an illustrative example. Second, we present new theorems regarding decompositions of matrices with ordinal data. Third, we propose a new algorithm based on these results along with an experimental evaluation.
On the Detection of the Correct Number of Factors in Two-Facet Models by Means of Parallel Analysis
Educational and Psychological Measurement, 2021
Methods for optimal factor rotation of two-facet loading matrices have recently been proposed. However, the problem of the correct number of factors to retain for rotation of two-facet loading matrices has rarely been addressed in the context of exploratory factor analysis. Most previous studies were based on the observation that two-facet loading matrices may be rank deficient when the salient loadings of each factor have the same sign. It was shown here that full-rank two-facet loading matrices are, in principle, possible, when some factors have positive and negative salient loadings. Accordingly, the current simulation study on the number of factors to extract for two-facet models was based on rank-deficient and full-rank two-facet population models. The number of factors to extract was estimated from traditional parallel analysis based on the mean of the unreduced eigenvalues as well as from nine other rather traditional versions of parallel analysis (based on the 95th percentile of eigenvalues, based on reduced eigenvalues, based on eigenvalue differences). Parallel analysis based on the mean eigenvalues of the correlation matrix with the squared multiple correlations of each variable with the remaining variables inserted in the main diagonal had the highest detection rates for most of the two-facet factor models. Recommendations for the identification of the correct number of factors are based on the simulation results, on the results of an empirical example data set, and on the conditions for approximately rank-deficient and full-rank two-facet models.
Matrix-Variate Factor Analysis and Its Applications
IEEE Transactions on Neural Networks, 2000
Factor analysis (FA) seeks to reveal the relationship between an observed vector variable and a latent variable of reduced dimension. It has been widely used in many applications involving high-dimensional data, such as image representation and face recognition. An intrinsic limitation of FA lies in its potentially poor performance when the data dimension is high, a problem known as curse of dimensionality. Motivated by the fact that images are inherently matrices, we develop, in this brief, an FA model for matrix-variate variables and present an efficient parameter estimation algorithm. Experiments on both toy and real-world image data demonstrate that the proposed matrix-variant FA model is more efficient and accurate than the classical FA approach, especially when the observed variable is high-dimensional and the samples available are limited.
Detecting outlying samples in a parallel factor analysis model
Analytica Chimica Acta, 2011
To explore multi-way data, different methods have been proposed. Here, we study the popular PARAFAC (Parallel factor analysis) model, which expresses multi-way data in a more compact way, without ignoring the underlying complex structure. To estimate the score and loading matrices, an alternating least squares procedure is typically used. It is however well known that least squares techniques suffer from outlying observations, making the models useless when outliers are present in the data. In this paper, we present a robust PARAFAC method. Essentially, it searches for an outlier-free subset of the data, on which we can then perform the classical PARAFAC algorithm. An outlier map is constructed to identify outliers. Simulations and examples show the robustness of our approach.
Multiple factor analysis: multi-table principal component analysis
2013
Multiple factor analysis (MFA, also called multiple factorial analysis) is an extension of principal component analysis (PCA) tailored to handle multiple data tables that measure sets of variables collected on the same observations, or, alternatively, (in dual-MFA) multiple data tables where the same variables are measured on different sets of observations. MFA proceeds in two steps: First it computes a PCA of each data table and 'normalizes' each data table by dividing all its elements by the first singular value obtained from its PCA. Second, all the normalized data tables are aggregated into a grand data table that is analyzed via a (non-normalized) PCA that gives a set of factor scores for the observations and loadings for the variables. In addition, MFA provides for each data table a set of partial factor scores for the observations that reflects the specific 'view-point' of this data table. Interestingly, the common factor scores could be obtained by replacing the original normalized data tables by the normalized factor scores obtained from the PCA of each of these tables. In this article, we present MFA, review recent extensions, and illustrate it with a detailed example.
1987
In this paper we discuss the problem of factor analysis from the Bayesian viewpoint. First, the classical factor analysis model is generalized in several directions. Then, prior distributions are adopted for the parameters of the generalized model and posterior dis-