A comparison of isometric and amalgamation logratio balances in compositional data analysis (original) (raw)

Aitchison's Compositional Data Analysis 40 Years On: A Reappraisal

2022

The development of John Aitchison's approach to compositional data analysis is followed since his paper read to the Royal Statistical Society in 1982. Aitchison's logratio approach, which was proposed to solve the problematic aspects of working with data with a fixed sum constraint, is summarized and reappraised. It is maintained that the principles on which this approach was originally built, the main one being subcompositional coherence, are not required to be satisfied exactly -- quasi-coherence is sufficient in practice. This opens up the field to using simpler data transformations with easier interpretations and also for variable selection to be possible to make results parsimonious. The additional principle of exact isometry, which was subsequently introduced and not in Aitchison's original conception, imposed the use of isometric logratio transformations, but these have been shown to be problematic to interpret. If this principle is regarded as important, it can b...

Groups of Parts and Their Balances in Compositional Data Analysis

Mathematical Geology, 2005

Amalgamation of parts of a composition has been extensively used as a technique of analysis to achieve reduced dimension, as was discussed during the CoDaWork'03 meeting (Girona, Spain, 2003). It was shown to be a non-linear operation in the simplex that does not preserve distances under perturbation. The discussion motivated the introduction in the present paper of concepts such as group of parts, balance between groups, and sequential binary partition, which are intended to provide tools of compositional data analysis for dimension reduction. Key concepts underlying this development are the established tools of subcomposition, coordinates in an orthogonal basis of the simplex, balancing element and, in general, the Aitchison geometry in the simplex. Main new results are: a method to analyze grouped parts of a compositional vector through the adequate coordinates in an ad hoc orthonormal basis; and the study of balances of groups of parts (inter-group analysis) as an orthogonal projection similar to that used in standard subcompositional analysis (intra-group analysis). A simulated example compares results when testing equal centers of two populations using amalgamated parts and balances; it shows that, in certain circumstances, results from both analysis can disagree.

Compositional data and their analysis: an introduction

Geological Society, London, Special Publications, 2006

Compositional data are those which contain only relative information. They are parts of some whole. In most cases they are recorded as closed data, i.e. data summing to a constant, such as 100% -whole-rock geochemical data being classic examples. Compositional data have important and particular properties that preclude the application of standard statistical techniques on such data in raw form. Standard techniques are designed to be used with data that are free to range from -oo to +oo. Compositional data are always positive and range only from 0 to 100, or any other constant, when given in closed form. If one component increases, others must, perforce, decrease, whether or not there is a genetic link between these components. This means that the results of standard statistical analysis of the relationships between raw components or parts in a compositional dataset are clouded by spurious effects. Although such analyses may give apparently interpretable results, they are, at best, approximations and need to be treated with considerable circumspection. The methods outlined in this volume are based on the premise that it is the relative variation of components which is of interest, rather than absolute variation. Log-ratios of components provide the natural means of studying compositional data. In this contribution the basic terms and operations are introduced using simple numerical examples to illustrate their computation and to familiarize the reader with their use.

Isometric Logratio Transformations for Compositional Data Analysis

2003

Geometry in the simplex has been developed in the last 15 years mainly based on the contributions due to J. Aitchison. The main goal was to develop analytical tools for the statistical analysis of compositional data. Our present aim is to get a further insight into some aspects of this geometry in order to clarify the way for more complex statistical approaches. This is done by way of orthonormal bases, which allow for a straightforward handling of geometric elements in the simplex. The transformation into real coordinates preserves all metric properties and is thus called isometric logratio transformation (ilr). An important result is the decomposition of the simplex, as a vector space, into orthogonal subspaces associated with nonoverlapping subcompositions. This gives the key to join compositions with different parts into a single composition by using a balancing element. The relationship between ilr transformations and the centered-logratio (clr) and additive-logratio (alr) transformations is also studied. Exponential growth or decay of mass is used to illustrate compositional linear processes, parallelism and orthogonality in the simplex.

On interpretations of tests and effect sizes in regression models with a compositional predictor

SORT, 2020

Compositional data analysis is concerned with the relative importance of positive variables, expressed through their log-ratios. The literature has proposed a range of manners to compute log-ratios, some of whose interrelationships have never been reported when used as explanatory variables in regression models. This article shows their similarities and differences in interpretation based on the notion that one log-ratio has to be interpreted keeping all others constant. The article shows that centred, additive, pivot, balance and pairwise log-ratios lead to simple reparametriza-tions of the same model which can be combined to provide useful tests and comparable effect size estimates.

Taxicab Correspondence Analysis and Taxicab Logratio Analysis: A Comparison on Contingency Tables and Compositional Data

Austrian Journal of Statistics

In this paper, we attempt to see further by relating theory with practice: First, we review the principles on which three interrelated well developed methods for the analysis and visualization of contingency tables and compositional data are erected: Correspondence analysis based on Benzécri’s principle of distributional equivalence, Goodman’s RC association model based on Yule’s principle of scale invariance, and compositional data analysis based on Aitchison’s principle of subcompositional coherence. Second, we introduce a novel index named intrinsic measure of the quality of the signs of the residuals for the choice of the method. The criterion is based on taxicab singular value decomposition, on which the package TaxicabCA in R is developed. We present a minimal R script thatcan be executed to obtain the numerical results and the maps in this paper. Third, we introduce a flexible method based on the novel index for the choice of the constant to be added to contingency tables wit...