Working with log-ratio coordinates in coda.base (original) (raw)
The additive logratio (alr) coordinates
The alr coordinates are accessible by setting the parameterbasis='alr'
or by using the building functionalr_basis()
.
If you don’t want the last part in the denominator, the easiest way to define an alr-coordinates is to set basis='alr'
:
H1.alr = coordinates(X, basis = 'alr')
head(H1.alr)
#> alr1 alr2 alr3
#> 1 0.23864536 0.446503630 -0.7201917
#> 2 -0.10388120 0.216858085 -1.0473730
#> 3 0.36723896 0.542010167 -0.5320675
#> 4 0.53209369 0.798479995 -0.4799141
#> 5 0.54918649 0.477309280 -0.1028807
#> 6 -0.09742133 0.002856425 -0.6858265
It defines an alr-coordinates were the last part is used in the denominator. We can obtain the basis used to build the coordinates with function basis()
:
basis(H1.alr)
#> alr1 alr2 alr3
#> erc 1 0 0
#> jxcat 0 1 0
#> psc 0 0 1
#> cs -1 -1 -1
The basis can be reproduced using the functionalr_basis
:
alr_basis(dim = 4)
#> alr1 alr2 alr3
#> c1 1 0 0
#> c2 0 1 0
#> c3 0 0 1
#> c4 -1 -1 -1
In fact, function alr_basis
allows to define any type of alr-like coordinate by defining the numerator and the denominator:
B.alr = alr_basis(dim = 4, numerator = c(4,2,3), denominator = 1)
B.alr
#> alr1 alr2 alr3
#> c1 -1 -1 -1
#> c2 0 1 0
#> c3 0 0 1
#> c4 1 0 0
The log-contrast matrix can be used as basis
parameter in coordinates()
function:
H2.alr = coordinates(X, basis = B.alr)
basis(H2.alr)
#> alr1 alr2 alr3
#> c1 -1 -1 -1
#> c2 0 1 0
#> c3 0 0 1
#> c4 1 0 0
The centered logratio (clr) coordinates
Building centered log-ratio coordinates can be accomplished by setting parameter basis='clr'
or
H.clr = coordinates(X, basis = 'clr')
head(H.clr)
#> clr1 clr2 clr3 clr4
#> 1 0.24740605 0.4552643 -0.7114311 0.008760689
#> 2 0.12971783 0.4504571 -0.8137740 0.233599031
#> 3 0.27294355 0.4477148 -0.6263629 -0.094295406
#> 4 0.31942879 0.5858151 -0.6925790 -0.212664904
#> 5 0.31828271 0.2464055 -0.3337844 -0.230903777
#> 6 0.09767651 0.1979543 -0.4907286 0.195097842
The isometric logratio (ilr) coordinates
coda.base
allows to define a wide variety of ilr-coordinates: principal components (pc) coordinates, specific user balances coordinates, principal balances (pb) coordinates, balanced coordinates (default’s CoDaPack’s coordinates).
The default ilr coordinate used by coda.base
are accessible by simply calling function coordinates
without parameters:
H1.ilr = coordinates(X)
head(H1.ilr)
#> ilr1 ilr2 ilr3
#> 1 -0.14697799 0.8677450 -0.01011597
#> 2 -0.22679692 0.9012991 -0.26973693
#> 3 -0.12358191 0.8056307 0.10888296
#> 4 -0.18836356 0.9350526 0.24556428
#> 5 0.05082486 0.5030669 0.26662472
#> 6 -0.07090708 0.5213690 -0.22527958
Parameter basis
is set to ilr
by default:
all.equal( coordinates(X, basis = 'ilr'),
H1.ilr )
#> [1] TRUE
Other ilr-coordinates: Principal Components and Principal balances
Other easily accessible coordinates are the Principal Component (PC) coordinates. PC coordinates define the first coordinate as the log-contrast with highest variance, the second the one independent from the first and with highest variance and so on:
H2.ilr = coordinates(X, basis = 'pc')
head(H2.ilr)
#> pc1 pc2 pc3
#> 1 -0.6787536 0.35694598 0.4319368
#> 2 -0.5581520 0.57775877 0.5396259
#> 3 -0.7013616 0.25302877 0.3467523
#> 4 -0.8973701 0.25915667 0.3125234
#> 5 -0.5362270 -0.05527103 0.1901418
#> 6 -0.2676101 0.32802497 0.3852126
barplot(apply(H2.ilr, 2, var))
Note that the PC coordinates are independent:
cov(H2.ilr)
#> pc1 pc2 pc3
#> pc1 4.475083e-01 1.036012e-16 1.997487e-16
#> pc2 1.036012e-16 3.650673e-02 -3.031068e-17
#> pc3 1.997487e-16 -3.031068e-17 1.257989e-02
The Principal Balance coordinates are similar to PC coordinates but with the restriction that the log contrast are balances
H3.ilr = coordinates(X, basis = 'pb')
head(H3.ilr)
#> pb1 pb2 pb3
#> 1 -0.7026704 -0.14697799 -0.50925247
#> 2 -0.5801749 -0.22679692 -0.74060456
#> 3 -0.7206583 -0.12358191 -0.37622854
#> 4 -0.9052439 -0.18836356 -0.33935049
#> 5 -0.5646882 0.05082486 -0.07274761
#> 6 -0.2956308 -0.07090708 -0.48495254
barplot(apply(H3.ilr, 2, var))
Moreover, they are not independent:
cor(H3.ilr)
#> pb1 pb2 pb3
#> pb1 1.0000000 0.6043786 -0.3197742
#> pb2 0.6043786 1.0000000 0.1594538
#> pb3 -0.3197742 0.1594538 1.0000000
Principal Balances are hard to compute when the number of components is very high. coda.base
allows to build PB approximations using different algorithms.
X100 = exp(matrix(rnorm(1000*100), ncol = 100))
- Hierarchical clustering based algorithm.
PB1.ward = pb_basis(X100, method = 'cluster')
- Constrained search algorithm
PB1.constrained = pb_basis(X100, method = 'constrained')
We can compare they performance (variance explained by the first balance) with respect to the principal components.
PC_approx = coordinates(X100, cbind(pc_basis(X100)[,1], PB1.ward[,1], PB1.constrained[,1]))
names(PC_approx) = c('PC', 'Ward', 'Constrained')
apply(PC_approx, 2, var)
#> h1 h2 h3
#> 1.702951 1.418563 1.582705
Finally, coda.base
allows to define the default CoDaPack basis which consists in defining well balanced balances, i.e. equal number of branches in each balance.
H4.ilr = coordinates(X, basis = 'cdp')
head(H4.ilr)
#> ilr1 ilr2 ilr3
#> 1 0.7026704 -0.14697799 -0.50925247
#> 2 0.5801749 -0.22679692 -0.74060456
#> 3 0.7206583 -0.12358191 -0.37622854
#> 4 0.9052439 -0.18836356 -0.33935049
#> 5 0.5646882 0.05082486 -0.07274761
#> 6 0.2956308 -0.07090708 -0.48495254