Constraint Transforms (original) (raw)
There’s at least one typo in this algorithm. The square root of y. Also, I found it incredible difficult to understand it. In the end, what I believe it is saying it is doing is the following:
vector sum_to_zero_transform(vector latent_variables){
int N = rows(latent_variables) ;
vector[N] seq_1N ;
{
for(j in 1:N) seq_1N[j] = j ;
}
vector[N] denom = sqrt(seq_1N.*(seq_1N+1)) ;
vector[N+1] sum_to_zero_vector ;
{
sum_to_zero_vector = append_row(
reverse(cumulative_sum(reverse(latent_variables)./reverse(denom))),
0) - append_row(0,seq_1N.*latent_variables./denom) ;
}
return sum_to_zero_vector ;
}
It works for me.
spinkney April 30, 2025, 2:23pm 2
That looks about right. But it can be simplified and made faster without all the reverse and extra allocations that append_row does.
vector sum_to_zero_constrain(vector y) {
int N = num_elements(y);
vector[N + 1] z = zeros_vector(N + 1);
real sum_w = 0;
for (ii in 1:N) {
int i = N - ii + 1;
real n = i;
real w = y[i] * inv_sqrt(n * (n + 1));
sum_w += w;
z[i] += sum_w;
z[i + 1] -= w * n;
}
Maybe we just put this in the docs.
On a side note, I’m going to add the docs for the sum to zero matrix which will be in the next version. I’ll try and clarify the vector case. It’s the same transform that gets applied at each additional dimension and can be expressed as tensor products of Helmert encodings. My hope is that if a user wants a 3d or more sum to zero object they can “easily” use the doc to make it. Easy being relative here.
Thanks a lot. I hope the typos in the doc can be fixed soon as well.
Edit: just realized your example code is missing its return statement and closing brackets.
spinkney April 30, 2025, 2:47pm 4
Sorry, I copied and pasted. Yes, will look into it!
Code that works is part of the sum_to_zero_vector case study: The Sum-to-Zero Constraint in Stan
/**
* Constrain sum-to-zero vector
*
* @param y unconstrained zero-sum parameters
* @return vector z, the vector whose elements sum to zero
*/
vector zero_sum_constrain(vector y) {
int N = num_elements(y);
vector[N + 1] z = zeros_vector(N + 1);
real sum_w = 0;
for (ii in 1:N) {
int i = N - ii + 1;
real n = i;
real w = y[i] * inv_sqrt(n * (n + 1));
sum_w += w;
z[i] += sum_w;
z[i + 1] -= w * n;
}
return z;
}
Thanks for providing the notes. Stan’s documentation is so immense (a great thing) it can be difficult to find everything that is needed.
My main reason for providing this post in the Devs area is that while the example code in your case study and the code that’s be kindly provided to me is correct, the documentation on the original link provided is not.
Thanks for pointing this out. For things like this, if you have a github account you can also open a pull request with the typo correction in the markdown source of our documentation, it can speed up when these things get fixed by quite a bit when it’s just clicking approve on our end