Shape mismatch when running predictions with Gaussian Processes (original) (raw)
March 11, 2025, 11:49am 1
Hello, PyMC-community!
I am trying to build a model, one of the components being Gaussian Process.
However, once I am trying to get out-of-sample predictions (pm.sample_posterior_predictive(idata)
), I am getting some shape mismatch error:Input dimension mismatch: (input[1].shape[0] = 5, input[2].shape[0] = 10)
I previously got acquainted to some solutions in threads related to shape mismatch, when using pm.sample_posterior_predictive()
, (e.g here and here) like adding dims, changing shapes for all variables, setting shape=mu.shape for observed
, but they do not seem to work for me.
The code is here:
import numpy as np
import pymc as pm
integer_time = np.arange(0,10)
future_integer_time = np.arange(0,5)
with pm.Model() as model:
x = pm.MutableData(name="x", value=[1,5,6,4,2,3,4,7,8,9], dims='obs_id')
y = pm.MutableData(name="y", value=[11,52,61,43,22,34,41,76,83,92], dims='obs_id') # y = 10*x + noise, here always positive
beta = pm.Normal("beta", mu=0, sigma=500)
sigma = pm.HalfNormal("sigma", sigma=500)
eta = pm.HalfNormal(name=f'eta', sigma=10)
ls = pm.HalfNormal(name=f'ls', sigma=10)
cov_func = eta**2 * pm.gp.cov.Matern52(1, ls=ls)
gp = pm.gp.HSGP(m=[30], c=1.2, cov_func=cov_func)
gp_rv = gp.prior('gp_rv',integer_time[:,None])
mu = pm.Deterministic('mu', beta * x + gp_rv, dims='obs_id')
obs = pm.Normal("obs", mu=mu, sigma=sigma, observed=y, dims='obs_id') # also tried to add shape=mu.shape, but does not seem to work
idata = pm.sample()
# Want to get mu predictions for new data
with model:
pm.set_data( {'x': [4,7,3,9,1]} ) # Also tried adding 'y': [0,0,0,0,0]
gp_rv_pred = gp.conditional("gp_rv_pred", Xnew=future_integer_time[:, None])
posterior_pred = pm.sample_posterior_predictive(idata)
Thanks so much in advance!
PS. Using 5.19.0, but tried later versions.
Hey @aabugaev, have you tried making integer_time[:, None]
a pm.Data()
object and then update that in your pm.set_data()
with future_integer_time[:, None]
?
Also, this example is a really good reference when using HSGP’s.
Hi, from reading your post, it looks like you may be after some sort of post training validation-f if i misread i apologize
One thing that has helped me tremendously in my own workflow is to do something very similar to @Dekermanjian’s recommendation: Name your spatial and temporal variables, train your model on your ‘train set’-and then after performing your inference/checks, extend your original data. This notebook: Gaussian Processes: HSGP Advanced Usage — PyMC example gallery has an example of doing that. Here’s the most relevant piece of code after you have trained your original model:
with model:
pm.set_data({"X": x_full[:, None]})
idata.extend(
pm.sample_posterior_predictive(
idata,
var_names=["f_mu", "f"],
predictions=True,
compile_kwargs={"mode": "NUMBA"},
random_seed=rng,
),
)
Here, you can set the training data container 'X" to contain all of your points
I realize I’m abstracting away probably the most important bits like defining your kronecker structure, defining hyperparamters ect etc, but if you’re after some code to help you work backwards-this block might serve a good reference!
aabugaev March 12, 2025, 11:39am 4
@Dekermanjian @brontidon thanks so much, this helped! Indeed, putting integer_time
to MutableData
container makes it work.
Here’s the model I ended up with:
with pm.Model() as model:
integer_time = pm.MutableData(name="integer_time", value=np.arange(0,10), dims='obs_id') # make sure you pass time as variable
x = pm.MutableData(name="x", value=[1,5,6,4,2,3,4,7,8,9], dims='obs_id')
y = pm.MutableData(name="y", value=[11,52,61,43,22,34,41,76,83,92], dims='obs_id') # y = 10*x + noise, here always positive
beta = pm.Normal("beta", mu=0, sigma=500)
sigma = pm.HalfNormal("sigma", sigma=500)
eta = pm.HalfNormal(name=f'eta', sigma=10)
ls = pm.HalfNormal(name=f'ls', sigma=10)
cov_func = eta**2 * pm.gp.cov.Matern52(1, ls=ls)
gp = pm.gp.HSGP(m=[30], c=1.2, cov_func=cov_func)
gp_rv = gp.prior('gp_rv',integer_time[:,None])
mu = pm.Deterministic('mu', beta * x + gp_rv, dims='obs_id')
obs = pm.Normal("obs", mu=mu, sigma=sigma, observed=y, dims='obs_id')
idata = pm.sample()
# Want to get mu predictions for new data
with model:
pm.set_data({"integer_time": np.arange(0,15), "x": [1,5,6,4,2,3,4,7,8,9, 12,8,5,4,2]}) # reset integer time here
gp_rv_pred = gp.conditional("gp_rv_pred", integer_time[:, None])
idata.extend(pm.sample_posterior_predictive(idata, var_names=['obs','gp_rv_pred'] ))
Also while looking for solutions, I found @juanitorduz 's code samples which also put it to work, but seem to create additional containers instead (source):
with gp_pymc_model:
x_star_data = pm.MutableData("x_star_data", x_test)
f_star = gp.conditional("f_star", x_star_data[:, None])
pm.Normal("likelihood_test", mu=f_star, sigma=noise)
Hey @aabugaev, I am happy to have helped. I just want to add a couple of things. Here is your original code modified with some commentary:
import numpy as np
import pymc as pm
integer_time = np.arange(0,10)
future_integer_time = np.arange(0,5)
with pm.Model() as model:
# PyMC Data class is now by default mutable no need to call pm.MutableData()
x = pm.Data(name="x", value=[1,5,6,4,2,3,4,7,8,9], dims='obs_id')
y = pm.Data(name="y", value=[11,52,61,43,22,34,41,76,83,92], dims='obs_id') # y = 10*x + noise, here always positive
integer_time = pm.Data(name="integer_time", value=integer_time[:,None])
beta = pm.Normal("beta", mu=0, sigma=500)
sigma = pm.HalfNormal("sigma", sigma=500)
eta = pm.HalfNormal(name=f'eta', sigma=10)
ls = pm.HalfNormal(name=f'ls', sigma=10)
cov_func = eta**2 * pm.gp.cov.Matern52(1, ls=ls)
gp = pm.gp.HSGP(m=[30], c=1.2, cov_func=cov_func)
gp_rv = gp.prior('gp_rv', X=integer_time)
mu = pm.Deterministic('mu', beta * x + gp_rv, dims='obs_id')
obs = pm.Normal("obs", mu=mu, sigma=sigma, observed=y, dims='obs_id') # also tried to add shape=mu.shape, but does not seem to work
idata = pm.sample()
# Want to get mu predictions for new data
with model:
# When using HSGP and using pm.set_data() for out of sample predictions there is no need
# to explicitly call gp.conditional(). Also, if you provide the argument predictions=True to
# pm.sample_posterior_predictive() then you don't need to supply your y variable in pm.set_data()
pm.set_data( {'x': [4,7,3,9,1], 'integer_time': future_integer_time[:, None]} ) # Also tried adding 'y': [0,0,0,0,0]
posterior_pred = pm.sample_posterior_predictive(idata, predictions=True)
I hope this helps!
aabugaev March 12, 2025, 1:36pm 6
Thank you!
Having shortly looked into the article you recommended, I got the idea that avoiding conditional is only possible when using the linearized version of HSGP.
Also thanks for elaborating about the predictions.
Largely appreciate your help!