Non-reproducible predictions using PyMC-bart (original) (raw)
November 28, 2023, 5:36pm 1
Hi,
I’m using PyMc-BART to fit a BART model to some data. I’ve ran a cross validated parameter grid search and want to rebuild the best models to evaluate against the test set.
However, when using the same data I get slightly different predictions and performance metrics, even when random state is set for each chain and posterior sample.
Reproducable via:
tr_RMSE = []
va_RMSE = []
for _ in range(5):
with pm.Model() as modelg:
des = pm.MutableData("des",X_tr.to_numpy())
σ = pm.HalfNormal('σ', 1)
μ = pmb.BART('μ', des, g_tr, m=10)
y = pm.Normal('y', μ, σ, observed=g_tr, shape=µ.shape)
trace_g = pm.sample(1000, tune=1000, return_inferencedata=False, chains=3, cores=3, random_seed = [42, 56, 69], progressbar=0)
gtrain_posterior = pm.sample_posterior_predictive(
trace=trace_g, random_seed=42
)
with modelg:
des.set_value(X_va.to_numpy())
gval_posterior = pm.sample_posterior_predictive(
trace=trace_g, random_seed=42
)
tr_RMSE.append(np.sqrt(metrics.mean_squared_error(g_tr, gtrain_posterior.posterior_predictive.y.mean(axis=0).mean(axis=0).values)))
va_RMSE.append(np.sqrt(metrics.mean_squared_error(g_va, gval_posterior.posterior_predictive.y.mean(axis=0).mean(axis=0).values)))
This gives the following RMSE metrics:
tr_RMSE va_RMSE
0 0.423 0.385
1 0.429 0.401
2 0.471 0.404
3 0.461 0.432
4 0.435 0.379
Hopefully I’m missing something obvious! Thanks in advance!