Lm_plot with bambi (original) (raw)

I’m trying to create a regression plot with lm_plot in conjunction with bambi. I passed in the inference data object into lm_plot, but I don’t what other parameters I’m supposed to pass in to make it work. Any examples of how to create these regression plots with bambi would be helpful. Thank you!

Hello!

Do you have an example of what you’re trying to do?

However, I think it would be better to use plot_cap from Bambi.

Do

from bambi.plots import plot_cap

plot_cap(model, idata, ["predictor"])

And it should work.

I want to see the the predictive posterior samples in addition to the observed data points. Here is an example of what I’m trying to do:

plot_lm

I’m not sure how to do this with inference data returned from the bambi model. When I put it in the arviz plot_lm function, it’s gives me the following error:

ValueError: x and y must have same first dimension, but have shapes (200,) and (1,)

If you have a minimum reproducible example of your data and model, I can help you with the code for the visualization :slight_smile:

hxk1633 April 11, 2023, 6:09pm 5

So I was able to figure it out by doing it this way:

model = bmb.Model('y ~ x', train_data, dropna=True)
fitted = model.fit(draws=5000, chains=4)
idata = model.predict(fitted, kind='pps', inplace=False)
data = az.from_dict(
    posterior={'y_mean': idata.posterior.y_mean},
    observed_data = {'y': idata.observed_data.y},
    posterior_predictive = {'y': idata.posterior_predictive.y},
    dims={"y": ["x"]},
    coords={"x": train_data['x']}
)
az.plot_lm(idata=data, y='y', y_model='y_mean')

Now, I’m trying to create an identical plot with plotly. I’m building an interactive application and would like to include this visualization. I’m not sure which data format the posterior_predictive samples should be in order to plot it.

This is a fully reproducible example with Bambi, using Matplotlib.

Some highlights

import arviz as az
import bambi as bmb
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd


rng = np.random.default_rng(1234)
b0, b1, sigma = 1.5, 0.6, 1.5
x = rng.uniform(2, 10, size=200)
y = rng.normal(b0 + b1 * x, sigma)

data = pd.DataFrame({"x": x, "y": y})

model = bmb.Model("y ~ 1 + x", data)
idata = model.fit()
model.predict(idata, kind="pps")

sort_idxs = np.argsort(data["x"])
x_sorted = data["x"][sort_idxs]

y_mean = idata.posterior["y_mean"].mean(("chain", "draw")).to_numpy()
y_mean_bands = idata.posterior["y_mean"].quantile((0.025, 0.975), ("chain", "draw")).to_numpy()

# plot_lm also uses `num_samples=50`
y_pp = az.extract(idata.posterior_predictive, num_samples=50)["y"].to_numpy().T

fig, ax = plt.subplots()
for y_values in y_pp:
    ax.scatter(data["x"], y_values, color="C0", alpha=0.1)
ax.scatter(data["x"], data["y"], color="C1");

ax.fill_between(
    x_sorted, 
    y_mean_bands[0][sort_idxs],
    y_mean_bands[1][sort_idxs], 
    color="0.4", 
    alpha=0.7
)
ax.plot(x_sorted, y_mean[sort_idxs], color="black", lw=1.55);

image