Best way of indexing numpy array using categorical latent variables (original) (raw)

October 30, 2024, 10:46am 1

I have a model that has Categorical latent variables that can be seen as indicator variables.
I am using the values of this latent variables vector to index a numpy array.

with pm.Model() as inference_model_categorical:
    w = pm.Dirichlet("w", a=np.ones(max_state)) 
    latent_z = pm.Categorical('z', p=w,shape=N)

    alpha = pm.Uniform("alpha", lower=0, upper=0.1) 
    mu = pt.shared(expected_mean)[latent_z]

    obs_distrib = pm.NegativeBinomial('obs',alpha=1/alpha, mu=mu, observed=obs_draw.T)

    obs_sample = pm.sample() 
pm.model_to_graphviz(model=inference_model_categorical)

As explained in the pytensor documentation, I am first converting the numpy array named “expected_mean” into a pytensor SharedVariable and then index it with my Categorical variables (latent_z).

pt.shared(expected_mean)[latent_z]

Is it the most efficient way to do this ?

Thank you for your help

It’s fine, but if you don’t want to change expected_mean later down the road you can just do pt.as_tensor(expected_mean)[latent_z].

Perfomance should be the same, but it’s more clear to the model that those ain’t ever changing.