CustomDist — PyMC 5.22.0 documentation (original) (raw)

class pymc.CustomDist(name, *dist_params, dist=None, random=None, logp=None, logcdf=None, support_point=None, ndim_supp=None, ndims_params=None, signature=None, dtype='floatX', **kwargs)[source]#

A helper class to create custom distributions.

This class can be used to wrap black-box random and logp methods for use in forward and mcmc sampling.

A user can provide a dist function that returns a PyTensor graph built from simpler PyMC distributions, which represents the distribution. This graph is used to take random draws, and to infer the logp expression automatically when not provided by the user.

Alternatively, a user can provide a random function that returns numerical draws (e.g., via NumPy routines), and a logp function that must return a PyTensor graph that represents the logp graph when evaluated. This is used for mcmc sampling.

Additionally, a user can provide a logcdf and support_point functions that must return PyTensor graphs that computes those quantities. These may be used by other PyMC routines.

Parameters:

namestr

dist_paramsTuple

A sequence of the distribution’s parameter. These will be converted into Pytensor tensor variables internally.

dist: Optional[Callable]

A callable that returns a PyTensor graph built from simpler PyMC distributions which represents the distribution. This can be used by PyMC to take random draws as well as to infer the logp of the distribution in some cases. In that case it’s not necessary to implement random or logp functions.

It must have the following signature: dist(*dist_params, size). The symbolic tensor distribution parameters are passed as positional arguments in the same order as they are supplied when the CustomDist is constructed.

randomOptional[Callable]

A callable that can be used to generate random draws from the distribution

It must have the following signature: random(*dist_params, rng=None, size=None). The numerical distribution parameters are passed as positional arguments in the same order as they are supplied when the CustomDist is constructed. The keyword arguments are rng, which will provide the random variable’s associated Generator, and size, that will represent the desired size of the random draw. If None, a NotImplementederror will be raised when trying to draw random samples from the distribution’s prior or posterior predictive.

logpOptional[Callable]

A callable that calculates the log probability of some given valueconditioned on certain distribution parameter values. It must have the following signature: logp(value, *dist_params), where value is a PyTensor tensor that represents the distribution value, and dist_paramsare the tensors that hold the values of the distribution parameters. This function must return a PyTensor tensor.

When the dist function is specified, PyMC will try to automatically infer the logp when this is not provided.

Otherwise, a NotImplementedError will be raised when trying to compute the distribution’s logp.

logcdfOptional[Callable]

A callable that calculates the log cumulative log probability of some givenvalue conditioned on certain distribution parameter values. It must have the following signature: logcdf(value, *dist_params), where value is a PyTensor tensor that represents the distribution value, and dist_paramsare the tensors that hold the values of the distribution parameters. This function must return a PyTensor tensor. If None, a NotImplementedErrorwill be raised when trying to compute the distribution’s logcdf.

support_pointOptional[Callable]

A callable that can be used to compute the finete logp point of the distribution. It must have the following signature: support_point(rv, size, *rv_inputs). The distribution’s variable is passed as the first argument rv. sizeis the random variable’s size implied by the dims, size and parameters supplied to the distribution. Finally, rv_inputs is the sequence of the distribution parameters, in the same order as they were supplied when the CustomDist was created. If None, a default support_point function will be assigned that will always return 0, or an array of zeros.

ndim_suppOptional[int]

The number of dimensions in the support of the distribution. Inferred from signature, if provided. Defaults to assuming a scalar distribution, i.e. ndim_supp = 0

ndims_paramsOptional[Sequence[int]]

The list of number of dimensions in the support of each of the distribution’s parameters. Inferred from signature, if provided. Defaults to assuming all parameters are scalars, i.e. ndims_params=[0, ...].

signatureOptional[str]

A numpy vectorize-like signature that indicates the number and core dimensionality of the input parameters and sample outputs of the CustomDist. When specified, ndim_supp and ndims_params are not needed. See examples below.

dtypestr

The dtype of the distribution. All draws and observations passed into the distribution will be cast onto this dtype. This is not needed if a PyTensor dist function is provided, which should already return the right dtype!

class_namestr

Name for the class which will wrap the CustomDist methods. When not specified, it will be given the name of the model variable.

kwargs

Extra keyword arguments are passed to the parent’s class __new__ method.

Examples

Create a CustomDist that wraps a black-box logp function. This variable cannot be used in prior or posterior predictive sampling because no random function was provided

import numpy as np import pymc as pm from pytensor.tensor import TensorVariable

def logp(value: TensorVariable, mu: TensorVariable) -> TensorVariable: return -((value - mu) ** 2)

with pm.Model(): mu = pm.Normal("mu", 0, 1) pm.CustomDist( "custom_dist", mu, logp=logp, observed=np.random.randn(100), ) idata = pm.sample(100)

Provide a random function that return numerical draws. This allows one to use a CustomDist in prior and posterior predictive sampling. A gufunc signature was also provided, which may be used by other routines.

from typing import Optional, Tuple

import numpy as np import pymc as pm from pytensor.tensor import TensorVariable

def logp(value: TensorVariable, mu: TensorVariable) -> TensorVariable: return -((value - mu) ** 2)

def random( mu: np.ndarray | float, rng: Optional[np.random.Generator] = None, size: Optional[Tuple[int]] = None, ) -> np.ndarray | float: return rng.normal(loc=mu, scale=1, size=size)

with pm.Model(): mu = pm.Normal("mu", 0, 1) pm.CustomDist( "custom_dist", mu, logp=logp, random=random, signature="()->()", observed=np.random.randn(100, 3), size=(100, 3), ) prior = pm.sample_prior_predictive(10)

Provide a dist function that creates a PyTensor graph built from other PyMC distributions. PyMC can automatically infer that the logp of this variable corresponds to a shifted Exponential distribution. A gufunc signature was also provided, which may be used by other routines.

import pymc as pm from pytensor.tensor import TensorVariable

def dist( lam: TensorVariable, shift: TensorVariable, size: TensorVariable, ) -> TensorVariable: return pm.Exponential.dist(lam, size=size) + shift

with pm.Model() as m: lam = pm.HalfNormal("lam") shift = -1 pm.CustomDist( "custom_dist", lam, shift, dist=dist, signature="(),()->()", observed=[-1, -1, 0], )

prior = pm.sample_prior_predictive()
posterior = pm.sample()

Provide a dist function that creates a PyTensor graph built from other PyMC distributions. PyMC can automatically infer that the logp of this variable corresponds to a modified-PERT distribution.

import pymc as pm from pytensor.tensor import TensorVariable

def pert( low: TensorVariable, peak: TensorVariable, high: TensorVariable, lmbda: TensorVariable, size: TensorVariable, ) -> TensorVariable: range = (high - low) s_alpha = 1 + lmbda * (peak - low) / range s_beta = 1 + lmbda * (high - peak) / range return pm.Beta.dist(s_alpha, s_beta, size=size) * range + low

with pm.Model() as m: low = pm.Normal("low", 0, 10) peak = pm.Normal("peak", 50, 10) high = pm.Normal("high", 100, 10) lmbda = 4 pm.CustomDist("pert", low, peak, high, lmbda, dist=pert, observed=[30, 35, 73])

m.point_logps()

Methods