Sampling from non-analytical distribution in C++ (original) (raw)

I wonder what non-analytical distribution in C++ looks like. Just for fun, I did something similar. What you see is done in Windows 11 with the help of Gemini 2.5 Flash. By the way, I don’t know anything about C++.

The first step is to show a known distribution, I chose the negative binomial distribution with a parametrization following this:

\text{NegBinomial}(n \, | \, \mu, \alpha) = \binom{\, n + \alpha - 1}{n} \, \left( \dfrac{\mu}{\mu+\alpha} \right)^{\!n} \, \left( \dfrac{\alpha}{\mu+\alpha} \right)^{\!\alpha} \!.

As you may know, you have to find the logarithm of that expression and then the derivatives of the logarithm of the distribution:

\log \text{NegBin} (n \, | \, \mu, \alpha) = \text{some expression} \\ \dfrac{\partial \log \text{NegBin} }{\partial \mu} = \text{some expression} \\ \dfrac{\partial \log \text{NegBin} }{\partial \alpha} = \text{some expression}

Well, let’s write the result for this in C++, call the file neg_binomial.cpp:

#include <cmath>    // Para log, lgamma
#include <vector>   // Para devolver el gradiente como vector
#include <limits>   // Para manejar valores muy pequeños/grandes
#include <iostream> // Para posibles mensajes de error/depuración

// Incluir la función digamma. Esta es la parte que requiere una librería externa.
// Aquí asumimos que usamos Boost.Math. Si no tienes Boost, esta parte fallará
// y necesitarías encontrar otra implementación de digamma o usar una aproximación
// (lo cual no es ideal para gradientes precisos).
#ifdef USE_BOOST
#include <boost/math/special_functions/digamma.hpp>
#endif


// Función para calcular el logaritmo de la PMF de la Distribución Binomial Negativa
// Parametrización: k (observación), mu (media), alpha (dispersión)
// k: entero no negativo (número de fracasos observados)
// mu: real positivo (media esperada)
// alpha: real positivo (parámetro de dispersión)
double logp_neg_binomial(int k, double mu, double alpha) {
    // Validar entradas básicas (PyMC suele manejar esto con restricciones en los parámetros,
    // pero es bueno tener validación básica).
    if (mu <= 0.0 || alpha <= 0.0) {
         // Retornar -infinito o un valor muy pequeño si los parámetros son inválidos
        return -std::numeric_limits<double>::infinity();
    }
    if (k < 0) {
         // k debe ser no negativo
        return -std::numeric_limits<double>::infinity();
    }

    // Calcular log(P(k; mu, alpha))
    // Fórmula: lgamma(k + alpha) - lgamma(k+1) - lgamma(alpha) + alpha log(alpha) + k log(mu) - (k + alpha) log(mu + alpha)

    double log_gamma_term = 0.0;
    if (k + alpha > 0 && alpha > 0) { // lgamma requiere argumento positivo
        log_gamma_term = std::lgamma(k + alpha) - std::lgamma(k + 1.0) - std::lgamma(alpha);
    } else if (k == 0 && alpha == 0) {
        // Caso especial, log(Binomial(k-1+alpha, k)) = log(Binomial(-1, 0))
        // Para k=0, Binomial(alpha-1, 0) = 1 si alpha-1 >= 0
        // Si k=0, el término es lgamma(alpha) - lgamma(1) - lgamma(alpha) = 0
         log_gamma_term = 0.0; // log(1) = 0
    } else {
        // Otros casos inválidos, aunque mu>0, alpha>0, k>=0 deberían cubrirlos
         return -std::numeric_limits<double>::infinity();
    }


    double log_prob = log_gamma_term
                      + alpha * std::log(alpha)
                      + k * std::log(mu)
                      - (k + alpha) * std::log(mu + alpha);

    return log_prob;
}

// Función para calcular el gradiente del logaritmo de la PMF
// con respecto a los parámetros mu y alpha.
// Retorna un vector<double> donde el primer elemento es el gradiente w.r.t. mu
// y el segundo elemento es el gradiente w.r.t. alpha.
// k: entero no negativo
// mu: real positivo
// alpha: real positivo
std::vector<double> grad_neg_binomial_logp(int k, double mu, double alpha) {
    std::vector<double> gradients(2); // gradients[0] para mu, gradients[1] para alpha

    // Validar entradas
    if (mu <= 0.0 || alpha <= 0.0 || k < 0) {
        gradients[0] = std::numeric_limits<double>::quiet_NaN(); // Not a Number
        gradients[1] = std::numeric_limits<double>::quiet_NaN();
        return gradients;
    }

    // Gradiente con respecto a mu: k/mu - (k + alpha)/(mu + alpha)
    gradients[0] = (double)k / mu - (double)(k + alpha) / (mu + alpha);

    // Gradiente con respecto a alpha: psi(k + alpha) - psi(alpha) + log(alpha) - log(mu + alpha) + 1 - (k + alpha)/(mu + alpha)
    // Requiere la función digamma (psi)
#ifdef USE_BOOST
    double digamma_k_alpha = boost::math::digamma(k + alpha);
    double digamma_alpha = boost::math::digamma(alpha);

    gradients[1] = digamma_k_alpha - digamma_alpha
                + std::log(alpha) - std::log(mu + alpha)
                + 1.0 - (double)(k + alpha) / (mu + alpha);
#endif

    return gradients;
}

// NOTA: Para probar estas funciones, podrías añadir una función main simple aquí,
// pero para la integración con Python no se necesita main.

// #include <iostream> // Para posibles mensajes de error/depuración
int main() {
    int k = 10;
    double mu = 3;
    double alpha = 2; // Equivalente a r=2, p = 2/(3+2) = 2/5 = 0.4

    double log_prob = logp_neg_binomial(k, mu, alpha);
    std::cout << "Log-Prob NB(k=" << k << ", mu=" << mu << ", alpha=" << alpha << ") = " << log_prob << std::endl;

    std::vector<double> grad = grad_neg_binomial_logp(k, mu, alpha);
    std::cout << "Gradiente w.r.t mu = " << grad[0] << std::endl;
    std::cout << "Gradiente w.r.t alpha = " << grad[1] << std::endl;

    return 0;
}

I left the comments in Spanish because Gemini gave me all of that. The function main is just to check the code. You can comment that.

Since I have to compile this code to check, I need g++. I installed Rtools45 and I added these folders to PATH in Windows:

C:\rtools45\x86_64-w64-mingw32.static.posix\bin
C:\rtools45\usr\bin

I also need to download the Boost library. The command to compile is:

g++ -v neg_binomial.cpp -o neg_binomial.exe -fPIC -std=c++17 -D USE_BOOST -I "C:\Users\myuser\Downloads\boost_1_82_0"

To use the functions with Python, Gemini recommended pybind11, call the file neg_binomial_bindings.cpp:

#include <pybind11/pybind11.h>
#include <pybind11/stl.h> // Necesario para convertir std::vector a listas de Python

// Incluye el archivo C++ que contiene las implementaciones de las funciones.
// En un proyecto grande, lo ideal sería incluir un archivo .h o .hpp con las declaraciones.
// Para este ejemplo simple, incluimos el .cpp directamente.
#include "neg_binomial.cpp"

namespace py = pybind11;

// PYBIND11_MODULE es una macro que crea el punto de entrada del módulo Python.
// El primer argumento (neg_binomial_ext) es el nombre que tendrá el módulo cuando lo importes en Python (import neg_binomial_ext).
PYBIND11_MODULE(neg_binomial_ext, m) {
    // Opcional: añadir un docstring al módulo
    m.doc() = "Binding de Python para las funciones de log-probabilidad y gradiente de la Binomial Negativa en C++";

    // Exponer la función logp_neg_binomial
    // m.def("nombre_en_python", &nombre_en_c++, "docstring")
    m.def("logp", &logp_neg_binomial,
          "Calcula el logaritmo de la PMF de la Binomial Negativa (parametrización mu, alpha).");

    // Exponer la función grad_neg_binomial_logp
    // pybind11/stl.h permite que std::vector<double> se convierta automáticamente en una lista de Python.
    m.def("grad", &grad_neg_binomial_logp,
          "Calcula el gradiente del log-PMF de la Binomial Negativa con respecto a mu y alpha.");

    // Opcional: Podrías exponer otras funciones si tuvieras (ej: cdf, invcdf, etc.)

To compile, the command is:

g++ -v -shared neg_binomial_bindings.cpp -o neg_binomial_ext.pyd `
-I "C:\Users\myuser\anaconda3\Include" `
-I "C:\Users\myuser\anaconda3\Lib\site-packages\pybind11\include" `
-DUSE_BOOST -I "C:\Users\myuser\Downloads\boost_1_82_0" `
-fPIC -std=c++17 `
-L "C:\Users\myuser\anaconda3\libs" -l python312

The flags could change if you are in Linux or macOS. The result is a file called neg_binomial_ext.pyd, this is the file to be used in pymc.

If you want to check that you can use the function, use this:

Click to see the code

Comments and message in Spanish

# test_neg_binomial_ext.py

import sys
import os

# Añade el directorio actual al PATH de Python para que pueda encontrar el módulo compilado
# si este script no se ejecuta en el mismo directorio del archivo .pyd/.so
# Aunque si ejecutas el script en el mismo directorio, esta línea no es estrictamente necesaria,
# es una buena práctica si el módulo está en un subdirectorio, por ejemplo.
# current_dir = os.path.dirname(os.path.abspath(__file__))
# if current_dir not in sys.path:
#     sys.path.append(current_dir)

# Importa el módulo compilado
# El nombre 'neg_binomial_ext' debe coincidir con el nombre que le diste en PYBIND11_MODULE
try:
    import neg_binomial_ext as nb_ext
    print("Módulo 'neg_binomial_ext' importado exitosamente.")
    print(f"Ubicación del módulo: {nb_ext.__file__}") # Muestra la ruta del archivo .pyd/.so que cargó

except ImportError as e:
    print(f"Error al importar el módulo: {e}")
    print("Posibles causas:")
    print("1. El archivo 'neg_binomial_ext.cp312-win_amd64.pyd' (o la extensión correcta para tu sistema/versión de Python)")
    print("   no está en el mismo directorio que este script.")
    print("2. El nombre del archivo .pyd/.so no coincide con el nombre del módulo especificado en PYBIND11_MODULE.")
    print("3. Hubo un error durante la compilación/enlazado que impidió que el módulo fuera cargado.")
    sys.exit(1) # Salir si no se puede importar

# --- Ahora puedes usar las funciones expuestas desde C++ ---

# Define algunos valores de entrada para la Distribución Binomial Negativa
# (parametrización mu, alpha)
# k: número de fracasos observados (entero no negativo)
# mu: media esperada (real positivo)
# alpha: parámetro de dispersión (real positivo)

k_observado = 10
mu_param = 3
alpha_param = 2

print(f"\nProbando con k={k_observado}, mu={mu_param}, alpha={alpha_param}")

# Llama a la función logp que expusimos desde C++ (con el nombre "logp")
try:
    log_probabilidad = nb_ext.logp(k_observado, mu_param, alpha_param)
    print(f"Log-probabilidad calculada en C++ (función 'logp'): {log_probabilidad}")
except Exception as e:
    print(f"Ocurrió un error al llamar a la función 'logp': {e}")


# Llama a la función de gradiente que expusimos desde C++ (con el nombre "grad")
try:
    vector_gradiente = nb_ext.grad(k_observado, mu_param, alpha_param)
    print(f"Vector gradiente calculado en C++ (función 'grad'): {vector_gradiente}")

    # La función grad retorna un std::vector<double> que pybind11 convierte automáticamente a una lista de Python
    if isinstance(vector_gradiente, list) and len(vector_gradiente) == 2:
        print(f"  Gradiente con respecto a mu: {vector_gradiente[0]}")
        print(f"  Gradiente con respecto a alpha: {vector_gradiente[1]}")
    else:
         print("  Tipo o tamaño inesperado del valor retornado por la función 'grad'.")

except Exception as e:
    print(f"Ocurrió un error al llamar a la función 'grad': {e}")

This part is the hardest: How to write the the code with pytensor and the function we just created? I don’t pretend that I understand the code, I have been reading the documentation for pytensor and this example to improve the code. Comments in Spanish.

import pymc as pm
import pytensor
import pytensor.tensor as pt
from pytensor.graph import Apply, Op
import numpy as np
import sys
import arviz as az
import matplotlib.pyplot as plt

try:
    import neg_binomial_ext as nb_ext
    print("Módulo C++ neg_binomial_ext importado correctamente para PyTensor Op.")
except ImportError:
    print("Error: No se pudo importar el módulo 'neg_binomial_ext'. Asegúrate de que el archivo .pyd/.so esté en el PYTHONPATH.")
    sys.exit(1)

class NegBinomialLogPGradientOp(Op):
    __hash__ = Op.__hash__
    def __eq__(self, other):
        return type(self) == type(other)

    def make_node(self, k_pt, mu_pt, alpha_pt):
        k_pt = pt.as_tensor_variable(k_pt, dtype='int32')
        mu_pt = pt.as_tensor_variable(mu_pt, dtype='float64')
        alpha_pt = pt.as_tensor_variable(alpha_pt, dtype='float64')

        # Verificaciones de tipo temporalmente quitadas

        outputs = [pt.vector(dtype='float64')]

        return Apply(self, [k_pt, mu_pt, alpha_pt], outputs)

    def perform(self, node, inputs, outputs):
        k, mu, alpha = inputs
        gradients_list = nb_ext.grad(int(k), float(mu), float(alpha))

        if not isinstance(gradients_list, list) or len(gradients_list) != 2:
             raise ValueError(f"La función nb_ext.grad retornó un formato inesperado: {gradients_list}")

        outputs[0][0] = np.array(gradients_list, dtype='float64')

    def grad(self, inputs, output_gradients):
         # CORRECCIÓN: Retornar ceros flotantes para todos los inputs (no se calcula Hessiano)
         # inputs son [k_pt, mu_pt, alpha_pt]
         return [pt.zeros_like(inputs[0], dtype='float64'), # Gradiente w.r.t k_pt
                 pt.zeros_like(inputs[1], dtype='float64'), # Gradiente w.r.t mu_pt
                 pt.zeros_like(inputs[2], dtype='float64')] # Gradiente w.r.t alpha_pt


class NegBinomialLogPOp(Op):
    __hash__ = Op.__hash__
    def __eq__(self, other):
        return type(self) == type(other)

    def make_node(self, k_pt, mu_pt, alpha_pt):
        k_pt = pt.as_tensor_variable(k_pt, dtype='int32')
        mu_pt = pt.as_tensor_variable(mu_pt, dtype='float64')
        alpha_pt = pt.as_tensor_variable(alpha_pt, dtype='float64')

        # Verificaciones de tipo temporalmente quitadas

        outputs = [pt.scalar(dtype='float64')] # Esta Op ES escalar

        return Apply(self, [k_pt, mu_pt, alpha_pt], outputs)

    def perform(self, node, inputs, outputs):
        k, mu, alpha = inputs
        logp_value = nb_ext.logp(int(k), float(mu), float(alpha))
        outputs[0][0] = np.array(logp_value, dtype='float64')

    def grad(self, inputs, output_gradients):
        k_pt, mu_pt, alpha_pt = inputs
        cost_grad = output_gradients[0]

        nb_grad_op = NegBinomialLogPGradientOp()
        gradients_pt_vector = nb_grad_op(k_pt, mu_pt, alpha_pt)

        d_logp_dmu_pt = gradients_pt_vector[0]
        d_logp_dalpha_pt = gradients_pt_vector[1]

        grad_mu = cost_grad * d_logp_dmu_pt
        grad_alpha = cost_grad * d_logp_dalpha_pt

        # CORRECCIÓN: Retornar un cero flotante simbólico para el input entero k_pt
        grad_k = pt.zeros_like(k_pt, dtype='float64')

        # La lista de retorno debe corresponder a los inputs: [grad_wrt_k, grad_wrt_mu, grad_wrt_alpha]
        return [grad_k, grad_mu, grad_alpha]


# --- Usar la Op personalizada en un Modelo PyMC ---

# Crear una instancia de la Op de log-probabilidad (la plantilla escalar)
neg_binomial_logp_op = NegBinomialLogPOp()

# Datos observados (como array NumPy)
observed_data_k = np.array([6, 9, 1, 3, 13, 8, 20, 11, 8, 16, 2, 11, 7, 6, 10, 14, 7, 45, 2, 
                            3, 3, 17, 6, 11, 3, 16, 22, 4, 15, 8, 3, 3, 4, 0, 7, 1, 0, 5, 3, 
                            5, 1, 1, 0, 1, 0, 8, 2, 15, 13, 4], dtype='int32')

if __name__ == '__main__':

    with pm.Model() as custom_nb_model:
        mu = pm.HalfNormal("mu", sigma=10)
        alpha = pm.HalfNormal("alpha", sigma=10)

        # Lista para almacenar los nodos simbólicos de log-probabilidad por cada punto de dato
        logp_terms_list = []

        # CORRECCIÓN: Construir el grafo manualmente iterando sobre los datos
        for i, k_i in enumerate(observed_data_k):
            # Convertir cada punto de dato escalar a una constante simbólica de PyTensor
            k_i_pt = pt.constant(k_i, dtype='int32')

            # Aplicar la Op escalar a este punto de dato escalar y a los parámetros escalares (priors)
            logp_i = neg_binomial_logp_op(k_i_pt, mu, alpha)

            # Añadir el resultado simbólico a la lista
            logp_terms_list.append(logp_i)

        # Concatenar todos los resultados simbólicos en un solo vector simbólico
        logp_term_vector = pt.stack(logp_terms_list)

        # Sumar los log-términos para obtener la log-verosimilitud total
        total_logp = pt.sum(logp_term_vector)

        # Añadir el total_logp al log-verosimilitud total del modelo usando pm.Potential
        logp_likelihood = pm.Potential("logp_likelihood_from_cpp", total_logp)

        print("Construcción del grafo completada. Iniciando muestreo con NUTS...")
        # pm.sample() por defecto intenta usar el muestreador NUTS si hay gradientes disponibles.
        idata = pm.sample(2500)

    print("\nAnálisis de resultados:")
    print(az.summary(idata))

    # Mostrar los gráficos de traza y distribución posterior
    az.plot_trace(idata)
    az.plot_posterior(idata)

    # Mostrar los gráficos generados por matplotlib
    plt.show()

Once again, I don’t know if this is something similar to what you are looking for.