Composite Bayesian inference (original) (raw)

Collective Inference with Learned and Engineered Knowledge

2009

Abstract: A persistent goal of research in artificial intelligence has been to enable learning and reasoning with probabilistic models in complex domains. Much of this work has been directed toward systems that complement, rather than replace, human abilities and knowledge. Models that fuse engineering knowledge (knowledge from human sources) with learned information (information gained algorithmically) can take advantage of the strengths of both approaches, yielding more accurate predictions.

Inference with Discriminative Posterior

We study Bayesian discriminative inference given a model family p(c, x, θ) that is assumed to contain all our prior information but still known to be incorrect. This falls in between "standard" Bayesian generative modeling and Bayesian regression, where the margin p(x, θ) is known to be uninformative about p(c|x, θ). We give an axiomatic proof that discriminative posterior is consistent for conditional inference; using the discriminative posterior is standard practice in classical Bayesian regression, but we show that it is theoretically justified for model families of joint densities as well. A practical benefit compared to Bayesian regression is that the standard methods of handling missing values in generative modeling can be extended into discriminative inference, which is useful if the amount of data is small. Compared to standard generative modeling, discriminative posterior results in better conditional inference if the model family is incorrect. If the model family contains also the true model, the discriminative posterior gives the same result as standard Bayesian generative modeling. Practical computation is done with Markov chain Monte Carlo.

Mini-buckets: A general scheme for bounded inference

2003

This article presents a class of approximation algorithms that extend the idea of boundedcomplexity inference, inspired by successful constraint propagation algorithms, to probabilistic inference and combinatorial optimization. The idea is to bound the dimensionality of dependencies created by inference algorithms. This yields a parameterized scheme, called mini-buckets, that offers adjustable trade-off between accuracy and efficiency. The mini-bucket approach to optimization problems, such as finding the most probable explanation (MPE) in Bayesian networks, generates both an approximate solution and bounds on the solution quality. We present empirical results demonstrating successful performance of the proposed approximation scheme for the MPE task, both on randomly generated problems and on realistic domains such as medical diagnosis and probabilistic decoding.

Learning Summary Statistics for Bayesian Inference with Autoencoders

SciPost physics core, 2022

For stochastic models with intractable likelihood functions, approximate Bayesian computation offers a way of approximating the true posterior through repeated comparisons of observations with simulated model outputs in terms of a small set of summary statistics. These statistics need to retain the information that is relevant for constraining the parameters but cancel out the noise. They can thus be seen as thermodynamic state variables, for general stochastic models. For many scientific applications, we need strictly more summary statistics than model parameters to reach a satisfactory approximation of the posterior. Therefore, we propose to use a latent representation of deep neural networks based on Autoencoders as summary statistics. To create an incentive for the encoder to encode all the parameter-related information but not the noise, we give the decoder access to explicit or implicit information on the noise that has been used to generate the training data. We validate the approach empirically on two types of stochastic models.

Bayesian Knowledge Fusion

We address the problem of information fusion in uncertain environments. Imagine there are multiple experts building probabilistic models of the same situation and we wish to aggregate the information they provide. There are several problems we may run into by naively merging the information from each. For example, the experts may disagree on the probability of a certain event or they may disagree on the direction of causility between two events (e.g., one thinks A causes B while another thinks B causes A). They may even disagree on the entire structure of dependencies among a set of variables in a probabilistic network. In our proposed solution to this problem, we represent the probabilistic models as Bayesian Knowledge Bases (BKBs) and propose an algorithm called Bayesian knowledge fusion that allows the fusion of multiple BKBs into a single BKB that retains the information from all input sources. This allows for easy aggregation and de-aggregation of information from multiple expert sources and facilitates multi-expert decision making by providing a framework in which all opinions can be preserved and reasoned over.

Inference Meta Models: A New Perspective On Belief Propagation With Bayesian Net-works

2006

We investigate properties of Bayesian networks (BNs) in the context of robust state estimation. We focus on problems where state estimation can be viewed as a classification of the possible states, which in turn is based on the fusion of heterogeneous and noisy information. We introduce a coarse perspective of the inference processes and show that classification with BNs can be very robust, even if we use models and evidence associated with significant uncertainties. By making coarse and realistic assumptions we can (i) formulate asymptotic properties of the classification performance, (ii) identify situations in which Bayesian fusion supports robust inference and (iii) introduce techniques that support detection of potentially misleading inference results at runtime. The presented coarse grained analysis from the runtime perspective is relevant for an important class of real world fusion problems where it is difficult to obtain domain models that precisely describe the true probability distributions over different states.

Minimax Optimal Bayesian Aggregation

It is generally believed that ensemble approaches, which combine multiple algorithms or models, can outperform any single algorithm at machine learning tasks, such as prediction.

Generalising the Maximum Entropy Inference Process to the Aggregation of Probabilistic Beliefs

1 2 This formulation ensures that linear constraint conditions such as w(θ) = a , w(φ | ψ) = b , and w(ψ | θ) ≤ c , where a, b, c ∈ [0, 1] and θ , φ , and ψ are Boolean combinations of the α j 's, are all permissible in K provided that the resulting constraint set K is consistent. Here a conditional constraint such as w(ψ | θ) ≤ c is interpreted as w(ψ ∧ θ) ≤ c w(θ) which is always a well-defined linear constraint, albeit vacuous when w(θ) = 0 .. See e.g.

First-order probabilistic inference

2003

There have been many proposals for first-order belief networks (i.e., where we quantify over individuals) but these typically only let us reason about the individuals that we know about. There are many instances where we have to quantify over all of the individuals in a population. When we do this the population size often matters and we need to reason about all of the members of the population (but not necessarily individually). This paper presents an algorithm to reason about multiple individuals, where we may know particular facts about some of them, but want to treat the others as a group. Combining unification with variable elimination lets us reason about classes of individuals without needing to ground out the theory.

2013: Ampliative Inference Under Varied Entropy Levels

Published, 2013

Systems of logico-probabilistic (LP) reasoning characterize inference from conditional assertions that are interpreted as expressing high conditional probabilities. In previous work, we studied four well known LP systems (namely, systems O, P, Z, and QC), and presented data from computer simulations in an attempt to illustrate the performance of the four systems. These simulations evaluated the four systems in terms of their tendency to license inference to accurate and informative lower probability bounds, given incomplete information about a randomly selected probability distribution (where this probability distribution may understood as representing the true stochastic state of the world). In our earlier work, the procedure used in generating the unknown probability distribution (i.e., the true stochastic state of the world) tended to yield probability distributions with moderately high entropy levels. In the present article, we present data charting the performance of the four systems in reasoning about probability distributions with various entropy levels. The results allow for a more inclusive assessment of the reliability and robustness of the four LP systems.