Modifying Bayesian Networks by Probability Constraints (original) (raw)

Adding Local Constraints to Bayesian Networks

When using Bayesian networks, practitioners often express constraints among variables by conditioning a common child node to induce the desired distribution. For example, an ‘or’ constraint can be easily expressed by a node modelling a logical ‘or’ of its parents’ values being conditioned to true. This has the desired effect that at least one parent must be true. However, conditioning also alters the distributions of further ancestors in the network. In this paper we argue that these side effects are undesirable when constraints are added during model design. We describe a method called shielding to remove these side effects while remaining within the directed language of Bayesian networks. This method is then compared to chain graphs which allow undirected and directed edges and which model equivalent distributions. Thus, in addition to solving this common modelling problem, shielded Bayesian networks provide a novel method for implementing chain graphs with existing Bayesian network tools.

Decomposing local probability distributions in Bayesian networks for improved inference and parameter learning

2006

A major difficulty in building Bayesian network models is the size of conditional probability tables, which grow exponentially in the number of parents. One way of dealing with this problem is through parametric conditional probability distributions that usually require only a linear number of parameters in the number of parents. In this paper we introduce a new class of parametric models, the pICI models, that aim at lowering the number of parameters required to specify local probability distributions, but are still capable of modeling a variety of interactions. A subset of the pICI models are decomposable and this leads to significantly faster inference as compared to models that cannot be decomposed. We also show that the pICI models are especially useful for parameter learning from small data sets and this leads to higher accuracy than learning CPTs.

Using sensitivity analysis for selective parameter update in Bayesian network learning

2002

The process of building a Bayesian network model is often a bottleneck in applying the Bayesian network approach to real-world problems. One of the daunting tasks is the quantification of the Bayesian network that often requires specifying a huge number of conditional probabilities. On the other hand, the sensitivity of the network's performance to variations in different probability parameters may be quite different; thus, certain parameters should be specified with a higher precision than the others. We present a method for a selective update of the probabilities based on the results of sensitivity analysis performed during learning a Bayesian network from data. We first perform the sensitivity analysis on a Bayesian network in order to identify the most important (most critical) probability parameters, and then further update those probabilities to more accurate values. The process is repeated until refining the probabilities any further does not improve the performance of the network. Our method can also be used in active learning of the Bayesian networks, in which case the sensitivity can be used as a criterion guiding active data selection.

An optimization-based approach for the design of Bayesian networks

Mathematical and Computer Modelling, 2008

Bayesian networks model conditional dependencies among the domain variables, and provide a way to deduce their interrelationships as well as a method for the classification of new instances. One of the most challenging problems in using Bayesian networks, in the absence of a domain expert who can dictate the model, is inducing the structure of the network from a large, multivariate data set. We propose a new methodology for the design of the structure of a Bayesian network based on concepts of graph theory and nonlinear integer optimization techniques.

Specifying Prior Probabilities in Bayesian Network by Maximum Likelihood Estimation method

Sylwan Journal, Volume 160, Issue 2, February 2016, pages 281-298, 2016

Bayesian network provides the solid inference mechanism when convincing the hypothesis by collecting evidences. Bayesian network is instituted of two models such as qualitative model quantitative model. The qualitative model is its structure and the quantitative model is its parameters, namely conditional probability tables (CPT) whose entries are probabilities quantifying the dependences among variables in network. The quality of CPT depends on the initialized values of its entries. Such initial values are prior probabilities. Because the beta function provides some conveniences when specifying CPT (s), this function is used as the basic distribution in my method. The main problem of defining prior probabilities is how to estimate parameters in beta distribution. It is slightly unfortunate when the equations whose solutions are parameter estimators are differential equations and it is too difficult to solve them. By applying the maximum likelihood estimation (MLE) technique, I invent the simple equations so that differential equations are eliminated and it is much easier to estimate parameters in case that such parameters are positive integer numbers. Thus, I also propose the algorithm to find out the approximate solutions of these simple equations. Keywords: prior probabilities, Bayesian network, maximum likelihood estimation.

A Bayesian Approach to Learning Bayesian Networks With Local Structure

UAI'97, 1997

Recently several researchers have investi- gated techniques for using data to learn Bayesian networks containing compact rep- resentations for the conditional probability distributions (CPDs) stored at each node. The majority of this work has concentrated on using decision-tree representations for the CPDs. In addition, researchers typi- cally apply non-Bayesian (or asymptotically Bayesian) scoring functions such as MDL to evaluate the

On the Use of Restrictions for Learning Bayesian Networks

Lecture Notes in Computer Science, 2005

In this paper we explore the use of several types of structural restrictions within algorithms for learning Bayesian networks. These restrictions may codify expert knowledge in a given domain, in such a way that a Bayesian network representing this domain should satisfy them. Our objective is to study whether the algorithms for automatically learning Bayesian networks from data can benefit from this prior knowledge to get better results. We formally define three types of restrictions: existence of arcs and/or edges, absence of arcs and/or edges, and ordering restrictions, and also study their interactions and how they can be managed within Bayesian network learning algorithms based on the score+search paradigm. Then we particularize our study to the classical local search algorithm with the operators of arc addition, arc removal and arc reversal, and carry out experiments using this algorithm on several data sets.

Approximations of Bayesian networks through KL minimisation

New Generation Computing, 2000

Exact inference in large, complex Bayesian networks is computationally intractable. Approximate schemes are therefore of great importance for real world computation. In this paper we consider an approximation scheme in which the original Bayesian network is approximated by another Bayesian network. The approximating network is optimised by an iterative procedure, which minimises the Kullback-Leibler divergence between the two networks. The procedure is guaranteed to converge to a local minimum of the Kullback-Leibler divergence. An important question in this scheme is how to choose the structure of the approximating network. In this paper we show how redundant structures of the approximating model can be pruned in advance. Simulation results of model selection and model optimisation are provided to illustrate the methods.

Learning equivalence classes of Bayesian-network structures

The Journal of Machine Learning Research, 2002

Approaches to learning Bayesian networks from data typically combine a scoring metric with a heuristic search procedure. Given a B a yesian network structure, many of the scoring metrics derived in the literature return a score for the entire equivalence class to which the structure belongs. When using such a metric, it is appropriate for the heuristic search algorithm to search o ver equivalence classes of Bayesian networks as opposed to individual structures. We present the general formulation of a search space for which the states of the search correspond to equivalence classes of structures. Using this space, any o n e o f a n umber of heuristic search a lgorithms can easily be applied. We compare greedy search performance in the proposed search space to greedy search performance in a search space for which the states correspond to individual Bayesian network structures.