Data fusion with entropic priors (original) (raw)
Related papers
Objective priors from maximum entropy in data classification
Information Fusion, 2013
Lack of knowledge of the prior distribution in classification problems that operate on small data sets may make the application of Bayes’ rule questionable. Uniform or arbitrary priors may provide classification answers that, even in simple examples, may end up contradicting our common sense about the problem. Entropic priors (EPs), via application of the maximum entropy (ME) principle, seem to provide good objective answers in practical cases leading to more conservative Bayesian inferences. EP are derived and applied to classification tasks when only the likelihood functions are available. In this paper, when inference is based only on one sample, we review the use of the EP also in comparison to priors that are obtained from maximization of the mutual information between observations and classes. This last criterion coincides with the maximization of the KL divergence between posteriors and priors that for large sample sets leads to the well-known reference (or Bernardo’s) priors. Our comparison on single samples considers both approaches in prospective and clarifies differences and potentials. A combinatorial justification for EP, inspired by Wallis’ combinatorial argument for entropy definition, is also included. The application of the EP to sequences (multiple samples) that may be affected by excessive domination of the class with the maximum entropy is also considered with a solution that guarantees posterior consistency. An explicit iterative algorithm is proposed for EP determination solely from knowledge of the likelihood functions. Simulations that compare EP with uniform priors on short sequences are also included.
Optimal Classifier Fusion in a Non-Bayesian Probabilistic Framework
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000
The combination of the output of classifiers has been one of the strategies used to improve classification rates in general purpose classification systems. Some of the most common approaches can be explained using the Bayes' formula. In this paper, we tackle the problem of the combination of classifiers using a non-Bayesian probabilistic framework. This approach permits us to derive two linear combination rules that minimize misclassification rates under some constraints on the distribution of classifiers. In order to show the validity of this approach we have compared it with other popular combination rules from a theoretical viewpoint using a synthetic data set, and experimentally using two standard databases: the MNIST handwritten digit database and the GREC symbol database. Results on the synthetic data set show the validity of the theoretical approach. Indeed, results on real data show that the proposed methods outperform other common combination schemes.
A Risk Profile for Information Fusion Algorithms
Entropy, 2011
E.T. Jaynes, originator of the maximum entropy interpretation of statistical mechanics, emphasized that there is an inevitable trade-off between the conflicting requirements of robustness and accuracy for any inferencing algorithm. This is because robustness requires discarding of information in order to reduce the sensitivity to outliers. The principal of nonlinear statistical coupling, which is an interpretation of the Tsallis entropy generalization, can be used to quantify this trade-off. The coupled-surprisal,
AN INTELLIGENT CLASSIFIER FUSION TECHNIQUE FOR
Multimodal biometric technology relatively is a technology developed to overcome those limitations imposed by unimodal biometric systems. The paradigm consolidates evidence from multiple biometric sources offering considerable improvements in reliability with reasonably overall performance in many applications. Meanwhile, the issue of efficient and effective information fusion of these evidences obtained from different sources remains an obvious concept that attracts research attention. In this research paper, we consider a classical classifier fusion technique, Dempster's rule of combination proposed in Dempster-Shafer Theory (DST) of evidence. DST provides useful computational scheme for integrating accumulative evidences and possesses the potential to update the prior every time a new data is added in the database. However, it has some shortcomings. Dempster Shafer evidence combination has this inability to respond adequately to the fusion of different basic belief assignments (bbas) of evidences, even when the level of conflict between sources is low. It also has this tendency of completely ignoring plausibility in the measure of its belief. To solve these problems, this paper presents a modified Dempster's rule of combination for multimodal biometric authentication which integrates hyperbolic tangent (tanh) estimators to overcome the inadequate normalization steps done in the original Dempster's rule of combination. We also adopt a multi-level decision threshold to its measure of belief to model the modified Dempster Shafer rule of combination.
Probabilistic Methods for Data Fusion
Maximum Entropy and Bayesian Methods, 1998
The main object of this paper is to show how we can use classical probabilistic methods such as Maximum Entropy (ME), maximum likelihood (ML) and/or Bayesian (BAYES) approaches to do microscopic and macroscopic data fusion. Actually ME can be used to assign a probability law to an unknown quantity when we have macroscopic data (expectations) on it. ML can be used to estimate the parameters of a probability law when we have microscopic data (direct observation). BAYES can be used to update a prior probability law when we have microscopic data through the likelihood. When we have both microscopic and macroscopic data we can use first ME to assign a prior and then use BAYES to update it to the posterior law thus doing the desired data fusion. However, in practical data fusion applications, we may still need some engineering feeling to propose realistic data fusion solutions. Some simple examples in sensor data fusion and image reconstruction using different kind of data are presented to illustrate these ideas.
Sensors
To apply data fusion in time-domain based on Dempster–Shafer (DS) combination rule, an 8-step algorithm with novel entropy function is proposed. The 8-step algorithm is applied to time-domain to achieve the sequential combination of time-domain data. Simulation results showed that this method is successful in capturing the changes (dynamic behavior) in time-domain object classification. This method also showed better anti-disturbing ability and transition property compared to other methods available in the literature. As an example, a convolution neural network (CNN) is trained to classify three different types of weeds. Precision and recall from confusion matrix of the CNN are used to update basic probability assignment (BPA) which captures the classification uncertainty. Real data of classified weeds from a single sensor is used test time-domain data fusion. The proposed method is successful in filtering noise (reduce sudden changes—smoother curves) and fusing conflicting informat...
PERFORMANCE-DRIVEN ENTROPIC INFORMATION FUSION
Advances in technology have resulted in acquisition and subsequent fusion of data from multiple sensors of possibly different modalities. Fusing data acquired from different sensors occurs near the front end of sensing systems and therefore can become a critical bottleneck. It is therefore crucial to quantify the performance of sensor fusion. Information fusion involves estimating and optimizing an information criterion over a transformation that maps data from from one sensor data to another. It is crucial to the task of fusion to estimate divergence to a high degree of accuracy and to quantify error in the estimate. To this end, we propose a class of plugin estimators based on k-nearest neighbor (k-NN) graphs for estimating divergence. For this class of estimators, we derive a large sample theory for the bias and variance and develop a joint central limit theorem for the distribution of the estimators over the domain of the transformation space. In this paper, we apply our theory to two applications: (i) detection of anomalies in wireless sensor networks and (ii) fusion of hyperspectral images of geographic images using intrinsic dimension.