Confident Classification Using a Hybrid Between Deterministic and Probabilistic Convolutional Neural Networks (original) (raw)

A Survey on Uncertainty Estimation in Deep Learning Classification Systems from a Bayesian Perspective

ACM Computing Surveys, 2022

Decision-making based on machine learning systems, especially when this decision-making can affect human lives, is a subject of maximum interest in the Machine Learning community. It is, therefore, necessary to equip these systems with a means of estimating uncertainty in the predictions they emit in order to help practitioners make more informed decisions. In the present work, we introduce the topic of uncertainty estimation, and we analyze the peculiarities of such estimation when applied to classification systems. We analyze different methods that have been designed to provide classification systems based on deep learning with mechanisms for measuring the uncertainty of their predictions. We will take a look at how this uncertainty can be modeled and measured using different approaches, as well as practical considerations of different applications of uncertainty. Moreover, we review some of the properties that should be borne in mind when developing such metrics. All in all, the ...

Correlated Parameters to Accurately Measure Uncertainty in Deep Neural Networks

IEEE Transactions on Neural Networks and Learning Systems

In this article, a novel approach for training deep neural networks using Bayesian techniques is presented. The Bayesian methodology allows for an easy evaluation of model uncertainty and, additionally, is robust to overfitting. These are commonly the two main problems classical, i.e., non-Bayesian architectures have to struggle with. The proposed approach applies variational inference in order to approximate the intractable posterior distribution. In particular, the variational distribution is defined as the product of multiple multivariate normal distributions with tridiagonal covariance matrices. Every single normal distribution belongs either to the weights or to the biases corresponding to one network layer. The layerwise a posteriori variances are defined based on the corresponding expectation values, and furthermore, the correlations are assumed to be identical. Therefore, only a few additional parameters need to be optimized compared with non-Bayesian settings. The performance of the new approach is evaluated and compared with other recently developed Bayesian methods. Basis of the performance evaluations are the popular benchmark data sets MNIST and CIFAR-10. Among the considered approaches, the proposed one shows the best predictive accuracy. Moreover, extensive evaluations of the provided prediction uncertainty information indicate that the new approach often yields more useful uncertainty estimates than the comparison methods.

PremiUm-CNN: Propagating Uncertainty Towards Robust Convolutional Neural Networks

IEEE Transactions on Signal Processing, 2021

Deep neural networks (DNNs) have surpassed human-level accuracy in various learning tasks. However, unlike humans who have a natural cognitive intuition for probabilities, DNNs cannot express their uncertainty in the output decisions. This limits the deployment of DNNs in mission-critical domains, such as warfighter decision-making or medical diagnosis. Bayesian inference provides a principled approach to reason about model's uncertainty by estimating the posterior distribution of the unknown parameters. The challenge in DNNs remains the multi-layer stages of non-linearities, which make the propagation of high-dimensional distributions mathematically intractable. This paper establishes the theoretical and algorithmic foundations of uncertainty or belief propagation by developing new deep learning models named PremiUm-CNNs (Propagating Uncertainty in Convolutional Neural Networks). We introduce a tensor normal distribution as a prior over convolutional kernels and estimate the variational posterior by maximizing the evidence lower bound (ELBO). We start by deriving the first-order mean-covariance propagation framework. Later, we develop a framework based on the unscented transformation (correct at least up to the second-order) that propagates sigma points of the variational distribution through layers of a CNN. The propagated covariance of the predictive distribution captures uncertainty in the output decision. Comprehensive experiments conducted on diverse benchmark datasets demonstrate: 1) superior robustness against noise and adversarial attacks, 2) self-assessment through predictive uncertainty that increases quickly with increasing levels of noise or attacks, and 3) an ability to detect a targeted attack from ambient noise.

Misclassification Risk and Uncertainty Quantification in Deep Classifiers

2021 IEEE Winter Conference on Applications of Computer Vision (WACV)

In this paper, we propose risk-calibrated evidential deep classifiers to reduce the costs associated with classification errors. We use two main approaches. The first is to develop methods to quantify the uncertainty of a classifier's predictions and reduce the likelihood of acting on erroneous predictions. The second is a novel way to train the classifier such that erroneous classifications are biased towards less risky categories. We combine these two approaches in a principled way. While doing this, we extend evidential deep learning with pignistic probabilities, which are used to quantify uncertainty of classification predictions and model rational decision making under uncertainty. We evaluate the performance of our approach on several image classification tasks. We demonstrate that our approach allows to (i) incorporate misclassification cost while training deep classifiers, (ii) accurately quantify the uncertainty of classification predictions, and (iii) simultaneously learn how to make classification decisions to minimize expected cost of classification errors.

Encoding the Latent Posterior of Bayesian Neural Networks for Uncertainty Quantification

IEEE Transactions on Pattern Analysis and Machine Intelligence

Bayesian neural networks (BNNs) have been long considered an ideal, yet unscalable solution for improving the robustness and the predictive uncertainty of deep neural networks. While they could capture more accurately the posterior distribution of the network parameters, most BNN approaches are either limited to small networks or rely on constraining assumptions such as parameter independence. These drawbacks have enabled prominence of simple, but computationally heavy approaches such as Deep Ensembles, whose training and testing costs increase linearly with the number of networks. In this work we aim for efficient deep BNNs amenable to complex computer vision architectures, e.g. ResNet50 DeepLabV3+, and tasks, e.g. semantic segmentation, with fewer assumptions on the parameters. We achieve this by leveraging variational autoencoders (VAEs) to learn the interaction and the latent distribution of the parameters at each network layer. Our approach, Latent-Posterior BNN (LP-BNN), is compatible with the recent BatchEnsemble method, leading to highly efficient (in terms of computation and memory during both training and testing) ensembles. LP-BNNs attain competitive results across multiple metrics in several challenging benchmarks for image classification, semantic segmentation and out-of-distribution detection.

A modified Bayesian Convolutional Neural Network for Breast Histopathology Image Classification and Uncertainty Quantification

2020

Convolutional neural network (CNN) based classification models have been successfully used on histopathological images for the detection of diseases. Despite its success, CNN may yield erroneous or overfitted results when the data is not sufficiently large or is biased. To overcome these limitations of CNN and to provide uncertainty quantification Bayesian CNN is recently proposed. However, we show that Bayesian-CNN still suffers from inaccuracies, especially in negative predictions. In the present work, we extend the Bayesian-CNN to improve accuracy and the rate of convergence. The proposed model is called modified Bayesian-CNN. The novelty of the proposed model lies in an adaptive activation function that contains a learnable parameter for each of the neurons. This adaptive activation function dynamically changes the loss function thereby providing faster convergence and better accuracy. The uncertainties associated with the predictions are obtained since the model learns a probab...

Quantifying Classification Uncertainty using Regularized Evidential Neural Networks

ArXiv, 2019

Traditional deep neural nets (NNs) have shown the state-of-the-art performance in the task of classification in various applications. However, NNs have not considered any types of uncertainty associated with the class probabilities to minimize risk due to misclassification under uncertainty in real life. Unlike Bayesian neural nets indirectly infering uncertainty through weight uncertainties, evidential neural networks (ENNs) have been recently proposed to support explicit modeling of the uncertainty of class probabilities. It treats predictions of an NN as subjective opinions and learns the function by collecting the evidence leading to these opinions by a deterministic NN from data. However, an ENN is trained as a black box without explicitly considering different types of inherent data uncertainty, such as vacuity (uncertainty due to a lack of evidence) or dissonance (uncertainty due to conflicting evidence). This paper presents a new approach, called a {\em regularized ENN}, tha...

Uncertainty handling in convolutional neural networks

Neural Computing and Applications

The performance of convolutional neural networks is degraded by noisy data, especially in the test phase. To address this challenge, a new convolutional neural network structure with data indeterminacy handling in the neutrosophic (NS) domain, named as Neutrosophic Convolutional Neural Networks, is proposed for image classification. For this task, images are firstly mapped from the pixel domain to three sets true (T), indeterminacy (I) and false (F) in NS domain by the proposed method. Then, NCNN with two parallel paths, one with the input of T and another with I, is constructed followed by an appropriate combination of paths to generate the final output. Here, two paths are trained simultaneously, and neural network weights are updated using back propagation algorithm. The effectiveness of NCNN to handle noisy data is analyzed mathematically in terms of the weights update rule. Proposed two paths NS idea is applied to two basic models: CNN and VGG-Net to construct NCNN and NVGG-Net, respectively. The proposed method has been evaluated on MNIST, CIFAR-10 and CIFAR-100 datasets contaminated with 20 levels of Gaussian noise. Results show that two-path NCNN outperforms CNN by 5.11% and 2.21% in 5 pairs (training, test) with different levels of noise on MNIST and CIFAR-10 datasets, respectively. Finally, NVGG-Net increases the accuracy by 3.09% and 2.57% compared to VGG-Net on CIFAR-10 and CIFAR-100 datasets, respectively.

Uncertainty Quantification in Chest X-Ray Image Classification using Bayesian Deep Neural Networks

2020

In this presentation, we quantify the uncertainty of deep neural networks (DNNs) for the task of Chest X-Ray (CXR) image classification. We investigate uncertainties of several commonly used DNN architectures including ResNet, ResNeXt, DenseNet and SENet. We propose an uncertainty based strategy and analyze the impact of this strategy on the classifier performance. Results show that utilizing uncertainty information may improve DNN performance for some metrics and observations.

Measuring the Uncertainty of Predictions in Deep Neural Networks with Variational Inference

Sensors

We present a novel approach for training deep neural networks in a Bayesian way. Compared to other Bayesian deep learning formulations, our approach allows for quantifying the uncertainty in model parameters while only adding very few additional parameters to be optimized. The proposed approach uses variational inference to approximate the intractable a posteriori distribution on basis of a normal prior. By representing the a posteriori uncertainty of the network parameters per network layer and depending on the estimated parameter expectation values, only very few additional parameters need to be optimized compared to a non-Bayesian network. We compare our approach to classical deep learning, Bernoulli dropout and Bayes by Backprop using the MNIST dataset. Compared to classical deep learning, the test error is reduced by 15%. We also show that the uncertainty information obtained can be used to calculate credible intervals for the network prediction and to optimize network architec...