Although effective uncertainty estimation can be a key consideration in the development of safe and fair artificial intelligence systems, most of today’s large-scale deep learning applications are lacking in this regard.
To accelerate research in this field, a team from DeepMind has proposed epistemic neural networks (ENNs) as an interface for uncertainty modelling in deep learning, and the KL divergence from a target distribution as a precise metric to evaluate ENNs. In the paper Epistemic Neural Networks, the team also introduces a computational testbed based on inference in a neural network Gaussian process, and validates that the proposed ENNs can improve performance in terms of statistical quality and computational cost.
The researchers say all existing approaches to uncertainty modelling in deep learning can be expressed as ENNs, presenting a new perspective on the potential of neural networks as computational tools for approximate posterior inference.
The study seeks to develop neural networks that are effective tools for probabilistic inference. The proposed ENNs use an input, parameters, and an epistemic index to make predictions; and are designed to tackle the complex problem of posterior inference in Bayesian neural networks (BNNs) by having the network learn a posterior distribution via the epistemic index input to the network.
To evaluate their proposed ENNs, the team suggests that computational constraints in training and evaluation both be taken into consideration. They therefore assess ENN performance through the quality of the posterior approximation, proposing the KL divergence (Kullback and Leibler, 1951) from a target distribution as their evaluation metric.
Another key contribution of the paper is a practical testbed for comparing the quality of ENNs in a computationally efficient manner. The researchers specify a target posterior in terms of the neural network Gaussian process (NNGP), and use the neural tangents library to compute the target NNGP posterior and a sample-based approximation to estimate the KL divergence of ENN predictions. Notably, the testbed is based on a generative model for the underlying posterior inference, thus alleviating a number of severe overfitting problems.
The team evaluated their approach using several benchmarks for uncertainty estimation in deep learning. The results show that the proposed metrics are robust to the choice of kernel and that ENNs can offer orders of magnitude of savings in computation. Moreover, the GP bandit problem experiment demonstrates that testbed performance is highly correlated with performance in the sequential decision problem.
The team believes their proposed ENNs coupled with the evaluation testbed can open up exciting new avenues of research, and represent an important step towards effective uncertainty estimation in large and complex deep learning systems.
Author: Hecate He | Editor: Michael Sarazen, Chain Zhang
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.