Reader small image

You're reading from  Enhancing Deep Learning with Bayesian Inference

Product typeBook
Published inJun 2023
PublisherPackt
ISBN-139781803246888
Edition1st Edition
Right arrow
Authors (3):
Matt Benatan
Matt Benatan
author image
Matt Benatan

Matt Benatan is a Principal Research Scientist at Sonos and a Simon Industrial Fellow at the University of Manchester. His work involves research in robust multimodal machine learning, uncertainty estimation, Bayesian optimization, and scalable Bayesian inference.
Read more about Matt Benatan

Jochem Gietema
Jochem Gietema
author image
Jochem Gietema

Jochem Gietema is an Applied Scientist at Onfido in London where he has developed and deployed several patented solutions related to anomaly detection, computer vision, and interactive data visualisation.
Read more about Jochem Gietema

Marian Schneider
Marian Schneider
author image
Marian Schneider

Marian Schneider is an applied scientist in machine learning. His work involves developing and deploying applications in computer vision, ranging from brain image segmentation and uncertainty estimation to smarter image capture on mobile devices.
Read more about Marian Schneider

View More author details
Right arrow

Chapter 8
Applying Bayesian Deep Learning

This chapter will guide you through a variety of applications of Bayesian deep learning (BDL). These will include the use of BDL in standard classification tasks, as well as demonstrating how it can be used in more sophisticated ways for out-of-distribution detection, data selection, and reinforcement learning.

We will cover these topics in the following sections:

  • Detecting out-of-distribution data

  • Being robust against dataset drift

  • Using data selection via uncertainty to keep models fresh

  • Using uncertainty estimates for smarter reinforcement learning

  • Susceptibility to adversarial input

8.1 Technical requirements

All of the code for this book can be found on the GitHub repository for the book: https://github.com/PacktPublishing/Enhancing-Deep-Learning-with-Bayesian-Inference.

8.2 Detecting out-of-distribution data

Typical neural networks do not handle out-of-distribution data well. We saw in Chapter 3, Fundamentals of Deep Learning that a cat-dog classifier classified an image of a parachute as a dog with more than 99% confidence. In this section, we will look into what we can do about this vulnerability of neural networks. We will do the following:

  • Explore the problem visually by perturbing a digit of the MNIST dataset

  • Explain the typical way out-of-distribution detection performance is reported in the literature

  • Review the out-of-distribution detection performance of some of the standard practical BDL methods we look at in this chapter

  • Explore even more practical methods that are specifically tailored to detect out-of-distribution detection

8.2.1 Exploring the problem of out-of-distribution detection

To give you a better understanding of what out-of-distribution performance is like, we will start with a visual example. Here is what we will do...

8.3 Being robust against dataset shift

We already encountered dataset shift in Chapter 3, Fundamentals of Deep Learning. As a reminder, dataset shift is a common problem in machine learning that happens when the joint distribution P(X,Y ) of inputs X and outputs Y differs between the model training stage and model inference stage (for example, when testing the model or when running it in a production environment). Covariate shift is a specific case of dataset shift where only the distribution of the inputs changes but the conditional distribution P(Y |X) stays constant.

Dataset shift is present in most production environments because of the difficulty of including all possible inference conditions during training and because most data is not static but changes over time. The input data can shift along many different dimensions in a production environment. Geographic and temporal dataset shift are two common forms of shift. Imagine, for example, you have trained your model on data...

8.4 Using data selection via uncertainty to keep models fresh

We saw at the beginning of the chapter that we can use uncertainties to figure out whether data is part of the training data or not. We can expand on this idea in the context of an area of machine learning called active learning. The promise of active learning is that a model can learn more effectively on less data if we have a way to control the type of data it is trained on. Conceptually, this makes sense: if we train a model on data that is not of sufficient quality, it will also not perform well. Active learning is a way to guide the learning process and data a model is trained on by providing functions that can acquire data from a pool of data that is not part of the training data. By iteratively selecting the right data from the pool, we can train a model that performs better than if we had chosen the data from the pool at random.

Active learning can be used in many modern-day systems where there is a ton of unlabeled...

8.5 Using uncertainty estimates for smarter reinforcement learning

Reinforcement learning aims to develop machine learning techniques capable of learning from their environment. There’s a clue to the fundamental principle behind reinforcement learning in its name: the aim is to reinforce successful behaviour. Generally speaking, in reinforcement learning, we have an agent capable of executing a number of actions in an environment. Following these actions, the agent receives feedback from the environment, and this feedback is used to allow the agent to build a better understanding of which actions are more likely to lead to a positive outcome given the current state of the environment.

Formally, we can describe this using a set of states, S, a set of actions A, which map from a current state s to a new state s, and a reward function, R(s,s), describing the reward for the transition between the current state, s, and the new state, s. The set of states comprises...

8.6 Susceptibility to adversarial input

In Chapter 3, Fundamentals of Deep Learning, we saw that we could fool a CNN by slightly perturbing the input pixels of an image. A picture that clearly looked like a cat was predicted as a dog with high confidence. The adversarial attack that we created (FSGM) is one of the many adversarial attacks that exist, and BDL might offer some protection against these attacks. Let’s see how that works in practice.

Step 1: Model training

Instead of using a pre-trained model, as in Chapter 3, Fundamentals of Deep Learning, we train a model from scratch. We use the same train and test data from Chapter 3, Fundamentals of Deep Learning – see that chapter for instructions on how to load the dataset. As a reminder, the dataset is a relatively small dataset of cats and dogs. We first define our model. We use a VGG-like architecture but add dropout after every MaxPooling2D layer:

 
def conv_block(filters):  
   ...

8.7 Summary

In this chapter, we have illustrated the various applications of modern BDL in five different case studies. Each case study used code examples to highlight a particular strength of BDL in response to various, common problems in applied machine learning practice. First, we saw how BDL can be used to detect out-of-distribution images in a classification task. We then looked at how BDL methods can be used to make models more robust to dataset shift, which is a very common problem in production environments. Next, we learned how BDL can help us to select the most informative data points for training and updating our machine learning models. We then turned to reinforcement learning and saw how BDL can be used to facilitate more cautious behaviour in reinforcement learning agents. Finally, we saw how BDL can help us in the face of adversarial attacks.

In the next chapter, we will have a look at the future of BDL by reviewing current trends and the latest methods.

8.8 Further reading

The following reading list will offer a greater understanding of some of the topics we touched on in this chapter:

  • Benchmarking neural network robustness to common corruptions and perturbations, Dan Hendrycks and Thomas Dietterich, 2019: this is the paper that introduced the image quality perturbations to benchmark model robustness, which we saw in the robustness case study.

  • Can You Trust Your Model’s Uncertainty? Evaluating predictive Uncertainty Under Dataset Shift, Yaniv Ovadia, Emily Fertig et al., 2019: this comparison paper uses image quality perturbations to introduce artificial dataset shift at different severity levels and measures how different deep neural networks respond to dataset shift in terms of accuracy and calibration.

  • A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks, Dan Hendrycks and Kevin Gimpel, 2016: this fundamental out-of-distribution detection paper introduces the concept and shows that softmax...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Enhancing Deep Learning with Bayesian Inference
Published in: Jun 2023Publisher: PacktISBN-13: 9781803246888
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (3)

author image
Matt Benatan

Matt Benatan is a Principal Research Scientist at Sonos and a Simon Industrial Fellow at the University of Manchester. His work involves research in robust multimodal machine learning, uncertainty estimation, Bayesian optimization, and scalable Bayesian inference.
Read more about Matt Benatan

author image
Jochem Gietema

Jochem Gietema is an Applied Scientist at Onfido in London where he has developed and deployed several patented solutions related to anomaly detection, computer vision, and interactive data visualisation.
Read more about Jochem Gietema

author image
Marian Schneider

Marian Schneider is an applied scientist in machine learning. His work involves developing and deploying applications in computer vision, ranging from brain image segmentation and uncertainty estimation to smarter image capture on mobile devices.
Read more about Marian Schneider