Reader small image

You're reading from  Enhancing Deep Learning with Bayesian Inference

Product typeBook
Published inJun 2023
PublisherPackt
ISBN-139781803246888
Edition1st Edition
Right arrow
Authors (3):
Matt Benatan
Matt Benatan
author image
Matt Benatan

Matt Benatan is a Principal Research Scientist at Sonos and a Simon Industrial Fellow at the University of Manchester. His work involves research in robust multimodal machine learning, uncertainty estimation, Bayesian optimization, and scalable Bayesian inference.
Read more about Matt Benatan

Jochem Gietema
Jochem Gietema
author image
Jochem Gietema

Jochem Gietema is an Applied Scientist at Onfido in London where he has developed and deployed several patented solutions related to anomaly detection, computer vision, and interactive data visualisation.
Read more about Jochem Gietema

Marian Schneider
Marian Schneider
author image
Marian Schneider

Marian Schneider is an applied scientist in machine learning. His work involves developing and deploying applications in computer vision, ranging from brain image segmentation and uncertainty estimation to smarter image capture on mobile devices.
Read more about Marian Schneider

View More author details
Right arrow

Chapter 3
Fundamentals of Deep Learning

Throughout the book, when studying how to apply Bayesian methods and extensions to neural networks, we will encounter different neural network architectures and applications. This chapter will provide an introduction to common architecture types, thus laying the foundation for introducing Bayesian extensions to these architectures later on. We will also review some of the limitations of such common neural network architectures, in particular their tendency to produce overconfident outputs and their susceptibility to adversarial manipulation of inputs. By the end of this chapter, you should have a good understanding of deep neural network basics and know how to implement the most common neural network architecture types in code. This will help you follow the code examples found in later sections.

The content will be covered in the following sections:

  • Introducing the multi-layer perceptron

  • Reviewing neural network architectures

  • Understanding...

3.1 Technical requirements

To complete the practical tasks in this chapter, you will need a Python 3.8 environment with the pandas and scikit-learn stack and the following additional Python packages installed:

  • TensorFlow 2.0

  • Matplotlib plotting library

All of the code for this book can be found on the GitHub repository for the book: https://github.com/PacktPublishing/Enhancing-Deep-Learning-with-Bayesian-Inference.

3.2 Introducing the multi-layer perceptron

Deep neural networks are at the core of the deep learning revolution. The aim of this section is to introduce basic concepts and building blocks for deep neural networks. To get started, we will review the components of the multi-layer perceptron (MLP) and implement it using the TensorFlow framework. This will serve as the foundation for other code examples in the book. If you are already familiar with neural networks and know how to implement them in code, feel free to jump ahead to the Understanding the problem with typical NNs section, where we cover the limitations of deep neural networks. This chapter focuses on architectural building blocks and principles and does not cover learning rules and gradients. If you require additional background information for those topics, we recommend Sebastian Raschka’s excellent Python Machine Learning book from Packt Publishing (in particular, Chapter 2, Fundamentals of Bayesian Inference)...

3.3 Reviewing neural network architectures

In the previous section, we saw how to implement a fully-connected network in the form of an MLP. While such networks were very popular in the early days of deep learning, over the years, machine learning researchers have developed more sophisticated architectures that work more successfully by including domain-specific knowledge (such as computer vision or Natural Language Processing (NLP)). In this section, we will review some of the most common of these neural network architectures, including Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), as well as attention mechanisms and transformers.

3.3.1 Exploring CNNs

When looking back at the example of trying to predict London housing prices with an MLP model, the input features we used (distance to the city centre, floor area, and construction year of the house) were still ”hand-engineered,” meaning that a human looked at the problem and decided which...

3.4 Understanding the problem with typical neural networks

The deep neural networks we discussed in previous sections are extremely powerful and, paired with appropriate training data, have enabled big strides in machine perception. In machine vision, convolutional neural networks enable us to classify images, locate objects in images, segment images into different segments or instances, and even to generate entirely novel images. In natural language processing, recurrent neural networks and transformers have allowed us to classify text, to recognize speech, to generate novel text or, as reviewed previously, to translate between two different languages.

However, these standard types of neural network models also have several limitations. In this section, we will explore some of these limitations. We will look at the following:

  • How the prediction scores of such neural network models can be overconfident

  • How such models can produce very confident predictions on OOD data

  • How tiny, imperceptible...

3.5 Summary

In this chapter, we have seen different types of common neural networks. First, we discussed the key building blocks of neural networks with a special focus on the multi-layer perceptron. Then we reviewed common neural network architectures: convolutional neural networks, recurrent neural networks, and the attention mechanism. All these components allow us to build very powerful deep learning models that can sometimes achieve super-human performance. However, in the second part of the chapter, we reviewed a few problems of neural networks. We discussed how they can be overconfident, and do not handle out-of-distribution data very well. We also saw how small, imperceptible changes to a neural network’s input can cause the model to make an incorrect prediction.

In the next chapter, we will combine the concepts learned in this chapter and in Chapter 3, Fundamentals of Deep Learning, and discuss Bayesian deep learning, which has the potential to overcome some of the...

3.6 Further reading

There are a lot of great resources to learn more about the essential building blocks of deep learning. Here are just a few popular resources that are a great start:

  • Nielsen, M.A., 2015. Neural networks and deep learning (Vol. 25). San Francisco, CA, USA: Determination press., http://neuralnetworksanddeeplearning.com/.

  • Chollet, F., 2021. Deep learning with Python. Simon and Schuster.

  • Raschka, S., 2015. Python Machine Learning. Packt Publishing Ltd.

  • Ng, Andrew, 2022, Deep Learning Specialization. Coursera.

  • Johnson, Justin, 2019. EECS 498-007 / 598-005, Deep Learning for Computer Vision. University of Michigan.

To learn more about the problems of deep learning models, you can read some of the following resources:

  • Overconfidence and calibration:

    • Guo, C., Pleiss, G., Sun, Y. and Weinberger, K.Q., 2017, July. On calibration of modern neural networks. In International conference on machine learning (pp. 1321-1330). PMLR.

    • Ovadia, Y., Fertig, E., Ren, J., Nado, Z., Sculley...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Enhancing Deep Learning with Bayesian Inference
Published in: Jun 2023Publisher: PacktISBN-13: 9781803246888
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (3)

author image
Matt Benatan

Matt Benatan is a Principal Research Scientist at Sonos and a Simon Industrial Fellow at the University of Manchester. His work involves research in robust multimodal machine learning, uncertainty estimation, Bayesian optimization, and scalable Bayesian inference.
Read more about Matt Benatan

author image
Jochem Gietema

Jochem Gietema is an Applied Scientist at Onfido in London where he has developed and deployed several patented solutions related to anomaly detection, computer vision, and interactive data visualisation.
Read more about Jochem Gietema

author image
Marian Schneider

Marian Schneider is an applied scientist in machine learning. His work involves developing and deploying applications in computer vision, ranging from brain image segmentation and uncertainty estimation to smarter image capture on mobile devices.
Read more about Marian Schneider