You're reading from Enhancing Deep Learning with Bayesian Inference

Product typeBook

Published inJun 2023

PublisherPackt

ISBN-139781803246888

Edition1st Edition

Concepts

Deep Learning

Authors (3):

Matt Benatan

Jochem Gietema

Marian Schneider

View More author details

Chapter 3
Fundamentals of Deep Learning

Throughout the book, when studying how to apply Bayesian methods and extensions to neural networks, we will encounter different neural network architectures and applications. This chapter will provide an introduction to common architecture types, thus laying the foundation for introducing Bayesian extensions to these architectures later on. We will also review some of the limitations of such common neural network architectures, in particular their tendency to produce overconfident outputs and their susceptibility to adversarial manipulation of inputs. By the end of this chapter, you should have a good understanding of deep neural network basics and know how to implement the most common neural network architecture types in code. This will help you follow the code examples found in later sections.

The content will be covered in the following sections:

Introducing the multi-layer perceptron
Reviewing neural network architectures
Understanding...

3.1 Technical requirements

To complete the practical tasks in this chapter, you will need a Python 3.8 environment with the pandas and scikit-learn stack and the following additional Python packages installed:

TensorFlow 2.0
Matplotlib plotting library

All of the code for this book can be found on the GitHub repository for the book: https://github.com/PacktPublishing/Enhancing-Deep-Learning-with-Bayesian-Inference.

3.2 Introducing the multi-layer perceptron

Deep neural networks are at the core of the deep learning revolution. The aim of this section is to introduce basic concepts and building blocks for deep neural networks. To get started, we will review the components of the multi-layer perceptron (MLP) and implement it using the TensorFlow framework. This will serve as the foundation for other code examples in the book. If you are already familiar with neural networks and know how to implement them in code, feel free to jump ahead to the Understanding the problem with typical NNs section, where we cover the limitations of deep neural networks. This chapter focuses on architectural building blocks and principles and does not cover learning rules and gradients. If you require additional background information for those topics, we recommend Sebastian Raschka’s excellent Python Machine Learning book from Packt Publishing (in particular, Chapter 2, Fundamentals of Bayesian Inference)...

3.3 Reviewing neural network architectures

In the previous section, we saw how to implement a fully-connected network in the form of an MLP. While such networks were very popular in the early days of deep learning, over the years, machine learning researchers have developed more sophisticated architectures that work more successfully by including domain-specific knowledge (such as computer vision or Natural Language Processing (NLP)). In this section, we will review some of the most common of these neural network architectures, including Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), as well as attention mechanisms and transformers.

3.3.1 Exploring CNNs

When looking back at the example of trying to predict London housing prices with an MLP model, the input features we used (distance to the city centre, floor area, and construction year of the house) were still ”hand-engineered,” meaning that a human looked at the problem and decided which...

3.4 Understanding the problem with typical neural networks

The deep neural networks we discussed in previous sections are extremely powerful and, paired with appropriate training data, have enabled big strides in machine perception. In machine vision, convolutional neural networks enable us to classify images, locate objects in images, segment images into different segments or instances, and even to generate entirely novel images. In natural language processing, recurrent neural networks and transformers have allowed us to classify text, to recognize speech, to generate novel text or, as reviewed previously, to translate between two different languages.

However, these standard types of neural network models also have several limitations. In this section, we will explore some of these limitations. We will look at the following:

How the prediction scores of such neural network models can be overconfident
How such models can produce very confident predictions on OOD data
How tiny, imperceptible...

3.5 Summary

In this chapter, we have seen different types of common neural networks. First, we discussed the key building blocks of neural networks with a special focus on the multi-layer perceptron. Then we reviewed common neural network architectures: convolutional neural networks, recurrent neural networks, and the attention mechanism. All these components allow us to build very powerful deep learning models that can sometimes achieve super-human performance. However, in the second part of the chapter, we reviewed a few problems of neural networks. We discussed how they can be overconfident, and do not handle out-of-distribution data very well. We also saw how small, imperceptible changes to a neural network’s input can cause the model to make an incorrect prediction.

In the next chapter, we will combine the concepts learned in this chapter and in Chapter 3, Fundamentals of Deep Learning, and discuss Bayesian deep learning, which has the potential to overcome some of the...

3.6 Further reading

There are a lot of great resources to learn more about the essential building blocks of deep learning. Here are just a few popular resources that are a great start:

Nielsen, M.A., 2015. Neural networks and deep learning (Vol. 25). San Francisco, CA, USA: Determination press., http://neuralnetworksanddeeplearning.com/.
Chollet, F., 2021. Deep learning with Python. Simon and Schuster.
Raschka, S., 2015. Python Machine Learning. Packt Publishing Ltd.
Ng, Andrew, 2022, Deep Learning Specialization. Coursera.
Johnson, Justin, 2019. EECS 498-007 / 598-005, Deep Learning for Computer Vision. University of Michigan.

To learn more about the problems of deep learning models, you can read some of the following resources:

Overconfidence and calibration:
- Guo, C., Pleiss, G., Sun, Y. and Weinberger, K.Q., 2017, July. On calibration of modern neural networks. In International conference on machine learning (pp. 1321-1330). PMLR.
- Ovadia, Y., Fertig, E., Ren, J., Nado, Z., Sculley...

The rest of the chapter is locked

You have been reading a chapter from

Enhancing Deep Learning with Bayesian Inference

Published in: Jun 2023Publisher: PacktISBN-13: 9781803246888

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Authors (3)

Matt Benatan

Matt Benatan is a Principal Research Scientist at Sonos and a Simon Industrial Fellow at the University of Manchester. His work involves research in robust multimodal machine learning, uncertainty estimation, Bayesian optimization, and scalable Bayesian inference.
Read more about Matt Benatan

Jochem Gietema

Jochem Gietema is an Applied Scientist at Onfido in London where he has developed and deployed several patented solutions related to anomaly detection, computer vision, and interactive data visualisation.
Read more about Jochem Gietema

Marian Schneider

Marian Schneider is an applied scientist in machine learning. His work involves developing and deploying applications in computer vision, ranging from brain image segmentation and uncertainty estimation to smarter image capture on mobile devices.
Read more about Marian Schneider

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages

You're reading from Enhancing Deep Learning with Bayesian Inference

Chapter 3 Fundamentals of Deep Learning

3.1 Technical requirements

3.2 Introducing the multi-layer perceptron

3.3 Reviewing neural network architectures

3.3.1 Exploring CNNs

3.4 Understanding the problem with typical neural networks

3.5 Summary

3.6 Further reading

Unlock this book and the full library FREE for 7 days

Authors (3)

Et al.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Mastering Tableau 2023

Building AI Applications with ChatGPT APIs

Building AI Applications with ChatGPT APIs

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

Modern Data Architecture on AWS

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

TinyML Cookbook

Chapter 3
Fundamentals of Deep Learning