Reader small image

You're reading from  Hands-On Mathematics for Deep Learning

Product typeBook
Published inJun 2020
Reading LevelIntermediate
PublisherPackt
ISBN-139781838647292
Edition1st Edition
Languages
Right arrow
Author (1)
Jay Dawani
Jay Dawani
author image
Jay Dawani

Jay Dawani is a former professional swimmer turned mathematician and computer scientist. He is also a Forbes 30 Under 30 Fellow. At present, he is the Director of Artificial Intelligence at Geometric Energy Corporation (NATO CAGE) and the CEO of Lemurian Labs - a startup he founded that is developing the next generation of autonomy, intelligent process automation, and driver intelligence. Previously he has also been the technology and R&D advisor to Spacebit Capital. He has spent the last three years researching at the frontiers of AI with a focus on reinforcement learning, open-ended learning, deep learning, quantum machine learning, human-machine interaction, multi-agent and complex systems, and artificial general intelligence.
Read more about Jay Dawani

Right arrow

Feedforward Neural Networks

In the previous chapter, we covered linear neural networks, which have proven to be effective for problems such as regression and so are widely used in the industry. However, we also saw that they have their limitations and are unable to work effectively on higher-dimensional problems.

In this chapter, we will take an in-depth look at the multilayer perceptron (MLP), a type of feedforward neural network (FNN). We will start by taking a look at how biological neurons process information, then we will move onto mathematical models of biological neurons. The artificial neural networks (ANNs) we will study in this book are made up of mathematical models of biological neurons (we will learn more about this shortly). Once we have built a foundation, we will move on to understanding how MLPs—which are the FNNs—work and their involvement with...

Understanding biological neural networks

The human brain is capable of some remarkable feats—it performs very complex information processing. The neurons that make up our brains are very densely connected and perform in parallel with others. These biological neurons receive and pass signals to other neurons through the connections (synapses) between them. These synapses have strengths associated with them and increasing or weakening the strength of the connections between neurons is what facilitates our learning and allows us to continuously learn and adapt to the dynamic environments we live in.

As we know, the brain consists of neurons—in fact, according to recent studies, it is estimated that the human brain contains roughly 86 billion neurons. That is a lot of neurons and a whole lot more connections. A very large number of these neurons are used simultaneously...

Comparing the perceptron and the McCulloch-Pitts neuron

In this section, we will cover two mathematical models of biological neurons—the McCulloch-Pitts (MP) neuron and Rosenblatt's perceptron—which create the foundation for neural networks.

The MP neuron

The MP neuron was created in 1943 by Warren McCulloch and Walter Pitts. It was modeled after the biological neuron and is the first mathematical model of a biological neuron. It was created primarily for classification tasks. The MP neuron takes as input binary values and outputs a binary value based on a threshold value. If the sum of the inputs is greater than the threshold, then the neuron outputs 1 (if it is under the threshold, it outputs 0). In the...

MLPs

As mentioned, both the MP neuron and perceptron models are unable to deal with nonlinear problems. To combat this issue, modern-day perceptrons use an activation function that introduces nonlinearity to the output.

The perceptrons (neurons, but we will mostly refer to them as nodes going forward) we will use are of the following form:

Here, y is the output, φ is a nonlinear activation function, xi is the inputs to the unit, wi is the weights, and b is the bias. This improved version of the perceptron looks as follows:

In the preceding diagram, the activation function is generally the sigmoid function:

What the sigmoid activation function does is squash all the output values into the (0, 1) range. The sigmoid activation function is largely used for historical purposes since the developers of the earlier neurons focused on thresholding. When gradient-based learning...

Training neural networks

Now that we have an understanding of backpropagation and how gradients are computed, you might be wondering what purpose it serves and what it has to do with training our MLP. If you will recall from Chapter 1, Vector Calculus, when we covered partial derivatives, we learned that we can use partial derivatives to check the impact that changing one parameter can have on the output of a function. When we use the first and second derivatives to plot our graphs, we can analytically tell what the local and global minima and maxima are. However, it isn't as straightforward as that in our case as our model doesn't know where the optima is or how to get there; so, instead, we use backpropagation with the gradient descent as a guide to help us get to the (hopefully global) minima.

In Chapter 4, Optimization, we learned about gradient descent and how we...

Deep neural networks

Now, it's time to get into the really fun stuff (and what you picked up this book for)—deep neural networks. The depth comes from the number of layers in the neural network and for an FNN to be considered deep, it must have more than 10 hidden layers. A number of today's state-of-the-art FNNs have well over 40 layers. Let's now explore some of the properties of deep FNNs and get an understanding of why they are so powerful.

If you recall, earlier on we came across the universal approximation theorem, which stated that an MLP with a single hidden layer could approximate any function. But if that is the case, why do we need deep neural networks? Simply put, the capacity of a neural network increases with each hidden layer (and the brain has a deep structure). What this means is that deeper networks have far greater expressiveness than shallower...

Summary

In this chapter, we first learned about a simple FNN, known as the MLP, and broke it down into its individual components to get a deeper understanding of how they work and are constructed. We then extended these concepts to further our understanding of deep neural networks. You should now have intimate knowledge of how FNNs work and understand how various models are constructed, as well as understand how to build and possibly improve them for yourself.

Let's now move on to the next chapter, where we will learn how to improve our neural networks so that they generalize better on unseen data.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Hands-On Mathematics for Deep Learning
Published in: Jun 2020Publisher: PacktISBN-13: 9781838647292
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Jay Dawani

Jay Dawani is a former professional swimmer turned mathematician and computer scientist. He is also a Forbes 30 Under 30 Fellow. At present, he is the Director of Artificial Intelligence at Geometric Energy Corporation (NATO CAGE) and the CEO of Lemurian Labs - a startup he founded that is developing the next generation of autonomy, intelligent process automation, and driver intelligence. Previously he has also been the technology and R&D advisor to Spacebit Capital. He has spent the last three years researching at the frontiers of AI with a focus on reinforcement learning, open-ended learning, deep learning, quantum machine learning, human-machine interaction, multi-agent and complex systems, and artificial general intelligence.
Read more about Jay Dawani