Reader small image

You're reading from  Applied Deep Learning and Computer Vision for Self-Driving Cars

Product typeBook
Published inAug 2020
Reading LevelIntermediate
PublisherPackt
ISBN-139781838646301
Edition1st Edition
Languages
Right arrow
Authors (2):
Sumit Ranjan
Sumit Ranjan
author image
Sumit Ranjan

Sumit Ranjan is a silver medalist in his Bachelor of Technology (Electronics and Telecommunication) degree. He is a passionate data scientist who has worked on solving business problems to build an unparalleled customer experience across domains such as, automobile, healthcare, semi-conductor, cloud-virtualization, and insurance. He is experienced in building applied machine learning, computer vision, and deep learning solutions, to meet real-world needs. He was awarded Autonomous Self-Driving Car Scholar by KPIT Technologies. He has also worked on multiple research projects at Mercedes Benz Research and Development. Apart from work, his hobbies are traveling and exploring new places, wildlife photography, and blogging.
Read more about Sumit Ranjan

Dr. S. Senthamilarasu
Dr. S. Senthamilarasu
author image
Dr. S. Senthamilarasu

Dr. S. Senthamilarasu was born and raised in the Coimbatore, Tamil Nadu. He is a technologist, designer, speaker, storyteller, journal reviewer educator, and researcher. He loves to learn new technologies and solves real world problems in the IT industry. He has published various journals and research papers and has presented at various international conferences. His research areas include data mining, image processing, and neural network. He loves reading Tamil novels and involves himself in social activities. He has also received silver medals in international exhibitions for his research products for children with an autism disorder. He currently lives in Bangalore and is working closely with lead clients.
Read more about Dr. S. Senthamilarasu

View More author details
Right arrow
Dive Deep into Deep Neural Networks

In this chapter, you will learn about a topic that has changed the way we think about autonomous driving: Artificial Neural Networks (ANNs). Throughout this chapter, you will learn how these algorithms can be used to build a self-driving car perception stack, and you'll learn about the different components needed to design and train a deep neural network. This chapter will teach you everything you need to know about ANNs. You will also learn about the building blocks of feedforward neural networks, a very useful basic type of ANN. Specifically, we'll look at the hidden layers of a feedforward neural network. These hidden layers are important as they differentiate the mode of action of neural networks from the rest of the Machine Learning (ML) algorithms. We'll begin by looking at the mathematical definition of feedforward...

Diving deep into neural networks

Deep learning is a sub-field of ML that is based on ANNs (see Fig 2.1). Deep learning mimics the human brain and is inspired by the structure and function of the brain. The concept of deep learning is not new and has existed for a number of years. The reason for the popularity and success of deep learning in recent years is due to high powered processing units, such as GPUs, and the presence of enormous amounts of data. One of the reasons for deep neural networks (DNNs) performing better is the complex relationships among features and high-dimensional data:

Fig 2.1: Deep learning is a sub-field of ML

One of the great things about deep learning is that it eliminates human input. It replaces the costly and inefficient effort of human beings and automates most of the extraction process from features and raw data so that it doesn't require human involvement. Before, we used to extract features ourselves to make ML algorithms...

Introduction to neurons

In this section, we will discuss neurons, which are the basic building blocks of ANNs. In the following photograph, we can see actual real-life neurons as observed through a microscope:

Fig 2.3: A photograph of a neuron

You can find this photograph at https://commons.wikimedia.org/wiki/Neuron#/media/File:Pyramidal_hippocampal_neuron_40x.jpg.

The question now is how can we recreate neurons in ML? We need to create them since the whole purpose of deep learning is to mimic the human brain, one of the most powerful tools on the planet. So, the first step toward creating an ANN is to recreate a neuron.

Before creating a neuron in ML, we will examine the depiction of neurons created by Spanish neuroscientist Santiago Ramon y Cajal in 1899. 

Santiago Ramon y Cajal observed two neurons that had branches at the top and many threads below (https://commons.wikimedia.org/wiki/File:PurkinjeCell.jpg).

Nowadays, we have advanced technology that...

Understanding neurons and perceptrons

As discussed in the previous section, Introduction to neurons, before, ANNs had a basis in biology, and we mimic biological neurons with artificial neurons that are known as perceptronsThe perceptron is a mathematical model of a biological neuron. Later in this section, we will see how we can mimic biological neurons with artificial neurons.

As we know, the biological neuron is a brain cell. The body of the neuron has dendrites. When an electrical signal is passed from the dendrites to the body cell of the neuron, a single output or a single electrical signal comes out through an axon, and then it connects to some other neuron, as shown in the diagram of the generic neurotransmitter system that you can find in the link provided in the Introduction to neurons section. That is the basic idea we have: lots of inputs of electrical signals go through the dendrites, into the body, and then through...

The workings of ANNs

We have seen the concept of how a single neuron or perceptron works; so now, let's expand the concept to the idea of deep learning. The following diagram shows us what multiple perceptrons look like:

Fig 2.12: Multiple perceptrons

In the preceding diagram, we can see various layers of single perceptrons connected to each other through their inputs and outputs. The input layer is violet, the hidden layers are blue and green, and the output layer of the network is represented in red.

Input layers are real values from the data, so they take in actual data as their input. The next layers are the hidden layers, which are between the input and output layers. If three or more hidden layers are present, then it's considered a deep neural network. The final layer is the output layer, where we have some sort of final estimation of whatever the output that we are trying to estimate is. As we progress through more layers, the level of...

Understanding activation functions

Activation functions are so important to neural networks as they introduce non-linearity to a network. Deep learning consists of multiple non-linear transformations, and activation functions are the tools for non-linear transformation. Hence, activation functions are applied before sending an input signal to the next layer of neural networks. Due to activation functions, a neural network has the power to learn complex features.

Deep learning has many activation functions:

  • The threshold function
  • The sigmoid function
  • The rectifier function
  • The hyperbolic tangent function
  • The cost function

In the next section, we will start with one of the most important activation functions, called the threshold activation function.

The threshold function

The threshold function can be seen in the following diagram:

Fig 2.13: The threshold function

On the axis, we have the weighted sum of the input, and on the axis, we have the threshold values from 0 to 1. The threshold function is very simple: if the value is less than 0, then the threshold will be 0 and if the value is more than 0, then the threshold will be 1. This works as a yes-or-no function.

The sigmoid function

The sigmoid function is a very interesting type of function; we can see it in the following diagram:

Fig 2.14: The sigmoid function

The sigmoid function is nothing but a logistic function. In this function, anything below 0 will be set to 0. This function is often used in the output layer, especially when you're trying to find the predictive probability.

The rectifier linear function

The Rectifier Linear (ReLU) function is one of the most popular functions in the field of ANNs. If the value is less than or equal to 0, then the value of x is set to 0, and then from there, it gradually progresses as the input value increases. We can observe this in the following diagram:

Fig 2.15: The rectifier function

In the next section, we will learn about the hyperbolic tangent activation function.

The hyperbolic tangent activation function

Finally, we have another function, called the Hyperbolic Tangent Activation (tanh) function, which looks as follows:

Fig 2.16: Hyperbolic tangent

The tanh function is very similar to the sigmoid function; the range of a tanh function is (-1,1). Tanh functions are also S-shaped, like sigmoid functions. The advantage of the tanh function is that a positive will be mapped as strongly positive, a negative will be mapped as strongly negative, and 0 will be mapped to 0, as shown in Fig 2.16.

For more information about the performance of the hyperbolic function (tanh), refer to http://proceedings.mlr.press/v15/glorot11a/glorot11a.pdf.

In the next section of this chapter, we will learn about the cost function.

The cost function of neural networks

We will now explore how can we evaluate the performance of a neural network by using the cost function. We will use it to measure how far we are from the expected value. We are going to use the following notation and variables:

  • Variable Y to represent the true value
  • Variable a to represent the neuron prediction

In terms of weight and biases, the formula is as follows:

We pass z, which is the input (X) times the weight (X) added to the bias (b), into the activation function of .

There are many types of cost functions, but we are just going to discuss two of them:

  • The quadratic cost function
  • The cross-entropy function

The first cost function we are going to discuss is the quadratic cost function, which is represented with the following formula:

In the preceding formula, we can see that when the error is high, which means the actual value (Y) is less than the predictive value (a), then the value of the cost function...

Optimizers

Optimizers define how a neural network learns. They define the value of parameters during the training such that the loss function is at its lowest.

Gradient descent is an optimization algorithm for finding the minima of a function or the minimum value of a cost function. This is useful to us as we want to minimize the cost function. So, to find the local minimum, we take steps proportional to the negative of the gradient.

Let's go through a very simple example in one dimension, shown in the following plot:

Fig 2.17: Gradient descent

On the axis, we have the cost (the result of the cost function), and on the axis, we have the particular weight we are trying to choose (we chose the random weight). The weight minimizes the cost function and we can see that, basically, the parameter value is at the bottom of the parabola. We have to minimize the value of the cost function to the minimum value. Finding the minimum is really...

Understanding hyperparameters

Hyperparameters serve a similar purpose to the various tone knobs on a guitar that are used to get the best sound. They are settings that you can tune to control the behavior of an ML algorithm.

A vital aspect of any deep learning solution is the selection of hyperparameters. Most deep learning models have specific hyperparameters that control various aspects of the model, including memory or the execution cost. However, it is possible to define additional hyperparameters to help an algorithm adapt to a scenario or problem statement. To get the maximum performance of a particular model, data science practitioners typically spend lots of time tuning hyperparameters as they play such an important role in deep learning model development.

Hyperparameters can be broadly classified into two categories:

  • Model training-specific hyperparameters
  • Network architecture-specific hyperparameters

In the following sections, we will cover model training-specific hyperparameters...

Model training-specific hyperparameters

Model training-specific hyperparameters play an important role in model training. These are hyperparameters that live outside the model but have a direct influence on it. We will discuss the following hyperparameters:

  • Learning rate
  • Batch size
  • Number of epochs

Let's start with the learning rate.

Learning rate

The learning rate is the mother of all hyperparameters and quantifies the model's learning progress in a way that can be used to optimize its capacity.

A too-low learning rate would increase the training time of the model as it would take longer to incrementally change the weights of the network to reach an optimal state. On the other hand, although a large learning rate helps the model adjust to the data quickly, it causes the model to overshoot the minima. A good starting value for the learning rate for most models would be 0.001; in the following diagram, you can see that a low learning rate requires many updates before reaching the minimum point:

Fig 2.18: A low learning rate

However, an optimal learning rate swiftly reaches the minimum point. It requires less of an update before reaching near minima. Here, we can see a diagram with a decent learning rate:

Fig 2.19: Decent learning rate

A high learning rate causes drastic updates that lead...

Batch size

Another non-trivial hyperparameter that has a huge influence on the training accuracy, time, and resource requirements is batch size. Basically, batch size determines the number of data points that are sent to the ML algorithm in a single iteration during training.

Although having a very large batch size is beneficial for huge computational boosts, in practice, it has been observed that there is a significant degradation in the quality of the model, as measured by its ability to generalize. Batch size also comes at the expense of needing more memory for the training process.

Although a smaller batch size increases the training time, it almost always yields a better model than when using a larger batch size. This can be attributed to the fact that smaller batch sizes introduce more noise in gradient estimations, which helps them converge to flat minimizers. However, the downside of using a small batch size is that training times are increased.

In general, if the...

Number of epochs

An epoch is the number of cycles for which a model is trained. One epoch is when a whole dataset is passed forward and backward only once through the neural network. We can also say that an epoch is an easy way to track the number of cycles, while the training or validation error continues to go on. Since one epoch is too large to feed at once to the machine, we divide it into many smaller batches. 

One of the techniques to do this is to use the early stopping Keras callback, which stops the training process if the training/validation error has not improved in the past 10 to 20 epochs.

Network architecture-specific hyperparameters 

The hyperparameters that directly deal with the architecture of the deep learning model are called network architecture-specific hyperparameters. The different types of network-specific hyperparameters are as follows:

  • Number of hidden layers
  • Regularization
  • Activation function as hyperparameters

In the following section, we will see how network architecture-specific hyperparameters work.

Number of hidden layers 

It is easy for a model to learn simple features with a smaller number of hidden layers. However, as the features get complex or non-linearity increases, it requires more and more layers and units.

Having a small network for a complex task would result in a model that performs poorly as it wouldn't have the required learning capacity. Having a slightly larger number of units than the optimal number is not a problem; however, a much larger number will lead to the model overfitting. This means that the model will try to memorize the dataset and perform well on the training dataset, but will fail to perform well on the test data. So, we can play with the number of hidden layers and validate the accuracy of the network.

Regularization

Regularization is a hyperparameter that allows slight changes to the learning algorithm so that the model becomes more generalized. This also improves the performance of the model on the unseen data.

In ML, regularization penalizes the coefficients. In deep learning, regularization penalizes the weight matrices of the nodes.

We are going to discuss two types of regularization, as follows:

  • L1 and L2 regularization
  • Dropout

We will start with L1 and L2 regularization.

L1 and L2 regularization

The most common types of regularization are L1 and L2. We change the overall cost function by adding another term called regularization. The values of weight matrices decrease due to the addition of this regularization because it assumes that a neural network with smaller weight matrices leads to simpler models. 

Regularization is different in L1 and L2. The formula for L1 regularization is as follows:

In the preceding formula, regularization is represented by lambda (λ). Here, we penalize the absolute weight.

The formula for L2 regularization is as follows:

In the preceding formula, L2 regularization is represented by lambda (λ). It is also called weight decay as it forces the weights to decay close to 0.

Dropout

Dropout is a regularization technique that is used to improve the generalizing power of a network and prevent it from overfitting. Generally, a dropout value of 0.2 to 0.5 is used, with 0.2 being a good starting point. In general, we have to select multiple values and check the performance of the model.

The likelihood of a dropout that has a value that is too low has a negligible impact. However, if the value is too high for the network, then the network under-learns the features during model training. If dropout is used on a larger and wider network, then you are likely to get better performance, giving the model a greater opportunity to learn independent representations.

An example of dropout can be seen as follows, showing how we are going to drop a few of the neurons from the network:

Fig 2.21: Dropout  

In the next section, we will learn about activation functions as hyperparameters. 

Activation functions as hyperparameters

Activation functions, which are less commonly known as transfer functions, are used to enable the model to learn nonlinear prediction boundaries. Different activation functions behave differently and are carefully chosen based on the deep learning task at hand. We have already discussed different types of activation in an earlier section of this chapter, Understanding activation functions.

In the next section, we will learn about the popular deep learning APIs—TensorFlow and Keras. 

TensorFlow versus Keras

Primarily, there are two levels of abstraction for deep learning frameworks:

  • Firstly, there is the lower level, where frameworks such as TensorFlow, Theano, and PyTorch sit. It is at this level where neural network elements such as convolutions and other generalized matrix operations are carried out.
  • Then, there is a higher level, where frameworks such as Keras are present. Here, primitives from the lower levels are utilized to create neural network layers and models. User-friendly APIs for training and saving models are also implemented here.

Since they are present on different levels of abstraction, you cannot compare Keras and TensorFlow. TensorFlow, while being used for deep learning, is not a dedicated deep learning library and is used for a wide array of other applications besides deep learning. Keras, however, is a library developed from the ground up specifically for deep learning. It has very well-designed APIs...

Summary

In this chapter, we learned how to convert biological neurons into artificial neurons, how ANNwork, and about various hyperparameters. We also covered an overview of deep learning APIs—TensorFlow and Keras. This chapter has provided a foundation for deep learning. Now, you are ready to start implementing a deep learning model, which is the next step toward designing your implementation of a deep learning model for autonomous cars.

In the next chapter, we are going to implement a deep learning model using Keras.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Applied Deep Learning and Computer Vision for Self-Driving Cars
Published in: Aug 2020Publisher: PacktISBN-13: 9781838646301
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Authors (2)

author image
Sumit Ranjan

Sumit Ranjan is a silver medalist in his Bachelor of Technology (Electronics and Telecommunication) degree. He is a passionate data scientist who has worked on solving business problems to build an unparalleled customer experience across domains such as, automobile, healthcare, semi-conductor, cloud-virtualization, and insurance. He is experienced in building applied machine learning, computer vision, and deep learning solutions, to meet real-world needs. He was awarded Autonomous Self-Driving Car Scholar by KPIT Technologies. He has also worked on multiple research projects at Mercedes Benz Research and Development. Apart from work, his hobbies are traveling and exploring new places, wildlife photography, and blogging.
Read more about Sumit Ranjan

author image
Dr. S. Senthamilarasu

Dr. S. Senthamilarasu was born and raised in the Coimbatore, Tamil Nadu. He is a technologist, designer, speaker, storyteller, journal reviewer educator, and researcher. He loves to learn new technologies and solves real world problems in the IT industry. He has published various journals and research papers and has presented at various international conferences. His research areas include data mining, image processing, and neural network. He loves reading Tamil novels and involves himself in social activities. He has also received silver medals in international exhibitions for his research products for children with an autism disorder. He currently lives in Bangalore and is working closely with lead clients.
Read more about Dr. S. Senthamilarasu