You're reading from Applied Deep Learning and Computer Vision for Self-Driving Cars

Product typeBook

Published inAug 2020

Reading LevelIntermediate

PublisherPackt

ISBN-139781838646301

Edition1st Edition

Languages

Python

Tools

TensorFlow Keras

Concepts

Deep Learning

Authors (2):

Sumit Ranjan

Dr. S. Senthamilarasu

View More author details

Dive Deep into Deep Neural Networks

In this chapter, you will learn about a topic that has changed the way we think about autonomous driving: Artificial Neural Networks (ANNs). Throughout this chapter, you will learn how these algorithms can be used to build a self-driving car perception stack, and you'll learn about the different components needed to design and train a deep neural network. This chapter will teach you everything you need to know about ANNs. You will also learn about the building blocks of feedforward neural networks, a very useful basic type of ANN. Specifically, we'll look at the hidden layers of a feedforward neural network. These hidden layers are important as they differentiate the mode of action of neural networks from the rest of the Machine Learning (ML) algorithms. We'll begin by looking at the mathematical definition of feedforward...

Diving deep into neural networks

Deep learning is a sub-field of ML that is based on ANNs (see Fig 2.1). Deep learning mimics the human brain and is inspired by the structure and function of the brain. The concept of deep learning is not new and has existed for a number of years. The reason for the popularity and success of deep learning in recent years is due to high powered processing units, such as GPUs, and the presence of enormous amounts of data. One of the reasons for deep neural networks (DNNs) performing better is the complex relationships among features and high-dimensional data:

Fig 2.1: Deep learning is a sub-field of ML

One of the great things about deep learning is that it eliminates human input. It replaces the costly and inefficient effort of human beings and automates most of the extraction process from features and raw data so that it doesn't require human involvement. Before, we used to extract features ourselves to make ML algorithms...

Introduction to neurons

In this section, we will discuss neurons, which are the basic building blocks of ANNs. In the following photograph, we can see actual real-life neurons as observed through a microscope:

Fig 2.3: A photograph of a neuron

You can find this photograph at https://commons.wikimedia.org/wiki/Neuron#/media/File:Pyramidal_hippocampal_neuron_40x.jpg.

The question now is how can we recreate neurons in ML? We need to create them since the whole purpose of deep learning is to mimic the human brain, one of the most powerful tools on the planet. So, the first step toward creating an ANN is to recreate a neuron.

Before creating a neuron in ML, we will examine the depiction of neurons created by Spanish neuroscientist Santiago Ramon y Cajal in 1899.

Santiago Ramon y Cajal observed two neurons that had branches at the top and many threads below (https://commons.wikimedia.org/wiki/File:PurkinjeCell.jpg).

Nowadays, we have advanced technology that...

Understanding neurons and perceptrons

As discussed in the previous section, Introduction to neurons, before, ANNs had a basis in biology, and we mimic biological neurons with artificial neurons that are known as perceptrons. The perceptron is a mathematical model of a biological neuron. Later in this section, we will see how we can mimic biological neurons with artificial neurons.

As we know, the biological neuron is a brain cell. The body of the neuron has dendrites. When an electrical signal is passed from the dendrites to the body cell of the neuron, a single output or a single electrical signal comes out through an axon, and then it connects to some other neuron, as shown in the diagram of the generic neurotransmitter system that you can find in the link provided in the Introduction to neurons section. That is the basic idea we have: lots of inputs of electrical signals go through the dendrites, into the body, and then through...

The workings of ANNs

We have seen the concept of how a single neuron or perceptron works; so now, let's expand the concept to the idea of deep learning. The following diagram shows us what multiple perceptrons look like:

Fig 2.12: Multiple perceptrons

In the preceding diagram, we can see various layers of single perceptrons connected to each other through their inputs and outputs. The input layer is violet, the hidden layers are blue and green, and the output layer of the network is represented in red.

Input layers are real values from the data, so they take in actual data as their input. The next layers are the hidden layers, which are between the input and output layers. If three or more hidden layers are present, then it's considered a deep neural network. The final layer is the output layer, where we have some sort of final estimation of whatever the output that we are trying to estimate is. As we progress through more layers, the level of...

Understanding activation functions

Activation functions are so important to neural networks as they introduce non-linearity to a network. Deep learning consists of multiple non-linear transformations, and activation functions are the tools for non-linear transformation. Hence, activation functions are applied before sending an input signal to the next layer of neural networks. Due to activation functions, a neural network has the power to learn complex features.

Deep learning has many activation functions:

The threshold function
The sigmoid function
The rectifier function
The hyperbolic tangent function
The cost function

In the next section, we will start with one of the most important activation functions, called the threshold activation function.

The threshold function

The threshold function can be seen in the following diagram:

Fig 2.13: The threshold function

On the x axis, we have the weighted sum of the input, and on the y axis, we have the threshold values from 0 to 1. The threshold function is very simple: if the value is less than 0, then the threshold will be 0 and if the value is more than 0, then the threshold will be 1. This works as a yes-or-no function.

The sigmoid function

The sigmoid function is a very interesting type of function; we can see it in the following diagram:

Fig 2.14: The sigmoid function

The sigmoid function is nothing but a logistic function. In this function, anything below 0 will be set to 0. This function is often used in the output layer, especially when you're trying to find the predictive probability.

The rectifier linear function

The Rectifier Linear (ReLU) function is one of the most popular functions in the field of ANNs. If the value is less than or equal to 0, then the value of x is set to 0, and then from there, it gradually progresses as the input value increases. We can observe this in the following diagram:

Fig 2.15: The rectifier function

In the next section, we will learn about the hyperbolic tangent activation function.

The hyperbolic tangent activation function

Finally, we have another function, called the Hyperbolic Tangent Activation (tanh) function, which looks as follows:

Fig 2.16: Hyperbolic tangent

The tanh function is very similar to the sigmoid function; the range of a tanh function is (-1,1). Tanh functions are also S-shaped, like sigmoid functions. The advantage of the tanh function is that a positive will be mapped as strongly positive, a negative will be mapped as strongly negative, and 0 will be mapped to 0, as shown in Fig 2.16.

For more information about the performance of the hyperbolic function (tanh), refer to http://proceedings.mlr.press/v15/glorot11a/glorot11a.pdf.

In the next section of this chapter, we will learn about the cost function.

The cost function of neural networks

We will now explore how can we evaluate the performance of a neural network by using the cost function. We will use it to measure how far we are from the expected value. We are going to use the following notation and variables:

Variable Y to represent the true value
Variable a to represent the neuron prediction

In terms of weight and biases, the formula is as follows:

We pass z, which is the input (X) times the weight (X) added to the bias (b), into the activation function of .

There are many types of cost functions, but we are just going to discuss two of them:

The quadratic cost function
The cross-entropy function

The first cost function we are going to discuss is the quadratic cost function, which is represented with the following formula:

In the preceding formula, we can see that when the error is high, which means the actual value (Y) is less than the predictive value (a), then the value of the cost function...

Optimizers

Optimizers define how a neural network learns. They define the value of parameters during the training such that the loss function is at its lowest.

Gradient descent is an optimization algorithm for finding the minima of a function or the minimum value of a cost function. This is useful to us as we want to minimize the cost function. So, to find the local minimum, we take steps proportional to the negative of the gradient.

Let's go through a very simple example in one dimension, shown in the following plot:

Fig 2.17: Gradient descent

On the y axis, we have the cost (the result of the cost function), and on the x axis, we have the particular weight we are trying to choose (we chose the random weight). The weight minimizes the cost function and we can see that, basically, the parameter value is at the bottom of the parabola. We have to minimize the value of the cost function to the minimum value. Finding the minimum is really...

Understanding hyperparameters

Hyperparameters serve a similar purpose to the various tone knobs on a guitar that are used to get the best sound. They are settings that you can tune to control the behavior of an ML algorithm.

A vital aspect of any deep learning solution is the selection of hyperparameters. Most deep learning models have specific hyperparameters that control various aspects of the model, including memory or the execution cost. However, it is possible to define additional hyperparameters to help an algorithm adapt to a scenario or problem statement. To get the maximum performance of a particular model, data science practitioners typically spend lots of time tuning hyperparameters as they play such an important role in deep learning model development.

Hyperparameters can be broadly classified into two categories:

Model training-specific hyperparameters
Network architecture-specific hyperparameters

In the following sections, we will cover model training-specific hyperparameters...

Model training-specific hyperparameters

Model training-specific hyperparameters play an important role in model training. These are hyperparameters that live outside the model but have a direct influence on it. We will discuss the following hyperparameters:

Learning rate
Batch size
Number of epochs

Let's start with the learning rate.

Learning rate

The learning rate is the mother of all hyperparameters and quantifies the model's learning progress in a way that can be used to optimize its capacity.

A too-low learning rate would increase the training time of the model as it would take longer to incrementally change the weights of the network to reach an optimal state. On the other hand, although a large learning rate helps the model adjust to the data quickly, it causes the model to overshoot the minima. A good starting value for the learning rate for most models would be 0.001; in the following diagram, you can see that a low learning rate requires many updates before reaching the minimum point:

Fig 2.18: A low learning rate

However, an optimal learning rate swiftly reaches the minimum point. It requires less of an update before reaching near minima. Here, we can see a diagram with a decent learning rate:

Fig 2.19: Decent learning rate

A high learning rate causes drastic updates that lead...

Batch size

Another non-trivial hyperparameter that has a huge influence on the training accuracy, time, and resource requirements is batch size. Basically, batch size determines the number of data points that are sent to the ML algorithm in a single iteration during training.

Although having a very large batch size is beneficial for huge computational boosts, in practice, it has been observed that there is a significant degradation in the quality of the model, as measured by its ability to generalize. Batch size also comes at the expense of needing more memory for the training process.

Although a smaller batch size increases the training time, it almost always yields a better model than when using a larger batch size. This can be attributed to the fact that smaller batch sizes introduce more noise in gradient estimations, which helps them converge to flat minimizers. However, the downside of using a small batch size is that training times are increased.

In general, if the...

Number of epochs

An epoch is the number of cycles for which a model is trained. One epoch is when a whole dataset is passed forward and backward only once through the neural network. We can also say that an epoch is an easy way to track the number of cycles, while the training or validation error continues to go on. Since one epoch is too large to feed at once to the machine, we divide it into many smaller batches.

One of the techniques to do this is to use the early stopping Keras callback, which stops the training process if the training/validation error has not improved in the past 10 to 20 epochs.

Network architecture-specific hyperparameters

The hyperparameters that directly deal with the architecture of the deep learning model are called network architecture-specific hyperparameters. The different types of network-specific hyperparameters are as follows:

Number of hidden layers
Regularization
Activation function as hyperparameters

In the following section, we will see how network architecture-specific hyperparameters work.

Number of hidden layers

It is easy for a model to learn simple features with a smaller number of hidden layers. However, as the features get complex or non-linearity increases, it requires more and more layers and units.

Having a small network for a complex task would result in a model that performs poorly as it wouldn't have the required learning capacity. Having a slightly larger number of units than the optimal number is not a problem; however, a much larger number will lead to the model overfitting. This means that the model will try to memorize the dataset and perform well on the training dataset, but will fail to perform well on the test data. So, we can play with the number of hidden layers and validate the accuracy of the network.

Regularization

Regularization is a hyperparameter that allows slight changes to the learning algorithm so that the model becomes more generalized. This also improves the performance of the model on the unseen data.

In ML, regularization penalizes the coefficients. In deep learning, regularization penalizes the weight matrices of the nodes.

We are going to discuss two types of regularization, as follows:

L1 and L2 regularization
Dropout

We will start with L1 and L2 regularization.

L1 and L2 regularization

The most common types of regularization are L1 and L2. We change the overall cost function by adding another term called regularization. The values of weight matrices decrease due to the addition of this regularization because it assumes that a neural network with smaller weight matrices leads to simpler models.

Regularization is different in L1 and L2. The formula for L1 regularization is as follows:

In the preceding formula, regularization is represented by lambda (λ). Here, we penalize the absolute weight.

The formula for L2 regularization is as follows:

In the preceding formula, L2 regularization is represented by lambda (λ). It is also called weight decay as it forces the weights to decay close to 0.

Dropout

Dropout is a regularization technique that is used to improve the generalizing power of a network and prevent it from overfitting. Generally, a dropout value of 0.2 to 0.5 is used, with 0.2 being a good starting point. In general, we have to select multiple values and check the performance of the model.

The likelihood of a dropout that has a value that is too low has a negligible impact. However, if the value is too high for the network, then the network under-learns the features during model training. If dropout is used on a larger and wider network, then you are likely to get better performance, giving the model a greater opportunity to learn independent representations.

An example of dropout can be seen as follows, showing how we are going to drop a few of the neurons from the network:

Fig 2.21: Dropout

In the next section, we will learn about activation functions as hyperparameters.

Activation functions as hyperparameters

Activation functions, which are less commonly known as transfer functions, are used to enable the model to learn nonlinear prediction boundaries. Different activation functions behave differently and are carefully chosen based on the deep learning task at hand. We have already discussed different types of activation in an earlier section of this chapter, Understanding activation functions.

In the next section, we will learn about the popular deep learning APIs—TensorFlow and Keras.

TensorFlow versus Keras

Primarily, there are two levels of abstraction for deep learning frameworks:

Firstly, there is the lower level, where frameworks such as TensorFlow, Theano, and PyTorch sit. It is at this level where neural network elements such as convolutions and other generalized matrix operations are carried out.
Then, there is a higher level, where frameworks such as Keras are present. Here, primitives from the lower levels are utilized to create neural network layers and models. User-friendly APIs for training and saving models are also implemented here.

Since they are present on different levels of abstraction, you cannot compare Keras and TensorFlow. TensorFlow, while being used for deep learning, is not a dedicated deep learning library and is used for a wide array of other applications besides deep learning. Keras, however, is a library developed from the ground up specifically for deep learning. It has very well-designed APIs...

Summary

In this chapter, we learned how to convert biological neurons into artificial neurons, how ANNs work, and about various hyperparameters. We also covered an overview of deep learning APIs—TensorFlow and Keras. This chapter has provided a foundation for deep learning. Now, you are ready to start implementing a deep learning model, which is the next step toward designing your implementation of a deep learning model for autonomous cars.

In the next chapter, we are going to implement a deep learning model using Keras.

The rest of the chapter is locked

You have been reading a chapter from

Applied Deep Learning and Computer Vision for Self-Driving Cars

Published in: Aug 2020Publisher: PacktISBN-13: 9781838646301

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at €14.99/month. Cancel anytime

Authors (2)

Sumit Ranjan

Sumit Ranjan is a silver medalist in his Bachelor of Technology (Electronics and Telecommunication) degree. He is a passionate data scientist who has worked on solving business problems to build an unparalleled customer experience across domains such as, automobile, healthcare, semi-conductor, cloud-virtualization, and insurance. He is experienced in building applied machine learning, computer vision, and deep learning solutions, to meet real-world needs. He was awarded Autonomous Self-Driving Car Scholar by KPIT Technologies. He has also worked on multiple research projects at Mercedes Benz Research and Development. Apart from work, his hobbies are traveling and exploring new places, wildlife photography, and blogging.
Read more about Sumit Ranjan

Dr. S. Senthamilarasu

Dr. S. Senthamilarasu was born and raised in the Coimbatore, Tamil Nadu. He is a technologist, designer, speaker, storyteller, journal reviewer educator, and researcher. He loves to learn new technologies and solves real world problems in the IT industry. He has published various journals and research papers and has presented at various international conferences. His research areas include data mining, image processing, and neural network. He loves reading Tamil novels and involves himself in social activities. He has also received silver medals in international exhibitions for his research products for children with an autism disorder. He currently lives in Bangalore and is working closely with lead clients.
Read more about Dr. S. Senthamilarasu

Other recommended products

Related to this chapter

Computer Vision with Python 3

The field of computer vision involves designing and implementing algorithms to understand images and extract meaningful information from them. This book enables you to build real-world applications using Python and open source image processing libraries.

BookAug 2017206 pages

The Computer Vision Workshop

With The Computer Vision Workshop, you’ll explore the basic and advanced techniques in video and image processing using OpenCV and Python. It is filled with real-world exercises and activities that will make the learning process easy and enjoyable.

BookJul 2020568 pages

Hands-On GPU-Accelerated Computer Vision with OpenCV and CUDA

This book is a guide to explore how accelerating of computer vision applications using GPUs will help you develop algorithms that work on complex image data in real time. It will solve the problems you face while deploying these algorithms on embedded platforms with the help of development boards from NVIDIA such as the Jetson TX1, Jetson TX2, and Jetson TK1.

BookSep 2018380 pages

Hands-On Algorithms for Computer Vision

The field of Computer Vision has seen advancements in terms of processing power and performance. Many algorithms are introduced to perform Computer Vision tasks efficiently. This book is a starting point for anyone interested in this field and wants to dig deeper into the most practical algorithms used by professional Computer Vision developers.

BookJul 2018290 pages

Machine Learning for Healthcare Analytics Projects

Machine Learning in the healthcare domain is booming because of its abilities to provide accurate and stabilized techniques. This book is packed with new methodologies to create efficient solutions for healthcare analytics. We will build five end-to-end projects to evaluate the efficiency of AI apps to carry out simple-to-complex healthcare analytics tasks.

BookOct 2018134 pages

Python Image Processing Cookbook

Advancements in wireless devices and mobile technology have enabled the acquisition of a tremendous amount of graphics, pictures, and videos. Through cutting edge recipes, this book provides coverage on tools, algorithms, and analysis for image processing. This book provides solutions addressing the challenges and complex tasks of image processing.

BookApr 2020438 pages

OpenCV 3.x with Python By Example

Computer vision is found everywhere in modern technology. OpenCV for Python enables us to run computer vision algorithms in real time. With the advent of powerful machines, we have more processing power to work with. Using this technology, we can seamlessly integrate our computer vision applications into the cloud. Focusing on OpenCV 3.x and Python 3.6, this book will walk you through all the building blocks needed to build amazing computer vision applications with ease.

BookJan 2018268 pages

R Deep Learning Projects

R is a popular programming language used by statisticians and mathematicians for statistical analysis, and is popularly used for deep learning. This book demonstrates end-to-end implementations of five real-world projects on popular topics in deep learning such as handwritten digit recognition, traffic light detection, fraud detection, text generation, and sentiment analysis. You'll see how to train effective neural networks in R—including convolutional neural networks, recurrent neural networks and LSTMs—and also see how neural networks can be trained using GPU capabilities. You will use popular R libraries and packages—such as MXNetR, H2O, deepnet, and more—to implement the projects. By the end of this book, you will have a better understanding of deep learning concepts and techniques and how to use them in a practical setting.

BookFeb 2018258 pages

Raspberry Pi Computer Vision Programming

You will learn the basics of hardware and software required for image processing and computer vision with Raspberry Pi and Python 3. You will have a look at all the major image processing, manipulation, and computer vision techniques and algorithms in detail using engaging examples. You will build a lot of real-life computer vision applications.

BookJun 2020306 pages5

Ensemble Machine Learning Cookbook

This book uses a recipe-based approach to showcase the power of machine learning algorithms to build ensemble models using Python libraries. Through this book, you will be able to pick up the code, understand in depth how it works, execute and implement it efficiently. This will be a desk reference to implement a wide range of tasks and solve the common and uncommon problems in ensemble machine learning domain.

BookJan 2019336 pages

Hands-On Image Processing with Python

This book covers how to use the image processing libraries in Python. It will enable you to write code snippets to implement complex image processing algorithms such as image enhancement, filtering, segmentation, object detection, and more. You will also be able to use machine learning and deep learning models and learn to implement them with ease.

BookNov 2018492 pages

OpenCV 3 Computer Vision with Python Cookbook

OpenCV 3 is a native cross-platform library for computer vision, machine learning, and image processing. OpenCV's convenient high-level APIs hide very powerful internals designed for computational efficiency that can take advantage of multicore and GPU processing. This book will help you tackle increasingly challenging computer vision problems by providing a number of recipes that you can use to improve your applications.

BookMar 2018306 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages