You're reading from Hands-On Mathematics for Deep Learning

Product typeBook

Published inJun 2020

Reading LevelIntermediate

PublisherPackt

ISBN-139781838647292

Edition1st Edition

Languages

Python

Tools

Pandas TensorFlow

Concepts

Mathematical Programming

Author (1)

Jay Dawani

Convolutional Neural Networks

In this chapter, we will cover one of the most popular and widely used deep neural networks—the convolutional neural network (CNN, also known as ConvNet).

It is this class of neural networks that is largely responsible for the incredible feats that have been accomplished in computer vision over the last few years, starting with AlexNet, created by Alex Krizhevsky, Geoffrey Hinton, and Ilya Sutskever, which outperformed all the other models in the 2012 ImageNet Large Scale Visual Recognition Challenge (ILSVRC), thus beginning the deep learning revolution.

ConvNets are a very powerful type of neural network for processing data. They have a grid-like topology (that is, there is a spatial correlation between neighboring points) and are tremendously useful in a variety of applications, such as facial recognition, self-driving cars, surveillance...

The inspiration behind ConvNets

CNNs are a type of artificial neural network (ANN); they are loosely inspired by the concept that the human visual cortex processes images and allows our brains to recognize objects in the world and interact with them, which allows us to do a number of things, such as drive, play sports, read, watch movies, and so on.

It has been found that computations that somewhat resemble convolutions take place in our brains. Additionally, our brains possess both simple and complex cells. The simple cells pick up basic features, such as edges and curves, while the complex cells show spatial invariance, while also responding to the same cues as the simple cells.

Types of data used in ConvNets

CNNs work exceptionally well on visual tasks, such as object classification and object recognition in images and videos and pattern recognition in music, sound clips, and so on. They work effectively in these areas because they are able to exploit the structure of the data to learn about it. This means that we cannot alter the properties of the data. For example, images have a fixed structure and if we were to alter this, the image would no longer make sense. This differs from ANNs, where the ordering of feature vectors does not matter. Therefore, the data for CNNs is stored in multidimensional arrays.

In computers, images are in grayscale (black and white) or are colored (RGB), and videos (RGB-D) are made of up pixels. A pixel is the smallest unit of a digitized image that can be shown on a computer and holds values in the form of [0, 255]. The...

Convolutions and pooling

In Chapter 7, Feedforward Neural Networks, we saw how deep neural networks are built and how weights connect neurons in one layer to neurons in the previous or following layer. The layers in CNNs, however, are connected through a linear operation known as convolution, which is where their name comes from and what makes it such a powerful architecture for images.

Here, we will go over the various kinds of convolution and pooling operations used in practice and what the effect of each is. But first, let's see what convolution actually is.

Two-dimensional convolutions

In mathematics, we write convolutions as follows:

What this means is that we have a function, f, which is our input and a function...

Working with the ConvNet architecture

Now that we know all the different components that make up a ConvNet, we can put it all together and see how to construct a deep CNN. In this section, we will build a full architecture and observe how forward propagation works and how we decide the depth of the network, the number of kernels to apply, when and why to use pooling, and so on. But before we dive in, let's explore some of the ways in which CNNs differ from FNNs. They are as follows:

The neurons in CNNs have local connectivity, which means that each neuron in a successive layer receives input from a small local group of pixels from an image, instead of receiving the entire image, as a feedforward neural network (FNN) would.
Each neuron in the layer of a CNN has the same weight parameters.
The layers in CNNs can be normalized.
CNNs are translation invariant, which allows us...

Training and optimization

Now that we've got that sorted, it's time for us to dive into the really fun stuff. How do we train these fantastic architectures? Do we need a completely new algorithm to facilitate our training and optimization? No! We can still use backpropagation and gradient descent to calculate the error, differentiate it with respect to the previous layers, and update the weights to get us as close to the global optima as possible.

But before we go further, let's go through how backpropagation works in CNNs, particularly with kernels. Let's revisit the example we used earlier on in this chapter, where we convolved a 3 × 3 input with a 2 × 2 kernel, which looked as follows:

We expressed each element in the output matrix as follows:

We should remember from Chapter 7, Feedforward Networks, where we introduced backpropagation, that we...

Exploring popular ConvNet architectures

Now that we know how CNNs are built and trained, it is time to explore some of the popular architectures that are used and understand what makes them so powerful.

VGG-16

The VGG network is a derivation of AlexNet that was created by Andrew Zisserman and Karen Simonyan at the Visual Geometry Group (VGG) at the University of Oxford in 2015. This architecture is simpler than the one we saw earlier, but it gives us a much better framework to work with. VGGNet was also trained on the ImageNet dataset, except it takes images with a size of 224 × 224 × 3 that are sampled from the rescaled images in the dataset as input. You may have noticed that we have headed this section VGG-16...

Summary

Congratulations! We have just finished learning about a powerful variant of neural networks known as CNNs, which are very effective in tasks relating to computer vision and time-series prediction. We will revisit CNNs later on in this book, but in the meantime, let's move on to the next chapter and learn about recurrent and recursive neural networks.

The rest of the chapter is locked

You have been reading a chapter from

Hands-On Mathematics for Deep Learning

Published in: Jun 2020Publisher: PacktISBN-13: 9781838647292

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at £13.99/month. Cancel anytime

Author (1)

Jay Dawani

Jay Dawani is a former professional swimmer turned mathematician and computer scientist. He is also a Forbes 30 Under 30 Fellow. At present, he is the Director of Artificial Intelligence at Geometric Energy Corporation (NATO CAGE) and the CEO of Lemurian Labs - a startup he founded that is developing the next generation of autonomy, intelligent process automation, and driver intelligence. Previously he has also been the technology and R&D advisor to Spacebit Capital. He has spent the last three years researching at the frontiers of AI with a focus on reinforcement learning, open-ended learning, deep learning, quantum machine learning, human-machine interaction, multi-agent and complex systems, and artificial general intelligence.
Read more about Jay Dawani

Other recommended products

Related to this chapter

Hands-On Java Deep Learning for Computer Vision

This book will take you through the process of efficiently training deep neural networks in Java for Computer Vision-related tasks. You will build real-world applications ranging from simple Java handwritten digit recognition models to real-time autonomous car driving systems and face recognition models using the popular Java-based libraries.

BookFeb 2019260 pages

Hands-On Meta Learning with Python

This hands-on guide for meta learning starts with exploring the principles, algorithms, and implementations of Meta learning with Tensorflow, Keras, and Python. Once it sets the foundation of "learning to learn", the book will help you implement your meta learning algorithms from scratch.

BookDec 2018226 pages

Hands-On One-shot Learning with Python

This book is a step by step guide to one-shot learning using Python-based libraries. It is designed to help you understand and design models that can learn information about your data from one, or only a few, training examples. You will also learn to apply these techniques with real-world examples and datasets for classification and regression.

BookApr 2020156 pages

Hands-On Deep Learning Algorithms with Python

This book introduces basic-to-advanced deep learning algorithms used in a production environment by AI researchers and principal data scientists; it explains algorithms intuitively, including the underlying math, and shows how to implement them using popular Python-based deep learning libraries such as TensorFlow.

BookJul 2019512 pages

Practical Convolutional Neural Networks

This book helps you master CNN, from the basics to the most advanced concepts in CNN such as GANs, instance classification and attention mechanism for vision models and more. You will implement advanced CNN models using complex image and video datasets. By the end of the book you will learn CNN’s best practices to implement smart ConvNet models and apply them to solve complex deep learning problems.

BookFeb 2018218 pages

Mastering Computer Vision with TensorFlow 2.x

You will learn the principles of computer vision and deep learning, and understand various models and architectures with their pros and cons. You will learn how to use TensorFlow 2.x to build your own neural network model and apply it to various computer vision tasks such as image acquiring, processing, and analyzing.

BookMay 2020430 pages

Applying Math with Python

Python has a number of powerful packages to help anyone tackle complex mathematical problems in a simple and efficient way. This practical guide explains how to model real-world problems as mathematical objects in Python and how to perform computations, and interpret results. It explores Python lang to solve a variety of math and statistics problems.

BookJul 2020358 pages

Advanced Deep Learning with Python

This book is an expert-level guide to master the neural network variants using the Python ecosystem. You will gain the skills to build smarter, faster, and efficient deep learning systems with practical examples. By the end of this book, you will be up to date with the latest advances and current researches in the deep learning domain.

BookDec 2019468 pages

Deep Learning with PyTorch Quick Start Guide

PyTorch is extremely powerful and yet easy to learn. It provides advanced features such as supporting multiprocessor, distributed and parallel computation. This book is an excellent entry point for those wanting to explore deep learning with PyTorch to harness its power.

BookDec 2018158 pages

Practical Computer Vision

Computer Vision is a broadly used term associated with acquiring, processing, and analyzing images. This book will show you how you can perform various Computer Vision techniques in the most practical way possible. Right from capturing images from various sources, you will learn how to perform image filtering/manipulation and detect features in your images. As you go through the chapters, you'll work with increasingly complex algorithms to develop complex Computer Vision applications

BookFeb 2018234 pages

Hands-On Generative Adversarial Networks with Keras

This book will explore deep learning and generative models, and their applications in artificial intelligence. You will learn to evaluate and improve your GAN models by eliminating challenges that are encountered in real-world applications. You will implement GAN architectures in various domains such as computer vision, NLP, and audio processing

BookMay 2019272 pages

Hands-On Deep Learning Architectures with Python

This book explains the essential learning algorithms used for deep and shallow architectures. Packed with practical implementations to help you understand the concepts and ideas required to build efficient artificial intelligence systems, this book will help you construct deep models using popular frameworks and datasets.

BookApr 2019316 pages

Personalised recommendations for you

Based on your interests and search pattern

C++ Programming for Linux Systems

This book covers the essential system programming tools and helps you explore the features of C++20. It emphasizes important details to maintain code quality and tackle everyday challenges of developing software for high performance, optimization, and more.

BookSep 2023288 pages

Expert C++

Discover advanced programming techniques, the latest features of C++17 and C++20, and best practices for memory management, debugging, testing, and large-scale application design with Expert C++. Ideal for experienced developers advancing to proficient programmers and building professional-grade C++ applications.

BookAug 2023604 pages

iOS 17 Programming for Beginners

iOS 17 Programming for Beginners, Eighth Edition is your comprehensive guide to learning the art of iOS app development. Whether you dream of creating the next chart-topping app or simply want to enhance your programming skills, this book is your trusted companion on this exciting journey.

BookOct 2023604 pages4

Developer Career Masterplan

Written by industry experts that have spent the last 20+ years helping developers grow their career path towards senior developer positions and beyond. This book provides a comprehensive guide, sharing examples and stories from their global careers. By the end, you’ll have the knowledge to create a clear career progression plan as a technical professional.

BookSep 2023310 pages

Refactoring with C#

In Refactoring with C#, you’ll explore the process of safely refactoring modern .NET code using Visual Studio features, advanced unit tests, AI assistance, and custom Roslyn analyzers.

BookNov 2023434 pages

Python Real-World Projects

Amplify your developer journey by curating a dynamic project portfolio that outshines traditional resumes. Delve into the Python realm through immersive projects, mastering core concepts while constructing comprehensive modules and applications. From data acquisition prowess to impactful data visualization, Python Real-World Projects arms you with essential skills to beat the competition.

BookSep 2023478 pages5

The MVVM Pattern in .NET MAUI

The MVVM Pattern in .NET MAUI enables developers to master MVVM principles and effectively apply them to .NET MAUI. This book uses real-life examples and covers complex problems to help you successfully apply MVVM with .NET MAUI to confidently develop robust and high-performing cross-platform apps.

BookNov 2023386 pages

Extending Microsoft Business Central with Power Platform

Extending Business Central with the Power Platform is a step-by-step guide for Business Central professionals to create solutions that automate business processes, explain complex workflow approvals, and integrate with hundreds of other systems, without traditional development. It’ll guide you in customizing Business Central with Power Platform.

BookAug 2023458 pages5

Extending Microsoft Business Central with Power Platform

Extending Business Central with the Power Platform is a step-by-step guide for Business Central professionals to create solutions that automate business processes, explain complex workflow approvals, and integrate with hundreds of other systems, without traditional development. It’ll guide you in customizing Business Central with Power Platform.

BookAug 2023458 pages5

Quantum Computing Algorithms

The book emphasizes intuitive ideas behind quantum algorithms in ways that other books don’t cover, striking a careful balance between no math and too much math. To get the most from this book, you should be comfortable with basic algebra and writing simple computer code. No prior understanding of quantum physics is needed to get started.

BookSep 2023342 pages

Python – Complete Python, Django, Data Science and ML Guide

Unlock Python's full potential with this 50+ hour course! From programming to web and game development, data manipulation, and machine learning, gain the skills required to succeed in various Python-related careers. With practical tasks, hands-on experience, and a strong foundation in Python, you'll be ready to tackle real-world challenges and take advantage of the many opportunities this versatile language offers.

VideoNov 202350 hours 30 minutes5

Python – Complete Python, Django, Data Science and ML Guide

Unlock Python's full potential with this 50+ hour course! From programming to web and game development, data manipulation, and machine learning, gain the skills required to succeed in various Python-related careers. With practical tasks, hands-on experience, and a strong foundation in Python, you'll be ready to tackle real-world challenges and take advantage of the many opportunities this versatile language offers.

VideoNov 202350 hours 30 minutes5