You're reading from Advanced Deep Learning with TensorFlow 2 and Keras - Second Edition

Product typeBook

Published inFeb 2020

Reading LevelIntermediate

PublisherPackt

ISBN-139781838821654

Edition2nd Edition

Languages

Python

Tools

Keras TensorFlow

Concepts

Programming Language

Author (1)

Rowel Atienza

4. Convolutional Neural Network (CNN)

We are now going to move onto the second artificial neural network, CNN. In this section, we're going to solve the same MNIST digit classification problem, but this time using a CNN.

Figure 1.4.1 shows the CNN model that we'll use for the MNIST digit classification, while its implementation is illustrated in Listing 1.4.1. Some changes in the previous model will be needed to implement the CNN model. Instead of having an input vector, the input tensor now has new dimensions (height, width, channels) or (image_size, image_size, 1) = (28, 28, 1) for the grayscale MNIST images. Resizing the train and test images will be needed to conform to this input shape requirement.

Figure 1.4.1: The CNN model for MNIST digit classification

Implement the preceding figure:

Listing 1.4.1: cnn-mnist-1.4.1.py

import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Activation, Dense, Dropout
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten
from tensorflow.keras.utils import to_categorical, plot_model
from tensorflow.keras.datasets import mnist

# load mnist dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# compute the number of labels
num_labels = len(np.unique(y_train))

# convert to one-hot vector
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

# input image dimensions
image_size = x_train.shape[1]
# resize and normalize
x_train = np.reshape(x_train,[-1, image_size, image_size, 1])
x_test = np.reshape(x_test,[-1, image_size, image_size, 1])
x_train = x_train.astype('float32') / 255
x_test = x_test.astype('float32') / 255

# network parameters
# image is processed as is (square grayscale)
input_shape = (image_size, image_size, 1)
batch_size = 128
kernel_size = 3
pool_size = 2
filters = 64
dropout = 0.2

# model is a stack of CNN-ReLU-MaxPooling
model = Sequential()
model.add(Conv2D(filters=filters,
                 kernel_size=kernel_size,
                 activation='relu',
                 input_shape=input_shape))
model.add(MaxPooling2D(pool_size))
model.add(Conv2D(filters=filters,
                 kernel_size=kernel_size,
                 activation='relu'))
model.add(MaxPooling2D(pool_size))
model.add(Conv2D(filters=filters,
                 kernel_size=kernel_size,
                 activation='relu'))
model.add(Flatten())
# dropout added as regularizer
model.add(Dropout(dropout))
# output layer is 10-dim one-hot vector
model.add(Dense(num_labels))
model.add(Activation('softmax'))
model.summary()
plot_model(model, to_file='cnn-mnist.png', show_shapes=True)

# loss function for one-hot vector
# use of adam optimizer
# accuracy is good metric for classification tasks
model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])
# train the network
model.fit(x_train, y_train, epochs=10, batch_size=batch_size)

_, acc = model.evaluate(x_test,
                        y_test,
                        batch_size=batch_size,
                   verbose=0)
print("\nTest accuracy: %.1f%%" % (100.0 * acc))

The major change here is the use of the Conv2D layers. The ReLU activation function is already an argument of Conv2D. The ReLU function can be brought out as an Activation layer when the batch normalization layer is included in the model. Batch normalization is used in deep CNNs so that large learning rates can be utilized without causing instability during training.

Convolution

If, in the MLP model, the number of units characterizes the Dense layers, the kernel characterizes the CNN operations. As shown in Figure 1.4.2, the kernel can be visualized as a rectangular patch or window that slides through the whole image from left to right, and from top to bottom. This operation is called convolution. It transforms the input image into a feature map, which is a representation of what the kernel has learned from the input image. The feature map is then transformed into another feature map in the succeeding layer and so on. The number of feature maps generated per Conv2D is controlled by the filters argument.

Figure 1.4.2: A 3 × 3 kernel is convolved with an MNIST digit image.

The convolution is shown in steps t_n and t_n+1 where the kernel moved by a stride of 1 pixel to the right.

The computation involved in the convolution is shown in Figure 1.4.3:

Figure 1.4.3: The convolution operation shows how one element of the feature map is computed

For simplicity, a 5 × 5 input image (or input feature map) where a 3 × 3 kernel is applied is illustrated. The resulting feature map is shown after the convolution. The value of one element of the feature map is shaded. You'll notice that the resulting feature map is smaller than the original input image, this is because the convolution is only performed on valid elements. The kernel cannot go beyond the borders of the image. If the dimensions of the input should be the same as the output feature maps, Conv2D accepts the option padding='same'. The input is padded with zeros around its borders to keep the dimensions unchanged after the convolution.

Pooling operations

The last change is the addition of a MaxPooling2D layer with the argument pool_size=2. MaxPooling2D compresses each feature map. Every patch of size pool_size × pool_size is reduced to 1 feature map point. The value is equal to the maximum feature point value within the patch. MaxPooling2D is shown in the following figure for two patches:

Figure 1.4.4: MaxPooling2D operation. For simplicity, the input feature map is 4 × 4, resulting in a 2 × 2 feature map.

The significance of MaxPooling2D is the reduction in feature map size, which translates to an increase in receptive field size. For example, after MaxPooling2D(2), the 2 × 2 kernel is now approximately convolving with a 4 × 4 patch. The CNN has learned a new set of feature maps for a different receptive field size.

There are other means of pooling and compression. For example, to achieve a 50% size reduction as MaxPooling2D(2), AveragePooling2D(2) takes the average of a patch instead of finding the maximum. Strided convolution, Conv2D(strides=2,…), will skip every two pixels during convolution and will still have the same 50% size reduction effect. There are subtle differences in the effectiveness of each reduction technique.

In Conv2D and MaxPooling2D, both pool_size and kernel can be non-square. In these cases, both the row and column sizes must be indicated. For example, pool_ size = (1, 2) and kernel = (3, 5).

The output of the last MaxPooling2D operation is a stack of feature maps. The role of Flatten is to convert the stack of feature maps into a vector format that is suitable for either Dropout or Dense layers, similar to the MLP model output layer.

In the next section, we will evaluate the performance of the trained MNIST CNN classifier model.

Performance evaluation and model summary

As shown in Listing 1.4.2, the CNN model in Listing 1.4.1 requires a smaller number of parameters at 80,226 compared to 269,322 when MLP layers are used. The conv2d_1 layer has 640 parameters because each kernel has 3 × 3 = 9 parameters, and each of the 64 feature maps has one kernel and one bias parameter. The number of parameters for other convolution layers can be computed in a similar way.

Listing 1.4.2: Summary of a CNN MNIST digit classifier

Layer (type)	                 Output Shape	        Param #
=================================================================
conv2d_1 (Conv2D)                (None, 26, 26, 64)      640
max_pooling2d_1 (MaxPooiling2)   (None, 13, 13, 64)      0
conv2d_2 (Conv2D)                (None, 11, 11, 64)      36928
max_pooling2d_2 (MaxPooiling2)   (None, 5.5, 5, 64)      0
conv2d_3 (Conv2D)                (None, 3.3, 3, 64)      36928
flatten_1 (Flatten)              (None, 576)             0
dropout_1 (Dropout)              (None, 576)             0
dense_1 (Dense)                  (None, 10)              5770
activation_1 (Activation)        (None, 10)              0
===================================================================
Total params: 80,266
Trainable params: 80,266
Non-trainable params: 0

Figure 1.4.5: shows a graphical representation of the CNN MNIST digit classifier.

A screenshot of a cell phone Description automatically generated

Figure 1.4.5: Graphical description of the CNN MNIST digit classifier

Table 1.4.1 shows a maximum test accuracy of 99.4%, which can be achieved for a 3-layer network with 64 feature maps per layer using the Adam optimizer with dropout=0.2. CNNs are more parameter efficient and have a higher accuracy than MLPs. Likewise, CNNs are also suitable for learning representations from sequential data, images, and videos.

Layers	Optimizer	Regularizer	Train Accuracy (%)	Test Accuracy (%)
64-64-64	SGD	Dropout(0.2)	97.76	98.50
64-64-64	RMSprop	Dropout(0.2)	99.11	99.00
64-64-64	Adam	Dropout(0.2)	99.75	99.40
64-64-64	Adam	Dropout(0.4)	99.64	99.30

Table 1.4.1: Different CNN network configurations and performance measures for the CNN MNIST digit classifier.

Having looked at CNNs and evaluated the trained model, let's look at the final core network that we will discuss in this chapter: RNN.

The rest of the page is locked

You have been reading a chapter from

Advanced Deep Learning with TensorFlow 2 and Keras - Second Edition

Published in: Feb 2020Publisher: PacktISBN-13: 9781838821654

Author (1)

Rowel Atienza

Rowel Atienza is an Associate Professor at the Electrical and Electronics Engineering Institute of the University of the Philippines, Diliman. He holds the Dado and Maria Banatao Institute Professorial Chair in Artificial Intelligence. Rowel has been fascinated with intelligent robots since he graduated from the University of the Philippines. He received his MEng from the National University of Singapore for his work on an AI-enhanced four-legged robot. He finished his Ph.D. at The Australian National University for his contribution on the field of active gaze tracking for human-robot interaction. Rowel's current research work focuses on AI and computer vision. He dreams on building useful machines that can perceive, understand, and reason. To help make his dreams become real, Rowel has been supported by grants from the Department of Science and Technology (DOST), Samsung Research Philippines, and Commission on Higher Education-Philippine California Advanced Research Institutes (CHED-PCARI).
Read more about Rowel Atienza

Other recommended products

Related to this chapter

Generative Adversarial Networks Projects

In this book, we will use different complexities of datasets in order to build end-to-end projects. With every chapter, the level of complexity and operations will become advanced. It consists of 8 full-fledged projects covering approaches such as 3D-GAN, Age-cGAN, DCGAN, SRGAN, StackGAN, and CycleGAN with real-world use cases.

BookJan 2019316 pages

Hands-On Generative Adversarial Networks with Keras

This book will explore deep learning and generative models, and their applications in artificial intelligence. You will learn to evaluate and improve your GAN models by eliminating challenges that are encountered in real-world applications. You will implement GAN architectures in various domains such as computer vision, NLP, and audio processing

BookMay 2019272 pages

Hands-On Image Generation with TensorFlow

This book is a step-by-step guide to show you how to implement generative models in TensorFlow 2.x from scratch. You’ll get to grips with the image generative technology by covering autoencoders, style transfer, and GANs as well as fundamental and state-of-the-art models.

BookDec 2020306 pages

PyTorch Computer Vision Cookbook

This book enables you to solve the trickiest of problems in computer vision using deep learning algorithms and techniques. You will learn to use several different algorithms for different CV problems such as classification, detection, segmentation, and more using Pytorch. Packed with best practices in training and deployment of CV applications.

BookMar 2020364 pages

TensorFlow Reinforcement Learning Quick Start Guide

This book is an essential guide for anyone interested in Reinforcement Learning. The book provides an actionable reference for Reinforcement Learning algorithms and their applications using TensorFlow and Python. It will help readers leverage the power of algorithms such as Deep Q-Network (DQN), Deep Deterministic Policy Gradients (DDPG), and Proximal Policy Optimization (PPO) to solve challenging control and decision-making problems.

BookMar 2019184 pages

Hands-On Generative Adversarial Networks with PyTorch 1.x

This book will help you understand how GANs architecture works using PyTorch. You will get familiar with the most flexible deep learning toolkit and use it to transform ideas into actual working codes. You will apply GAN models to areas like computer vision, multimedia and natural language processing using a sample-generation perspective.

BookDec 2019312 pages

Deep Reinforcement Learning with Python

Deep Reinforcement Learning with Python - Second Edition will help you learn reinforcement learning algorithms, techniques and architectures – including deep reinforcement learning – from scratch. This new edition is an extensive update of the original, reflecting the state-of-the-art latest thinking in reinforcement learning.

BookSep 2020760 pages

Generative Adversarial Networks Cookbook

Generative Adversarial Networks have opened up many new possibilities in the machine learning domain. This book is all you need to implement different types of GANs using TensorFlow and Keras, in order to provide optimized and efficient deep learning solutions.

BookDec 2018268 pages

Hands-On Deep Learning Algorithms with Python

This book introduces basic-to-advanced deep learning algorithms used in a production environment by AI researchers and principal data scientists; it explains algorithms intuitively, including the underlying math, and shows how to implement them using popular Python-based deep learning libraries such as TensorFlow.

BookJul 2019512 pages

Deep Learning with R Cookbook

This book will help you get through the problems that you face during the execution of different tasks and understand hacks in deep learning. With unique recipes, you will implement various deep learning architectures using R 3.5.x. You will cover complex algorithms to perform tasks such as reinforcement learning, GANs, advanced neural networks and more.

BookFeb 2020328 pages

Intelligent Projects Using Python

This book includes 9 projects on building smart and practical AI-based systems. These projects cover solutions to different domain-specific problems in healthcare, e-commerce and more. With this book, you will apply different machine learning and deep learning techniques and learn how to build your own intelligent applications for smart predictions and other insight-driven tasks.

BookJan 2019342 pages

Advanced Deep Learning with Python

This book is an expert-level guide to master the neural network variants using the Python ecosystem. You will gain the skills to build smarter, faster, and efficient deep learning systems with practical examples. By the end of this book, you will be up to date with the latest advances and current researches in the deep learning domain.

BookDec 2019468 pages

Personalised recommendations for you

Based on your interests and search pattern

C++ Programming for Linux Systems

This book covers the essential system programming tools and helps you explore the features of C++20. It emphasizes important details to maintain code quality and tackle everyday challenges of developing software for high performance, optimization, and more.

BookSep 2023288 pages

Expert C++

Discover advanced programming techniques, the latest features of C++17 and C++20, and best practices for memory management, debugging, testing, and large-scale application design with Expert C++. Ideal for experienced developers advancing to proficient programmers and building professional-grade C++ applications.

BookAug 2023604 pages

iOS 17 Programming for Beginners

iOS 17 Programming for Beginners, Eighth Edition is your comprehensive guide to learning the art of iOS app development. Whether you dream of creating the next chart-topping app or simply want to enhance your programming skills, this book is your trusted companion on this exciting journey.

BookOct 2023604 pages4

Developer Career Masterplan

Written by industry experts that have spent the last 20+ years helping developers grow their career path towards senior developer positions and beyond. This book provides a comprehensive guide, sharing examples and stories from their global careers. By the end, you’ll have the knowledge to create a clear career progression plan as a technical professional.

BookSep 2023310 pages

Refactoring with C#

In Refactoring with C#, you’ll explore the process of safely refactoring modern .NET code using Visual Studio features, advanced unit tests, AI assistance, and custom Roslyn analyzers.

BookNov 2023434 pages

Python Real-World Projects

Amplify your developer journey by curating a dynamic project portfolio that outshines traditional resumes. Delve into the Python realm through immersive projects, mastering core concepts while constructing comprehensive modules and applications. From data acquisition prowess to impactful data visualization, Python Real-World Projects arms you with essential skills to beat the competition.

BookSep 2023478 pages5

The MVVM Pattern in .NET MAUI

The MVVM Pattern in .NET MAUI enables developers to master MVVM principles and effectively apply them to .NET MAUI. This book uses real-life examples and covers complex problems to help you successfully apply MVVM with .NET MAUI to confidently develop robust and high-performing cross-platform apps.

BookNov 2023386 pages

Extending Microsoft Business Central with Power Platform

Extending Business Central with the Power Platform is a step-by-step guide for Business Central professionals to create solutions that automate business processes, explain complex workflow approvals, and integrate with hundreds of other systems, without traditional development. It’ll guide you in customizing Business Central with Power Platform.

BookAug 2023458 pages5

Extending Microsoft Business Central with Power Platform

Extending Business Central with the Power Platform is a step-by-step guide for Business Central professionals to create solutions that automate business processes, explain complex workflow approvals, and integrate with hundreds of other systems, without traditional development. It’ll guide you in customizing Business Central with Power Platform.

BookAug 2023458 pages5

Quantum Computing Algorithms

The book emphasizes intuitive ideas behind quantum algorithms in ways that other books don’t cover, striking a careful balance between no math and too much math. To get the most from this book, you should be comfortable with basic algebra and writing simple computer code. No prior understanding of quantum physics is needed to get started.

BookSep 2023342 pages

Python – Complete Python, Django, Data Science and ML Guide

Unlock Python's full potential with this 50+ hour course! From programming to web and game development, data manipulation, and machine learning, gain the skills required to succeed in various Python-related careers. With practical tasks, hands-on experience, and a strong foundation in Python, you'll be ready to tackle real-world challenges and take advantage of the many opportunities this versatile language offers.

VideoNov 202350 hours 30 minutes5

Python – Complete Python, Django, Data Science and ML Guide

Unlock Python's full potential with this 50+ hour course! From programming to web and game development, data manipulation, and machine learning, gain the skills required to succeed in various Python-related careers. With practical tasks, hands-on experience, and a strong foundation in Python, you'll be ready to tackle real-world challenges and take advantage of the many opportunities this versatile language offers.

VideoNov 202350 hours 30 minutes5

You're reading from Advanced Deep Learning with TensorFlow 2 and Keras - Second Edition

4. Convolutional Neural Network (CNN)

Convolution

Pooling operations

Performance evaluation and model summary

Unlock this book and the full library FREE for 7 days

Author (1)

Generative Adversarial Networks Projects

Hands-On Generative Adversarial Networks with Keras

Hands-On Image Generation with TensorFlow

This book is a step-by-step guide to show you how to implement generative models in TensorFlow 2.x from scratch. You’ll get to grips with the image generative technology by covering autoencoders, style transfer, and GANs as well as fundamental and state-of-the-art models.

PyTorch Computer Vision Cookbook

TensorFlow Reinforcement Learning Quick Start Guide

Hands-On Generative Adversarial Networks with PyTorch 1.x

Deep Reinforcement Learning with Python

Generative Adversarial Networks Cookbook

Generative Adversarial Networks have opened up many new possibilities in the machine learning domain. This book is all you need to implement different types of GANs using TensorFlow and Keras, in order to provide optimized and efficient deep learning solutions.

Hands-On Deep Learning Algorithms with Python

Deep Learning with R Cookbook

Intelligent Projects Using Python

Advanced Deep Learning with Python

C++ Programming for Linux Systems

This book covers the essential system programming tools and helps you explore the features of C++20. It emphasizes important details to maintain code quality and tackle everyday challenges of developing software for high performance, optimization, and more.

Expert C++

iOS 17 Programming for Beginners

iOS 17 Programming for Beginners, Eighth Edition is your comprehensive guide to learning the art of iOS app development. Whether you dream of creating the next chart-topping app or simply want to enhance your programming skills, this book is your trusted companion on this exciting journey.

Developer Career Masterplan

Refactoring with C#

In Refactoring with C#, you’ll explore the process of safely refactoring modern .NET code using Visual Studio features, advanced unit tests, AI assistance, and custom Roslyn analyzers.

Python Real-World Projects

The MVVM Pattern in .NET MAUI

The MVVM Pattern in .NET MAUI enables developers to master MVVM principles and effectively apply them to .NET MAUI. This book uses real-life examples and covers complex problems to help you successfully apply MVVM with .NET MAUI to confidently develop robust and high-performing cross-platform apps.

Extending Microsoft Business Central with Power Platform

Extending Microsoft Business Central with Power Platform

Quantum Computing Algorithms

Python – Complete Python, Django, Data Science and ML Guide

Python – Complete Python, Django, Data Science and ML Guide