Reader small image

You're reading from  Practical Convolutional Neural Networks

Product typeBook
Published inFeb 2018
Reading LevelIntermediate
PublisherPackt
ISBN-139781788392303
Edition1st Edition
Languages
Right arrow
Authors (3):
Mohit Sewak
Mohit Sewak
author image
Mohit Sewak

Mohit is a Python programmer with a keen interest in the field of information security. He has completed his Bachelor's degree in technology in computer science from Kurukshetra University, Kurukshetra, and a Master's in engineering (2012) in computer science from Thapar University, Patiala. He is a CEH, ECSA from EC-Council USA. He has worked in IBM, Teramatrix (Startup), and Sapient. He currently doing a Ph.D. from Thapar Institute of Engineering & Technology under Dr. Maninder Singh. He has published several articles in national and international magazines. He is the author of Python Penetration Testing Essentials, Python: Penetration Testing for Developers and Learn Python in 7 days, also by Packt. For more details on the author, you can check the following user name mohitraj.cs
Read more about Mohit Sewak

Md. Rezaul Karim
Md. Rezaul Karim
author image
Md. Rezaul Karim

Md. Rezaul Karim is a researcher, author, and data science enthusiast with a strong computer science background, coupled with 10 years of research and development experience in machine learning, deep learning, and data mining algorithms to solve emerging bioinformatics research problems by making them explainable. He is passionate about applied machine learning, knowledge graphs, and explainable artificial intelligence (XAI). Currently, he is working as a research scientist at Fraunhofer FIT, Germany. He is also a PhD candidate at RWTH Aachen University, Germany. Before joining FIT, he worked as a researcher at the Insight Centre for Data Analytics, Ireland. Previously, he worked as a lead software engineer at Samsung Electronics, Korea.
Read more about Md. Rezaul Karim

Pradeep Pujari
Pradeep Pujari
author image
Pradeep Pujari

https://www.linkedin.com/in/ppujari/
Read more about Pradeep Pujari

View More author details
Right arrow

Build Your First CNN and Performance Optimization

A convolutional neural network (CNN) is a type of feed-forward neural network (FNN) in which the connectivity pattern between its neurons is inspired by an animal's visual cortex. In the last few years, CNNs have demonstrated superhuman performance in image search services, self-driving cars, automatic video classification, voice recognition, and natural language processing (NLP).

Considering these motivations, in this chapter, we will construct a simple CNN model for image classification from scratch, followed by some theoretical aspects, such as convolutional and pooling operations. Then we will discuss how to tune hyperparameters and optimize the training time of CNNs for improved classification accuracy. Finally, we will build the second CNN model by considering some best practices. In a nutshell, the following topics...

CNN architectures and drawbacks of DNNs

In Chapter 2Introduction to Convolutional Neural Networks, we discussed that a regular multilayer perceptron works fine for small images (for example, MNIST or CIFAR-10). However, it breaks down for larger images because of the huge number of parameters it requires. For example, a 100 × 100 image has 10,000 pixels, and if the first layer has just 1,000 neurons (which already severely restricts the amount of information transmitted to the next layer), this means 10 million connections; and that is just for the first layer.

CNNs solve this problem using partially connected layers. Because consecutive layers are only partially connected and because it heavily reuses its weights, a CNN has far fewer parameters than a fully connected DNN, which makes it much faster to train, reduces the risk of overfitting, and requires much less...

Convolution and pooling operations in TensorFlow

Now that we have seen how convolutional and pooling operations are performed theoretically, let's see how we can perform these operation hands-on using TensorFlow. So let's get started.

Applying pooling operations in TensorFlow

Using TensorFlow, a subsampling layer can normally be represented by a max_pool operation by maintaining the initial parameters of the layer. For max_pool, it has the following signature in TensorFlow:

tf.nn.max_pool(value, ksize, strides, padding, data_format, name) 

Now let's learn how to create a function that utilizes the preceding signature and returns a tensor with type tf.float32, that is, the max pooled output tensor:

import tensorflow...

Training a CNN

In the previous section, we have seen how to construct a CNN and apply different operations on its different layers. Now when it comes to training a CNN, it is much trickier as it needs a lot of considerations to control those operations such as applying appropriate activation function, weight and bias initialization, and of course, using optimizers intelligently.

There are also some advanced considerations such as hyperparameter tuning for optimized too. However, that will be discussed in the next section. We first start our discussion with weight and bias initialization.

Weight and bias initialization

One of the most common initialization techniques in training a DNN is random initialization. The idea...

Building, training, and evaluating our first CNN

In the next section, we will look at how to classify and distinguish between dogs from cats based on their raw images. We will also look at how to implement our first CNN model to deal with the raw and color image having three channels. This network design and implementation are not straightforward; TensorFlow low-level APIs will be used for this. However, do not worry; later in this chapter, we will see another example of implementing a CNN using TensorFlow's high-level contrib API. Before we formally start, a short description of the dataset is a mandate.

Dataset description

For this example, we will use the dog versus cat dataset from Kaggle that was provided for...

Model performance optimization

Since CNNs are different from the layering architecture's perspective, they have different requirements as well as tuning criteria. How do you know what combination of hyperparameters is the best for your task? Of course, you can use a grid search with cross-validation to find the right hyperparameters for linear machine learning models.

However, for CNNs, there are many hyperparameters to tune, and since training a neural network on a large dataset takes a lot of time, you will only be able to explore a tiny part of the hyperparameter space in a reasonable amount of time. Here are some insights that can be followed.

Number of hidden layers

For many problems, you can just begin...

Summary

In this chapter, we discussed how to use CNNs, which are a type of feed-forward artificial neural network in which the connectivity pattern between neurons is inspired by the organization of an animal's visual cortex. We saw how to cascade a set of layers to construct a CNN and perform different operations in each layer. Then we saw how to train a CNN. Later on, we discussed how to optimize the CNN hyperparameters and optimization.

Finally, we built another CNN, where we utilized all the optimization techniques. Our CNN models did not achieve outstanding accuracy since we iterated both of the CNNs a few times and did not even apply any grid searching techniques; that means we did not hunt for the best combinations of the hyperparameters. Therefore, the takeaway would be to apply more robust feature engineering in the raw images, iterate the training for more...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Practical Convolutional Neural Networks
Published in: Feb 2018Publisher: PacktISBN-13: 9781788392303
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (3)

author image
Mohit Sewak

Mohit is a Python programmer with a keen interest in the field of information security. He has completed his Bachelor's degree in technology in computer science from Kurukshetra University, Kurukshetra, and a Master's in engineering (2012) in computer science from Thapar University, Patiala. He is a CEH, ECSA from EC-Council USA. He has worked in IBM, Teramatrix (Startup), and Sapient. He currently doing a Ph.D. from Thapar Institute of Engineering & Technology under Dr. Maninder Singh. He has published several articles in national and international magazines. He is the author of Python Penetration Testing Essentials, Python: Penetration Testing for Developers and Learn Python in 7 days, also by Packt. For more details on the author, you can check the following user name mohitraj.cs
Read more about Mohit Sewak

author image
Md. Rezaul Karim

Md. Rezaul Karim is a researcher, author, and data science enthusiast with a strong computer science background, coupled with 10 years of research and development experience in machine learning, deep learning, and data mining algorithms to solve emerging bioinformatics research problems by making them explainable. He is passionate about applied machine learning, knowledge graphs, and explainable artificial intelligence (XAI). Currently, he is working as a research scientist at Fraunhofer FIT, Germany. He is also a PhD candidate at RWTH Aachen University, Germany. Before joining FIT, he worked as a researcher at the Insight Centre for Data Analytics, Ireland. Previously, he worked as a lead software engineer at Samsung Electronics, Korea.
Read more about Md. Rezaul Karim

author image
Pradeep Pujari

https://www.linkedin.com/in/ppujari/
Read more about Pradeep Pujari