Reader small image

You're reading from  Deep Learning with TensorFlow

Product typeBook
Published inApr 2017
Reading LevelIntermediate
PublisherPackt
ISBN-139781786469786
Edition1st Edition
Languages
Right arrow
Authors (3):
Giancarlo Zaccone
Giancarlo Zaccone
author image
Giancarlo Zaccone

Giancarlo Zaccone has over fifteen years' experience of managing research projects in the scientific and industrial domains. He is a software and systems engineer at the European Space Agency (ESTEC), where he mainly deals with the cybersecurity of satellite navigation systems. Giancarlo holds a master's degree in physics and an advanced master's degree in scientific computing. Giancarlo has already authored the following titles, available from Packt: Python Parallel Programming Cookbook (First Edition), Getting Started with TensorFlow, Deep Learning with TensorFlow (First Edition), and Deep Learning with TensorFlow (Second Edition).
Read more about Giancarlo Zaccone

Md. Rezaul Karim
Md. Rezaul Karim
author image
Md. Rezaul Karim

Md. Rezaul Karim is a researcher, author, and data science enthusiast with a strong computer science background, coupled with 10 years of research and development experience in machine learning, deep learning, and data mining algorithms to solve emerging bioinformatics research problems by making them explainable. He is passionate about applied machine learning, knowledge graphs, and explainable artificial intelligence (XAI). Currently, he is working as a research scientist at Fraunhofer FIT, Germany. He is also a PhD candidate at RWTH Aachen University, Germany. Before joining FIT, he worked as a researcher at the Insight Centre for Data Analytics, Ireland. Previously, he worked as a lead software engineer at Samsung Electronics, Korea.
Read more about Md. Rezaul Karim

Ahmed Menshawy
Ahmed Menshawy
author image
Ahmed Menshawy

Ahmed Menshawy is a Research Engineer at the Trinity College Dublin, Ireland. He has more than 5 years of working experience in the area of ML and NLP. He holds an MSc in Advanced Computer Science. He started his Career as a Teaching Assistant at the Department of Computer Science, Helwan University, Cairo, Egypt. He taught several advanced ML and NLP courses such as ML, Image Processing, and so on. He was involved in implementing the state-of-the-art system for Arabic Text to Speech. He was the main ML specialist at the Industrial research and development lab at IST Networks, based in Egypt.
Read more about Ahmed Menshawy

View More author details
Right arrow

TensorFlow on a Convolutional Neural Network

Convolutional Neural Networks (CNNs) are deep learning networks, which have achieved excellent results in many practical applications, and primarily in object recognition of images. CNN architecture is organized into a series of blocks. The first blocks are composed of two types of layers, convolutional layers and pooling layers; while the last blocks are fully-connected layers with softmax layers.

We'll develop two examples of CNN networks, for image classification problems. The first problem is the classic MNIST digit classification system. We'll see how to build a CNN that reaches 99 percent accuracy. The training set for the second example is taken from the Kaggle platform. The purpose here is to train a network on a series of facial images to classify their emotional stretch.

We'll evaluate the accuracy of the model and then we'll test it on a...

Introducing CNNs

In recent years, Deep Neural Networks (DNNs) have contributed a new impetus to research as well as industry and are therefore been used increasingly. A special type of a DNN is a Convolutional Neural Network (CNN), which has been used with great success in image classification problems.

Before diving into the implementation of an image classifier based on CNN, we'll introduce some basic concepts in image recognition, such as feature detection and convolution.

It's well known that a real image is associated with a grid composed of a high number of small squares, called pixels. The following figure represents a black and white image related to a 5x5 grid of pixels:

Black and white image

Each element of the grid corresponds to a pixel and, in the case of a black and white image, it assumes either a value of 1, which is associated with black color or the value 0, which is associated with...

CNN architecture

Taking as an example the input matrix 5x5 as shown earlier, a CNN consists of an input layer consisting of 25 neurons (5x5 = 25) whose task is to acquire the input value corresponding to each pixel and transfer it to the next hidden layer.

In a multilayer network, the outputs of all neurons of the input layer would be connected to each neuron of the hidden layer (fully-connected layer).

In CNN networks, the connection scheme that defines the convolutional layer that we are going to describe is significantly different.

As you can probably guess, this is the main type of layer; the use of one or more of these layers in a CNN is indispensable.

In a convolutional layer, each neuron is connected to a certain region of the input area called the receptive field.

For example, using a 3x3 kernel filter, each neuron will have a bias and 9=3x3 weights connected to a single receptive field. Of course, to effectively...

Building your first CNN

In this section, we will learn how to build a CNN to classify images of the MNIST dataset. In the previous chapter, we saw that a simple softmax model provides about 92% classification accuracy for recognizing hand written digits in the MNIST.

Here we'll implement a CNN which has a classification accuracy of about 99%.

The following figure shows how the data flows in the first two convolutional layer. The input image is processed in the first convolutional layer using the filter-weights. This results in 32 new images, one for each filter in the convolutional layer. The images are also downsampled with the pooling operation so the image resolution is decreased from 28x28 to 14x14.

These 32 smaller images are then processed in the second convolutional layer. We need filter weights again for each of these 32 features, and we need filter-weights for each output channel of this layer. The...

Emotion recognition with CNNs

One of the hardest problems to solve in deep learning has nothing to do with neural nets, it's the problem of getting the right data in the right format. However, a valuable assistant to find new problems, and new datasets to study, comes from the Kaggle platform (https://www.kaggle.com/).

The Kaggle platform was founded in 2010 as a platform for predictive modeling and analytics competitions on which companies and researchers post their data and statisticians and data miners from all over the world compete to produce the best models.

In this section, we show how to make a CNN for emotion detection from facial images. The train and test set of this example can be downloaded from https://inclass.kaggle.com/c/facial-keypoints-detector/data. Please note that you can login and download the data using Facebook, Google+ or Yahoo. Alternatively, you will have to create an account and...

Summary

In this chapter, we introduced Convolutional Neural Networks (CNNs).

We have seen how the architecture of these networks yield CNNs, which are particularly suitable for image classification problems, making the training phase faster and the test phase more accurate.

We have therefore implemented an image classifier, testing it on MNIST dataset, where have achieved a 99 percent accuracy.

Finally, we built a CNN to classify emotions starting from a dataset of images; we tested the network on a single image and we evaluated the limits and the goodness of our model.

The next chapter describes autoencoders, these algorithms are useful for dimensionality reduction, classification, regression, collaborative filtering, feature learning and topic modeling. We will carry out further data analysis using autoencoders and measure classification performance using image datasets.

...
lock icon
The rest of the chapter is locked
You have been reading a chapter from
Deep Learning with TensorFlow
Published in: Apr 2017Publisher: PacktISBN-13: 9781786469786
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (3)

author image
Giancarlo Zaccone

Giancarlo Zaccone has over fifteen years' experience of managing research projects in the scientific and industrial domains. He is a software and systems engineer at the European Space Agency (ESTEC), where he mainly deals with the cybersecurity of satellite navigation systems. Giancarlo holds a master's degree in physics and an advanced master's degree in scientific computing. Giancarlo has already authored the following titles, available from Packt: Python Parallel Programming Cookbook (First Edition), Getting Started with TensorFlow, Deep Learning with TensorFlow (First Edition), and Deep Learning with TensorFlow (Second Edition).
Read more about Giancarlo Zaccone

author image
Md. Rezaul Karim

Md. Rezaul Karim is a researcher, author, and data science enthusiast with a strong computer science background, coupled with 10 years of research and development experience in machine learning, deep learning, and data mining algorithms to solve emerging bioinformatics research problems by making them explainable. He is passionate about applied machine learning, knowledge graphs, and explainable artificial intelligence (XAI). Currently, he is working as a research scientist at Fraunhofer FIT, Germany. He is also a PhD candidate at RWTH Aachen University, Germany. Before joining FIT, he worked as a researcher at the Insight Centre for Data Analytics, Ireland. Previously, he worked as a lead software engineer at Samsung Electronics, Korea.
Read more about Md. Rezaul Karim

author image
Ahmed Menshawy

Ahmed Menshawy is a Research Engineer at the Trinity College Dublin, Ireland. He has more than 5 years of working experience in the area of ML and NLP. He holds an MSc in Advanced Computer Science. He started his Career as a Teaching Assistant at the Department of Computer Science, Helwan University, Cairo, Egypt. He taught several advanced ML and NLP courses such as ML, Image Processing, and so on. He was involved in implementing the state-of-the-art system for Arabic Text to Speech. He was the main ML specialist at the Industrial research and development lab at IST Networks, based in Egypt.
Read more about Ahmed Menshawy