Reader small image

You're reading from  Deep Learning with Hadoop

Product typeBook
Published inFeb 2017
Reading LevelIntermediate
PublisherPackt
ISBN-139781787124769
Edition1st Edition
Languages
Right arrow
Author (1)
Dipayan Dev
Dipayan Dev
author image
Dipayan Dev

Dipayan Dev has completed his M.Tech from National Institute of Technology, Silchar with a first class first and is currently working as a software professional in Bengaluru, India. He has extensive knowledge and experience in non-relational database technologies, having primarily worked with large-scale data over the last few years. His core expertise lies in Hadoop Framework. During his postgraduation, Dipayan had built an infinite scalable framework for Hadoop, called Dr. Hadoop, which got published in top-tier SCI-E indexed journal of Springer (http://link.springer.com/article/10.1631/FITEE.1500015). Dr. Hadoop has recently been cited by Goo Wikipedia in their Apache Hadoop article. Apart from that, he registers interest in a wide range of distributed system technologies, such as Redis, Apache Spark, Elasticsearch, Hive, Pig, Riak, and other NoSQL databases. Dipayan has also authored various research papers and book chapters, which are published by IEEE and top-tier Springer Journals. To know more about him, you can also visit his LinkedIn profile https://www.linkedin.com/in/dipayandev.
Read more about Dipayan Dev

Right arrow

Chapter 3.  Convolutional Neural Network

 

"The question of whether a computer can think is no more interesting than the question of whether a submarine can swim."

 
 --Edsger W. Dijkstra

Convolutional neural network (CNN)--doesn't it give an uncanny feeling about the combination of mathematics and biology with some negligible amount of computer science added? However, these type of networks have been some of the most dominant and powerful architectures in the field of computer vision. CNN started to gain its popularity after 2012, when there were huge improvements in the precision of classification, credit to some pioneer in the field of deep learning. Ever since then, a bunch of high-tech companies have been using deep CNN for various services. Amazon uses CNN for their product recommendations, Google uses it for their photo search, and Facebook primarily uses it for its automatic tagging algorithms.

CNN [89] is a type of feed-forward neural network comprised of neurons, which have learnable...

Understanding convolution


To understand the concept of convolution, let us take an example to determine the position of a lost mobile phone with the help of a laser sensor. Let's say the current location of the mobile phone at time t can be given by the laser as f (t). The laser gives different readings of the location for all the values of t. The laser sensors are generally noisy in nature, which is undesirable for this scenario. Therefore, to derive a less noisy measurement of the location of the phone, we need to calculate the average various measurements. Ideally, the more the measurements, the greater the accuracy of the location. Hence, we should undergo a weighted average, which provides more weight to the measurements.

A weighted function can be given by the function w (b), where b denotes the age of the measurement. To derive a new function that will provide a better estimate of the location of the mobile phone, we need to take the average of the weight at every moment.

The new function...

Background of a CNN


CNN, a particular form of deep learning models, is not a new concept, and they have been widely adopted by the vision community for a long time. The model worked well in recognizing the hand-written digit by LeCun et al in 1998 [90]. But unfortunately, due to the inability of CNNs to work with higher resolution images, its popularity has diminished with the course of time. The reason was mostly due to hardware and memory constraints, and also the lack of availability of large-scale training datasets. As the computational power increases with time, mostly due to the wide availability of CPUs and GPUs and with the generation of big data, various large-scale datasets, such as the MIT Places dataset (see Zhou et al., 2014), ImageNet [91] and so on. it became possible to train larger and complex models. This is initially shown by Krizhevsky et al [4] in their paper, Imagenet classification using deep convolutional neural networks. In that paper, they brought down the error...

Basic layers of CNN


A CNN is composed of a sequence of layers, where every layer of the network goes through a differentiable function to transform itself from one volume of activation to another. Four main types of layers are used to build a CNN: Convolutional layer, Rectified Linear Units layer, Pooling layer, and Fully-connected layer. All these layers are stacked together to form a full CNN.

A regular CNN could have the following architecture:

[INPUT - CONV - RELU - POOL - FC]

However, in a deep CNN, there are generally more layers interspersed between these five basic layers.

A classic deep neural network will have the following structure:

Input -> Conv->ReLU->Conv->ReLu->Pooling->ReLU->Conv->ReLu->Pooling->Fully Connected

AlexNet, as mentioned in the earlier section, can be taken as a perfect example for this kind of structure. The architecture of AlexNet is shown in Figure 3.4. After every layer, an implicit ReLU non-linearity has been added. We will explain...

Distributed deep CNN


This section of the chapter will introduce some extremely aggressive deep CNN architecture, associated challenges for these networks, and the need of much larger distributed computing to overcome this. This section will explain how Hadoop and its YARN can provide a sufficient solution for this problem.

Most popular aggressive deep neural networks and their configurations

CNNs have shown stunning results in image recognition in recent years. However, unfortunately, they are extremely expensive to train. In the case of a sequential training process, the convolution operation takes around 95% of the total running time. With big datasets, even with low-scale distributed training, the training process takes many days to complete. The award winning CNN, AlexNet with ImageNet in 2012, took nearly an entire week to train on with two GTX 580 3 GB GPUs. The following table displays few of the most popular distributed deep CNNs with their configuration and corresponding time taken...

Convolutional layer using Deeplearning4j


This section of the chapter will provide the basic idea on how to write the code for CNN using Deeplearning4j. You'll be able to learn the syntax for using the various hyperparameters mentioned in this chapter.

To implement CNN using Deeplearning4j, the whole idea can be split into three core phases: loading data or preparation of the data, network configuration, and training and evaluation of the model.

Loading data

For CNNs, generally, we only work on the image data to train the model. In Deeplearning4j, the images can be read using ImageRecordReader. The following code snippet shows how to load 16×16 color images for the model:

RecordReader imageReader = new ImageRecordReader(16, 16, false);
imageReader.initialize(new FileSplit(new      
File(System.getProperty("user.home"), "image_location")));

After that, using CSVRecordReader, we can load all the image labels from the input CSV files, as follows:

int numLinesToSkip = 0;
String delimiter = ",";
RecordReader...

Summary


CNNs, although not a new concept, has gained immense popularity in the last half a decade. The network primarily finds its application in the field of vision. The last few years have seen some major research on CNN by various technological companies such as Google, Microsoft, Apple, and the like, and also from various eminent researchers. Starting from the beginning, this chapter talked about the concept of convolution, which is the backbone of this type of network. Going forward, the chapter introduced the various layers of this network. Then it provided in-depth explanations for every associated layer of the deep CNN. After that, the various hyperparameters and their relations with the network were explained, both theoretically and mathematically. Later, the chapter talked about the approach of how to distribute the deep CNN across various machines with the help of Hadoop and its YARN. The last part discussed how to implement this network using Deeplearning4j for every worker working...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Deep Learning with Hadoop
Published in: Feb 2017Publisher: PacktISBN-13: 9781787124769
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Dipayan Dev

Dipayan Dev has completed his M.Tech from National Institute of Technology, Silchar with a first class first and is currently working as a software professional in Bengaluru, India. He has extensive knowledge and experience in non-relational database technologies, having primarily worked with large-scale data over the last few years. His core expertise lies in Hadoop Framework. During his postgraduation, Dipayan had built an infinite scalable framework for Hadoop, called Dr. Hadoop, which got published in top-tier SCI-E indexed journal of Springer (http://link.springer.com/article/10.1631/FITEE.1500015). Dr. Hadoop has recently been cited by Goo Wikipedia in their Apache Hadoop article. Apart from that, he registers interest in a wide range of distributed system technologies, such as Redis, Apache Spark, Elasticsearch, Hive, Pig, Riak, and other NoSQL databases. Dipayan has also authored various research papers and book chapters, which are published by IEEE and top-tier Springer Journals. To know more about him, you can also visit his LinkedIn profile https://www.linkedin.com/in/dipayandev.
Read more about Dipayan Dev