Packt+ | Advance your knowledge in tech

You're reading from Deep Learning with Hadoop

Product typeBook

Published inFeb 2017

Reading LevelIntermediate

PublisherPackt

ISBN-139781787124769

Edition1st Edition

Languages

Java

Tools

Deeplearning4j Hadoop

Concepts

Deep Learning

Author (1)

Dipayan Dev

Chapter 3. Convolutional Neural Network

	"The question of whether a computer can think is no more interesting than the question of whether a submarine can swim."
	--Edsger W. Dijkstra

Convolutional neural network (CNN)--doesn't it give an uncanny feeling about the combination of mathematics and biology with some negligible amount of computer science added? However, these type of networks have been some of the most dominant and powerful architectures in the field of computer vision. CNN started to gain its popularity after 2012, when there were huge improvements in the precision of classification, credit to some pioneer in the field of deep learning. Ever since then, a bunch of high-tech companies have been using deep CNN for various services. Amazon uses CNN for their product recommendations, Google uses it for their photo search, and Facebook primarily uses it for its automatic tagging algorithms.

CNN [89] is a type of feed-forward neural network comprised of neurons, which have learnable...

Understanding convolution

To understand the concept of convolution, let us take an example to determine the position of a lost mobile phone with the help of a laser sensor. Let's say the current location of the mobile phone at time t can be given by the laser as f (t). The laser gives different readings of the location for all the values of t. The laser sensors are generally noisy in nature, which is undesirable for this scenario. Therefore, to derive a less noisy measurement of the location of the phone, we need to calculate the average various measurements. Ideally, the more the measurements, the greater the accuracy of the location. Hence, we should undergo a weighted average, which provides more weight to the measurements.

A weighted function can be given by the function w (b), where b denotes the age of the measurement. To derive a new function that will provide a better estimate of the location of the mobile phone, we need to take the average of the weight at every moment.

The new function...

Background of a CNN

CNN, a particular form of deep learning models, is not a new concept, and they have been widely adopted by the vision community for a long time. The model worked well in recognizing the hand-written digit by LeCun et al in 1998 [90]. But unfortunately, due to the inability of CNNs to work with higher resolution images, its popularity has diminished with the course of time. The reason was mostly due to hardware and memory constraints, and also the lack of availability of large-scale training datasets. As the computational power increases with time, mostly due to the wide availability of CPUs and GPUs and with the generation of big data, various large-scale datasets, such as the MIT Places dataset (see Zhou et al., 2014), ImageNet [91] and so on. it became possible to train larger and complex models. This is initially shown by Krizhevsky et al [4] in their paper, Imagenet classification using deep convolutional neural networks. In that paper, they brought down the error...

Basic layers of CNN

A CNN is composed of a sequence of layers, where every layer of the network goes through a differentiable function to transform itself from one volume of activation to another. Four main types of layers are used to build a CNN: Convolutional layer, Rectified Linear Units layer, Pooling layer, and Fully-connected layer. All these layers are stacked together to form a full CNN.

A regular CNN could have the following architecture:

[INPUT - CONV - RELU - POOL - FC]

However, in a deep CNN, there are generally more layers interspersed between these five basic layers.

A classic deep neural network will have the following structure:

Input -> Conv->ReLU->Conv->ReLu->Pooling->ReLU->Conv->ReLu->Pooling->Fully Connected

AlexNet, as mentioned in the earlier section, can be taken as a perfect example for this kind of structure. The architecture of AlexNet is shown in Figure 3.4. After every layer, an implicit ReLU non-linearity has been added. We will explain...

Distributed deep CNN

This section of the chapter will introduce some extremely aggressive deep CNN architecture, associated challenges for these networks, and the need of much larger distributed computing to overcome this. This section will explain how Hadoop and its YARN can provide a sufficient solution for this problem.

Most popular aggressive deep neural networks and their configurations

CNNs have shown stunning results in image recognition in recent years. However, unfortunately, they are extremely expensive to train. In the case of a sequential training process, the convolution operation takes around 95% of the total running time. With big datasets, even with low-scale distributed training, the training process takes many days to complete. The award winning CNN, AlexNet with ImageNet in 2012, took nearly an entire week to train on with two GTX 580 3 GB GPUs. The following table displays few of the most popular distributed deep CNNs with their configuration and corresponding time taken...

Convolutional layer using Deeplearning4j

This section of the chapter will provide the basic idea on how to write the code for CNN using Deeplearning4j. You'll be able to learn the syntax for using the various hyperparameters mentioned in this chapter.

To implement CNN using Deeplearning4j, the whole idea can be split into three core phases: loading data or preparation of the data, network configuration, and training and evaluation of the model.

Loading data

For CNNs, generally, we only work on the image data to train the model. In Deeplearning4j, the images can be read using ImageRecordReader. The following code snippet shows how to load 16Ã16 color images for the model:

RecordReader imageReader = new ImageRecordReader(16, 16, false);
imageReader.initialize(new FileSplit(new      
File(System.getProperty("user.home"), "image_location")));

After that, using CSVRecordReader, we can load all the image labels from the input CSV files, as follows:

int numLinesToSkip = 0;
String delimiter = ",";
RecordReader...

Summary

CNNs, although not a new concept, has gained immense popularity in the last half a decade. The network primarily finds its application in the field of vision. The last few years have seen some major research on CNN by various technological companies such as Google, Microsoft, Apple, and the like, and also from various eminent researchers. Starting from the beginning, this chapter talked about the concept of convolution, which is the backbone of this type of network. Going forward, the chapter introduced the various layers of this network. Then it provided in-depth explanations for every associated layer of the deep CNN. After that, the various hyperparameters and their relations with the network were explained, both theoretically and mathematically. Later, the chapter talked about the approach of how to distribute the deep CNN across various machines with the help of Hadoop and its YARN. The last part discussed how to implement this network using Deeplearning4j for every worker working...

The rest of the chapter is locked

You have been reading a chapter from

Deep Learning with Hadoop

Published in: Feb 2017Publisher: PacktISBN-13: 9781787124769

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Dipayan Dev

Dipayan Dev has completed his M.Tech from National Institute of Technology, Silchar with a first class first and is currently working as a software professional in Bengaluru, India. He has extensive knowledge and experience in non-relational database technologies, having primarily worked with large-scale data over the last few years. His core expertise lies in Hadoop Framework. During his postgraduation, Dipayan had built an infinite scalable framework for Hadoop, called Dr. Hadoop, which got published in top-tier SCI-E indexed journal of Springer (http://link.springer.com/article/10.1631/FITEE.1500015). Dr. Hadoop has recently been cited by Goo Wikipedia in their Apache Hadoop article. Apart from that, he registers interest in a wide range of distributed system technologies, such as Redis, Apache Spark, Elasticsearch, Hive, Pig, Riak, and other NoSQL databases. Dipayan has also authored various research papers and book chapters, which are published by IEEE and top-tier Springer Journals. To know more about him, you can also visit his LinkedIn profile https://www.linkedin.com/in/dipayandev.
Read more about Dipayan Dev

Other recommended products

Related to this chapter

Java Data Science Cookbook

Java has been one of the most popular languages for developers for several decades and yet the potential of the Java ecosystem still remains untapped when it comes to using JVM-based languages and platforms to solve data science related problems. A variety of tools and libraries are available such as Spark, Hadoop, and Mahout for computation and libraries such as MLlib, Weka, DL4j to implement smart data models. This book uncovers practically all these techniques in the form of recipes showing you how these tools and libraries can solve statistical, analytical, data mining, and information science related problems.

BookMar 2017372 pages

Recurrent Neural Networks with Python Quick Start Guide

Developers struggle to find an easy to follow learning resource for implementing Recurrent Neural Network(RNN) models. RNNs are the state-of-the-art model in deep learning for dealing with sequential data. From language translation to generating captions for an image, RNNs are used to continuously improve the results. This book will teach you the fundamentals of RNNs with example applications in Python and the TensorFlow library. The examples are accompanied by the right combination of theoretical knowledge and real-world implementations of concepts to build a solid foundation of neural network modeling.

BookNov 2018122 pages

Java Deep Learning Cookbook

Deep Learning is a trending topic in AI currently, as it allows you to make faster and more accurate predictions using the power of neural networks. This book will teach you the process of neural network design, and show you how to develop efficient deep learning applications using Deeplearning4j through practical and easy to implement recipes.

BookNov 2019304 pages

Hands-On Deep Learning with Apache Spark

Deep Learning is a subset of Machine Learning where data sets with several layers of complexity can be processed. This book teaches you the different techniques using which deep learning solutions can be implemented at scale, on Apache Spark. This will help you gain experience of implementing your deep learning models in many real-world use cases.

BookJan 2019322 pages

Python Deep Learning

Starting with a quick recap of important machine learning concepts, the book will delve straight into deep learning principles using Sci-kit learn. Moving ahead, you will learn to use the latest open source libraries such as Theano, Keras, Google's TensorFlow, and H20. Use this guide to uncover the difficulties of pattern recognition, scaling data with greater accuracy and discussing deep learning algorithms and techniques.

BookApr 2017406 pages

Neural Network Programming with Tensorflow

If you’re aware of the buzz surrounding the terms such as machine learning, artificial intelligence or deep learning, you might know what neural networks are. TensorFlow is a popular framework which can be used to implement efficient neural networks and deep learning models. This book will show you how to leverage the power of TensorFlow to train efficient neural networks. You will start with understanding the fundamentals and basic math for neural networks and why TensorFlow is a popular choice of tool for programming neural networks. During the course of the book, you will be working on real-world datasets to get a hands-on understanding of neural network programming. By the end of this book, you will have a fair understanding of how you can leverage the power of TensorFlow to train neural networks of varying complexities, without any hassle. While you are learning about various neural network implementations you will learn the underlying mathematics and linear algebra and how it maps to the appropriate TensorFlow constructs.

BookNov 2017274 pages

Hands-On Deep Learning Architectures with Python

This book explains the essential learning algorithms used for deep and shallow architectures. Packed with practical implementations to help you understand the concepts and ideas required to build efficient artificial intelligence systems, this book will help you construct deep models using popular frameworks and datasets.

BookApr 2019316 pages

R Deep Learning Cookbook

Deep Learning is the next big thing. It is a part of machine learning. Its favorable results in application with huge and complex data is remarkable. This book will help you to get through the problems that you face during the execution of different tasks and understand hacks in deep learning, neural networks, and advanced machine learning techniques

BookAug 2017288 pages

Deep Learning for Beginners

This book is for beginners who are looking for a strong foundation to build deep learning models from scratch. You will test your understanding of the concepts and measure your progress at the end of each chapter. You will have a firm understanding of deep learning and will be able to identify which algorithms are appropriate for different tasks.

BookSep 2020432 pages

Java Deep Learning Projects

You will build full-fledged, deep learning applications with Java and different open-source libraries. Master numerical computing, deep learning, and the latest Java programming features to carry out complex advanced tasks. This book is filled with best practices/tips after every project to help you optimize your deep learning models with ease.

BookJun 2018436 pages

Java for Data Science

Harness the incredible power of Java-based approaches to data science and create new, innovative applications to explore, visualise and analyse big data. With its tutorial approach and step-by-step instructional style, Java for Data Science is the ultimate data science book for Java developers interested in Java-based data science solutions.

BookJan 2017386 pages

Deep Learning with R Cookbook

This book will help you get through the problems that you face during the execution of different tasks and understand hacks in deep learning. With unique recipes, you will implement various deep learning architectures using R 3.5.x. You will cover complex algorithms to perform tasks such as reinforcement learning, GANs, advanced neural networks and more.

BookFeb 2020328 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages