Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Big Data Analytics with Java

You're reading from  Big Data Analytics with Java

Product type Book
Published in Jul 2017
Publisher Packt
ISBN-13 9781787288980
Pages 418 pages
Edition 1st Edition
Languages
Concepts
Author (1):
RAJAT MEHTA RAJAT MEHTA
Profile icon RAJAT MEHTA

Table of Contents (21) Chapters

Big Data Analytics with Java
Credits
About the Author
About the Reviewers
www.PacktPub.com
Customer Feedback
Preface
Big Data Analytics with Java First Steps in Data Analysis Data Visualization Basics of Machine Learning Regression on Big Data Naive Bayes and Sentiment Analysis Decision Trees Ensembling on Big Data Recommendation Systems Clustering and Customer Segmentation on Big Data Massive Graphs on Big Data Real-Time Analytics on Big Data Deep Learning Using Big Data Index

Chapter 13. Deep Learning Using Big Data

In recent years, if there is something that has gained lot of traction and advancement in the field of computer science research it is deep learning. If you pick up any of the latest research papers, you will see that a lot of them are in fact in the field of deep learning only. Deep learning is a form of machine learning that sat idle for quite some time, until recently, when computations on multiple parallel computers became more advanced. The technology behind the self-driving car or an ATM recognizing a hand-written check is all done through deep learning in real life. So, what exactly is deep learning? We will cover the basic details of deep learning in this chapter. It is a form of machine learning that roughly mimics the working of a human brain using neural networks. Deep learning is a vast field and a growing one too, so this chapter should serve as a bare minimum for anyone trying to find a basic introduction on the topic. More advanced...

Introduction to neural networks


Our human brain has millions of neurons that talk and transfer signals to each other. So, when one neuron calculates a signal and transfers it to another neuron, the second neuron that is connected to the first neuron becomes their input and acts on it. In this way, the initial input goes through various neurons while being altered at each level until a final deduction can be made. You can think of our brain as a graph of these neurons interconnected to each other, and sending signal or inputs to each other.

Let's see how a typical neuron in the human brain looks:

This picture of a human neuron has been taken from Wikipedia and can be seen at this link: https://en.wikipedia.org/wiki/Dendrite. The neuron has some important components, as seen in the labels in the image. They are explained as follows:

  • Dendrite: These are hair-like parts. They connect the neuron to other neurons, and they are used in taking input from other neurons.

  • Cell Body: This is the place...

Perceptron


A perceptron is a type of artificial neuron that is mathematical and programmatic. It takes in many inputs and applies weights to them based on the importance of the inputs, and then adds a bias before using this mathematical approach to figure out a result. This result from the perceptron is then fed to a machine learning algorithm, such as logistic regression. We call this algorithm as an activation function, which is then is used to predict the final result of the outcome.

The perceptron is depicted as follows:

As you can see in the previous image, a perceptron depicts an artificial neuron that takes in various inputs in binary form and multiplies them with a weight, w. The weight is calculated based on the importance of the input. A bias value is also added, along with the weights. Now, the entire combination is summed up by the perceptron. Finally, the summed-up output is tested against a threshold value, and we call this as an Activation Function. If the value is above a threshold...

Deep learning


In the last section, we saw how a number of perceptrons can be stacked together in multiple layers to start a learning network. We saw an example of a feed forward network with just one hidden layer. Apart from just a single hidden layer, we can have multiple hidden layers stacked one after the other. This would enhance the accuracy of the artificial neural network further. When an artificial neural network has multiple hidden layers (that is, greater than one), this approach is called deep learning as the network is deep.

Deep learning is currently one of the most widely studied research topics and it is practically used in many real-world applications.

Let's now see some of the advantages and real-world use cases of deep learning.

Advantages and use cases of deep learning

There are two main advantages of deep learning:

  1. No feature engineering required: In traditional machine learning, feature engineering is of the utmost importance if you want your models to work well. There are...

Flower species classification using multi-Layer perceptrons


This is a simple hello world-style program for performing classification using multi-layer perceptrons. For this, we will be using the famous Iris dataset, which can be downloaded from the UCI Machine Learning Repository at https://archive.ics.uci.edu/ml/datasets/Iris. This dataset has four types of datapoints, shown as follows:

Attribute name

Attribute description

Petal Length

Petal length in cm

Petal Width

Petal width in cm

Sepal Length

Sepal length in cm

Sepal Width

Sepal width in cm

Class

The type of iris flower that is Iris Setosa, Iris Versicolour, Iris Virginica

This is a simple dataset with three types of Iris classes, as mentioned in the table.

From the perspective of our neural network of perceptrons, we will be using the multi-perceptron algorithm bundled inside the spark ml library and will demonstrate how you can club it with the Spark-provided pipeline API for the easy manipulation of the machine...

Deeplearning4j


This is a Java library that is used to build different types of neural networks. It can be easily integrated with Apache Spark on the big data stack and can even run on GPUs. It is the only main Java library out there currently that has a lot of built-in algorithms focusing on deep learning. It also has a very good online community and good documentation, which can be checked on its website at https://deeplearning4j.org.

There are lots of submodules within this Java library and we need some of those sub modules for running our machine learning algorithms. To check out more detail and running samples within Deeplearning4j, please refer to their documentation. We will not cover Deeplearning4j API in this book, please refer to https://deeplearning4j.org for more information on its documentation.

In order to generate the curiosity of the reader as to what all can be accomplished with deep learning we will end the chapter with another simple sample case study of hand written digit...

Hand written digit recognizition using CNN


This is one of the classic "Hello World" type problem in the field of deep learning. We already covered one very simple case study of flower classification earlier and in this one we are going to classify hand written digits. For this case study we are using the MNIST dataset. The MNIST database of handwritten digits is available at http://yann.lecun.com/exdb/mnist/. It has a training set of 60,000 examples, and a test set of 10,000 examples. Some of the sample images in this dataset are as shown:

A typical hello world neural network that we are building is to train our network with the training set and to classify the images based on the test set. For this we will use a CNN or convolutional neural network.

A convolutional neural network is a special type of feed forward neural network and is especially suited for image classification. Explaining the entire concept of a convolution network is beyond scope of this chapter but we will explain it briefly...

Summary


This chapter gave a brief introduction to the field of deep learning for developers. We started with how an artificial neural network mimics the working of our own nervous system. We showed the basic unit of this artificial neural network, the perceptron. We showed how perceptrons can be used to depict logical functions and we later moved on to show their pitfalls. Later, we learnt how the perceptron's usage can be enhanced by making modifications to it, leading us to the artificial neuron, the sigmoid neuron. Further we also covered a sample case study for the classification of Iris flower species based on the features that were used to train our neural network. We also mentioned how the Java library Deeplearning4j includes many deep learning algorithms that can be integrated with Apache Spark on the Java big data stack. Finally, we provided readers with information on where they can find free resources to learn more.

lock icon The rest of the chapter is locked
You have been reading a chapter from
Big Data Analytics with Java
Published in: Jul 2017 Publisher: Packt ISBN-13: 9781787288980
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}