Packt+ | Advance your knowledge in tech

You're reading from Deep Learning with Hadoop

Product typeBook

Published inFeb 2017

Reading LevelIntermediate

PublisherPackt

ISBN-139781787124769

Edition1st Edition

Languages

Java

Tools

Deeplearning4j Hadoop

Concepts

Deep Learning

Author (1)

Dipayan Dev

Chapter 5. Restricted Boltzmann Machines

	"What I cannot create, I do not understand."
	--Richard Feynman

So far in this book, we have only discussed the discriminative models. The use of these in deep learning is to model the dependencies of an unobserved variable y on an observed variable x. Mathematically, it is formulated as P(y|x). In this chapter, we will discuss deep generative models to be used in deep learning.

Generative models are models, which when given some hidden parameters, can randomly generate some observable data values out of them. The model works on a joint probability distribution over label sequences and observation.

The generative models are used in machine and deep learning either as an intermediate step to generate a conditional probability density function or modeling observations directly from a probability density function.

Restricted Boltzmann machines (RBMs) are a popular generative model that will be discussed in this chapter. RBMs are basically probabilistic...

Energy-based models

The main goal of deep learning and statistical modeling is to encode the dependencies between variables. By getting an idea of those dependencies, from the values of the known variables, a model can answer questions about the unknown variables.

Energy-based models (EBMs) [120] gather and collect the dependencies by identifying scaler energy, which generally is a measure of compatibility to each configuration of the variable. In EBMs, the predictions are made by setting the value of observed variables and finding the value of the unobserved variables, which minimize the overall energy. Learning in EBMs consists of formulating an energy function, which assigns low energies to the correct values of unobserved variables and higher energies to the incorrect ones. Energy-based learning can be treated as an alternative to probabilistic estimation for classification, decision-making, or prediction tasks.

To give a clear idea about how EBMs work, let us look at a simple example...

Boltzmann machines

Boltzmann machines [122] are a network of symmetrically connected, neuron-like units, which are used for stochastic decisions on the given datasets. Initially, they were introduced to learn the probability distributions over binary vectors. Boltzmann machines possess a simple learning algorithm, which helps them to infer and reach interesting conclusions about input datasets containing binary vectors. The learning algorithm becomes very slow in networks with many layers of feature detectors; however, with one layer of feature detector at a time, learning can be much faster.

To solve a learning problem, Boltzmann machines consist of a set of binary data vectors, and update the weight on the respective connections so that the data vectors turn out to be good solutions for the optimization problem laid by the weights. The Boltzmann machine, to solve the learning problem, makes lots of small updates to these weights.

The Boltzmann machine over a d-dimensional binary vector can...

Restricted Boltzmann machine

The Restricted Boltzmann machine (RBM) is a classic example of building blocks of deep probabilistic models that are used for deep learning. The RBM itself is not a deep model but can be used as a building block to form other deep models. In fact, RBMs are undirected probabilistic graphical models that consist of a layer of observed variables and a single layer of hidden variables, which may be used to learn the representation for the input. In this section, we will explain how the RBM can be used to build many deeper models.

Let us consider two examples to see the use case of RBM. RBM primarily operates on a binary version of factor analysis. Let us say we have a restaurant, and want to ask our customer to rate the food on a scale of 0 to 5. In the traditional approach, we will try to explain each food item and customer in terms of the variable's hidden factors. For example, foods such as pasta and lasagne will have a strong association with the Italian factors...

Convolutional Restricted Boltzmann machines

Very high dimensional inputs, such as images or videos, put immense stress on the memory, computation, and operational requirements of traditional machine learning models. In Chapter 3 , Convolutional Neural Network, we have shown how replacing the matrix multiplication by discrete convolutional operations with small kernel resolves these problems. Going forward, Desjardins and Bengio [123] have shown that this approach also works fine when applied to RBMs. In this section, we will discuss the functionalities of this model.

Figure 5.7 : Figure shows the observed variables or the visible units of an RBM can be associated with mini batches of image to a compute the final result. The weight connections represents a set of filters

Further, in normal RBMs, the visible units are directly related to all the hidden variables through different parameters and weights. To describe an image in terms of spatially local features ideally needs fewer parameters...

Deep Belief networks

Deep Belief networks (DBNs) were one of the most popular, non-convolutional models that could be successfully deployed as deep neural networks in the year 2006-07 [124] [125]. The renaissance of deep learning probably started from the invention of DBNs back in 2006. Before the introduction of DBNs, it was very difficult to optimize the deep models. By outperforming the Support Vector machines (SVMs), DBNs had shown that deep models can be really successful; although, compared to the other generative or unsupervised learning algorithms, the popularity of DBNs has fallen a bit, and is rarely used these days. However, they still play a very important role in the history of deep learning.

Note

A DBN with only one hidden layer is just an RBM.

DBNs are generative models composed of more than one layer of hidden variables. The hidden variables are generally binary in nature; however the visible units might consist of binary or real values. In DBNs, every unit of each layer is...

Distributed Deep Belief network

DBNs have so far achieved a lot in numerous applications such as speech and phone recognition [127], information retrieval [128], human motion modelling[129], and so on. However, the sequential implementation for both RBM and DBNs come with various limitations. With a large-scale dataset, the models show various shortcomings in their applications due to the long, time consuming computation involved, memory demanding nature of the algorithms, and so on. To work with Big data, RBMs and DBNs require distributed computing to provide scalable, coherent and efficient learning.

To make DBNs acquiescent to the large-scale dataset stored on a cluster of computers, DBNs should acquire a distributed learning approach with Hadoop and Map-Reduce. The paper in [130] has shown a key-value pair approach for each level of an RBM, where the pre-training is accomplished with layer-wise, in a distributed environment in Map-Reduce framework. The learning is performed on Hadoop...

Implementation using Deeplearning4j

This section of the chapter will provide a basic idea of how to write the code for RBMs and DBNs using Deeplearning4j. Readers will be able to learn the syntax for using the various hyperparameters mentioned in this chapter.

To implement RBMs and DBNs using Deeplearning4j, the whole idea is very simple. The overall implementation can be split into three core phases: loading data or preparation of the data, network configuration, and training and evaluation of the model.

In this section, we will first discuss RBMs on IrisDataSet, and then we will come to the implementation of DBNs.

Restricted Boltzmann machines

For the building and training of RBMs, first we need to define and initialize the hyperparameter needed for the model:

Nd4j.MAX_SLICES_TO_PRINT = -1;       
Nd4j.MAX_ELEMENTS_PER_SLICE = -1;       
Nd4j.ENFORCE_NUMERICAL_STABILITY = true;       
final int numRows = 4;       
final int numColumns = 1;       
int outputNum = 10;       
int numSamples ...

Summary

The RBM is a generative model, which can randomly produce visible data values when some latent or hidden parameters are supplied to it. In this chapter, we have discussed the concept and mathematical model of the Boltzmann machine, which is an energy-based model. The chapter then discusses and gives a visual representation of the RBM. Further, this chapter discusses CRBM, which is a combination of Convolution and RBMs to extract the features of high dimensional images. We then moved toward popular DBNs that are nothing but a stacked implementation of RBMs. The chapter further discusses the approach to distribute the training of RBMs as well as DBNs in the Hadoop framework.

We conclude the chapter by providing code samples for both the models. The next chapter of the book will introduce one more generative model called autoencoder and its various forms such as de-noising autoencoder, deep autoencoder, and so on.

The rest of the chapter is locked

You have been reading a chapter from

Deep Learning with Hadoop

Published in: Feb 2017Publisher: PacktISBN-13: 9781787124769

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Dipayan Dev

Dipayan Dev has completed his M.Tech from National Institute of Technology, Silchar with a first class first and is currently working as a software professional in Bengaluru, India. He has extensive knowledge and experience in non-relational database technologies, having primarily worked with large-scale data over the last few years. His core expertise lies in Hadoop Framework. During his postgraduation, Dipayan had built an infinite scalable framework for Hadoop, called Dr. Hadoop, which got published in top-tier SCI-E indexed journal of Springer (http://link.springer.com/article/10.1631/FITEE.1500015). Dr. Hadoop has recently been cited by Goo Wikipedia in their Apache Hadoop article. Apart from that, he registers interest in a wide range of distributed system technologies, such as Redis, Apache Spark, Elasticsearch, Hive, Pig, Riak, and other NoSQL databases. Dipayan has also authored various research papers and book chapters, which are published by IEEE and top-tier Springer Journals. To know more about him, you can also visit his LinkedIn profile https://www.linkedin.com/in/dipayandev.
Read more about Dipayan Dev

Other recommended products

Related to this chapter

Java Data Science Cookbook

Java has been one of the most popular languages for developers for several decades and yet the potential of the Java ecosystem still remains untapped when it comes to using JVM-based languages and platforms to solve data science related problems. A variety of tools and libraries are available such as Spark, Hadoop, and Mahout for computation and libraries such as MLlib, Weka, DL4j to implement smart data models. This book uncovers practically all these techniques in the form of recipes showing you how these tools and libraries can solve statistical, analytical, data mining, and information science related problems.

BookMar 2017372 pages

Recurrent Neural Networks with Python Quick Start Guide

Developers struggle to find an easy to follow learning resource for implementing Recurrent Neural Network(RNN) models. RNNs are the state-of-the-art model in deep learning for dealing with sequential data. From language translation to generating captions for an image, RNNs are used to continuously improve the results. This book will teach you the fundamentals of RNNs with example applications in Python and the TensorFlow library. The examples are accompanied by the right combination of theoretical knowledge and real-world implementations of concepts to build a solid foundation of neural network modeling.

BookNov 2018122 pages

Java Deep Learning Cookbook

Deep Learning is a trending topic in AI currently, as it allows you to make faster and more accurate predictions using the power of neural networks. This book will teach you the process of neural network design, and show you how to develop efficient deep learning applications using Deeplearning4j through practical and easy to implement recipes.

BookNov 2019304 pages

Hands-On Deep Learning with Apache Spark

Deep Learning is a subset of Machine Learning where data sets with several layers of complexity can be processed. This book teaches you the different techniques using which deep learning solutions can be implemented at scale, on Apache Spark. This will help you gain experience of implementing your deep learning models in many real-world use cases.

BookJan 2019322 pages

Python Deep Learning

Starting with a quick recap of important machine learning concepts, the book will delve straight into deep learning principles using Sci-kit learn. Moving ahead, you will learn to use the latest open source libraries such as Theano, Keras, Google's TensorFlow, and H20. Use this guide to uncover the difficulties of pattern recognition, scaling data with greater accuracy and discussing deep learning algorithms and techniques.

BookApr 2017406 pages

Neural Network Programming with Tensorflow

If you’re aware of the buzz surrounding the terms such as machine learning, artificial intelligence or deep learning, you might know what neural networks are. TensorFlow is a popular framework which can be used to implement efficient neural networks and deep learning models. This book will show you how to leverage the power of TensorFlow to train efficient neural networks. You will start with understanding the fundamentals and basic math for neural networks and why TensorFlow is a popular choice of tool for programming neural networks. During the course of the book, you will be working on real-world datasets to get a hands-on understanding of neural network programming. By the end of this book, you will have a fair understanding of how you can leverage the power of TensorFlow to train neural networks of varying complexities, without any hassle. While you are learning about various neural network implementations you will learn the underlying mathematics and linear algebra and how it maps to the appropriate TensorFlow constructs.

BookNov 2017274 pages

Hands-On Deep Learning Architectures with Python

This book explains the essential learning algorithms used for deep and shallow architectures. Packed with practical implementations to help you understand the concepts and ideas required to build efficient artificial intelligence systems, this book will help you construct deep models using popular frameworks and datasets.

BookApr 2019316 pages

R Deep Learning Cookbook

Deep Learning is the next big thing. It is a part of machine learning. Its favorable results in application with huge and complex data is remarkable. This book will help you to get through the problems that you face during the execution of different tasks and understand hacks in deep learning, neural networks, and advanced machine learning techniques

BookAug 2017288 pages

Deep Learning for Beginners

This book is for beginners who are looking for a strong foundation to build deep learning models from scratch. You will test your understanding of the concepts and measure your progress at the end of each chapter. You will have a firm understanding of deep learning and will be able to identify which algorithms are appropriate for different tasks.

BookSep 2020432 pages

Java Deep Learning Projects

You will build full-fledged, deep learning applications with Java and different open-source libraries. Master numerical computing, deep learning, and the latest Java programming features to carry out complex advanced tasks. This book is filled with best practices/tips after every project to help you optimize your deep learning models with ease.

BookJun 2018436 pages

Java for Data Science

Harness the incredible power of Java-based approaches to data science and create new, innovative applications to explore, visualise and analyse big data. With its tutorial approach and step-by-step instructional style, Java for Data Science is the ultimate data science book for Java developers interested in Java-based data science solutions.

BookJan 2017386 pages

Deep Learning with R Cookbook

This book will help you get through the problems that you face during the execution of different tasks and understand hacks in deep learning. With unique recipes, you will implement various deep learning architectures using R 3.5.x. You will cover complex algorithms to perform tasks such as reinforcement learning, GANs, advanced neural networks and more.

BookFeb 2020328 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages