Packt+ | Advance your knowledge in tech

You're reading from Java for Data Science

Product typeBook

Published inJan 2017

Reading LevelIntermediate

PublisherPackt

ISBN-139781785280115

Edition1st Edition

Languages

Java

Tools

Weka Deeplearning4j

Concepts

Data Science

Authors (2):

Richard M. Reese

Jennifer L. Reese

View More author details

Chapter 8. Deep Learning

In this chapter, we will focus on neural networks, often referred to as Deep Learning Networks (DLNs). This type of network is characterized as a multiple-layer neural network. Each of these layers are rained on the output of the previous layer, potentially identifying features and sub-features of the dataset. A feature hierarchy is created in this manner.

DLNs typically work with unstructured and unlabeled data, which constitute the vast bulk of data found in the world today. DLN will take this unstructured data, identify features, and try to reconstruct the original input. This approach is illustrated with Restricted Boltzmann Machines (RBMs) in Restricted Boltzmann Machines and with autoencoders in Deep autoencoders. An autoencoder takes a dataset and effectively compresses it. It then decompresses it to reconstruct the original dataset.

DLN can also be used for predictive analysis. The last step of a DLN will use an activation function to generate output represented...

Deeplearning4j architecture

In this section, we will discuss its architecture and address several of the common tasks performed when using the API. DLN typically starts with the creation of a MultiLayerConfiguration instance, which defines the network, or model. The network is composed of multiple layers. Hyperparameters are used to configure the network and are variables that affect such things as learning speed, activation functions to use for a layer, and how weights are to be initialized.

As with neural networks, the basic DLN process consists of:

Acquiring and manipulating data
Configuring and building a model
Training the model
Testing the model

We will investigate each of these tasks in the next sections.

Note

The code examples in this section are not intended to be entered and executed here. Instead, these examples are snippets out of later models that we will be using.

Acquiring and manipulating data

The DL4J API has a number of techniques for acquiring data. We will focus on those specific...

Deep learning and regression analysis

Neural networks can be used to perform regression analysis. However, other techniques (see the early chapters) may offer a more effective solution. With regression analysis, we want to predict a result based on several input variables.

We can perform regression analysis using an output layer that consists of a single neuron that sums the weighted input plus bias of the previous hidden layer. Thus, the result is a single value representing the regression.

Preparing the data

We will use a car evaluation database to demonstrate how to predict the acceptability of a car based on a series of attributes. The file containing the data we will be using can be downloaded from: http://archive.ics.uci.edu/ml/machine-learning-databases/car/car.data. It consists of car data such as price, number of passengers, and safety information, and an assessment of its overall quality. It is this latter element, quality, that we will try to predict. The comma-delimited values in...

Restricted Boltzmann Machines

RBM is often used as part of a multi-layer deep belief network. The output of the RBM is used as an input to another layer. The use of the RBM is repeated until the final layer is reached.

Note

Deep Belief Networks (DBNs) consist of several RBMs stacked together. Each hidden layer provides the input for the subsequent layer. Within each layer, the nodes cannot communicate laterally and it becomes essentially a network of other single-layer networks. DBNs are especially helpful for classifying, clustering, and recognizing image data.

The term, continuous restricted Boltzmann machine, refers an RBM that uses values other than integers. Input data is normalized to values between zero and one.

Each node of the input layer is connected to each node of the second layer. No nodes of the same layer are connected to each other. That is, there is no intra-layer communication. This is what restricted means.

The number of input nodes for the visible layer is dependent on the...

Deep autoencoders

An autoencoder is used for feature selection and extraction. It consists of two symmetrical DBNs. The first half of the network is composed of several layers, which performs encoding. The second part of the network performs decoding. Each layer of the autoencoder is an RBM. This is illustrated in the following figure:

The purpose of the encoding sequence is to compress the original input into a smaller vector space. The middle layer of the previous figure is this compressed layer. These intermediate vectors can be thought of as possible features of the dataset. The encoding is also referred to as the pre-training half. It is the output of the intermediate RBM layer and does not perform classification.

The encoder's first layer will use more inputs than used by the dataset. This has the effect of expanding the features of the dataset. A sigmoid-belief unit is a form of non-linear transformation used with each layer. This unit is not able to accurately represent information...

Convolutional networks

CNNs are feed-forward networks modeled after the visual cortex found in animals. The visual cortex is arranged with overlapping neurons, and so in this type of network, the neurons are also arranged in overlapping sections, known as receptive fields. Due to their design model, they function with minimal preprocessing or prior knowledge, and this lack of human intervention makes them especially useful.

This type of network is used frequently in image and video recognition applications. They can be used for classification, clustering, and object recognition. CNNs can also be applied to text analysis by implementing Optical Character Recognition (OCR). CNNs have been a driving force in the machine learning movement in part due to their wide applicability in practical situations.

We are going to demonstrate a CNN using DL4J. The process will closely mirror the process we used in the Building an autoencoder in DL4J section. We will again use the Mnist dataset. This dataset...

Recurrent Neural Networks

RNN differ from feed-forward networks in that their input includes the input from the previous iteration or step. They still process the current input but use a feedback loop to take into consideration the inputs to the prior step, also called the recent past, for context. This step effectively gives the network memory. One popular type of recurrent network involves Long Short-Term Memory (LSTM). This type of memory improves the processing power of the network.

RNNs are designed to process sequential data and are especially useful for analysis and prediction with text data. Given a sequence of words, an RNN can predict the probability of each word being the next in the sequence. This also allows for text generation by the network. RNNs are versatile and also process image data well, especially image labeling applications. The flexibility in design and purpose and ease in training make RNNs popular choices for many data science applications. DL4J also provides support...

Summary

In this chapter, we examined deep learning techniques for neural networks. All API support in this chapter was provided by Deeplearning4j. We began by demonstrating how to acquire and prepare data for use with deep learning networks. We discussed how to configure and build a model. This was followed by an explanation of how to train and test a model by splitting the dataset into training and testing segments.

Our discussion continued with an examination of deep learning and regression analysis. We showed how to prepare the data and class, build the model, and evaluate the model. We used sample data and displayed output statistics to demonstrate the relative effectiveness of our model.

RBM and DBNs were then examined. DBNs are comprised of RBMs stacked together and are especially useful for classification and clustering applications. Deep autoencoders are also built using RBMs, with two symmetrical DBNs. The autoencoders are especially useful for feature selection and extraction.

Finally...

The rest of the chapter is locked

You have been reading a chapter from

Java for Data Science

Published in: Jan 2017Publisher: PacktISBN-13: 9781785280115

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Authors (2)

Richard M. Reese

Richard Reese has worked in the industry and academics for the past 29 years. For 10 years he provided software development support at Lockheed and at one point developed a C based network application. He was a contract instructor providing software training to industry for 5 years. Richard is currently an Associate Professor at Tarleton State University in Stephenville Texas. Richard is the author of various books and video courses some of which are as follows: Natural Language Processing with Java. Java for Data Science Getting Started with Natural Language Processing in Java
Read more about Richard M. Reese

Jennifer L. Reese

Jennifer L. Reese studied computer science at Tarleton State University. She also earned her M.Ed. from Tarleton in December 2016. She currently teaches computer science to high-school students. Her interests include the integration of computer science concepts with other academic disciplines, increasing diversity in computer science courses, and the application of data science to the field of education. She has co-authored two books: Java for Data Science and Java 7 New Features Cookbook. She previously worked as a software engineer. In her free time she enjoys reading, cooking, and traveling—especially to any destination with a beach. She is a musician and appreciates a variety of musical genres.
Read more about Jennifer L. Reese

Other recommended products

Related to this chapter

Natural Language Processing with Java Cookbook

This book will teach you how to perform basic and advanced NLP tasks in Java, using independent recipes. The book not only covers the essential aspects of NLP but also addresses other important areas such as the acquisition of text and techniques for utilizing NLP in varied domains

BookApr 2019386 pages

Natural Language Processing with Java

Natural Language Processing with Java will explore how to automatically organize text using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization. You will leverage the power of Java to extract relationships within different elements of text and documents.

BookJul 2018318 pages

Java Data Science Cookbook

Java has been one of the most popular languages for developers for several decades and yet the potential of the Java ecosystem still remains untapped when it comes to using JVM-based languages and platforms to solve data science related problems. A variety of tools and libraries are available such as Spark, Hadoop, and Mahout for computation and libraries such as MLlib, Weka, DL4j to implement smart data models. This book uncovers practically all these techniques in the form of recipes showing you how these tools and libraries can solve statistical, analytical, data mining, and information science related problems.

BookMar 2017372 pages

Machine Learning in Java

Machine Learning in Java will provide you with the techniques and tools you need to quickly gain insight from complex data. You will start by learning how to apply machine learning methods to a variety of common tasks including classification, prediction, forecasting, market basket analysis, and clustering.

BookNov 2018300 pages

Java Deep Learning Projects

You will build full-fledged, deep learning applications with Java and different open-source libraries. Master numerical computing, deep learning, and the latest Java programming features to carry out complex advanced tasks. This book is filled with best practices/tips after every project to help you optimize your deep learning models with ease.

BookJun 2018436 pages

Java Deep Learning Cookbook

Deep Learning is a trending topic in AI currently, as it allows you to make faster and more accurate predictions using the power of neural networks. This book will teach you the process of neural network design, and show you how to develop efficient deep learning applications using Deeplearning4j through practical and easy to implement recipes.

BookNov 2019304 pages

Deep Learning with Hadoop

BookFeb 2017206 pages

Mastering Java for Data Science

Java is the most wide-spread programming language nowadays, and you fill find it everywhere, from small startup companies to large enterprises. It is also a common choice for developing Data Science applications thanks to Java's prevalence and rich data processing toolbox. This book will explain how to use Java for Data Science, overview the available Machine Learning libraries, and cover many topics including supervised and unsupervised learning, natural language processing, deep learning, and big data

BookApr 2017364 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages