You're reading from Deep Learning with TensorFlow 2 and Keras - Second Edition

Product type Book

Published in Dec 2019

Publisher Packt

ISBN-13 9781838823412

Pages 646 pages

Edition 2nd Edition

Languages

Python

Concepts

Deep Learning

Authors (3):

Antonio Gulli

Amita Kapoor

Sujit Pal

View More author details

Table of Contents (19) Chapters

Preface

Neural Network Foundations with TensorFlow 2.0

TensorFlow 1.x and 2.x

Regression

Convolutional Neural Networks

Advanced Convolutional Neural Networks

Generative Adversarial Networks

Word Embeddings

Recurrent Neural Networks

Autoencoders

Unsupervised Learning

Reinforcement Learning

TensorFlow and Cloud

TensorFlow for Mobile and IoT and TensorFlow.js

An introduction to AutoML

The Math Behind Deep Learning

Tensor Processing Unit

Other Books You May Enjoy

Index

Unsupervised Learning

This chapter delves into unsupervised learning models. In the previous chapter we explored Autoencoders, novel neural networks that learn via unsupervised learning. In this chapter we will delve deeper into some other unsupervised learning models. In contrast to supervised learning, where the training dataset consists of both the input and the desired labels, unsupervised learning deals with the case where the model is provided only the input. The model learns the inherent input distribution by itself without any desired label guiding it. Clustering and dimensionality reduction are the two most commonly used unsupervised learning techniques. In this chapter we will learn about different machine learning and NN techniques for both. We will cover techniques required for clustering and dimensionality reduction, and will go into the details of Boltzmann machines, and finally we will cover the implementation of the aforementioned techniques using TensorFlow. The...

Principal component analysis

Principal component analysis (PCA) is the most popular multivariate statistical technique for dimensionality reduction. It analyzes the training data consisting of several dependent variables, which are, in general, inter-correlated, and extracts important information from the training data in the form of a set of new orthogonal variables called principal components. We can perform PCA using two methods either using eigen decomposition or using singular value decomposition (SVD).

PCA reduces the n–dimensional input data to r–dimensional input data, where r<n. In the most simple terms, PCA involves translating the origin and performing rotation of the axis such that one of the axes (principal axis) has the highest variance with data points. A reduced-dimensions dataset is obtained from the original dataset by performing this transformation and then dropping (removing) the orthogonal axes with low variance. Here we employ the SVD method...

Self-organizing maps

Both k-means and PCA can cluster the input data; however, they do not maintain topological relationship. In this section we will consider Self-organized maps (SOM), sometimes known as Kohonen networks or Winner take all units (WTU). They maintain the topological relation. SOMs are a very special kind of neural network, inspired by a distinctive feature of the human brain. In our brain, different sensory inputs are represented in a topologically ordered manner. Unlike other neural networks, neurons are not all connected to each other via weights; instead, they influence each other's learning. The most important aspect of SOM is that neurons represent the learned inputs in a topographic manner. They were proposed by Tuevo Kohonen in 1989 [2].

In SOMs, neurons are usually placed at nodes of a (1D or 2D) lattice. Higher dimensions are also possible but are rarely used in practice. Each neuron in the lattice is connected to all the input units via a weight matrix...

Restricted Boltzmann machines

The RBM is a two-layered neural network—the first layer is called the visible layer and the second layer is called the hidden layer. They are called shallow neural networks because they are only two layers deep. They were first proposed in 1986 by Paul Smolensky (he called them Harmony Networks [1]) and later by Geoffrey Hinton who in 2006 proposed Contrastive Divergence (CD) as a method to train them. All neurons in the visible layer are connected to all the neurons in the hidden layer, but there is a restriction—no neuron in the same layer can be connected. All neurons in the RBM are binary in nature.

RBMs can be used for dimensionality reduction, feature extraction, and collaborative filtering. The training of RBMs can be divided into three parts: forward pass, backward pass, and then compare.

Let us delve deeper into the math. We can divide the operation of RBMs into two passes:

Forward pass: The information at visible units...

Variational Autoencoders

Like DBNs and GANs, variational autoencoders are also generative models. Variational Autoencoders (VAEs) are a mix of the best of neural networks and Bayesian inference. They are one of the most interesting neural networks and have emerged as one of the most popular approaches to unsupervised learning. They are Autoencoders with a twist. Along with the conventional encoder and decoder network of Autoencoders (see Chapter 8, Autoencoders), they have additional stochastic layers. The stochastic layer, after the encoder network, samples the data using a Gaussian distribution, and the one after the decoder network samples the data using Bernoulli's distribution. Like GANs, VAEs can be used to generate images and figures based on the distribution they have been trained on. VAEs allow one to set complex priors in the latent and thus learn powerful latent representations. The following diagram describes a VAE:

The Encoder network qΦ(z|x) approximates...

Summary

The chapter covered the major unsupervised learning algorithms. We went through algorithms best suited for dimension reduction, clustering, and image reconstruction. We started with the dimension reduction algorithm PCA, then we performed clustering using k-means and self-organized maps. After this we studied the restricted Boltzmann machine and saw how we can use it for both dimension reduction and image reconstruction. Next the chapter delved into stacked RBMs, that is, deep belief networks, and we trained a DBN consisting of three RBM layers on the MNIST dataset. Lastly, we learned about variational autoencoders, which, like GANs, can generate images after learning the distribution of the input sample space.

This chapter, along with chapters 6 and 9, covered models that were trained using unsupervised learning. In the next chapter, we move on to another learning paradigm: reinforcement learning.

References

https://arxiv.org/abs/1404.1100
http://www.cs.otago.ac.nz/cosc453/student_tutorials/principal_components.pdf
http://mplab.ucsd.edu/tutorials/pca.pdf
http://projector.tensorflow.org/
http://web.mit.edu/be.400/www/SVD/Singular_Value_Decomposition.htm
https://www.deeplearningbook.org
Kanungo, Tapas, et al. An Efficient k-Means Clustering Algorithm: Analysis and Implementation. IEEE transactions on pattern analysis and machine intelligence 24.7 (2002): 881-892.
Ortega, Joaquín Pérez, et al. Research issues on K-means Algorithm: An Experimental Trial Using Matlab. CEUR Workshop Proceedings: Semantic Web and New Technologies.
A Tutorial on Clustering Algorithms, http://home.deib.polimi.it/matteucc/Clustering/tutorial_html/kmeans.html.
Chen, Ke. On Coresets for k-Median and k-Means Clustering in Metric and Euclidean Spaces and Their Applications. SIAM Journal on Computing 39.3 (2009): 923-947.
https://en.wikipedia.org/wiki...