Reader small image

You're reading from  Hands-On Neural Networks with Keras

Product typeBook
Published inMar 2019
Reading LevelIntermediate
PublisherPackt
ISBN-139781789536089
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Niloy Purkait
Niloy Purkait
author image
Niloy Purkait

Niloy Purkait is a technology and strategy consultant by profession. He currently resides in the Netherlands, where he offers his consulting services to local and international companies alike. He specializes in integrated solutions involving artificial intelligence, and takes pride in navigating his clients through dynamic and disruptive business environments. He has a masters in Strategic Management from Tilburg University, and a full specialization in data science from Michigan University. He has advanced industry grade certifications from IBM, in subjects like signal processing, cloud computing, machine and deep learning. He is also perusing advanced academic degrees in several related fields, and is a self-proclaimed lifelong learner.
Read more about Niloy Purkait

Right arrow

Generative Networks

In the last chapter, we submerged ourselves in the world of autoencoding neural networks. We saw how these models can be used to estimate parameterized functions capable of reconstructing given inputs with respect to target outputs. While at prima facie this may seem trivial, we now know that this manner of self-supervised encoding has several theoretical and practical implications.

In fact, from a machine learning (ML) perspective, the ability to approximate a connected set of points in a higher dimensional space on to a lower dimensional space (that is, manifold learning) has several advantages, ranging from higher data storage efficiency to more efficient memory consumption. Practically speaking, this allows us to discover ideal coding schemes for different types of data, or to perform dimensionality reduction thereupon, for use cases such as Principal Component...

Replicating versus generating content

While our autoencoding use cases in the last chapter were limited to image reconstruction and denoising, these use cases are quite distinct from the one we are about to address in this chapter. So far, we made our autoencoders reconstruct some given inputs, by learning an arbitrary mapping function. In this chapter, we want to understand how to train a model to create new instances of some content, instead of simply replicating its inputs. In other words, what if we asked a neural network to truly be creative and generate content just like human beings do?. Can this even be achieved? The canonical answer common in the realm of Artificial Intelligence (AI) is yes, but it is complicated. In the search for a more detailed answer, we arrive at the topic of this chapter: generative networks.

While a plethora of generative networks exist, ranging...

Understanding the notion of latent space

Recall from the previous chapter that a latent space is nothing but a compressed representation of the input data in a lower dimensional space. It essentially includes features that are crucial to the identification of the original input. To better understand this notion, it is helpful to try to mentally visualize what type of information may be encoded by the latent space. A useful analogy can be to think of how we ourselves create content, with our imagination. Suppose you were asked to create an imaginary animal. What information would you be relying on to create this creature? You will sample features from animals you have previously seen, features such as their color, or whether they are bi-pedal, quadri-pedal, a mammal or reptile, land-or sea-dwelling, and so on. As it turns out, we ourselves develop latent models of the world, as...

Diving deeper into generative networks

So, let's try to understand the core mechanics of generative networks and how such approaches differ from the ones we already know. In our quest thus far, most of the networks we have implemented are for the purpose of executing a deterministic transformation of some inputs, in order to get to some sort of outputs. It was not until we explored the topic of reinforcement learning (Chapter 7, Reinforcement Learning with Deep Q-Networks) that we learned the benefits of introducing a degree of stochasticity (that is, randomness) to our modeling efforts. This is a core notion that we will be further exploring as we familiarize ourselves with the manner in which generative networks function. As we mentioned earlier, the central idea behind generative networks is to use a deep neural network to learn the probability distribution of variables...

Using randomness to augment outputs

Over the years, we developed methods that operationalize this notion of injecting some controlled randomness, which in a sense are guided by the intuition of the inputs. When we speak of generative models, we essentially wish to implement a mechanism that allows controlled and quasi-randomized transformations of our input, to generate something new, yet still plausibly resembling the original input.

Let's consider for a moment how this can be achieved. We wish to train a neural network to use some input variables (x) to generate some output variables (y), from a latent space produced by a model. An easy way to solve this is to simply add an element of randomness as input to our generator network, defined here by the variable (z). The value of z may be sampled from some probability distribution (a Gaussian distribution, for example) and...

Sampling from the latent space

To further elaborate, suppose we had to draw some samples (y) from a probability distribution of variables from a latent space, with a mean of (μ) and a variance of (σ2):

  • Sampling operation: y ̴ N(μ , σ2)

Since we use a sampling process to draw from this distribution, each individual sample may change every time the process is queried. We can't exactly differentiate the generated sample (y) with respect to the distribution parameters (μ and σ2), since we are dealing with a sampling operation, and not a function. So, how exactly can we backpropagate our model's errors? Well, one solution could be to redefine the sampling process, such as performing a transformation on a random variable (z), to get to our generated output (y), like so:

  • Sampling equation: y = μ + σz

This is a crucial...

Understanding types of generative networks

So, all we are actually doing here is generating an output by transforming a sample taken from the probability distribution representing the encoded latent space. In the last chapter, we saw how to produce such a latent space from some input data using encoding functions. In this chapter, we will see how to learn a continuous latent space (l), then sample from it to generate novel outputs. To do this, we essentially learn a differentiable generator function, g (l ; θ(g) ), which transforms samples from a continuous latent space (l) to generate an output. Here, this function itself is what is being approximated by the neural network.

The family of generative networks includes both Variational Autoencoders (VAEs) as well as Generative Adversarial Networks (GANs). As we mentioned before, there exist many types of generative models...

Understanding VAEs

Now we have a high-level understanding of what generative networks entail, we can focus on a specific type of generative models. One of them is the VAE, proposed by both Kingma and Welling (2013) as well as Rezende, Mohamed, and Wierstra (2014). This model is actually very similar to the autoencoders we saw in the last chapter, but they come with a slight twist—well, several twists, to be more specific. For one, the latent space being learned is no longer a discrete one, but a continuous one by design! So, what's the big deal? Well, as we explained earlier, we will be sampling from this latent space to generate our outputs. However, sampling from a discrete latent space is problematic. The fact that it is discrete implies that there will be regions in the latent space with discontinuities, meaning that if these regions were to be randomly sampled...

Designing a VAE in Keras

For this exercise, we will go back to a well-known dataset that is easily available to all: the MNIST dataset. The visual features of handwritten digits make this dataset uniquely suited to experiment with VAEs, allowing us to better understand how these models work. We start by importing the necessary libraries:

import numpy as np
import matplotlib.pyplot as plt
from keras.layers import Input, Dense, Lambda, Layer
from keras.models import Model
from keras import backend as K
from keras import metrics from keras.datasets import mnist

Loading and pre-processing the data

Next, we load the dataset, just as we did in Chapter 3, Signal Processing – Data Analysis with Neural Networks. We also take the...

Building the encoding module in a VAE

Next, we will start building the encoding module of our VAE. This part is almost identical to the shallow encoder we built in the last chapter, except that it splits into two separate layers: one estimating the mean and the other estimating the variance over the latent space:

#Encoder module
input_layer= Input(shape=(original_dim,))
intermediate_layer= Dense(intermediate_dim, activation='relu', name='Intermediate layer')(input_layer)
z_mean=Dense(latent_dim, name='z-mean')(intermediate_layer)
z_log_var=Dense(latent_dim, name='z_log_var')(intermediate_layer)

You could optionally add the name argument while defining a layer, to be able to visualize our model intuitively. If we want, we can actually visualize the network we have built so far, by initializing it already and summarizing it, as shown here:

Note...

Building the decoder module

Now that we have a mechanism implemented to sample from the latent space, we can proceed to build a decoder capable of mapping this sample to the output space, thereby generating a novel instance of the input data. Recall that just as the encoder funnels the data by narrowing the layer dimensions till the encoded representation is reached, the decoder layers progressively enlarge the representations sampled from the latent space, mapping them back to the original image dimension:

#Decoder module
decoder_h= Dense(intermediate_dim, activation='relu')
decoder_mean= Dense(original_dim, activation='sigmoid')
h_decoded=decoder_h(z)
x_decoded_mean=decoder_mean(h_decoded)

Defining a custom variational layer

...

Visualizing the latent space

Since we have a two-dimensional latent space, we can simply plot out the representations as a 2D manifold where encoded instances of each digit class may be visualized with respect to their proximity to other instances. This allows us to inspect the continuous latent space that we spoke of before and see how the network relates to different features in the 10-digit classes (0 to 9) to each other. To do this, we revisit the encoding module from our VAE, which can now be used to produce a compressed latent space from some given data. Thus, we use the encoder module to make predictions on the test set, thereby encoding these images the latent space. Finally, we can use a scatterplot from Matplotlib to plot out the latent representation. Do note that each individual point represents an encoded instance from the test set. The colors denote the different...

Latent space sampling and output generation

Finally, we can proceed to generate some novel handwritten digits with our VAE. To do this, we simply revisit the decoder part of our VAE (which naturally excludes the loss layer). We will be using it to decode samples from the latent space and generate some handwritten digits that were never actually written by anyone:

Next, we will display a grid of 15 x 15 digits, each of size 28. To do this, we initialize a matrix of zeros, matching the dimensions of the entire output to be generated. Then, we use the ppf function from SciPy to transform some linearly placed coordinates to get to the grid values of the latent variables (z). After this, we enumerate through these grids to obtain a sampled (z) value. We can now feed this sample to the generator network, which will decode the latent representation, to subsequently reshape the output...

Exploring GANs

The idea behind GANs is much more understandable when compared to other similar models. In essence, we use several neural networks to play a rather elaborate game. Just like in the movie Catch-me-if-you-can. For those who are not familiar with the plot of this film, we apologize in advance for any missed allusions.

We can think of a GAN as a system of two actors. On one side, we have a Di Caprio-like network that attempts to recreate some Monets and Dalis and ship them off to unsuspecting art dealers. We also have a vigilant Tom Hanks-style network that intercepts these shipments and identifies any forgeries present. As time goes by, both individuals become better at what they do, leading to realistic forgeries on the conman's side, and a keen eye for them on the cop's side. This variation of a commonly used analogy indeed does well at introducing the...

Diving deeper into GANs

So, let's try to better understand how the different parts of the GAN work together to generate synthetic data. Consider the parameterized function (G) (you know, the kind we usually approximate using a neural network). This will be our generator, which samples its input vectors (z) from some latent probability distribution, and transforms them into synthetic images. Our discriminator network (D), will then be presented with some synthetic images produced by our generator, mixed among real images, and attempt to classify real from forgery. Hence, our discriminator network is simply a binary classifier, equipped with something like a sigmoid activation function. Ideally, we want the discriminator to output high values when presented with real images, and low values when presented with generated fakes. Conversely, we want our generator network to try...

Designing a GAN in Keras

For this exercise, suppose you were part of a research team working for a large automobile manufacturer. Your boss wants you to come up with a way to generate synthetic designs for cars, to systematically inspire the design team. You have heard all the hype about GANs and have decided to investigate whether they can be used for the task at hand. To do this, you want to first do a proof of concept, so you quickly get a hold of some low-resolution pictures of cars and design a basic GAN in Keras to see whether the network is at least able to recreate the general morphology of cars. Once you can establish this, you can convince your manager to invest in a few Titan x GUPs for the office, get some higher-resolution data, and develop some more complex architectures. So, let's start by implementing this proof of concept by first getting our hands on some...

Designing the generator module

Now comes the fun part. We will be implementing a Deep Convolutional Generative Adversarial Network (DCGAN). We start with the first part of the DCGAN: the generator network. The generator network will essentially learn to recreate realistic car images, by transforming a sample from some normal probability distribution, representing a latent space.

We will again use the functional API to defile our model, nesting it in a function with three different arguments. The first argument, latent_dim, refers to the dimension of the input data randomly sampled from a normal distribution. The leaky_alpha argument simply refers to the alpha parameter provided to the LeakyRelu activation function used throughout the network. Finally, the argument init_stddev simply refers to the standard deviation with which to initialize the random weights of the network, used...

Designing the discriminator module

Next, we continue our journey designing the discriminator module, which will be responsible for telling the real images from the fake ones supplied by the generator module we just designed. The concept behind the architecture is quite similar to that of the generator, with some key differences. The discriminator network receives images of a 32 x 32 x 3 dimension, which it then transforms into various representations as information propagates through deeper layers, until the dense classification layer is reached, equipped with one neuron and a sigmoid activation function. It has one neuron, since we are dealing with the binary classification task of distinguishing fake from real. The sigmoid function ensures a probabilistic output between 0 and 1, indicating how fake or real the network thinks a given image may be. Do also note the inclusion of...

Putting the GAN together

Next, we weave together the two modules using this function shown here. As arguments, it takes the size of the latent samples for the generator, which will be transformed by the generator network to produce synthetic images. It also accepts a learning rate and a decay rate for both the generator and discriminator networks. Finally, the last two arguments denote the alpha value for the LeakyReLU activation function used, as well as a standard deviation value for the random initialization of network weights:

def make_DCGAN(sample_size, 
               g_learning_rate,
               g_beta_1,
               d_learning_rate,
               d_beta_1,
               leaky_alpha,
               init_std):
    # clear first
    K.clear_session()
    
    # generator
    generator = gen(sample_size, leaky_alpha, init_std)

    # discriminator
    discriminator...

The training function

Next comes the training function. Yes, it is a big one. Yet, as you will soon see, it is quite intuitive, and basically combines everything we have implemented so far:

def train(
    g_learning_rate,   # learning rate for the generator
    g_beta_1,          # the exponential decay rate for the 1st moment estimates in Adam optimizer
    d_learning_rate,   # learning rate for the discriminator
    d_beta_1,          # the exponential decay rate for the 1st moment estimates in Adam optimizer
    leaky_alpha,
    init_std,
    smooth=0.1,        # label smoothing
    sample_size=100,   # latent sample size (i.e. 100 random numbers)
    epochs=200,
    batch_size=128,    # train batch size
    eval_size=16):      # evaluate size
    
    # labels for the batch size and the test size
    y_train_real, y_train_fake = make_labels(batch_size)
    y_eval_real,  y_eval_fake...

Defining the discriminator labels

Next, we define the label arrays to be used for the training and evaluation images, by calling the make_labels() function, and using the appropriate batch dimension. This will return us arrays with the labels 1 and 0 for each instance of the training and evaluation image:

# labels for the batch size and the test size
    y_train_real, y_train_fake = make_labels(batch_size)
    y_eval_real,  y_eval_fake  = make_labels(eval_size)

Initializing the GAN

Following this, we initialize the GAN network by calling the make_DCGAN() function we defined earlier and providing it with the appropriate arguments:

# create a GAN, a generator and a discriminator
    generator, discriminator, gan = make_DCGAN...

Training the generator per batch

After this, we freeze the layers of the discriminator, again using the make_trainable() function, this time to train the rest of the network only. Now it is the generator's turn to try beat the discriminator, by generating a realistic image:

# train the generator via GAN
make_trainable(discriminator, False)
gan.train_on_batch(latent_samples, y_train_real)

Evaluate results per epoch

Next, we exit the nested loop to perform some actions at the end of each epoch. We randomly sample some real images as well as latent variables, and then generate some fake images to plot out. Do note that we used the .test_on_batch() method to obtain the loss values of the discriminator and the GAN and append...

Executing the training session

We finally initiate the training session with the respective arguments. You will notice the tqdm module displaying a percentage bar indicating the number of processed batches per epoch. At the end of the epoch, you will be able to visualize a 4 x 4 grid (shown next) of samples generated from the GAN network. And there you have it, now you know how to implement a GAN in Keras. On a side note, it can be very beneficial to have tensorflow-gpu along with CUDA set up, if you're running the code on a local machine with access to a GPU. We ran this code for 200 epochs, yet it would not be uncommon to let it run for thousands of epochs, given the resources and time. Ideally, the longer the two networks battle, the better the results should get. Yet, this may not always be the case, and hence, such attempts may also require careful monitoring of the...

Conclusion

In this section of the chapter, we implemented a specific type of GAN (that is, the DCGAN) for a specific use case (image generation). The idea of using two networks in parallel to keep each other in check, however, can be applied to various types of networks, for very different use cases. For example, if you wish to generate synthetic timeseries data, we can implement the same concepts we learned here with recurrent neural networks to design a generative adversarial model! There have been several attempts at this in the research community, with quite successful results. A group of Swedish researchers, for example, used recurrent neural networks in a generative adversarial setup to produce synthetic segments of classical music! Other prominent ideas with GANs involve using attention models (a topic unfortunately not covered by this book) to orient network perception...

Summary

In this chapter, we saw how to augment neural networks with randomness in a systematic manner, in order to make them output instances of what we humans deem creative. With VAEs, we saw how parameterized function approximation using neural networks can be used to learn a probability distribution, over a continuous latent space. We then saw how to randomly sample from such a distribution and generate synthetic instances of the original data. In the second part of the chapter, we saw how two networks can be trained in an adversarial manner for a similar task.

The methodology of training GANs is simply a different strategy for learning a latent space compared to their counterpart, the VAE. While GANs have some key benefits for the use case of synthetic image generation, they do have some downsides as well. GANs are notoriously difficult to train and often generate images from...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Hands-On Neural Networks with Keras
Published in: Mar 2019Publisher: PacktISBN-13: 9781789536089
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Niloy Purkait

Niloy Purkait is a technology and strategy consultant by profession. He currently resides in the Netherlands, where he offers his consulting services to local and international companies alike. He specializes in integrated solutions involving artificial intelligence, and takes pride in navigating his clients through dynamic and disruptive business environments. He has a masters in Strategic Management from Tilburg University, and a full specialization in data science from Michigan University. He has advanced industry grade certifications from IBM, in subjects like signal processing, cloud computing, machine and deep learning. He is also perusing advanced academic degrees in several related fields, and is a self-proclaimed lifelong learner.
Read more about Niloy Purkait