Reader small image

You're reading from  Hands-On Mathematics for Deep Learning

Product typeBook
Published inJun 2020
Reading LevelIntermediate
PublisherPackt
ISBN-139781838647292
Edition1st Edition
Languages
Right arrow
Author (1)
Jay Dawani
Jay Dawani
author image
Jay Dawani

Jay Dawani is a former professional swimmer turned mathematician and computer scientist. He is also a Forbes 30 Under 30 Fellow. At present, he is the Director of Artificial Intelligence at Geometric Energy Corporation (NATO CAGE) and the CEO of Lemurian Labs - a startup he founded that is developing the next generation of autonomy, intelligent process automation, and driver intelligence. Previously he has also been the technology and R&D advisor to Spacebit Capital. He has spent the last three years researching at the frontiers of AI with a focus on reinforcement learning, open-ended learning, deep learning, quantum machine learning, human-machine interaction, multi-agent and complex systems, and artificial general intelligence.
Read more about Jay Dawani

Right arrow

Generative Models

So far in this book, we have covered the three main types of neural networks—feedforward neural networks (FNNs), convolutional neural networks (CNNs), and recurrent neural networks (RNNs). Each of them are discriminative models; that is, they learned to discriminate (differentiate) between the classes we wanted them to be able to predict, such as is this language French or English?, is this song classic rock or 90s pop?, and what are the objects present in this scene?. However, deep neural networks don't just stop there. They can also be used to improve image or video resolution or generate entirely new images and data. These types of models are known as generative models.

In this chapter, we will cover the following topics related to generative models:

  • Why we need generative models
  • Autoencoders
  • Generative adversarial networks
  • Flow-based networks...

Why we need generative models

All the various neural network architectures we have learned about in this book have served a specific purpose—to make a prediction about some given data. Each of these neural networks has its own respective strengths for various tasks. A CNN is very effective for object recognition tasks or music genre classification, an RNN is very effective for language translation or time series prediction, and FNNs are great for regression or classification. Generative models, on the other hand, are those that model the data, p(x), that we can sample data from, which is different from discriminative models, which learn to estimate conditional distributions, such as p(•|x).

But how does this benefit us? What can we use generative models for? Well, there are a couple of reasons why it is important for us to understand how generative models work....

Autoencoders

An autoencoder is an unsupervised type of FNN that learns to reconstruct high-dimensional data using latent-encoded data. You can think of it as trying to learn an identity function (that is, take x as input and then predict x).

Let's start by taking a look at the following diagram, which shows you what an autoencoder looks like:

As you can see, the network is split into two components—an encoder and a decoder—which are mirror images of each other. The two components are connected to each other through a bottleneck layer (sometimes referred to as either a latent-space representation or compression) that has dimensions that are a lot smaller than the input. You should note that the network architecture is symmetric, but that doesn't necessarily mean its weights need be. But why? What does this network learn and how does it do it? Let&apos...

Generative adversarial networks

The generative adversarial network (GAN) is a game theory-inspired neural network architecture that was created by Ian Goodfellow in 2014. It comprises two networks—a generator network and a critic network—both of which compete against each other in a minimax game, which allows both of them to improve simultaneously by trying to better the other.

In the last couple of years, GANs have produced some phenomenal results in tasks such as creating images that are indistinguishable from real images, generating music when given some recordings, and even generating text. But these models are known for being notoriously difficult to train. Let's now find out what exactly GANs are, how they bring about such tremendous results, and what makes them so challenging to train.

As we know, discriminative models learn a conditional distribution...

Flow-based networks

So far in this chapter, we have studied two kinds of generative models—GANs and VAEs—but there is also another kind, known as flow-based generative models, which directly learn the probability density function of the data distribution, which is something that the previous models do not do. Flow-based models make use of normalizing flows, which overcomes the difficulty that GANs and VAEs face in trying to learn the distribution. This approach can transform a simple distribution into a more complex one through a series of invertible mappings. We repeatedly apply the change of variables rule, which allows the initial probability density to flow through the series of invertible mappings, and at the end, we get the target probability distribution.

Normalizing...

Summary

In this section, we covered a variety of generative models that learn the distribution of true data and try to generate data that is indistinguishable from it. We started with a simple autoencoder and built on it to understand a variant of it that uses variational inference to generate data similar to the input. We then went on to learn about GANs, which pit two models—a discriminator and a generator—against each other in a game so that the generator tries to learn to create data that looks real enough to fool the discriminator into thinking it is real.

Finally, we learned about flow-based networks, which approximate a complex probability density using a simpler one by applying several invertible transformations on it. These models are used in a variety of tasks, including—but not limited to—synthetic data generation to overcome data limitations...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Hands-On Mathematics for Deep Learning
Published in: Jun 2020Publisher: PacktISBN-13: 9781838647292
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at £13.99/month. Cancel anytime

Author (1)

author image
Jay Dawani

Jay Dawani is a former professional swimmer turned mathematician and computer scientist. He is also a Forbes 30 Under 30 Fellow. At present, he is the Director of Artificial Intelligence at Geometric Energy Corporation (NATO CAGE) and the CEO of Lemurian Labs - a startup he founded that is developing the next generation of autonomy, intelligent process automation, and driver intelligence. Previously he has also been the technology and R&D advisor to Spacebit Capital. He has spent the last three years researching at the frontiers of AI with a focus on reinforcement learning, open-ended learning, deep learning, quantum machine learning, human-machine interaction, multi-agent and complex systems, and artificial general intelligence.
Read more about Jay Dawani