Reader small image

You're reading from  Advanced Deep Learning with Python

Product typeBook
Published inDec 2019
Reading LevelIntermediate
PublisherPackt
ISBN-139781789956177
Edition1st Edition
Languages
Right arrow
Author (1)
Ivan Vasilev
Ivan Vasilev
author image
Ivan Vasilev

Ivan Vasilev started working on the first open source Java deep learning library with GPU support in 2013. The library was acquired by a German company, with whom he continued its development. He has also worked as a machine learning engineer and researcher in medical image classification and segmentation with deep neural networks. Since 2017, he has focused on financial machine learning. He co-founded an algorithmic trading company, where he's the lead engineer. He holds an MSc in artificial intelligence from Sofia University St. Kliment Ohridski and has written two previous books on the same topic.
Read more about Ivan Vasilev

Right arrow

Generative Models

In the previous two chapters (Chapter 4, Advanced Convolutional Networks, and Chapter 5, Object Detection and Image Segmentation), we focused on supervised computer vision problems, such as classification and object detection. In this chapter, we'll discuss how to create new images with the help of unsupervised neural networks. After all, it's a lot better knowing that you don't need labeled data. More specifically, we'll talk about generative models.

This chapter will cover the following topics:

  • Intuition and justification of generative models
  • Introduction to Variational Autoencoders (VAEs)
  • Introduction to Generative Adversarial Networks (GANs)
  • Types of GAN
  • Introducing to artistic style transfer

Intuition and justification of generative models

So far, we've used neural networks as discriminative models. This simply means that, given input data, a discriminative model will map it to a certain label (in other words, a classification). A typical example is the classification of MNIST images in 1 of 10 digit classes, where the neural network maps input data features (pixel intensities) to the digit label. We can also say this in another way: a discriminative model gives us the probability of (class), given (input). In the case of MNIST, this is the probability of the digit when given the pixel intensities of the image.

On the other hand, a generative model learns how classes are distributed. You can think of it as the opposite of what the discriminative model does. Instead of predicting the class probability, , given certain input features, it tries to predict the...

Introduction to VAEs

To understand VAEs, we need to talk about regular autoencoders. An autoencoder is a feed-forward neural network that tries to reproduce its input. In other words, the target value (label) of an autoencoder is equal to the input data, yi = xi, where i is the sample index. We can formally say that it tries to learn an identity function, (a function that repeats its input). Since our labels are just input data, the autoencoder is an unsupervised algorithm.

The following diagram represents an autoencoder:

An autoencoder

An autoencoder consists of input, hidden (or bottleneck), and output layers. Similar to U-Net (Chapter 4, Object Detection and Image Segmentation), we can think of the autoencoder as a virtual composition of two components:

  • Encoder: Maps the input data to the network's internal representation. For the sake of simplicity, in this example...

Introduction to GANs

In this section, we'll talk about arguably the most popular generative model today: the GAN framework. It was first introduced in 2014 in the landmark paper Generative Adversarial Nets (http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf). The GAN framework can work with any type of data, but its most popular application by far is to generate images, and we'll discuss them in this context only. Let's see how it works:

A GAN system

A GAN is a system of two components (neural networks):

  • Generator: This is the generative model itself. It takes a probability distribution (random noise) as input and tries to generate a realistic output image. Its purpose is similar to the decoder part of the VAE.
  • Discriminator: This takes two alternating inputs: real images of the training dataset or generated fake samples from the generator. It tries...

Types of GAN

Since the GAN framework was first introduced, a lot of new variations have emerged. In fact, there are so many new GANs now that, in order to stand out, the authors have come up with creative GAN names, such as BicycleGAN, DiscoGAN, GANs for LIFE, and ELEGANT. In the next few sections, we'll discuss some of them. All of the examples have been implemented with TensorFlow 2.0 and Keras.

The code for DCGAN, CGAN, WGAN, and CycleGAN is partially inspired by https://github.com/eriklindernoren/Keras-GAN. You can find the full implementations of all the examples in this chapter at https://github.com/PacktPublishing/Advanced-Deep-Learning-with-Python/tree/master/Chapter05.

Deep Convolutional GAN

In this section...

Introducing artistic style transfer

In this final section, we'll discuss artistic style transfer. Similar to one of the applications of CycleGAN, it allows us to use the style (or texture) of one image to reproduce the semantic content of another. Although it can be implemented with different algorithms, the most popular way was introduced in 2015 in the A Neural Algorithm of Artistic Style paper (https://arxiv.org/abs/1508.06576). It's also known as neural style transfer and it uses (you guessed it!) CNNs. The basic algorithm has been improved and tweaked over the past few years, but in this section we'll explore its original form as this will give us a good foundation for understanding the latest versions.

The algorithm takes two images as input:

  • The content image (C) we would like to redraw
  • The style image (I) whose style (texture) we'll use to redraw C...

Summary

In this chapter, we discussed how to create new images with generative models, which is one of the most exciting deep learning areas at the moment. We learned about the theoretical foundations of VAEs and then we implemented a simple VAE to generate new MNIST digits. Then, we described the GAN framework and we discussed and implemented multiple types of GAN, including DCGAN, CGAN, WGAN, and CycleGAN. Finally, we mentioned the neural style transfer algorithm. This chapter concludes a series of four chapters dedicated to computer vision and I really hope you've enjoyed them.

In the next few chapters, we'll talk about Natural Language Processing and recurrent networks.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Advanced Deep Learning with Python
Published in: Dec 2019Publisher: PacktISBN-13: 9781789956177
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Ivan Vasilev

Ivan Vasilev started working on the first open source Java deep learning library with GPU support in 2013. The library was acquired by a German company, with whom he continued its development. He has also worked as a machine learning engineer and researcher in medical image classification and segmentation with deep neural networks. Since 2017, he has focused on financial machine learning. He co-founded an algorithmic trading company, where he's the lead engineer. He holds an MSc in artificial intelligence from Sofia University St. Kliment Ohridski and has written two previous books on the same topic.
Read more about Ivan Vasilev