Reader small image

You're reading from  Advanced Deep Learning with Keras

Product typeBook
Published inOct 2018
Reading LevelIntermediate
PublisherPackt
ISBN-139781788629416
Edition1st Edition
Languages
Right arrow
Author (1)
Rowel Atienza
Rowel Atienza
author image
Rowel Atienza

Rowel Atienza is an Associate Professor at the Electrical and Electronics Engineering Institute of the University of the Philippines, Diliman. He holds the Dado and Maria Banatao Institute Professorial Chair in Artificial Intelligence. Rowel has been fascinated with intelligent robots since he graduated from the University of the Philippines. He received his MEng from the National University of Singapore for his work on an AI-enhanced four-legged robot. He finished his Ph.D. at The Australian National University for his contribution on the field of active gaze tracking for human-robot interaction. Rowel's current research work focuses on AI and computer vision. He dreams on building useful machines that can perceive, understand, and reason. To help make his dreams become real, Rowel has been supported by grants from the Department of Science and Technology (DOST), Samsung Research Philippines, and Commission on Higher Education-Philippine California Advanced Research Institutes (CHED-PCARI).
Read more about Rowel Atienza

Right arrow

Implementation of InfoGAN in Keras


To implement InfoGAN on MNIST dataset, there are some changes that need to be made in the base code of ACGAN. As highlighted in following listing, the generator concatenates both entangled (z noise code) and disentangled codes (one-hot label and continuous codes) to serve as input. The builder functions for the generator and discriminator are also implemented in gan.py in the lib folder.

Note

The complete code is available on GitHub:

https://github.com/PacktPublishing/Advanced-Deep-Learning-with-Keras

Listing 6.1.1, infogan-mnist-6.1.1.py shows us how the InfoGAN generator concatenates both entangled and disentangled codes to serve as input:

def generator(inputs,
              image_size,
              activation='sigmoid',
              labels=None,
              codes=None):
    """Build a Generator Model

    Stack of BN-ReLU-Conv2DTranpose to generate fake images.
    Output activation is sigmoid instead of tanh in [1].
    Sigmoid converges easily.

...

Generator outputs of InfoGAN


Similar to all previous GANs that have been presented to us, we've trained InfoGAN for 40,000 steps. After the training is completed, we're able to run the InfoGAN generator to generate new outputs using the model saved on the infogan_mnist.h5 file. The following validations are conducted:

  1. Generate digits 0 to 9 by varying the discrete labels from 0 to 9. Both continuous codes are set to zero. The results are shown in Figure 6.1.5. We can see that the InfoGAN discrete code can control the digits produced by the generator:

    python3 infogan-mnist-6.1.1.py --generator=infogan_mnist.h5 --digit=0 --code1=0 --code2=0

    to

    python3 infogan-mnist-6.1.1.py --generator=infogan_mnist.h5 --digit=9 --code1=0 --code2=0
    
  2. Examine the effect of the first continuous code to understand which attribute has been affected. We vary the first continuous code from -2.0 to 2.0 for digits 0 to 9. The second continuous code is set to 0.0. Figure 6.1.6 shows that the first continuous...

StackedGAN


In the same spirit as InfoGAN, StackedGAN proposes a method for disentangling latent representations for conditioning generator outputs. However, StackedGAN uses a different approach to the problem. Instead of learning how to condition the noise to produce the desired output, StackedGAN breaks down a GAN into a stack of GANs. Each GAN is trained independently in the usual discriminator-adversarial manner with its own latent code.

Figure 6.2.1 shows us how StackedGAN works in the context of the hypothetical celebrity face generation. Assuming that the Encoder network is trained to classify celebrity faces.

The Encoder network is made of a stack of simple encoders, Encoder i where i = 0 … n - 1 corresponding to n features. Each encoder extracts certain facial features. For example, Encoder0 may be the encoder for hairstyle features, Features1. All the simple encoders contribute to making the overall Encoder perform correct predictions.

The idea behind StackedGAN is that if we would...

Implementation of StackedGAN in Keras


The detailed network model of StackedGAN can be seen in the following figure. For conciseness, only two encoder-GANs per stack are shown. The figure may initially appear complex, but it is just a repetition of an encoder-GAN. Meaning that if we understood how to train one encoder-GAN, the rest uses the same concept. In the following section, we assume that the StackedGAN is designed for the MNIST digit generation:

Figure 6.2.2: A StackedGAN is made of a stack of an encoder and GAN. The encoder is pre-trained to perform classification. Generator1, G1, learns to synthesize f1f features conditioned on the fake label, y f, and latent code, z1f. Generator0, G0, produces fake images using both the fake features, f1f and latent code, z0f.

StackedGAN starts with an Encoder. It could be a trained classifier that predicts the correct labels. The intermediate features vector, f1r, is made available for GAN training. For MNIST, we can use a CNN-based classifier similar...

Generator outputs of StackedGAN


After training the StackedGAN for 10,000 steps, the Generator0 and Generator1 models are saved on files. Stacked together, Generator0 and Generator1 can synthesize fake images conditioned on label and noise codes, z0 and z1.

The StackedGAN generator can be qualitatively validated by:

  1. Varying the discrete labels from 0 to 9 with both noise codes, z0 and z1 sampled from a normal distribution with a mean of 0.5 and standard -deviation of 1.0. The results are shown in Figure 6.2.9. We're able to see that the StackedGAN discrete code can control the digits produced by the generator:

    python3 stackedgan-mnist-6.2.1.py 
    --generator0=stackedgan_mnist-gen0.h5 
    --generator1=stackedgan_mnist-gen1.h5 --digit=0
    python3 stackedgan-mnist-6.2.1.py 
    --generator0=stackedgan_mnist-gen0.h5 
    --generator1=stackedgan_mnist-gen1.h5 --digit=9
    

    to

  2. Varying the first noise code, z0, as a constant vector from -4.0 to 4.0 for digits 0 to 9 as shown as follows. The second noise...

Conclusion


In this chapter, we've discussed how to disentangle the latent representations of GANs. Earlier on in the chapter, we discussed how InfoGAN maximizes the mutual information in order to force the generator to learn disentangled latent vectors. In the MNIST dataset example, InfoGAN uses three representations and a noise code as inputs. The noise represents the rest of the attributes in the form of an entangled representation. StackedGAN approaches the problem in a different way. It uses a stack of encoder-GANs to learn how to synthesize fake features and images. The encoder is first trained to provide a dataset of features. Then, the encoder-GANs are trained jointly to learn how to use the noise code to control attributes of the generator output.

In the next chapter, we will embark on a new type of GAN that is able to generate new data in another domain. For example, given an image of a horse, the GAN can perform an automatic transformation to an image of a zebra. The interesting...

Reference


  1. Xi Chen and others. InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets. Advances in Neural Information Processing Systems, 2016(http://papers.nips.cc/paper/6399-infogan-interpretable-representation-learning-by-information-maximizing-generative-adversarial-nets.pdf).

  2. Xun Huang and others. Stacked Generative Adversarial Networks. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Vol. 2, 2017(http://openaccess.thecvf.com/content_cvpr_2017/papers/Huang_Stacked_Generative_Adversarial_CVPR_2017_paper.pdf).

Generator outputs of StackedGAN

After training the StackedGAN for 10,000 steps, the Generator0 and Generator1 models are saved on files. Stacked together, Generator0 and Generator1 can synthesize fake images conditioned on label and noise codes, z0 and z1.

The StackedGAN generator can be qualitatively validated by:

  1. Varying the discrete labels from 0 to 9 with both noise codes, z0 and z1 sampled from a normal distribution with a mean of 0.5 and standard -deviation of 1.0. The results are shown in Figure 6.2.9. We're able to see that the StackedGAN discrete code can control the digits produced by the generator:
    python3 stackedgan-mnist-6.2.1.py 
    --generator0=stackedgan_mnist-gen0.h5 
    --generator1=stackedgan_mnist-gen1.h5 --digit=0
    python3 stackedgan-mnist-6.2.1.py 
    --generator0=stackedgan_mnist-gen0.h5 
    --generator1=stackedgan_mnist-gen1.h5 --digit=9
    

    to

  2. Varying the first noise code, z0, as a constant vector from -4.0 to 4.0 for digits 0 to 9 as shown...

Conclusion

In this chapter, we've discussed how to disentangle the latent representations of GANs. Earlier on in the chapter, we discussed how InfoGAN maximizes the mutual information in order to force the generator to learn disentangled latent vectors. In the MNIST dataset example, InfoGAN uses three representations and a noise code as inputs. The noise represents the rest of the attributes in the form of an entangled representation. StackedGAN approaches the problem in a different way. It uses a stack of encoder-GANs to learn how to synthesize fake features and images. The encoder is first trained to provide a dataset of features. Then, the encoder-GANs are trained jointly to learn how to use the noise code to control attributes of the generator output.

In the next chapter, we will embark on a new type of GAN that is able to generate new data in another domain. For example, given an image of a horse, the GAN can perform an automatic transformation to...

Reference

  1. Xi Chen and others. InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets. Advances in Neural Information Processing Systems, 2016(http://papers.nips.cc/paper/6399-infogan-interpretable-representation-learning-by-information-maximizing-generative-adversarial-nets.pdf).
  2. Xun Huang and others. Stacked Generative Adversarial Networks. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Vol. 2, 2017(http://openaccess.thecvf.com/content_cvpr_2017/papers/Huang_Stacked_Generative_Adversarial_CVPR_2017_paper.pdf).
lock icon
The rest of the chapter is locked
You have been reading a chapter from
Advanced Deep Learning with Keras
Published in: Oct 2018Publisher: PacktISBN-13: 9781788629416
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Rowel Atienza

Rowel Atienza is an Associate Professor at the Electrical and Electronics Engineering Institute of the University of the Philippines, Diliman. He holds the Dado and Maria Banatao Institute Professorial Chair in Artificial Intelligence. Rowel has been fascinated with intelligent robots since he graduated from the University of the Philippines. He received his MEng from the National University of Singapore for his work on an AI-enhanced four-legged robot. He finished his Ph.D. at The Australian National University for his contribution on the field of active gaze tracking for human-robot interaction. Rowel's current research work focuses on AI and computer vision. He dreams on building useful machines that can perceive, understand, and reason. To help make his dreams become real, Rowel has been supported by grants from the Department of Science and Technology (DOST), Samsung Research Philippines, and Commission on Higher Education-Philippine California Advanced Research Institutes (CHED-PCARI).
Read more about Rowel Atienza