Reader small image

You're reading from  Advanced Deep Learning with Keras

Product typeBook
Published inOct 2018
Reading LevelIntermediate
PublisherPackt
ISBN-139781788629416
Edition1st Edition
Languages
Right arrow
Author (1)
Rowel Atienza
Rowel Atienza
author image
Rowel Atienza

Rowel Atienza is an Associate Professor at the Electrical and Electronics Engineering Institute of the University of the Philippines, Diliman. He holds the Dado and Maria Banatao Institute Professorial Chair in Artificial Intelligence. Rowel has been fascinated with intelligent robots since he graduated from the University of the Philippines. He received his MEng from the National University of Singapore for his work on an AI-enhanced four-legged robot. He finished his Ph.D. at The Australian National University for his contribution on the field of active gaze tracking for human-robot interaction. Rowel's current research work focuses on AI and computer vision. He dreams on building useful machines that can perceive, understand, and reason. To help make his dreams become real, Rowel has been supported by grants from the Department of Science and Technology (DOST), Samsung Research Philippines, and Commission on Higher Education-Philippine California Advanced Research Institutes (CHED-PCARI).
Read more about Rowel Atienza

Right arrow

References


  1. Yuval Netzer and others. Reading Digits in Natural Images with Unsupervised Feature Learning. NIPS workshop on deep learning and unsupervised feature learning. Vol. 2011. No. 2. 2011(https://www-cs.stanford.edu/~twangcat/papers/nips2011_housenumbers.pdf).

  2. Zhu, Jun-Yan and others. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks. 2017 IEEE International Conference on Computer Vision (ICCV). IEEE, 2017 (http://openaccess.thecvf.com/content_ICCV_2017/papers/Zhu_Unpaired_Image-To-Image_Translation_ICCV_2017_paper.pdf).

  3. Phillip Isola and others. Image-to-Image Translation with Conditional Adversarial Networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2017 (http://openaccess.thecvf.com/content_cvpr_2017/papers/Isola_Image-To-Image_Translation_With_CVPR_2017_paper.pdf).

  4. Mehdi Mirza and Simon Osindero. Conditional Generative Adversarial Nets. arXiv preprint arXiv:1411.1784, 2014(https://arxiv.org/pdf/1411.1784.pdf).

  5. Xudong...

Principles of CycleGAN

Principles of CycleGAN

Figure 7.1.1: Example of aligned image pair: left, original image and right, transformed image using a Canny edge detector. Original photos were taken by the author.

Translating an image from one domain to another is a common task in computer vision, computer graphics, and image processing. The preceding figure shows edge detection which is a common image translation task. In this example, we can consider the real photo (left) as an image in the source domain and the edge detected photo (right) as a sample in the target domain. There are many other cross-domain translation procedures that have practical applications such as:

  • Satellite image to map
  • Face image to emoji, caricature or anime
  • Body image to the avatar
  • Colorization of grayscale photos
  • Medical scan to a real photo
  • Real photo to an artist's painting

There are many more examples of this in different fields. In computer vision and image processing, for example, we can perform the translation by inventing an algorithm...

The CycleGAN Model

Figure 7.1.3 shows the network model of the CycleGAN. The objective of the CycleGAN is to learn the function:

y' = G(x) (Equation 7.1.1)

That generates fake images, y ', in the target domain as a function of the real source image, x. Learning is unsupervised by capitalizing only on the available real images, x, in the source domain and real images, y, in the target domain.

Unlike regular GANs, CycleGAN imposes the cycle-consistency constraint. The forward cycle-consistency network ensures that the real source data can be reconstructed from the fake target data:

x' = F(G(x)) (Equation 7.1.2)

This is done by minimizing the forward cycle-consistency L1 loss:

The CycleGAN Model

(Equation 7.1.3)

The network is symmetric. The backward cycle-consistency network also attempts to reconstruct the real target data from the fake source data:

y ' = G(F(y)) (Equation 7.1.4)

This is done by minimizing the backward cycle-consistency L1...

Implementing CycleGAN using Keras

Let us tackle a simple problem that CycleGAN can address. In Chapter 3, Autoencoders, we used an autoencoder to colorize grayscale images from the CIFAR10 dataset. We can recall that the CIFAR10 dataset is made of 50,000 trained data and 10,000 test data samples of 32 × 32 RGB images belonging to ten categories. We can convert all color images into grayscale using rgb2gray(RGB) as discussed in Chapter 3, Autoencoders.

Following on from that, we can use the grayscale train images as source domain images and the original color images as the target domain images. It's worth noting that although the dataset is aligned, the input to our CycleGAN is a random sample of color images and a random sample of grayscale images. Thus, our CycleGAN will not see the train data as aligned. After training, we'll use the test grayscale images to observe the performance of the CycleGAN:

Implementing CycleGAN using Keras

Figure 7.1.6: The forward cycle generator G, implementation in Keras...

Generator outputs of CycleGAN

Figure 7.1.9 shows the colorization results of CycleGAN. The source images are from the test dataset. For comparison, we show the ground truth and the colorization results using a plain autoencoder described in Chapter 3, Autoencoders. Generally, all colorized images are perceptually acceptable. Overall, it seems that each colorization technique has both its own pros and cons. All colorization methods are not consistent with the right color of the sky and vehicle.

For example, the sky in the background of the plane (3rd row, 2nd column) is white. The autoencoder got it right, but the CycleGAN thinks it is light brown or blue. For the 6th row, 6th column, the boat on the dark sea had an overcast sky but was colorized with blue sky and blue sea by autoencoder and blue sea and white sky by CycleGAN without PatchGAN. Both predictions make sense in the real world. Meanwhile, the prediction of CycleGAN with PatchGAN is similar to the ground truth. On 2nd to...

Conclusion

In this chapter, we've discussed CycleGAN as an algorithm that can be used for image translation. In CycleGAN, the source and target data are not necessarily aligned. We demonstrated two examples, grayscalecolor, and MNISTSVHN. Though there are many other possible image translations that CycleGAN can perform.

In the next chapter, we'll embark on another type of generative model, Variational AutoEncoders (VAEs). VAEs have a similar objective of learning how to generate new images (data). They focus on learning the latent vector modeled as a Gaussian distribution. We'll demonstrate other similarities in the problem being addressed by GANs in the form of conditional VAEs and the disentangling of latent representations in VAEs.

References

  1. Yuval Netzer and others. Reading Digits in Natural Images with Unsupervised Feature Learning. NIPS workshop on deep learning and unsupervised feature learning. Vol. 2011. No. 2. 2011(https://www-cs.stanford.edu/~twangcat/papers/nips2011_housenumbers.pdf).
  2. Zhu, Jun-Yan and others. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks. 2017 IEEE International Conference on Computer Vision (ICCV). IEEE, 2017 (http://openaccess.thecvf.com/content_ICCV_2017/papers/Zhu_Unpaired_Image-To-Image_Translation_ICCV_2017_paper.pdf).
  3. Phillip Isola and others. Image-to-Image Translation with Conditional Adversarial Networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2017 (http://openaccess.thecvf.com/content_cvpr_2017/papers/Isola_Image-To-Image_Translation_With_CVPR_2017_paper.pdf).
  4. Mehdi Mirza and Simon Osindero. Conditional Generative Adversarial Nets. arXiv preprint arXiv:1411.1784, 2014(https://arxiv.org/pdf/1411.1784.pdf).
  5. Xudong...
lock icon
The rest of the chapter is locked
You have been reading a chapter from
Advanced Deep Learning with Keras
Published in: Oct 2018Publisher: PacktISBN-13: 9781788629416
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Rowel Atienza

Rowel Atienza is an Associate Professor at the Electrical and Electronics Engineering Institute of the University of the Philippines, Diliman. He holds the Dado and Maria Banatao Institute Professorial Chair in Artificial Intelligence. Rowel has been fascinated with intelligent robots since he graduated from the University of the Philippines. He received his MEng from the National University of Singapore for his work on an AI-enhanced four-legged robot. He finished his Ph.D. at The Australian National University for his contribution on the field of active gaze tracking for human-robot interaction. Rowel's current research work focuses on AI and computer vision. He dreams on building useful machines that can perceive, understand, and reason. To help make his dreams become real, Rowel has been supported by grants from the Department of Science and Technology (DOST), Samsung Research Philippines, and Commission on Higher Education-Philippine California Advanced Research Institutes (CHED-PCARI).
Read more about Rowel Atienza