Reader small image

You're reading from  Hands-On Image Generation with TensorFlow

Product typeBook
Published inDec 2020
Reading LevelIntermediate
PublisherPackt
ISBN-139781838826789
Edition1st Edition
Languages
Right arrow
Author (1)
Soon Yau Cheong
Soon Yau Cheong
author image
Soon Yau Cheong

Soon Yau Cheong is an AI consultant and the founder of Sooner.ai Ltd. With a history of being associated with industry giants such as NVIDIA and Qualcomm, he provides consultation in the various domains of AI, such as deep learning, computer vision, natural language processing, and big data analytics. He was awarded a full scholarship to study for his PhD at the University of Bristol while working as a teaching assistant. He is also a mentor for AI courses with Udacity.
Read more about Soon Yau Cheong

Right arrow

Chapter 6: AI Painter

In this chapter, we are going to look at two generative adversarial networks (GANs) that could be used to generate and edit images interactively; they are iGAN and GauGAN . The iGAN (interactive GAN) was the first network to demonstrate how to use GANs for interactive image editing and transformation, back in 2016. As GANs were still in fancy at that time, the generated image quality was not impressive as that of today's networks, but the door was opened to the incorporation of GANs into mainstream image editing.

In this chapter, you will be introduced to the concepts behind iGANs and some websites that feature video demonstrations of them. There won't be any code in that section. Then, we will go over a more recent award-winning application called GauGAN, produced by Nvidia in 2019, that gives impressive results in converting semantic segmentation masks into real landscape photos.

We will implement GauGAN from scratch, starting with a new normalization...

Technical requirements

The relevant Jupyter notebooks and code can be found here:

https://github.com/PacktPublishing/Hands-On-Image-Generation-with-TensorFlow-2.0/tree/master/Chapter06

The notebook used in this chapter is ch6_gaugan.ipynb.

Introduction to iGAN

We are now familiar with using generative models such as pix2pix (see Chapter 4, Image-to-Image Translation)to generate images from sketch or segmentation masks. However, as most of us are not skilled artists, we are only able to draw simple sketches, and as a result, our generated images also have simple shapes. What if we could use a real image as input and use sketches to change the appearance of the real image?

In the early days of GANs, a paper titled Generative Visual Manipulation on the Natural Image Manifold by J-Y. Zhu (inventor of CycleGAN) et al. was published that explored how to use a learned latent representation to perform image editing and morphing. The authors made a website, http://efrosgans.eecs.berkeley.edu/iGAN/, that contains videos that demonstrate a few of the following use cases:

  • Interactive image generation: This involves generating images from sketches in real time, as shown here:
Figure 6.1 – Interactive image generation, where an image is generated only from simple brush strokes (Source: J-Y. Zhu et al., 2016, "Generative Visual Manipulation on the Natural Image Manifold", https://arxiv.org/abs/1609.03552)

Figure 6.1...

Segmentation map-to-image translation with GauGAN

GauGAN (named after 19th-century painter Paul Gauguin) is a GAN from Nvidia. Speaking of Nvidia, it is one of the handful of companies that has invested heavily in GANs. They have achieved several breakthroughs in this space, including ProgressiveGAN (we'll cover that in Chapter 7, High Fidelity Face Generation), to generate high-resolution images, and StyleGAN for high-fidelity faces.

Their main business is in making graphics chips rather than AI software. Therefore, unlike some other companies, who keep their code and trained models as closely guarded secrets, Nvidia tends to open source their software code to the general public. They have built a web page (http://nvidia-research-mingyuliu.com/gaugan/) to showcase GauGAN, which can generate photorealistic landscape photos from segmentation maps. The following screenshot is taken from their web page.

Feel free to pause this chapter for a bit and have a play with the application...

Summary

Using AI in image editing is already prevalent now, and all this started at around the time that the iGAN was introduced. We learned about the key principle of the iGAN being to first project an image onto a manifold and then directly perform editing on the manifold. We then optimize this on the latent variables and generate an edited image that is natural-looking. This is in contrast with previous methods that could only change generated images indirectly by manipulating latent variables.

GauGAN incorporates many advanced techniques to generate crisp images from semantic segmentation masks. This includes the use of hinge loss and feature matching loss. However, the key ingredient is SPADE, which provides superior performance when using a segmentation mask as input. SPADE performs normalization on a local segmentation map to preserve its semantic meaning, which helps us to produce high-quality images. So far, we have been using images with up to 256x256 resolution to train...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Hands-On Image Generation with TensorFlow
Published in: Dec 2020Publisher: PacktISBN-13: 9781838826789
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Soon Yau Cheong

Soon Yau Cheong is an AI consultant and the founder of Sooner.ai Ltd. With a history of being associated with industry giants such as NVIDIA and Qualcomm, he provides consultation in the various domains of AI, such as deep learning, computer vision, natural language processing, and big data analytics. He was awarded a full scholarship to study for his PhD at the University of Bristol while working as a teaching assistant. He is also a mentor for AI courses with Udacity.
Read more about Soon Yau Cheong