Reader small image

You're reading from  Hands-On Image Generation with TensorFlow

Product typeBook
Published inDec 2020
Reading LevelIntermediate
PublisherPackt
ISBN-139781838826789
Edition1st Edition
Languages
Right arrow
Author (1)
Soon Yau Cheong
Soon Yau Cheong
author image
Soon Yau Cheong

Soon Yau Cheong is an AI consultant and the founder of Sooner.ai Ltd. With a history of being associated with industry giants such as NVIDIA and Qualcomm, he provides consultation in the various domains of AI, such as deep learning, computer vision, natural language processing, and big data analytics. He was awarded a full scholarship to study for his PhD at the University of Bristol while working as a teaching assistant. He is also a mentor for AI courses with Udacity.
Read more about Soon Yau Cheong

Right arrow

Chapter 4: Image-to-Image Translation

In part one of the book, we learned to generate photorealistic images with VAE and GANs. The generative models can turn some simple random noise into high-dimensional images with complex distribution! However, the generation processes are unconditional, and we have fine control over the images to be generated. If we use MNIST as an example, we will not know which digit will be generated; it is a bit of a lottery. Wouldn't it be nice to be able to tell GAN what we want it to generate? This is what we will learn in this chapter.

We will first learn to build a conditional GAN (cGAN) that allows us to specify the class of images to generate. This lays the foundation for more complex networks that follow. We will learn to build a GAN known as pix2pix to perform image-to-image translation, or image translation for short. This will enable a lot of cool applications such as converting sketches to real images. After that, we will build CycleGAN...

Technical requirements

The Jupyter notebooks can be found at the following link:

https://github.com/PacktPublishing/Hands-On-Image-Generation-with-TensorFlow-2.0/tree/master/Chapter04.

The notebooks used in this chapter are as follows:

  • ch4_cdcgan_mnist.ipynb
  • ch4_cdcgan_fashion_mnist.ipynb
  • ch4_pix2pix.ipynb
  • ch4_cyclegan_facade.ipynb
  • ch4_cyclegan_horse2zebra.ipynb
  • ch4_bicycle_gan.ipynb

Conditional GANs

The first goal of a generative model is to be able to produce good quality images. Then we would like to be able to have some control over the images that are to be generated.

In Chapter 1, Getting Started with Image Generation Using TensorFlow, we learned about conditional probability and generated faces with certain attributes using a simple conditional probabilistic model. In that model, we generated a smiling face by forcing the model to only sample from the images that had a smiling face. When we condition on something, that thing will always be present and will no longer be a variable with random probability. You can also see that the probability of having those conditions is set to 1.

To enforce the condition on a neural network is simple. We simply need to show the labels to the network during training and inference. For example, if we want the generator to generate the digit 1, we will need to present the label of 1 in addition to the usual random...

Image translation with pix2pix

The introduction of pix2pix in 2017 caused quite a stir, not only within the research community, but also the wider population. This can be attributed in part to the https://affinelayer.com/pixsrv/ website, which puts the models online and allows people to translate their sketches into cats, shoes, and bags. You should try it too! The following screenshot is taken from their website to give you a glimpse of how it works:

Figure 4.8 – Application of turning a sketch of a cat into a real image (Source: https://affinelayer.com/pixsrv/)

Pix2pix came from a research paper entitled Image-to-Image Translation with Conditional Adversarial Networks. From the paper title, we can tell that pix2pix is a conditional GAN that performs image-to-image translation. The model can be trained to perform general image translation, but we will need to have image pairs in the dataset. In our pix2pix implementation, we will translate masks of...

Unpaired image translation with CycleGAN

CycleGAN was created by the same research group who invented pix2pix. CycleGAN could train with unpaired images using two generators and two discriminators. However, by using pix2pix as a foundation, CycleGAN is actually quite simple to implement once you understand how the cycle consistency loss works. Before this, let's try to understand the advantage of CycleGAN over pix2pix in the following sections.

Unpaired dataset

One drawback of pix2pix is that it requires a paired training dataset. For some applications, we can create a dataset rather easily. A grayscale-to-color images dataset and vice-versa is probably the simplest to create using any image processing software libraries such as OpenCV or Pillow. Similarly, we could also easily create sketches from real images using edge detection techniques. For a photo-to-artistic-painting dataset, we can use neural style transfer (we'll cover this in Chapter 5, Style Transfer) to...

Diversifying translation with BicyleGAN

Both Pix2pix and CycleGAN came from the Berkeley AI Research (BAIR) laboratory at UC Berkeley. They are popular and have a number of tutorials and blogs about them online, including on the official TensorFlow site. BicycleGAN is what I see as the last of the image-to-image translation trilogy from that research group. However, you don't find a lot of example code online, perhaps due to its complexity.

In order to build the most advanced network in this book up to this point, we will throw in all the knowledge you have acquired in this chapter, plus the last two chapters. Maybe that is why it is regarded as advanced by many. Don't worry; you already have all the prerequisite knowledge. Let's jump in!

Understanding architecture

Before jumping straight into implementation, let me give you an overview of BicycleGAN. From the name, you may naturally think that BicycleGAN is an upgrade of CycleGAN by adding another cycle (from...

Summary

We began this chapter by learning how the basic cGAN enforces the class label as a condition to generate MNIST. We implemented two different ways of injecting the condition, one being to one-hot encode the class labels to a dense layer, reshape them to match the channel dimensions of the input noise, and then concatenate them together. The other way is to use the embedding layer and element-wise multiplication.

Next, we learned to implement pix2pix, a special type of condition GAN for image-to-image translation. It uses PatchGAN as a discriminator, which looks at patches of images to encourage fine details or high-frequency components in the generated image. We also learned about a popular network architecture, U-Net, that has been used for various applications. Although pix2pix can generate high-quality image translation, the image is one-to-one mapping without diversification of the output. This is due to the removal of input noise. This was overcome by BicycleGAN, which...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Hands-On Image Generation with TensorFlow
Published in: Dec 2020Publisher: PacktISBN-13: 9781838826789
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Soon Yau Cheong

Soon Yau Cheong is an AI consultant and the founder of Sooner.ai Ltd. With a history of being associated with industry giants such as NVIDIA and Qualcomm, he provides consultation in the various domains of AI, such as deep learning, computer vision, natural language processing, and big data analytics. He was awarded a full scholarship to study for his PhD at the University of Bristol while working as a teaching assistant. He is also a mentor for AI courses with Udacity.
Read more about Soon Yau Cheong