You're reading from Hands-On Image Generation with TensorFlow

Product typeBook

Published inDec 2020

Reading LevelIntermediate

PublisherPackt

ISBN-139781838826789

Edition1st Edition

Languages

Python

Tools

TensorFlow

Concepts

Computer Vision

Author (1)

Soon Yau Cheong

Chapter 4: Image-to-Image Translation

In part one of the book, we learned to generate photorealistic images with VAE and GANs. The generative models can turn some simple random noise into high-dimensional images with complex distribution! However, the generation processes are unconditional, and we have fine control over the images to be generated. If we use MNIST as an example, we will not know which digit will be generated; it is a bit of a lottery. Wouldn't it be nice to be able to tell GAN what we want it to generate? This is what we will learn in this chapter.

We will first learn to build a conditional GAN (cGAN) that allows us to specify the class of images to generate. This lays the foundation for more complex networks that follow. We will learn to build a GAN known as pix2pix to perform image-to-image translation, or image translation for short. This will enable a lot of cool applications such as converting sketches to real images. After that, we will build CycleGAN...

Technical requirements

The Jupyter notebooks can be found at the following link:

https://github.com/PacktPublishing/Hands-On-Image-Generation-with-TensorFlow-2.0/tree/master/Chapter04.

The notebooks used in this chapter are as follows:

ch4_cdcgan_mnist.ipynb
ch4_cdcgan_fashion_mnist.ipynb
ch4_pix2pix.ipynb
ch4_cyclegan_facade.ipynb
ch4_cyclegan_horse2zebra.ipynb
ch4_bicycle_gan.ipynb

Conditional GANs

The first goal of a generative model is to be able to produce good quality images. Then we would like to be able to have some control over the images that are to be generated.

In Chapter 1, Getting Started with Image Generation Using TensorFlow, we learned about conditional probability and generated faces with certain attributes using a simple conditional probabilistic model. In that model, we generated a smiling face by forcing the model to only sample from the images that had a smiling face. When we condition on something, that thing will always be present and will no longer be a variable with random probability. You can also see that the probability of having those conditions is set to 1.

To enforce the condition on a neural network is simple. We simply need to show the labels to the network during training and inference. For example, if we want the generator to generate the digit 1, we will need to present the label of 1 in addition to the usual random...

Image translation with pix2pix

The introduction of pix2pix in 2017 caused quite a stir, not only within the research community, but also the wider population. This can be attributed in part to the https://affinelayer.com/pixsrv/ website, which puts the models online and allows people to translate their sketches into cats, shoes, and bags. You should try it too! The following screenshot is taken from their website to give you a glimpse of how it works:

Figure 4.8 – Application of turning a sketch of a cat into a real image (Source: https://affinelayer.com/pixsrv/)

Pix2pix came from a research paper entitled Image-to-Image Translation with Conditional Adversarial Networks. From the paper title, we can tell that pix2pix is a conditional GAN that performs image-to-image translation. The model can be trained to perform general image translation, but we will need to have image pairs in the dataset. In our pix2pix implementation, we will translate masks of...

Unpaired image translation with CycleGAN

CycleGAN was created by the same research group who invented pix2pix. CycleGAN could train with unpaired images using two generators and two discriminators. However, by using pix2pix as a foundation, CycleGAN is actually quite simple to implement once you understand how the cycle consistency loss works. Before this, let's try to understand the advantage of CycleGAN over pix2pix in the following sections.

Unpaired dataset

One drawback of pix2pix is that it requires a paired training dataset. For some applications, we can create a dataset rather easily. A grayscale-to-color images dataset and vice-versa is probably the simplest to create using any image processing software libraries such as OpenCV or Pillow. Similarly, we could also easily create sketches from real images using edge detection techniques. For a photo-to-artistic-painting dataset, we can use neural style transfer (we'll cover this in Chapter 5, Style Transfer) to...

Diversifying translation with BicyleGAN

Both Pix2pix and CycleGAN came from the Berkeley AI Research (BAIR) laboratory at UC Berkeley. They are popular and have a number of tutorials and blogs about them online, including on the official TensorFlow site. BicycleGAN is what I see as the last of the image-to-image translation trilogy from that research group. However, you don't find a lot of example code online, perhaps due to its complexity.

In order to build the most advanced network in this book up to this point, we will throw in all the knowledge you have acquired in this chapter, plus the last two chapters. Maybe that is why it is regarded as advanced by many. Don't worry; you already have all the prerequisite knowledge. Let's jump in!

Understanding architecture

Before jumping straight into implementation, let me give you an overview of BicycleGAN. From the name, you may naturally think that BicycleGAN is an upgrade of CycleGAN by adding another cycle (from...

Summary

We began this chapter by learning how the basic cGAN enforces the class label as a condition to generate MNIST. We implemented two different ways of injecting the condition, one being to one-hot encode the class labels to a dense layer, reshape them to match the channel dimensions of the input noise, and then concatenate them together. The other way is to use the embedding layer and element-wise multiplication.

Next, we learned to implement pix2pix, a special type of condition GAN for image-to-image translation. It uses PatchGAN as a discriminator, which looks at patches of images to encourage fine details or high-frequency components in the generated image. We also learned about a popular network architecture, U-Net, that has been used for various applications. Although pix2pix can generate high-quality image translation, the image is one-to-one mapping without diversification of the output. This is due to the removal of input noise. This was overcome by BicycleGAN, which...

The rest of the chapter is locked

You have been reading a chapter from

Hands-On Image Generation with TensorFlow

Published in: Dec 2020Publisher: PacktISBN-13: 9781838826789

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Soon Yau Cheong

Soon Yau Cheong is an AI consultant and the founder of Sooner.ai Ltd. With a history of being associated with industry giants such as NVIDIA and Qualcomm, he provides consultation in the various domains of AI, such as deep learning, computer vision, natural language processing, and big data analytics. He was awarded a full scholarship to study for his PhD at the University of Bristol while working as a teaching assistant. He is also a mentor for AI courses with Udacity.
Read more about Soon Yau Cheong

Other recommended products

Related to this chapter

Generative Adversarial Networks Projects

In this book, we will use different complexities of datasets in order to build end-to-end projects. With every chapter, the level of complexity and operations will become advanced. It consists of 8 full-fledged projects covering approaches such as 3D-GAN, Age-cGAN, DCGAN, SRGAN, StackGAN, and CycleGAN with real-world use cases.

BookJan 2019316 pages

Hands-On Generative Adversarial Networks with Keras

This book will explore deep learning and generative models, and their applications in artificial intelligence. You will learn to evaluate and improve your GAN models by eliminating challenges that are encountered in real-world applications. You will implement GAN architectures in various domains such as computer vision, NLP, and audio processing

BookMay 2019272 pages

Generative AI with Python and TensorFlow 2

Packed with intriguing real-world projects as well as theory, Generative AI with Python and TensorFlow 2 enables you to leverage artificial intelligence creatively and generate human-like data in the form of speech, text, images, and music.

BookApr 2021488 pages4

Hands-On Generative Adversarial Networks with PyTorch 1.x

This book will help you understand how GANs architecture works using PyTorch. You will get familiar with the most flexible deep learning toolkit and use it to transform ideas into actual working codes. You will apply GAN models to areas like computer vision, multimedia and natural language processing using a sample-generation perspective.

BookDec 2019312 pages

Advanced Deep Learning with TensorFlow 2 and Keras

A second edition of the bestselling guide to exploring and mastering deep learning with Keras, updated to include TensorFlow 2.x with new chapters on object detection, semantic segmentation, and unsupervised learning using mutual information.

BookFeb 2020512 pages

Advanced Deep Learning with Keras

This book covers advanced deep learning techniques to create successful AI. Using MLPs, CNNs, and RNNs as building blocks to more advanced techniques, you’ll study deep neural network architectures, Autoencoders, Generative Adversarial Networks (GANs), Variational AutoEncoders (VAEs), and Deep Reinforcement Learning (DRL) critical to many cutting-edge AI results.

BookOct 2018368 pages

Generative Adversarial Networks Cookbook

Generative Adversarial Networks have opened up many new possibilities in the machine learning domain. This book is all you need to implement different types of GANs using TensorFlow and Keras, in order to provide optimized and efficient deep learning solutions.

BookDec 2018268 pages

Deep Learning for Computer Vision

Deep learning has shown its power in several application areas of Artificial Intelligence, especially in Computer Vision, the science of manipulating and processing images. In this book, you will learn different techniques in deep learning to accomplish tasks related to object classification, object detection, image segmentation, captioning, image generation, and more. You will also explore their application using the popular Python libraries such as TensorFlow and Keras. With practical examples, you will learn to develop Computer Vision applications by leveraging the power of deep learning.

BookJan 2018310 pages

Hands-On Deep Learning Algorithms with Python

This book introduces basic-to-advanced deep learning algorithms used in a production environment by AI researchers and principal data scientists; it explains algorithms intuitively, including the underlying math, and shows how to implement them using popular Python-based deep learning libraries such as TensorFlow.

BookJul 2019512 pages

Modern Computer Vision with PyTorch

Starting from the basics of neural networks, this book covers over 50 applications of computer vision and helps you to gain a solid understanding of the theory of various architectures before implementing them. Each use case is accompanied by a notebook in GitHub with ready-to-execute code and self-assessment questions.

BookNov 2020824 pages5

Modern Computer Vision with PyTorch

Starting from the basics of neural networks, this book covers over 50 applications of computer vision and helps you to gain a solid understanding of the theory of various architectures before implementing them. Each use case is accompanied by a notebook in GitHub with ready-to-execute code and self-assessment questions.

BookNov 2020824 pages5

TensorFlow 2.0 Computer Vision Cookbook

This book covers recipes for solving various computer vision tasks using TensorFlow, taking you through all the tips and tricks you need to overcome any challenges that you may face while building various computer vision applications. You will discover machine learning techniques to solve problems in image processing, feature extraction, and more.

BookFeb 2021542 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages