You're reading from Generative Adversarial Networks Projects

Product typeBook

Published inJan 2019

Reading LevelIntermediate

PublisherPackt

ISBN-139781789136678

Edition1st Edition

Languages

Python

Tools

TensorFlow Keras

Concepts

Deep Learning

Author (1)

Kailash Ahirwar

StackGAN - Text to Photo-Realistic Image Synthesis

Text to image synthesis is one of the use cases for Generative Adversarial Networks (GANs) that has many industrial applications, just like the GANs described in previous chapters. Synthesizing images from text descriptions is very hard, as it is very difficult to build a model that can generate images that reflect the meaning of the text. One network that tries to solve this problem is StackGAN. In this chapter, we will implement a StackGAN in the Keras framework, using TensorFlow as the backend.

In this chapter, we will cover the following topics:

Introduction to StackGAN
The architecture of StackGAN
Data gathering and preparation
A Keras implementation of StackGAN
Training a StackGAN
Evaluating the model
Practical applications of a pix2pix network

Introduction to StackGAN

A StackGAN is named as such because it has two GANs that are stacked together to form a network that is capable of generating high-resolution images. It has two stages, Stage-I and Stage-II. The Stage-I network generates low-resolution images with basic colors and rough sketches, conditioned on a text embedding, while the Stage-II network takes the image generated by the Stage-I network and generates a high-resolution image that is conditioned on a text embedding. Basically, the second network corrects defects and adds compelling details, yielding a more realistic high-resolution image.

We can compare a StackGAN network to the work of a painter. As a painter starts working, they draw primitive shapes such as lines, circles, and rectangles. Then, they try to fill in the colors. As the painting progresses, more and more detail is added. In a StackGAN, Stage...

Architecture of StackGAN

StackGAN is a two-stage network. Each stage has two generators and two discriminators. StackGAN is made up of many networks, which are as follows:

Stack-I GAN: text encoder, Conditioning Augmentation network, generator network, discriminator network, embedding compressor network
Stack-II GAN: text encoder, Conditioning Augmentation network, generator network, discriminator network, embedding compressor network

Source: arXiv:1612.03242 [cs.CV]

The preceding image is self-explanatory. It represents both stages of the StackGAN network. As you can see, the first stage is generating images with dimensions of 64x64. Then the second stage takes these low-resolution images and generates high-resolution images with dimensions of 256x256. In the next few sections, we will explore the different components in the StackGAN network. Before doing this, however, let...

Setting up the project

If you haven't already cloned the repository with the complete code for all chapters, clone the repository now. The downloaded code has a directory called Chapter06, which contains the entire code for this chapter. Execute the following commands to set up the project:

Start by navigating to the parent directory as follows:

cd Generative-Adversarial-Networks-Projects

Now, change the directory from the current directory to Chapter06:

cd Chapter06

Next, create a Python virtual environment for this project:

virtualenv venv
virtualenv venv -p python3 # Create a virtual environment using 
           python3 interpreter
virtualenv venv -p python2 # Create a virtual environment using 
           python2 interpreter

We will be using this newly created virtual environment for this project. Each chapter has its own separate virtual environment.

Activate the...

Data preparation

In this section, we will be working with the CUB dataset, which is an image dataset of different bird species and can found at the following link: http://www.vision.caltech.edu/visipedia/CUB-200-2011.html. The CUB dataset contains 11,788 high-resolution images. We will also need the char-CNN-RNN text embeddings, which can be found at the following link: https://drive.google.com/open?id=0B3y_msrWZaXLT1BZdVdycDY5TEE. These are pretrained text embeddings. Follow the instructions given in the next few sections to download and extract the dataset.

Downloading the dataset

The CUB dataset can be downloaded manually from http://www.vision.caltech.edu/visipedia/CUB-200-2011.html. Alternatively, we can execute the following...

A Keras implementation of StackGAN

The Keras implementation of StackGAN is divided into two parts: Stage-I and Stage-II. We will implement these stages in the following sections.

Stage-I

A Stage-I StackGAN contains both a generator network and a discriminator network. It also has a text encoder network and a Conditional Augmentation network (CA network), which are explained in detail in the following section. The generator network gets the text conditioning variable (), along with a noise vector (x). After a set of upsampling layers, it produces a low-resolution image with dimensions of 64x64x3. The discriminator network takes this low-resolution image and tries to identify whether the image is real or fake. The generator...

Training a StackGAN

In this section, we will learn how to train both the StackGANs. In the first subsection, we will train the Stage-I StackGAN. In the second subsection, we will train the Stage-II StackGAN.

Training the Stage-I StackGAN

Before starting the training, we need to specify the essential hyperparameters. Hyperparameters are values that don't change during the training. Let's do this first:

data_dir = "Specify your dataset directory here/Data/birds"
train_dir = data_dir + "/train"
test_dir = data_dir + "/test"
image_size = 64
batch_size = 64
z_dim = 100
stage1_generator_lr = 0.0002
stage1_discriminator_lr = 0.0002
stage1_lr_decay_step = 600
epochs = 1000
condition_dim = 128

embeddings_file_path_train...

Practical applications of StackGAN

The industry applications of a StackGAN include the following:

Generating high-resolution images automatically for entertainment purposes or educational purposes
Creating comics: With the use of a StackGAN, the process of creating comics can be reduced to days as the StackGAN can generate comics automatically and assist in the creative process
Movie creation: A StackGAN can assist a movie creator by generating frames from text descriptions
Art creation: A StackGAN can assist an artist by generating sketches from text descriptions

Summary

In this chapter, we have learned about and implemented a StackGAN network to generate high-resolution images from text descriptions. We started with a basic introduction to StackGAN, in which we explored the architectural details of a StackGAN and discovered the losses used for the training of StackGAN. Then, we downloaded and prepared the dataset. After that, we started implementing the StackGAN in the Keras framework. After the implementation, we trained the Stage-I and Stage-II StackGANS sequentially. After successfully training the network, we evaluated the model and saved it for further use.

In the next chapter, we will work with CycleGAN, a network that can convert paintings into photos.

The rest of the chapter is locked

You have been reading a chapter from

Generative Adversarial Networks Projects

Published in: Jan 2019Publisher: PacktISBN-13: 9781789136678

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at £13.99/month. Cancel anytime

Author (1)

Kailash Ahirwar

Kailash Ahirwar is a machine learning and deep learning enthusiast. He has worked in many areas of Artificial Intelligence (AI), ranging from natural language processing and computer vision to generative modeling using GANs. He is a co-founder and CTO of Mate Labs. He uses GANs to build different models, such as turning paintings into photos and controlling deep image synthesis with texture patches. He is super optimistic about AGI and believes that AI is going to be the workhorse of human evolution.
Read more about Kailash Ahirwar

Other recommended products

Related to this chapter

Hands-On Generative Adversarial Networks with PyTorch 1.x

This book will help you understand how GANs architecture works using PyTorch. You will get familiar with the most flexible deep learning toolkit and use it to transform ideas into actual working codes. You will apply GAN models to areas like computer vision, multimedia and natural language processing using a sample-generation perspective.

BookDec 2019312 pages

Hands-On Generative Adversarial Networks with Keras

This book will explore deep learning and generative models, and their applications in artificial intelligence. You will learn to evaluate and improve your GAN models by eliminating challenges that are encountered in real-world applications. You will implement GAN architectures in various domains such as computer vision, NLP, and audio processing

BookMay 2019272 pages

Hands-On Image Generation with TensorFlow

This book is a step-by-step guide to show you how to implement generative models in TensorFlow 2.x from scratch. You’ll get to grips with the image generative technology by covering autoencoders, style transfer, and GANs as well as fundamental and state-of-the-art models.

BookDec 2020306 pages

Generative Adversarial Networks Cookbook

Generative Adversarial Networks have opened up many new possibilities in the machine learning domain. This book is all you need to implement different types of GANs using TensorFlow and Keras, in order to provide optimized and efficient deep learning solutions.

BookDec 2018268 pages

Advanced Deep Learning with Keras

This book covers advanced deep learning techniques to create successful AI. Using MLPs, CNNs, and RNNs as building blocks to more advanced techniques, you’ll study deep neural network architectures, Autoencoders, Generative Adversarial Networks (GANs), Variational AutoEncoders (VAEs), and Deep Reinforcement Learning (DRL) critical to many cutting-edge AI results.

BookOct 2018368 pages

Hands-On Deep Learning Algorithms with Python

This book introduces basic-to-advanced deep learning algorithms used in a production environment by AI researchers and principal data scientists; it explains algorithms intuitively, including the underlying math, and shows how to implement them using popular Python-based deep learning libraries such as TensorFlow.

BookJul 2019512 pages

Generative AI with Python and TensorFlow 2

Packed with intriguing real-world projects as well as theory, Generative AI with Python and TensorFlow 2 enables you to leverage artificial intelligence creatively and generate human-like data in the form of speech, text, images, and music.

BookApr 2021488 pages4

Advanced Deep Learning with TensorFlow 2 and Keras

A second edition of the bestselling guide to exploring and mastering deep learning with Keras, updated to include TensorFlow 2.x with new chapters on object detection, semantic segmentation, and unsupervised learning using mutual information.

BookFeb 2020512 pages

PyTorch Computer Vision Cookbook

This book enables you to solve the trickiest of problems in computer vision using deep learning algorithms and techniques. You will learn to use several different algorithms for different CV problems such as classification, detection, segmentation, and more using Pytorch. Packed with best practices in training and deployment of CV applications.

BookMar 2020364 pages

TensorFlow 2.0 Computer Vision Cookbook

This book covers recipes for solving various computer vision tasks using TensorFlow, taking you through all the tips and tricks you need to overcome any challenges that you may face while building various computer vision applications. You will discover machine learning techniques to solve problems in image processing, feature extraction, and more.

BookFeb 2021542 pages

Mastering PyTorch

Discover the flexibility of the PyTorch library for implementing new algorithms in a scalable and efficient way with this expert guide. This book will show you how to process data with deep learning methodologies using PyTorch 1.x and cover advanced topics such as GANs, Deep RL, and NLP using advanced deep learning techniques.

BookFeb 2021450 pages

Deep Learning with R Cookbook

This book will help you get through the problems that you face during the execution of different tasks and understand hacks in deep learning. With unique recipes, you will implement various deep learning architectures using R 3.5.x. You will cover complex algorithms to perform tasks such as reinforcement learning, GANs, advanced neural networks and more.

BookFeb 2020328 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages