You're reading from Deep Learning with PyTorch Lightning

Product typeBook

Published inApr 2022

Reading LevelBeginner

PublisherPackt

ISBN-139781800561618

Edition1st Edition

Languages

Python

Tools

PyTorch

Concepts

Deep Learning

Author (1)

Kunal Sawarkar

Chapter 6: Deep Generative Models

It has always been the dream of mankind to build a machine that can match human ingenuity. While the word intelligence comes with various dimensions, such as calculations, recognition of objects, speech, understanding context, and reasoning; no aspects of human intelligence make us more human than our creativity. The ability to create a piece of art, be it a piece of music, a poem, a painting, or a movie, has always been the epitome of human intelligence, and people who are good at such creativity are often treated as "geniuses." The question that remains fully unanswered is, can a machine learn creativity?

We have seen machines learn to predict images using a variety of information and sometimes even with little information. A machine learning model can learn from a set of training images and labels to recognize various objects in an image; however, the success of vision models depends on their capability for vast generalizations –...

Technical requirements

In this chapter, we will primarily be using the following Python modules, mentioned with their versions:

pytorch lightning (version 1.5.2)
torch (version 1.10.0)
matplotlib (version 3.2.2)

Working examples for this chapter can be found at this GitHub link: https://github.com/PacktPublishing/Deep-Learning-with-PyTorch-Lightning/tree/main/Chapter06.

In order to make sure that these modules work together and not go out of sync, we have used the specific version of torch, torchvision, torchtext, torchaudio with PyTorch Lightning 1.5.2. You can also use the latest version of PyTorch Lightning and torch compatible with each other. More details can be found on the GitHub link: https://github.com/PacktPublishing/Deep-Learning-with-PyTorch-Lightning

!pip install torch==1.10.0 torchvision==0.11.1 torchtext==0.11.0 torchaudio==0.10.0 --quiet
!pip install pytorch-lightning==1.5.2 --quiet

We will be using the Food dataset, which contains a...

Getting started with GAN models

One of the most amazing applications of GANs is generation. Just look at the following picture of a girl; can you guess whether she is real or simply generated by a machine?

Figure 6.1 – Fake face generation using StyleGAN (image credit – https://thispersondoesnotexist.com)

Creating such incredibly realistic faces is one of the most successful use cases of GANs. However, GANs are not limited to just generating pretty faces or deepfake videos; they also have key commercial applications as well, such as generating images of houses or creating new models of cars or paintings.

While generative models have been used in the past in statistics, deep generative models such as GANs are relatively new. Deep generative models also include Variational Autoencoders (VAEs) and auto-regressive models. However, with GAN being the most popular method, we will focus on them here.

What is a GAN?

Interestingly, GAN originated...

Creating new food items using a GAN

GANs are one of the most common and powerful algorithms used in generative modeling. GANs are used widely to generate fake faces, pictures, anime/cartoon characters, image style translations, semantic image translation, and so on.

We will start by creating an architecture for our GAN model:

Figure 6.3 – GAN architecture for creating a new food

Firstly, we will define the neural networks for the generator and the discriminator with multiple layers of convolution and fully connected layers. In the architecture that we will be building, we will have four convolutional and one fully connected layer for the discriminator, and we will be utilizing five transposed convolution layers for the generator. We will attempt to generate fake images by adding Gaussian noise and use the discriminator to detect these fake images. Then, we will use the Adam optimizer to optimize the neural network. For this use, we will use cross...

Creating new butterfly species using a GAN

In this section, we are going to use the same GAN model that we built in the previous section with a minor tweak to generate new species of butterflies.

Since we are following the same steps here, we will keep the description concise and observe the outputs. (The full code can be found in the GitHub repository for this chapter.)

We will first try with the previous architecture that we used for generating food images (which is 4 convolution, 1 fully connected layer, and 5 transposed convolution layers). We will then try another architecture with 5 convolution layers and 5 transposed convolution layers:

Download the dataset:

dataset_url =  'https://www.kaggle.com/gpiosenka/butterfly-images40-species'
od.download(dataset_url)

Initialize the variables for the images:

image_size = 64
batch_size = 128
normalize = [(0.5, 0.5, 0.5), (0.5, 0.5, 0.5)]
latent_size = 256
butterfly_data_directory = "/content/butterfly...

GAN training challenges

A GAN model requires a lot of compute resources for training a model in order to get a good result, especially when a dataset is not very clean and representations in an image are not very easy to learn. In order to get a very clean output with sharp representations in our fake generated image, we need to pass a higher resolution image as input to our GAN model. However, the higher resolution means a lot more parameters are needed in the model, which in turn requires much more memory to train the model.

Here is an example scenario. We have trained our models using the image size of 64 pixels, but if we increase the image size to 128 pixels, then the number of parameters in the GAN model increases drastically from 15.9 M to 93.4 M. This, in turn, requires much more compute power to train the model, and with the limited resources in the Google Collab environment, you might get an error similar to this after 20–25 epochs:

RuntimeError: CUDA out of...

Creating images using DCGAN

A DCGAN is a direct extension of the GAN model discussed in the previous section, except that it explicitly uses the convolutional and convolutional-transpose layers in the discriminator and generator respectively. DCGAN was first proposed in a paper, Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, by Alec Radford, Luke Metz, and Soumith Chintala:

Figure 6.13 – A DCGAN architecture overview

The DCGAN architecture basically consists of 5 layers of convolution and 5 layers for transposed convolution. There is no fully connected layer in this architecture. We will also use a learning rate of 0.0002 for training the model.

We can also take a more in-depth look at the generator architecture of the DCGAN to see how it works:

Figure 6.14 – The DCGAN generator architecture from the paper

It can be observed from the DCGAN generator architecture diagram...

Summary

GAN is a powerful method for generating not only images but also paintings, and even 3D objects (using newer variants of a GAN). We saw how, using a combination of discriminator and generator networks (each with five convolutional layers), we can start with random noise and generate an image that mimics real images. The play-off between the generator and discriminator keeps producing better images by minimizing the loss function and going through multiple iterations. The end result is fake pictures that never existed in real life.

It's a powerful method, and there are concerns about its ethical use. Fake images and objects can be used to defraud people; however, it also creates endless new opportunities. For example, imagine looking at a picture of fashion models while shopping for a new outfit. Instead of relying on endless image shoots, using a GAN (and DCGAN), you can generate realistic pictures of models with all body types, sizes, shapes, and colors, helping both...

The rest of the chapter is locked

You have been reading a chapter from

Deep Learning with PyTorch Lightning

Published in: Apr 2022Publisher: PacktISBN-13: 9781800561618

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Kunal Sawarkar

Kunal Sawarkar is a chief data scientist and AI thought leader. He leads the worldwide partner ecosystem in building innovative AI products. He also serves as an advisory board member and an angel investor. He holds a master's degree from Harvard University with major coursework in applied statistics. He has been applying machine learning to solve previously unsolved problems in industry and society, with a special focus on deep learning and self-supervised learning. Kunal has led various AI product R&D labs and has 20+ patents and papers published in this field. When not diving into data, he loves doing rock climbing and learning to fly aircraft, in addition to an insatiable curiosity for astronomy and wildlife.
Read more about Kunal Sawarkar

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages