Reader small image

You're reading from  Hands-On Image Generation with TensorFlow

Product typeBook
Published inDec 2020
Reading LevelIntermediate
PublisherPackt
ISBN-139781838826789
Edition1st Edition
Languages
Right arrow
Author (1)
Soon Yau Cheong
Soon Yau Cheong
author image
Soon Yau Cheong

Soon Yau Cheong is an AI consultant and the founder of Sooner.ai Ltd. With a history of being associated with industry giants such as NVIDIA and Qualcomm, he provides consultation in the various domains of AI, such as deep learning, computer vision, natural language processing, and big data analytics. He was awarded a full scholarship to study for his PhD at the University of Bristol while working as a teaching assistant. He is also a mentor for AI courses with Udacity.
Read more about Soon Yau Cheong

Right arrow

Preface

Any sufficiently advanced technology is indistinguishable from magic.

– Arthur C. Clarke

This phrase best describes image generation using artificial intelligence (AI). The field of deep learning—a subset of artificial intelligence—has been developing rapidly in the last decade. Now we can generate artificial but faces that are indistinguishable from real people's faces, and to generate realistic paintings from simple brush strokes. Most of these abilities are owed to a type of deep neural network known as a generative adversarial network (GAN). With this hands-on book, you'll not only develop image generation skills but also gain a solid understanding of the underlying principles.

The book starts with an introduction to the fundamentals of image generation using TensorFlow covering variational autoencoders and GANs. As you progress through the chapters, you'll learn to build models for different applications for performing face swaps using deep fakes, neural style transfer, image-to-image translation, turning simple images into photorealistic images, and much more. You'll also understand how and why to construct state-of-the-art deep neural networks using advanced techniques such as spectral normalization and self-attention layer before working with advanced models for face generation and editing. You'll also be introduced to photo restoration, text-to-image synthesis, video retargeting, and neural rendering. Throughout the book, you'll learn to implement models from scratch in TensorFlow 2.x, including PixelCNN, VAE, DCGAN, WGAN, pix2pix, CycleGAN, StyleGAN, GauGAN, and BigGAN.

By the end of this book, you'll be well-versed in TensorFlow and image generative technologies.

Who this book is for

This book is for deep learning engineers, practitioners, and researchers who have basic knowledge of convolutional neural networks and want to use it to learn various image generation techniques using TensorFlow 2.x. You'll also find this book useful if you are an image processing professional or computer vision engineer looking to explore state-of-the-art architectures to improve and enhance images and videos. Knowledge of Python and TensorFlow is required to get the best out of the book.

How to use this book

There are many online tutorials available teaching the basics of GANs. However, the models tend to be rather simple and suitable only for toy datasets. At the other end of the spectrum, there are also free codes available for state-of-the-art models to generate realistic images. Nevertheless, the code tends to be complex, and the lack of explanation makes it difficult for beginners to understand. Many of the “Git cloners” who downloaded the codes had no clue how to tweak the models to make them work for their applications. This book aims to bridge that gap.

We will start with learning the basic principles and immediately implement the code to put them to the test. You'll be able to see the result of your work instantly. All the necessary code to build a model is laid bare in a single Jupyter notebook. This is to make it easier for you to go through the flow of the code and to modify and test the code in an interactive manner. I believe writing from scratch is the best way to learn and master deep learning. There are between one to three models in each chapter, and we will write all of them from scratch. When you finish this book, not only will you be familiar with image generation but you will also be an expert in TensorFlow 2.

The chapters are arranged in roughly chronological order of the history of GANs, where the chapters may build upon knowledge from previous chapters. Therefore, it is best to read the chapters in order, especially the first three chapters, which cover the fundamentals. After that, you may jump to chapters that interest you more. Should you feel confused by the acronyms during the reading, you can refer to the summary of GAN techniques listed in the last chapter.

What this book covers

Chapter 1, Getting Started with Image Generation Using TensorFlow, walks through the basics of pixel probability and uses it to build our first model to generate handwritten digits.

Chapter 2, Variational Autoencoder, explains how to build a variational autoencoder (VAE) and use it to generate and edit faces.

Chapter 3, Generative Adversarial Network, introduces the fundamentals of GANs and builds a DCGAN to generate photorealistic images. We'll then learn about new adversarial loss to stabilize the training.

Chapter 4, Image-to-Image Translation, covers a lot of models and interesting applications. We will first implement pix2pix to convert sketches to photorealistic photos. Then we'll use CycleGAN to transform a horse to a zebra. Lastly, we will use BicycleGAN to generate a variety of shoes.

Chapter 5, Style Transfer, explains how to extract the style from a painting and transfer it into a photo. We'll also learn advanced techniques to make neural style transfer run faster in runtime, and to use it in state-of-the-art GANs.

Chapter 6, AI Painter, goes through the underlying principles of image editing and transformation using interactive GAN (iGAN) as an example. Then we will build a GauGAN to create photorealistic building facades from a simple segmentation map.

Chapter 7, High Fidelity Face Generation, shows how to build a StyleGAN using techniques from style transfer. However, before that, we will learn to grow the network layer progressively using a Progressive GAN.

Chapter 8, Self-Attention for Image Generation, shows how to build self-attention into a Self-Attention GAN (SAGAN) and a BigGAN for conditional image generation.

Chapter 9, Video Synthesis, demonstrates how to use autoencoders to create a deepfake video. Along the way, we'll learn how to use OpenCV and dlib for face processing.

Chapter 10, Road Ahead, reviews and summarizes the generative techniques we have learned. Then we will look at how they are used as the basis of up-and-coming applications, including text-to-image-synthesis, video compression, and video retargeting.

To get the most out of this book

Readers should have basic knowledge of deep learning training pipelines, such as training convolutional neural networks for image classification. This book will mainly use high-level Keras APIs in TensorFlow 2, which is easy to learn. Should you need to refresh or learn TensorFlow 2, there are many free tutorials available online, such as the one on the official TensorFlow website, https://www.tensorflow.org/tutorials/keras/classification.

Training deep neural networks is computationally intensive. You can train the first few simple models using the CPU only. However, as we progress to more complex models and datasets in later chapters, the model training could take a few days before you start to see satisfactory results. To get the most out of this book, you should have access to the GPU to accelerate the model training time. There are also free cloud services, such as Google's Colab, that provide GPUs on which you can upload and run the code.

If you are using the digital version of this book, we advise you to type the code yourself or access the code via the GitHub repository (link available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.

Download the example code files

You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Hands-On-Image-Generation-with-TensorFlow-2.0. In case there's an update to the code, it will be updated on the existing GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: https://static.packt-cdn.com/downloads/9781838826789_ColorImages.pdf.

Conventions used

There are a number of text conventions used throughout this book.

Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: “This is done using tf.gather(self.beta, labels), which is conceptually equivalent to beta = self.beta[labels], as follows.”

A block of code is set as follows:

attn = tf.matmul(theta, phi, transpose_b=True)attn = tf.nn.softmax(attn)

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

self.conv_theta = Conv2D(c//8, 1, padding='same',              	                   kernel_constraint=SpectralNorm(),  	 	                   name='Conv_Theta')

Any command-line input or output is written as follows:

$ mkdir css
$ cd css

Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: “From the preceding architecture diagram, we can see that G1's encoder output concatenates with G1's features and feeds into the decoder part of G2 to generate high-resolution images.”

Tips or important notes

Appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at customercare@packtpub.com.

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.

Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at copyright@packt.com with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Reviews

Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!

For more information about Packt, please visit packt.com.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Hands-On Image Generation with TensorFlow
Published in: Dec 2020Publisher: PacktISBN-13: 9781838826789
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Soon Yau Cheong

Soon Yau Cheong is an AI consultant and the founder of Sooner.ai Ltd. With a history of being associated with industry giants such as NVIDIA and Qualcomm, he provides consultation in the various domains of AI, such as deep learning, computer vision, natural language processing, and big data analytics. He was awarded a full scholarship to study for his PhD at the University of Bristol while working as a teaching assistant. He is also a mentor for AI courses with Udacity.
Read more about Soon Yau Cheong