An Introduction to Generative AI: "Drawing" Data from Models
In this chapter, we will dive into the various applications of generative models. Before that, we will take a step back and examine how exactly generative models are different from other types of machine learning. The difference lies with the basic units of any machine learning algorithm: probability and the various ways we use mathematics to quantify the shape and distribution of data we encounter in the world.
In the rest of this chapter, we will cover:
- Applications of AI
- Discriminative and generative models
- Implementing generative models
- The rules of probability
- Why use generative models?
- Unique challenges of generative models
Applications of AI
In New York City in October 2018, the international auction house Christie's sold the Portrait of Edmond Belamy (Figure 1.1) during the show Prints & Multiples for $432,500.00. This sale was remarkable both because the sale price was 45 times higher than the initial estimates for the piece, and due to the unusual origin of this portrait. Unlike the majority of other artworks sold by Christie's since the 18th century, the Portrait of Edmond Belamy is not painted using oil or watercolors, nor is its creator even human; rather, it is an entirely digital image produced by a sophisticated machine learning algorithm. The creators—a Paris-based collective named Obvious—used a collection of 15,000 portraits created between the 14th and 20th centuries to tune an artificial neural network model capable of generating aesthetically similar, albeit synthetic, images.
Figure 1.1: The Portrait of Edmond Belamy1
Portraiture is far from...
The rules of probability
In the task of modeling, we usually think about separating the variables of our dataset into two broad classes:
- Independent data, which primarily means inputs to a model, are denoted by X. These could be categorical features (such as a "0" or "1" in six columns indicating which of six schools a student attends), continuous (such as the heights or test scores of the same students), or ordinal (the rank of a student in the class).
- Dependent data, conversely, are the outputs of our models, and are denoted by Y. (Note that in some cases Y is a label that can be used to condition a generative output, such as in a conditional GAN.) As with the independent variables, these can be continuous, categorical, or ordinal, and they can...
Why use generative models?
Now that we have reviewed what generative models are and defined them more formally in the language of probability, why would we have a need for such models in the first place? What value do they provide in practical applications? To answer this question, let's take a brief tour of the topics that we will cover in more detail in the rest of this book.
The promise of deep learning
As noted already, many of the models we will survey in the book are deep, multi-level neural networks. The last 15 years have seen a renaissance in the development of deep learning models for image classification, natural language processing and understanding, and reinforcement learning. These advances were enabled by breakthroughs in traditional challenges in tuning and optimizing very complex models, combined with access to larger datasets, distributed computational power in the cloud, and frameworks such as TensorFlow that make it easier to prototype and...
Style transfer and image transformation
In addition to mapping artificial images to a space of random numbers, we can also use generative models to learn a mapping between one kind of image and a second. This kind of model can, for example, be used to convert an image of a horse into that of a zebra (Figure 1.7), create deep fake videos in which one actor's face has been replaced with another's, or transform a photo into a painting (Figures 1.2 and 1.4):21
Figure 1.7: CycleGANs apply stripes to horses to generate zebras22
Another fascinating example of applying generative modeling is a study in which lost masterpieces of the artist Pablo Picasso were discovered to have been painted over with another image. After X-ray imaging of The Old Guitarist and The Crouching Beggar indicated that earlier images of a woman and a landscape lay underneath (Figure 1.8), researchers used the other paintings from Picasso's blue period or other color photographs (Figure...
Unique challenges of generative models
Given the powerful applications that generative models have, what are the major challenges in implementing them? As described, most of these models utilize complex data, requiring us to fit large models to capture all the nuances of their features and distribution. This has implications both for the number of examples that we must collect to adequately represent the kind of data we are trying to generate, and the computational resources needed to build the model. We will discuss techniques in Chapter 2, Setting up a TensorFlow Lab, to parallelize the training of these models using cloud computing frameworks and graphics processing units (GPUs).
A more subtle problem that comes from having complex data, and the fact that we are trying to generate data rather than a numerical label or value, is that our notion of model accuracy is much more complicated: we cannot simply calculate the distance to a single label or scores.
We will discuss...
In this chapter, we discussed what generative modeling is, and how it fits into the landscape of more familiar machine learning methods. I used probability theory and Bayes' theorem to describe how these models approach prediction in an opposite manner to generative learning.
We reviewed use cases for generative learning, both for specific kinds of data and general prediction tasks. Finally, we examined some of the specialized challenges that arise from building these models.
In the next chapter, we will begin our practical implementation of these models by exploring how to set up a development environment for TensorFlow 2.0 using Docker and Kubeflow.
- Baltruschat, I.M., Nickisch, H., Grass, M. et al. (2019). Comparison of Deep Learning Approaches for Multi-Label Chest X-Ray Classification. Sci Rep 9, 6381. https://doi.org/10.1038/s41598-019-42294-8
- AlphaGo (n.d.). DeepMind. Retrieved April 20, 2021, from https://deepmind.com/research/case-studies/alphago-the-story-so-far
- The AlphaStar team (2019, October). AlphaStar: Grandmaster level in StarCraft II using multi-agent reinforcement learning. DeepMind. https://deepmind.com/blog/article/AlphaStar-Grandmaster-level-in-StarCraft-II-using-multi-agent-reinforcement-learning
- Devlin, J., Chang, M., Lee, K., Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv. https://arxiv.org/abs/1810.04805v2
- Brandon, J. (2018, February 16). Terrifying high-tech porn: Creepy 'deepfake&apos...