Reader small image

You're reading from  Advanced Deep Learning with Keras

Product typeBook
Published inOct 2018
Reading LevelIntermediate
PublisherPackt
ISBN-139781788629416
Edition1st Edition
Languages
Right arrow
Author (1)
Rowel Atienza
Rowel Atienza
author image
Rowel Atienza

Rowel Atienza is an Associate Professor at the Electrical and Electronics Engineering Institute of the University of the Philippines, Diliman. He holds the Dado and Maria Banatao Institute Professorial Chair in Artificial Intelligence. Rowel has been fascinated with intelligent robots since he graduated from the University of the Philippines. He received his MEng from the National University of Singapore for his work on an AI-enhanced four-legged robot. He finished his Ph.D. at The Australian National University for his contribution on the field of active gaze tracking for human-robot interaction. Rowel's current research work focuses on AI and computer vision. He dreams on building useful machines that can perceive, understand, and reason. To help make his dreams become real, Rowel has been supported by grants from the Department of Science and Technology (DOST), Samsung Research Philippines, and Commission on Higher Education-Philippine California Advanced Research Institutes (CHED-PCARI).
Read more about Rowel Atienza

Right arrow

Chapter 3. Autoencoders

In the previous chapter, Chapter 2, Deep Neural Networks, you were introduced to the concepts of deep neural networks. We're now going to move on to look at autoencoders, which are a neural network architecture that attempts to find a compressed representation of the given input data.

Similar to the previous chapters, the input data may be in multiple forms including, speech, text, image, or video. An autoencoder will attempt to find a representation or code in order to perform useful transformations on the input data. As an example, in denoising autoencoders, a neural network will attempt to find a code that can be used to transform noisy data into clean ones. Noisy data could be in the form of an audio recording with static noise which is then converted into clear sound. Autoencoders will learn the code automatically from the data alone without human labeling. As such, autoencoders can be classified under unsupervised learning algorithms...

Principles of autoencoders


In this section, we're going to go over the principles of autoencoders. In this section, we're going to be looking at autoencoders with the MNIST dataset, which we were first introduced to in the previous chapters.

Firstly, we need to be made aware that an autoencoder has two operators, these are:

  • Encoder: This transforms the input, x, into a low-dimensional latent vector, z = f(x). Since the latent vector is of low dimension, the encoder is forced to learn only the most important features of the input data. For example, in the case of MNIST digits, the important features to learn may include writing style, tilt angle, roundness of stroke, thickness, and so on. Essentially, these are the most important information needed to represent digits zero to nine.

  • Decoder: This tries to recover the input from the latent vector,

    . Although the latent vector has a low dimension, it has a sufficient size to allow the decoder to recover the input data.

The goal of the decoder...

Building autoencoders using Keras


We're now going to move onto something really exciting, building an autoencoder using Keras library. For simplicity, we'll be using the MNIST dataset for the first set of examples. The autoencoder will then generate a latent vector from the input data and recover the input using the decoder. The latent vector in this first example is 16-dim.

Firstly, we're going to implement the autoencoder by building the encoder. Listing 3.2.1 shows the encoder that compresses the MNIST digit into a 16-dim latent vector. The encoder is a stack of two Conv2D. The final stage is a Dense layer with 16 units to generate the latent vector. Figure 3.2.1 shows the architecture model diagram generated by plot_model() which is the same as the text version produced by encoder.summary(). The shape of the output of the last Conv2D is saved to compute the dimensions of the decoder input layer for easy reconstruction of the MNIST image.

The following Listing 3.2.1, shows autoencoder-mnist...

Denoising autoencoder (DAE)


We're now going to build an autoencoder with a practical application. Firstly, let's paint a picture and imagine that the MNIST digits images were corrupted by noise, thus making it harder for humans to read. We're able to build a Denoising Autoencoder (DAE) to remove the noise from these images. Figure 3.3.1 shows us three sets of MNIST digits. The top rows of each set (for example, MNIST digits 7, 2, 1, 9, 0, 6, 3, 4, 9) are the original images. The middle rows show the inputs to DAE, which are the original images corrupted by noise. The last rows show the outputs of DAE:

Figure 3.3.1: Original MNIST digits (top rows), corrupted original images (middle rows) and denoised images (last rows)

Figure 3.3.2: The input to the denoising autoencoder is the corrupted image. The output is the clean or denoised image. The latent vector is assumed to be 16-dim.

As shown in Figure 3.3.2, the denoising autoencoder has practically the same structure as the autoencoder for MNIST...

Automatic colorization autoencoder


We're now going to work on another practical application of autoencoders. In this case, we're going to imagine that we have a grayscale photo and that we want to build a tool that will automatically add color to them. We would like to replicate the human abilities in identifying that the sea and sky are blue, the grass field and trees are green, while clouds are white, and so on.

As shown in Figure 3.4.1, if we are given a grayscale photo of a rice field on the foreground, a volcano in the background and sky on top, we're able to add the appropriate colors.

Figure 3.4.1: Adding color to a grayscale photo of the Mayon Volcano. A colorization network should replicate human abilities by adding color to a grayscale photo. Left photo is grayscale. The right photo is color. Original color photo can be found on the book GitHub repository, https://github.com/PacktPublishing/Advanced-Deep-Learning-with-Keras/blob/master/chapter3-autoencoders/README.md.

A simple automatic...

Conclusion


In this chapter, we've been introduced to autoencoders, which are neural networks that compress input data into low-dimensional codes in order to efficiently perform structural transformations such as denoising and colorization. We've laid the foundations to the more advanced topics of GANs and VAEs, that we will introduce in later chapters, while still exploring how autoencoders can utilize Keras. We've demonstrated how to implement an autoencoder from two building block models, both encoder and decoder. We've also learned how the extraction of a hidden structure of input distribution is one of the common tasks in AI.

Once the latent code has been uncovered, there are many structural operations that can be performed on the original input distribution. In order to gain a better understanding of the input distribution, the hidden structure in the form of the latent vector can be visualized using low-level embedding similar to what we did in this chapter or through more sophisticated...

References


  1. Ian Goodfellow and others. Deep learning. Vol. 1. Cambridge: MIT press, 2016 (http://www.deeplearningbook.org/).

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Advanced Deep Learning with Keras
Published in: Oct 2018Publisher: PacktISBN-13: 9781788629416
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Rowel Atienza

Rowel Atienza is an Associate Professor at the Electrical and Electronics Engineering Institute of the University of the Philippines, Diliman. He holds the Dado and Maria Banatao Institute Professorial Chair in Artificial Intelligence. Rowel has been fascinated with intelligent robots since he graduated from the University of the Philippines. He received his MEng from the National University of Singapore for his work on an AI-enhanced four-legged robot. He finished his Ph.D. at The Australian National University for his contribution on the field of active gaze tracking for human-robot interaction. Rowel's current research work focuses on AI and computer vision. He dreams on building useful machines that can perceive, understand, and reason. To help make his dreams become real, Rowel has been supported by grants from the Department of Science and Technology (DOST), Samsung Research Philippines, and Commission on Higher Education-Philippine California Advanced Research Institutes (CHED-PCARI).
Read more about Rowel Atienza