Reader small image

You're reading from  Deep Learning with Keras

Product typeBook
Published inApr 2017
Reading LevelIntermediate
PublisherPackt
ISBN-139781787128422
Edition1st Edition
Languages
Right arrow
Authors (2):
Antonio Gulli
Antonio Gulli
author image
Antonio Gulli

Antonio Gulli has a passion for establishing and managing global technological talent for innovation and execution. His core expertise is in cloud computing, deep learning, and search engines. Currently, Antonio works for Google in the Cloud Office of the CTO in Zurich, working on Search, Cloud Infra, Sovereignty, and Conversational AI.
Read more about Antonio Gulli

Sujit Pal
Sujit Pal
author image
Sujit Pal

Sujit Pal is a Technology Research Director at Elsevier Labs, an advanced technology group within the Reed-Elsevier Group of companies. His interests include semantic search, natural language processing, machine learning, and deep learning. At Elsevier, he has worked on several initiatives involving search quality measurement and improvement, image classification and duplicate detection, and annotation and ontology development for medical and scientific corpora.
Read more about Sujit Pal

View More author details
Right arrow

Chapter 7. Additional Deep Learning Models

So far, most of the discussion has been focused around different models that do classification. These models are trained using object features and their labels to predict labels for hitherto unseen objects. The models also had a fairly simple architecture, all the ones we have seen so far have a linear pipeline modeled by the Keras sequential API.

In this chapter, we will focus on more complex architectures where the pipelines are not necessarily linear. Keras provides the functional API to deal with these sorts of architectures. We will learn how to define our networks using the functional API in this chapter. Note that the functional API can be used to build linear architectures as well.

The simplest extension of classification networks are regression networks. The two broad subcategories under supervised machine learning are classification and regression. Instead of predicting a category, the network now predicts a continuous value. You saw an...

Keras functional API


The Keras functional API defines each layer as a function and provides operators to compose these functions into a larger computational graph. A function is some sort of transformation with a single input and single output. For example, the function y = f(x) defines a function f with input x and output y. Let us consider the simple sequential model from Keras (for more information refer to: https://keras.io/getting-started/sequential-model-guide/):

from keras.models import Sequential
from keras.layers.core import dense, Activation

model = Sequential([
   dense(32, input_dim=784),
   Activation("sigmoid"),
   dense(10),
   Activation("softmax"),
])

model.compile(loss="categorical_crossentropy", optimizer="adam")

As you can see, the sequential model represents the network as a linear pipeline, or list, of layers. We can also represent the network as the composition of the following nested functions. Here x is the input tensor of shape (None, 784) and y is the output tensor...

Regression networks


The two major techniques of supervised learning are classification and regression. In both cases, the model is trained with data to predict known labels. In case of classification, these labels are discrete values such as genres of text or image categories. In case of regression, these labels are continuous values, such as stock prices or human intelligence quotients (IQ).

Most of the examples we have seen show deep learning models being used to perform classification. In this section, we will look at how to perform regression using such a model.

Recall that classification models have a dense layer with a nonlinear activation at the end, the output dimension of which corresponds to the number of classes the model can predict. Thus, an ImageNet image classification model has a dense (1,000) layer at the end, corresponding to 1,000 ImageNet classes it can predict. Similarly, a sentiment analysis model has a dense layer at the end, corresponding to positive or negative sentiment...

Unsupervised learning — autoencoders


Autoencoders are a class of neural network that attempt to recreate the input as its target using back-propagation. An autoencoder consists of two parts, an encoder and a decoder. The encoder will read the input and compress it to a compact representation, and the decoder will read the compact representation and recreate the input from it. In other words, the autoencoder tries to learn the identity function by minimizing the reconstruction error.

Even though the identity function does not seem like a very interesting function to learn, the way in which this is done makes it interesting. The number of hidden units in the autoencoder is typically less than the number of input (and output) units. This forces the encoder to learn a compressed representation of the input which the decoder reconstructs. If there is structure in the input data in the form of correlations between input features, then the autoencoder will discover some of these correlations, and...

Composing deep networks


We have looked extensively at these three basic deep learning networks—the fully connected network (FCN), the CNN and the RNN models. While each of these have specific use cases for which they are most suited, you can also compose larger and more useful models by combining these models as Lego-like building blocks and using the Keras functional API to glue them together in new and interesting ways.

Such models tend to be somewhat specialized to the task for which they were built, so it is impossible to generalize about them. Usually, however, they involve learning from multiple inputs or generating multiple outputs. One example could be a question answering network, where the network learns to predict answers given a story and a question. Another example could be a siamese network that calculates similarity between a pair of images, where the network is trained to predict either a binary (similar/not similar) or categorical (gradations of similarity) label using a...

Customizing Keras


Just as composing our basic building blocks into larger architectures enables us to build interesting deep learning models, sometimes we need to look at the other end of the spectrum. Keras has a lot of functionality built in already, so it is very likely that you can build all your models with the provided components and not feel the need for customization at all. In case you do need customization, Keras has you covered.

As you will recall, Keras is a high level API that delegates to either a TensorFlow or Theano backend for the computational heavy lifting. Any code you build for your customization will call out to one of these backends. In order to keep your code portable across the two backends, your custom code should use the Keras backend API (https://keras.io/backend/), which provides a set of functions that act like a facade over your chosen backend. Depending on the backend selected, the call to the backend facade will translate to the appropriate TensorFlow or Theano...

Generative models


Generative models are models that learn to create data similar to data it is trained on. We saw one example of a generative model that learns to write prose similar to Alice in Wonderland in Chapter 6, Recurrent Neural Network — RNN. In that example, we trained a model to predict the 11th character of text given the first 10 characters. Yet another type of generative model is generative adversarial models (GAN) that have recently emerged as a very powerful class of models—you saw examples of GANs in Chapter 4, Generative Adversarial Networks and WaveNet. The intuition for generative models is that it learns a good internal representation of its training data, and is therefore able to generate similar data during the prediction phase.

Another perspective on generative models is the probabilistic one. A typical classification or regression network, also called a discriminative model, learns a function that maps the input data X to some label or output y, that is, these models...

Summary


In this chapter, we covered some deep learning networks that were not covered in earlier chapters. We started with a brief look into the Keras functional API, which allows us to build networks that are more complex than the sequential networks we have seen so far. We then looked at regression networks, which allow us to do predictions in a continuous space, and opens up a whole new range of problems we can solve. However, a regression network is really a very simple modification of a standard classification network. The next area we looked at was autoencoders, which are a style of network that allows us to do unsupervised learning and make use of the massive amount of unlabeled data that all of us have access to nowadays. We also learned how to compose the networks we had already learned about as giant Lego-like building blocks into larger and more interesting networks. We then moved from building large networks using smaller networks, to learning how to customize individual layers...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Deep Learning with Keras
Published in: Apr 2017Publisher: PacktISBN-13: 9781787128422
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
Antonio Gulli

Antonio Gulli has a passion for establishing and managing global technological talent for innovation and execution. His core expertise is in cloud computing, deep learning, and search engines. Currently, Antonio works for Google in the Cloud Office of the CTO in Zurich, working on Search, Cloud Infra, Sovereignty, and Conversational AI.
Read more about Antonio Gulli

author image
Sujit Pal

Sujit Pal is a Technology Research Director at Elsevier Labs, an advanced technology group within the Reed-Elsevier Group of companies. His interests include semantic search, natural language processing, machine learning, and deep learning. At Elsevier, he has worked on several initiatives involving search quality measurement and improvement, image classification and duplicate detection, and annotation and ontology development for medical and scientific corpora.
Read more about Sujit Pal