Reader small image

You're reading from  Hands-On Meta Learning with Python

Product typeBook
Published inDec 2018
Reading LevelIntermediate
PublisherPackt
ISBN-139781789534207
Edition1st Edition
Languages
Right arrow
Author (1)
Sudharsan Ravichandiran
Sudharsan Ravichandiran
author image
Sudharsan Ravichandiran

Sudharsan Ravichandiran is a data scientist and artificial intelligence enthusiast. He holds a Bachelors in Information Technology from Anna University. His area of research focuses on practical implementations of deep learning and reinforcement learning including natural language processing and computer vision. He is an open-source contributor and loves answering questions on Stack Overflow.
Read more about Sudharsan Ravichandiran

Right arrow

Chapter 3. Prototypical Networks and Their Variants

In the last chapter, we learned what siamese networks are and how they are used to perform few-shot learning tasks. We also explored how to use siamese networks for performing face and audio recognition. In this chapter, we will look at another interesting few-shot learning algorithm called a prototypical network, which has the ability to generalize even to the class that is not present in a training set. We will start off with understanding what prototypical networks are, after which we will see how to perform a classification task in an omniglot dataset using prototypical network. We will then see different variants of prototypical networks, such as Gaussian prototypical networks and semi-prototypical networks.

In this chapter, you will learn about the following:

  • Prototypical networks
  • The algorithm of prototypical networks
  • Classification using prototypical networks
  • Gaussian prototypical networks
  • The Gaussian prototypical network algorithm
  • Semi...

Prototypical networks


Prototypical networks are yet another simple, efficient, few shot learning algorithm. Like siamese networks, a prototypical network tries to learn the metric space to perform classification. The basic idea of prototypical networks is to create a prototypical representation of each class and classify a query point (that is, a new point) based on the distance between the class prototype and the query point.

Let's say we have a support set comprising images of lions, elephants, and dogs, as shown in the following diagram:

So, we have three classes: {lion, elephant, dog}. Now we need to create a prototypical representation for each of these three class. How can we build the prototype of these three classes? First, we will learn the embeddings of each data point using an embedding function. The embedding function,

, can be any function that can be used to extract features. Since our input is an image, we can use the convolutional network as our embedding function, which will...

Gaussian prototypical network


Now, we will look at a variant of a prototypical network, called a Gaussian prototypical network. We just learned how a prototypical network learns the embeddings of the data points and how it builds the class prototype by taking the mean embeddings of each class and uses the class prototype for performing classification.

In a Gaussian prototypical network, along with generating embeddings for the data points, we add a confidence region around them, characterized by a Gaussian covariance matrix. Having a confidence region helps in characterizing the quality of individual data points and would be useful in the case of noisy and less homogeneous data.

So, in Gaussian prototypical networks, the output of the encoder will be embeddings, as well as the covariance matrix. Instead of using the full covariance matrix, we either include a radius or diagonal component from the covariance matrix along with the embeddings:

  • Radius component: If we use the radius component of...

Semi-prototypical networks


Now, we will see another interesting variant of prototypical networks called the semi-prototypical network. It deals with handling unlabeled examples. As we know, in the prototypical network, we compute the prototype of each class by taking the mean embedding of each class and then predict the class of query set by finding the distance between query points to the class prototypes.

Consider the case where our dataset contains some of the unlabeled data points: how do we compute the class prototypes of these unlabeled data points?

Let's say we have a support set,

where x is the feature and y is the label, and a query set,

. Along with these, we have one more set called the unlabeled set, R, where we have only unlabeled examples,

.

So, what can we do with this unlabeled set?

First, we will compute the class prototype with all the examples given in the support set. Next, we use soft k-means and assign the class for unlabeled examples in R—that is, we assign the class...

Summary


In this chapter, we started off with prototypical networks, and we saw how a prototypical network computes the class prototype using the embedding function and predicts the class label of the query set by comparing the Euclidean distance between the class prototype and query set embeddings. Following this, we experimented with a prototypical network by performing classification on an omniglot dataset. Then, we learned about the Gaussian prototypical network, which, along with the embeddings, also uses the covariance matrix to compute the class prototype. Following this, we explored semi-prototypical networks, which are used to handle semi-supervised classes. In the next chapter, we will learn about relation and matching networks.

Questions


  1. What is a prototypical network?
  2. What is the use of computing embeddings?
  3. How do we calculate the class prototype?
  4. What is a Gaussian prototypical network?
  5. How do Gaussian prototypical networks differ from vanilla ones?
  6. What are the different components of the covariance matrix used in a Gaussian prototypical network?

Further reading


lock icon
The rest of the chapter is locked
You have been reading a chapter from
Hands-On Meta Learning with Python
Published in: Dec 2018Publisher: PacktISBN-13: 9781789534207
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Sudharsan Ravichandiran

Sudharsan Ravichandiran is a data scientist and artificial intelligence enthusiast. He holds a Bachelors in Information Technology from Anna University. His area of research focuses on practical implementations of deep learning and reinforcement learning including natural language processing and computer vision. He is an open-source contributor and loves answering questions on Stack Overflow.
Read more about Sudharsan Ravichandiran