Reader small image

You're reading from  Hands-On Meta Learning with Python

Product typeBook
Published inDec 2018
Reading LevelIntermediate
PublisherPackt
ISBN-139781789534207
Edition1st Edition
Languages
Right arrow
Author (1)
Sudharsan Ravichandiran
Sudharsan Ravichandiran
author image
Sudharsan Ravichandiran

Sudharsan Ravichandiran is a data scientist and artificial intelligence enthusiast. He holds a Bachelors in Information Technology from Anna University. His area of research focuses on practical implementations of deep learning and reinforcement learning including natural language processing and computer vision. He is an open-source contributor and loves answering questions on Stack Overflow.
Read more about Sudharsan Ravichandiran

Right arrow

Chapter 8. Gradient Agreement as an Optimization Objective

In the last chapter, we learned about the Meta-SGD and Reptile algorithm. We saw how Meta-SGD is used to find the optimal parameter, optimal learning rate, and the gradient update direction. We also saw how the Reptile algorithm works and how it is more efficient than MAML. In this chapter, we'll learn how gradient agreement is used as an optimization objective for meta learning. As you saw in MAML, we were basically taking an average of gradients across tasks and updating our model parameter. In gradient agreement algorithm, we'll take a weighted average of gradients to update a model parameter and we'll see how adding weights to the gradient helps us to find the better model parameter. We'll explore exactly how gradient agreement algorithm work in this chapter. Our gradient agreement algorithm can be plugged with both MAML and the Reptile algorithm. We'll also see how to implement gradient agreement in MAML from scratch.

In this...

Gradient agreement as an optimization


The gradient agreement algorithm is an interesting and recently introduced algorithm that acts as an enhancement to meta learning algorithms. In MAML and Reptile, we try to find a better model parameter that's generalizable across several related tasks so that we can learn quickly with fewer data points. If we recollect what we've learned in the previous chapters, we've seen that we randomly initialize the model parameter and then we sample a random batch of tasks,

from the task distribution,

. For each of the sampled tasks,

, we minimize the loss by calculating gradients and we get the updated parameters,

, and that forms our inner loop:

After calculating the optimal parameter for each of the sampled tasks, we perform meta optimization— that is, we perform meta optimization by calculating loss in a new set of tasks, we minimize loss by calculating gradients with respect to the optimal parameters

, which we obtained in the inner loop, and we update our...

Building gradient agreement algorithm with MAML


In the last section, we saw how the gradient agreement algorithm works. We saw how gradient agreement adds weights to the gradients implying their importance. Now, we'll see how to use our gradient agreement algorithm with MAML by coding them from scratch using NumPy. For better understanding, we'll consider a simple binary classification task. We'll randomly generate our input data, train it with a simple single-layer neural network, and try to find the optimal parameter θ.

Now we'll see step by step exactly how to do this.

You can also check out the whole code, available as a Jupyter Notebook here: https://github.com/sudharsan13296/Hands-On-Meta-Learning-With-Python/blob/master/08.%20Gradient%20Agreement%20As%20An%20Optimization%20Objective/8.4%20Building%20Gradient%20Agreement%20Algorithm%20with%20MAML.ipynb.

We import all of the necessary libraries:

import numpy as np

Generating data points

Now, we define a function called sample_points for generating...

Summary


In this chapter, we've learned about gradient agreement algorithm. We've seen how the gradient agreement algorithm uses a weighted gradient to find the better initial model parameter,

. We also saw how these weights are proportional to the inner product of the gradients of a task and an average of gradients of all of the tasks in a sampled batch of tasks. We also explored how the gradient agreement algorithm can be plugged with both MAML and the Reptile algorithm. Following this, we saw how to find the optimal parameter

in a classification task using a gradient agreement algorithm.

In the next chapter, we'll learn about some of the recent advancements in meta learning such as task agnostic meta learning, learning to learn in the concept space, and meta imitation learning.

Questions


  1. What is gradient agreement and disagreement?
  2. What is the update equation of MAML in gradient agreement?
  3. What are the weights in gradient agreement?
  4. How weights are computed?
  5. What is a normalization factor?
  6. When do we increase and decrease weights?

Further reading


lock icon
The rest of the chapter is locked
You have been reading a chapter from
Hands-On Meta Learning with Python
Published in: Dec 2018Publisher: PacktISBN-13: 9781789534207
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Sudharsan Ravichandiran

Sudharsan Ravichandiran is a data scientist and artificial intelligence enthusiast. He holds a Bachelors in Information Technology from Anna University. His area of research focuses on practical implementations of deep learning and reinforcement learning including natural language processing and computer vision. He is an open-source contributor and loves answering questions on Stack Overflow.
Read more about Sudharsan Ravichandiran