Packt+ | Advance your knowledge in tech

You're reading from Hands-On Meta Learning with Python

Product typeBook

Published inDec 2018

Reading LevelIntermediate

PublisherPackt

ISBN-139781789534207

Edition1st Edition

Languages

Python

Tools

TensorFlow OpenAI Gym

Concepts

Reinforcement Learning

Author (1)

Sudharsan Ravichandiran

Appendix 1. Assessments

Chapter 1: Introduction to Meta Learning

Meta learning produces a versatile AI model that can learn to perform various tasks without having to be trained from scratch. We train our meta learning model on various related tasks with a few data points, so for a new but related task, the model can make use of what it learned from the previous tasks without having to be trained from scratch.
Learning from fewer data points is called few-shot learning or k-shot learning, where k denotes the number of data points in each of the classes in the dataset.
In order to make our model learn from a few data points, we will train it in the same way. So, when we have a dataset D, we sample some data points from each of the classes present in our dataset and we call it the support set.
We sample different data points from each of the classes that differ from the support set and call it the query set.
In a metric-based meta learning setting, we will learn the appropriate metric space. Let's say we want to find out the similarities between two images. In a metric-based setting, we use a simple neural network, which extracts the features from the two images and finds the similarities by computing the distance between the features of those two images.
We train our model in an episodic fashion; that is, in each episode, we sample a few data points from our dataset D, and prepare our support set and learn on the support set. So, over a series of episodes, our model will learn how to learn from a smaller dataset.

Chapter 2: Face and Audio Recognition Using Siamese Networks

A siamese network is a special type of neural network, and it is one of the simplest and most commonly used one-shot learning algorithms. Siamese networks basically consist of two symmetrical neural networks that share the same weights and architecture and are joined together at the end using an energy function, E.
The contrastive loss function can be expressed as follows:
In the preceding equation, the value of Y is the true label, which will be 1 when the two input values are similar and 0 if the two input values are dissimilar, and E is our energy function, which can be any distance measure. The term margin is used to hold the constraint; that is, when two input values are dissimilar and if their distance is greater than a margin, then they do not incur a loss.
The energy function tells us how similar the two inputs are. It is basically any similarity measure, such as Euclidean distance and cosine similarity.
The input to the siamese networks should be in pairs, (X₁,X₂), along with their binary label, Y ∈ (0, 1),stating whether the input pairs are genuine pairs (the same) orimposite pairs (different).
The applications of siamese networks are endless; they've been stacked with various architectures for performing various tasks, such as human action recognition, scene change detection, and machine translation.

Chapter 3: Prototypical Networks and Their Variants

Prototypical networks are simple, efficient, and one of the most popularly used few-shot learning algorithms. The basic idea of the prototypical network is to create a prototypical representation of each class and classify a query point (new point) based on the distance between the class prototype and the query point.
We compute embeddings for each of the data points to learn the features.
Once we learn the embeddings of each data point, we take the mean embeddings of data points in each class and form the class prototype. So, a class prototype is basically the mean embeddings of data points in a class.
In a Gaussian prototypical network, along with generating embeddings for the data points, we add a confidence region around them, which is characterized by a Gaussian covariance matrix. Having a confidence region helps to characterize the quality of individual data points, and it is useful with noisy and less homogeneous data.
Gaussian prototypical networks differ from vanilla prototypical networks in that in a vanilla prototypical network, we learn only the embeddings of a data point, but in a Gaussian prototypical network, along with learning embeddings, we also add a confidence region to them.
The radius and diagonal are the different components of the covariance matrix used in a Gaussian prototypical network.

Chapter 4: Relation and Matching Networks Using TensorFlow

A relation network consists of two important functions: the embedding function, denoted by
, and the relation function, denoted by
.
Once we have the feature vectors of the support set,
, and query set,
, we combine them using an operator,
. Here,
can be any combination operator; we use concatenation as an operator to combine the feature vectors of the support set and the query set—that is,
.

The relation function,
, will generate a relation score ranging from 0 to 1, representing the similarity between samples in the support set,
, and samples in the query set,
.
Our loss function can be represented as follows:
In matching networks, we use two embedding functions,
and
, to learn the embeddings of the query set
and the support set
, respectively.
The output,
, for the query point,
, can be predicted as follows:

Chapter 5: Memory-Augmented Neural Networks

NTM is an interesting algorithm that has the ability to store and retrieve information from memory. The idea of NTM is to augment the neural network with external memory—that is, instead of using hidden states as memory, it uses external memory to store and retrieve information.
The controller is basically a feed-forward neural network or recurrent neural network. It reads from and writes to memory.
The read head and write head are the pointers containing addresses of the memory that it has to read from and write to.
The memory matrix or memory bank, or simply the memory, is where we will store the information. Memory is basically a two-dimensional matrix composed of memory cells. The memory matrix contains N rows and M columns. Using the controller, we access the content from the memory. So, the controller receives input from the external environment and emits the response by interacting with the memory matrix.
Location-based addressing and content-based addressing are the different types of addressing mechanisms used in NTM.

An interpolation gate is used to decide whether we should use the weights we obtained at the previous time step,
, or use the weights obtained through content-based addressing,
.

Computing the least-used weight vector,
, from the usage weight vector,
, is very simple. We simply set the index of the lowest value usage weight vector to 1 and the rest of the values to 0, as the lowest value in the usage weight vector means that it is least recently used.

Chapter 6: MAML and Its Variants

MAML is one of the recently introduced and most commonly used meta learning algorithms, and it has lead to a major breakthrough in meta learning research. The basic idea of MAML is to find better initial parameters so that, with good initial parameters, the model can learn quickly on new tasks with fewer gradient steps.
MAML is model agnostic, meaning that we can apply MAML for any models that are trainable with gradient descent.
ADML is a variant of MAML that makes use of both clean and adversarial samples to find the better and robust initial model parameter, θ.
In FGSM, we get the adversarial sample of our image and we calculate the gradients of our loss with respect to our image, more clearly input pixels of our image instead of the model parameter.
The context parameter is a task-specific parameter that's updated on the inner loop. It is denoted by ∅ and it is specific to each task and represents the embeddings of an individual task.
The shared parameter is shared across tasks and updated in the outer loop to find the optimal model parameter. It is denoted by θ.

Chapter 7: Meta-SGD and Reptile Algorithms

Unlike MAML, in Meta-SGD, along with finding optimal parameter value,
, we also find the optimal learning rate,
, and update the direction.
The learning rate is implicitly implemented in the adaptation term. So, in Meta-SGD, we don't initialize a learning rate with a small scalar value. Instead, we initialize them with random values with the same shape as
and learn them along with
.

The update equation of the learning rate can be expressed as
.
Sample n tasks and run SGD for fewer iterations on each of the sampled tasks, and then update our model parameter in a direction that is common to all the tasks.
The reptile update equation can be expressed as
.

Chapter 8: Gradient Agreement as an Optimization Objective

When the gradients of all tasks are in the same direction, then it is called gradient agreement, and when the gradient of some tasks differ greatly from others, then it is called gradient disagreement.
The update equation in gradient agreement can be expressed as
.
Weights are proportional to the inner product of the gradients of a task and the average of gradients of all of the tasks in the sampled batch of tasks.
The weights are calculated as follows:
The normalization factor is proportional to the inner product of
and
.
If the gradient of a task is in the same direction as the average gradient of all tasks in a sampled batch of tasks, then we can increase its weights so that it'll contribute more when updating our model parameter. Similarly, if the gradient of a task is in the direction that's greatly different from the average gradient of all tasks in a sampled batch of tasks, then we can decrease its weights so that it'll contribute less when updating our model parameter.

Chapter 9: Recent Advancements and Next Steps

Different types of inequality measures are Gini coefficients, the Theil index, and the variance of algorithms.
The Theil index is the most commonly used inequality measure. It's named after a Dutch econometrician, Henri Theil, and it's a special case of the family of inequality measures called generalized entropy measures. It can be defined as the difference between the maximum entropy and observed entropy.
If we enable our robot to learn by just looking at our actions, then we can easily make the robot learn complex goals efficiently and we don't have to engineer complex goal and reward functions. This type of learning—that is, learning from human actions—is called imitation learning, where the robot tries to mimic human action.
A concept generator is used to extract features. We can use deep neural nets that are parameterized by some parameter,
, to generate the concepts. For examples, our concept generator can be a CNN if our input is an image.
We sample a batch of tasks from the task distributions, learn their concepts via the concept generator, perform meta learning on those concepts, and then we compute the meta learning loss:

The rest of the chapter is locked

You have been reading a chapter from

Hands-On Meta Learning with Python

Published in: Dec 2018Publisher: PacktISBN-13: 9781789534207

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Sudharsan Ravichandiran

Sudharsan Ravichandiran is a data scientist and artificial intelligence enthusiast. He holds a Bachelors in Information Technology from Anna University. His area of research focuses on practical implementations of deep learning and reinforcement learning including natural language processing and computer vision. He is an open-source contributor and loves answering questions on Stack Overflow.
Read more about Sudharsan Ravichandiran

Other recommended products

Related to this chapter

Hands-On Deep Learning Algorithms with Python

This book introduces basic-to-advanced deep learning algorithms used in a production environment by AI researchers and principal data scientists; it explains algorithms intuitively, including the underlying math, and shows how to implement them using popular Python-based deep learning libraries such as TensorFlow.

BookJul 2019512 pages

Hands-On One-shot Learning with Python

This book is a step by step guide to one-shot learning using Python-based libraries. It is designed to help you understand and design models that can learn information about your data from one, or only a few, training examples. You will also learn to apply these techniques with real-world examples and datasets for classification and regression.

BookApr 2020156 pages

Deep Reinforcement Learning with Python

Deep Reinforcement Learning with Python - Second Edition will help you learn reinforcement learning algorithms, techniques and architectures – including deep reinforcement learning – from scratch. This new edition is an extensive update of the original, reflecting the state-of-the-art latest thinking in reinforcement learning.

BookSep 2020760 pages

Hands-On Mathematics for Deep Learning

The main aim of this book is to make the advanced mathematical background accessible to someone with a programming background. This book will equip the readers with not only deep learning architectures but the mathematics behind them. With this book, you will understand the relevant mathematics that goes behind building deep learning models.

BookJun 2020364 pages

Advanced Deep Learning with Python

This book is an expert-level guide to master the neural network variants using the Python ecosystem. You will gain the skills to build smarter, faster, and efficient deep learning systems with practical examples. By the end of this book, you will be up to date with the latest advances and current researches in the deep learning domain.

BookDec 2019468 pages

Hands-On Reinforcement Learning with Python

Reinforcement learning is a self-evolving type of machine learning that takes us closer to achieving true artificial intelligence. This easy-to-follow guide explains everything from scratch using rich examples written in Python.

BookJun 2018318 pages

Hands-On Deep Learning Architectures with Python

This book explains the essential learning algorithms used for deep and shallow architectures. Packed with practical implementations to help you understand the concepts and ideas required to build efficient artificial intelligence systems, this book will help you construct deep models using popular frameworks and datasets.

BookApr 2019316 pages

Neural Network Programming with Tensorflow

If you’re aware of the buzz surrounding the terms such as machine learning, artificial intelligence or deep learning, you might know what neural networks are. TensorFlow is a popular framework which can be used to implement efficient neural networks and deep learning models. This book will show you how to leverage the power of TensorFlow to train efficient neural networks. You will start with understanding the fundamentals and basic math for neural networks and why TensorFlow is a popular choice of tool for programming neural networks. During the course of the book, you will be working on real-world datasets to get a hands-on understanding of neural network programming. By the end of this book, you will have a fair understanding of how you can leverage the power of TensorFlow to train neural networks of varying complexities, without any hassle. While you are learning about various neural network implementations you will learn the underlying mathematics and linear algebra and how it maps to the appropriate TensorFlow constructs.

BookNov 2017274 pages

TensorFlow 1.x Deep Learning Cookbook

Deep Neural Networks (DNNs) have achieved a lot of success in the field of computer vision, speech recognition, and natural language processing. In this book, you will learn how to efficiently use TensorFlow, Google's open source framework for deep learning, and implement different deep learning networks with easy to follow independent recipes.

BookDec 2017536 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages