You're reading from Deep Learning with MXNet Cookbook

Product type Book

Published in Dec 2023

Publisher Packt

ISBN-13 9781800569607

Pages 370 pages

Edition 1st Edition

Languages

Python

Concepts

Machine Learning

Author (1):

Andrés P. Torres

Table of Contents (12) Chapters

Preface

Chapter 1: Up and Running with MXNet

Chapter 2: Working with MXNet and Visualizing Datasets – Gluon and DataLoader

Chapter 3: Solving Regression Problems

Chapter 4: Solving Classification Problems

Chapter 5: Analyzing Images with Computer Vision

Chapter 6: Understanding Text with Natural Language Processing

Chapter 7: Optimizing Models with Transfer Learning and Fine-Tuning

Chapter 8: Improving Training Performance with MXNet

Chapter 9: Improving Inference Performance with MXNet

Index

Why subscribe?

Other Books You May Enjoy

Optimizing Models with Transfer Learning and Fine-Tuning

As models grow in size (the depth and number of processing modules per layer), training them grows exponentially as more time is spent per epoch, and typically, more epochs are required to reach optimum performance.

For this reason, MXNet provides state-of-the-art pre-trained models via GluonCV and GluonNLP libraries. As we have seen in previous chapters, these models can help us solve a variety of problems when our final dataset is similar to the one the selected model has been pre-trained on.

However, sometimes this is not good enough, and our final dataset might have some nuances that the pre-trained model is not picking up. In these cases, it would be ideal to combine the stored knowledge of the pre-trained model with our final dataset. This is called transfer learning, where the knowledge of our pre-trained model is transferred to a new task (final dataset).

In this chapter, we will learn how to use GluonCV and...

Technical requirements

Apart from the technical requirements specified in the Preface, the following technical requirements apply:

Ensure that you have completed the first recipe, Installing MXNet, Gluon, GluonCV and GluonNLP, from Chapter 1, Up and Running with MXNet.
Ensure that you have completed Chapter 5, Analyzing Images with Computer Vision, and Chapter 6, Understanding Text with Natural Language Processing.

The code for this chapter can be found at the following GitHub URL: https://github.com/PacktPublishing/Deep-Learning-with-MXNet-Cookbook/tree/main/ch07.

Furthermore, you can access each recipe directly from Google Colab; for example, the first recipe of this chapter can be found here: https://colab.research.google.com/github/PacktPublishing/Deep-Learning-with-MXNet-Cookbook/blob/main/ch07/7_1_Understanding_Transfer_Learning_and_Fine_Tuning.ipynb.

Understanding transfer learning and fine-tuning

In the previous chapters, we saw how we could leverage MXNet, GluonCV, and GluonNLP to retrieve pre-trained models in certain datasets (such as ImageNet, MS COCO, and IWSLT2015) and use them for our specific tasks and datasets.

In this recipe, we will introduce a methodology called transfer learning, which will allow us to combine the information from pre-trained models (on general knowledge datasets) and the information from the new domain (the dataset from the task we want to solve). There are two main significant advantages to this approach. On the one hand, pre-training datasets are typically large-scale (ImageNet-22k has 14 million images), and using a pre-trained model saves us that training time. On the other hand, we use our specific dataset not only for evaluation but also for training the model, improving its performance in the desired scenario. As we will discover, there is not always an easy way to achieve this, as it requires...

Improving performance for classifying images

After introducing transfer learning and fine-tuning in the previous recipe, in this one, we will apply it to image classification, a CV task.

In the second recipe, Classifying images with MXNet – GluonCV Model Zoo, AlexNet, and ResNet, in Chapter 5, Analyzing Images with Computer Vision, we saw how we could use GluonCV to retrieve pre-trained models and use them directly for an image classification task. In the first instance, we looked at training them from scratch, effectively only leveraging past knowledge by using the architecture of the pre-trained model, without leveraging any past knowledge contained in the pre-trained weights, which were re-initialized, deleting any past information. Afterward, the pre-trained models were used directly for the task, effectively also leveraging the weights/parameters of the model.

In this recipe, we will combine the weights/parameters of the model with the target dataset, applying the...

Improving performance for segmenting images

In this recipe, we will apply transfer learning and fine-tuning to semantic segmentation, a CV task.

In the fourth recipe, Segmenting objects in images with MXNet: PSPNet and DeepLab-v3, in Chapter 5, Analyzing Images with Computer Vision, we saw how we could use GluonCV to retrieve pre-trained models and use them directly for a semantic segmentation task, effectively leveraging past knowledge by using the architecture and the weights/parameters of the pre-trained model.

In this recipe, we will continue leveraging the weights/parameters of the model, obtained for a task consisting of classifying images among a set of 21 classes using semantic segmentation models. The dataset used for the pre-training was MS COCO (source task) and we will run several experiments to evaluate our models in a new (target) task, using the Penn-Fudan Pedestrian dataset. In these experiments, we will also include knowledge from the target dataset to improve...

Improving performance for translating English to German

In the previous recipes, we have seen how we can leverage pre-trained models and new datasets for transfer learning and fine-tuning applied to CV tasks. In this recipe, we will follow a similar approach, but with an NLP task, translating from English to German.

In the fourth recipe, Translating text from Vietnamese to English, in Chapter 6, Understanding Text with Natural Language Processing, we saw how we could use GluonNLP to retrieve pre-trained models and use them directly for a translation task, training them from scratch, effectively only leveraging past knowledge by using the architecture of the pre-trained model.

In this recipe, we will also leverage the weights/parameters of the model, obtained for a task consisting of translating text from English to German using machine translation models. The dataset that we will use for pre-training will be WMT2014 (task source), and we will run several experiments to evaluate...