Reader small image

You're reading from  Deep Learning with MXNet Cookbook

Product typeBook
Published inDec 2023
Reading LevelBeginner
PublisherPackt
ISBN-139781800569607
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Andrés P. Torres
Andrés P. Torres
author image
Andrés P. Torres

Andrés P. Torres, is the Head of Perception at Oxa, a global leader in industrial autonomous vehicles, leading the design and development of State-Of The-Art algorithms for autonomous driving. Before, Andrés had a stint as an advisor and Head of AI at an early-stage content generation startup, Maekersuite, where he developed several AI-based algorithms for mobile phones and the web. Prior to this, Andrés was a Software Development Manager at Amazon Prime Air, developing software to optimize operations for autonomous drones.
Read more about Andrés P. Torres

Right arrow

Optimizing Models with Transfer Learning and Fine-Tuning

As models grow in size (the depth and number of processing modules per layer), training them grows exponentially as more time is spent per epoch, and typically, more epochs are required to reach optimum performance.

For this reason, MXNet provides state-of-the-art pre-trained models via GluonCV and GluonNLP libraries. As we have seen in previous chapters, these models can help us solve a variety of problems when our final dataset is similar to the one the selected model has been pre-trained on.

However, sometimes this is not good enough, and our final dataset might have some nuances that the pre-trained model is not picking up. In these cases, it would be ideal to combine the stored knowledge of the pre-trained model with our final dataset. This is called transfer learning, where the knowledge of our pre-trained model is transferred to a new task (final dataset).

In this chapter, we will learn how to use GluonCV and...

Technical requirements

Apart from the technical requirements specified in the Preface, the following technical requirements apply:

  • Ensure that you have completed the first recipe, Installing MXNet, Gluon, GluonCV and GluonNLP, from Chapter 1, Up and Running with MXNet.
  • Ensure that you have completed Chapter 5, Analyzing Images with Computer Vision, and Chapter 6, Understanding Text with Natural Language Processing.

The code for this chapter can be found at the following GitHub URL: https://github.com/PacktPublishing/Deep-Learning-with-MXNet-Cookbook/tree/main/ch07.

Furthermore, you can access each recipe directly from Google Colab; for example, the first recipe of this chapter can be found here: https://colab.research.google.com/github/PacktPublishing/Deep-Learning-with-MXNet-Cookbook/blob/main/ch07/7_1_Understanding_Transfer_Learning_and_Fine_Tuning.ipynb.

Understanding transfer learning and fine-tuning

In the previous chapters, we saw how we could leverage MXNet, GluonCV, and GluonNLP to retrieve pre-trained models in certain datasets (such as ImageNet, MS COCO, and IWSLT2015) and use them for our specific tasks and datasets.

In this recipe, we will introduce a methodology called transfer learning, which will allow us to combine the information from pre-trained models (on general knowledge datasets) and the information from the new domain (the dataset from the task we want to solve). There are two main significant advantages to this approach. On the one hand, pre-training datasets are typically large-scale (ImageNet-22k has 14 million images), and using a pre-trained model saves us that training time. On the other hand, we use our specific dataset not only for evaluation but also for training the model, improving its performance in the desired scenario. As we will discover, there is not always an easy way to achieve this, as it requires...

Improving performance for classifying images

After introducing transfer learning and fine-tuning in the previous recipe, in this one, we will apply it to image classification, a CV task.

In the second recipe, Classifying images with MXNet – GluonCV Model Zoo, AlexNet, and ResNet, in Chapter 5, Analyzing Images with Computer Vision, we saw how we could use GluonCV to retrieve pre-trained models and use them directly for an image classification task. In the first instance, we looked at training them from scratch, effectively only leveraging past knowledge by using the architecture of the pre-trained model, without leveraging any past knowledge contained in the pre-trained weights, which were re-initialized, deleting any past information. Afterward, the pre-trained models were used directly for the task, effectively also leveraging the weights/parameters of the model.

In this recipe, we will combine the weights/parameters of the model with the target dataset, applying the...

Improving performance for segmenting images

In this recipe, we will apply transfer learning and fine-tuning to semantic segmentation, a CV task.

In the fourth recipe, Segmenting objects in images with MXNet: PSPNet and DeepLab-v3, in Chapter 5, Analyzing Images with Computer Vision, we saw how we could use GluonCV to retrieve pre-trained models and use them directly for a semantic segmentation task, effectively leveraging past knowledge by using the architecture and the weights/parameters of the pre-trained model.

In this recipe, we will continue leveraging the weights/parameters of the model, obtained for a task consisting of classifying images among a set of 21 classes using semantic segmentation models. The dataset used for the pre-training was MS COCO (source task) and we will run several experiments to evaluate our models in a new (target) task, using the Penn-Fudan Pedestrian dataset. In these experiments, we will also include knowledge from the target dataset to improve...

Improving performance for translating English to German

In the previous recipes, we have seen how we can leverage pre-trained models and new datasets for transfer learning and fine-tuning applied to CV tasks. In this recipe, we will follow a similar approach, but with an NLP task, translating from English to German.

In the fourth recipe, Translating text from Vietnamese to English, in Chapter 6, Understanding Text with Natural Language Processing, we saw how we could use GluonNLP to retrieve pre-trained models and use them directly for a translation task, training them from scratch, effectively only leveraging past knowledge by using the architecture of the pre-trained model.

In this recipe, we will also leverage the weights/parameters of the model, obtained for a task consisting of translating text from English to German using machine translation models. The dataset that we will use for pre-training will be WMT2014 (task source), and we will run several experiments to evaluate...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Deep Learning with MXNet Cookbook
Published in: Dec 2023Publisher: PacktISBN-13: 9781800569607
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Andrés P. Torres

Andrés P. Torres, is the Head of Perception at Oxa, a global leader in industrial autonomous vehicles, leading the design and development of State-Of The-Art algorithms for autonomous driving. Before, Andrés had a stint as an advisor and Head of AI at an early-stage content generation startup, Maekersuite, where he developed several AI-based algorithms for mobile phones and the web. Prior to this, Andrés was a Software Development Manager at Amazon Prime Air, developing software to optimize operations for autonomous drones.
Read more about Andrés P. Torres