You're reading from Deep Learning with MXNet Cookbook

Product typeBook

Published inDec 2023

Reading LevelBeginner

PublisherPackt

ISBN-139781800569607

Edition1st Edition

Languages

Python

Tools

MXNet

Concepts

Machine Learning

Author (1)

Andrés P. Torres

Optimizing Models with Transfer Learning and Fine-Tuning

As models grow in size (the depth and number of processing modules per layer), training them grows exponentially as more time is spent per epoch, and typically, more epochs are required to reach optimum performance.

For this reason, MXNet provides state-of-the-art pre-trained models via GluonCV and GluonNLP libraries. As we have seen in previous chapters, these models can help us solve a variety of problems when our final dataset is similar to the one the selected model has been pre-trained on.

However, sometimes this is not good enough, and our final dataset might have some nuances that the pre-trained model is not picking up. In these cases, it would be ideal to combine the stored knowledge of the pre-trained model with our final dataset. This is called transfer learning, where the knowledge of our pre-trained model is transferred to a new task (final dataset).

In this chapter, we will learn how to use GluonCV and...

Technical requirements

Apart from the technical requirements specified in the Preface, the following technical requirements apply:

Ensure that you have completed the first recipe, Installing MXNet, Gluon, GluonCV and GluonNLP, from Chapter 1, Up and Running with MXNet.
Ensure that you have completed Chapter 5, Analyzing Images with Computer Vision, and Chapter 6, Understanding Text with Natural Language Processing.

The code for this chapter can be found at the following GitHub URL: https://github.com/PacktPublishing/Deep-Learning-with-MXNet-Cookbook/tree/main/ch07.

Furthermore, you can access each recipe directly from Google Colab; for example, the first recipe of this chapter can be found here: https://colab.research.google.com/github/PacktPublishing/Deep-Learning-with-MXNet-Cookbook/blob/main/ch07/7_1_Understanding_Transfer_Learning_and_Fine_Tuning.ipynb.

Understanding transfer learning and fine-tuning

In the previous chapters, we saw how we could leverage MXNet, GluonCV, and GluonNLP to retrieve pre-trained models in certain datasets (such as ImageNet, MS COCO, and IWSLT2015) and use them for our specific tasks and datasets.

In this recipe, we will introduce a methodology called transfer learning, which will allow us to combine the information from pre-trained models (on general knowledge datasets) and the information from the new domain (the dataset from the task we want to solve). There are two main significant advantages to this approach. On the one hand, pre-training datasets are typically large-scale (ImageNet-22k has 14 million images), and using a pre-trained model saves us that training time. On the other hand, we use our specific dataset not only for evaluation but also for training the model, improving its performance in the desired scenario. As we will discover, there is not always an easy way to achieve this, as it requires...

Improving performance for classifying images

After introducing transfer learning and fine-tuning in the previous recipe, in this one, we will apply it to image classification, a CV task.

In the second recipe, Classifying images with MXNet – GluonCV Model Zoo, AlexNet, and ResNet, in Chapter 5, Analyzing Images with Computer Vision, we saw how we could use GluonCV to retrieve pre-trained models and use them directly for an image classification task. In the first instance, we looked at training them from scratch, effectively only leveraging past knowledge by using the architecture of the pre-trained model, without leveraging any past knowledge contained in the pre-trained weights, which were re-initialized, deleting any past information. Afterward, the pre-trained models were used directly for the task, effectively also leveraging the weights/parameters of the model.

In this recipe, we will combine the weights/parameters of the model with the target dataset, applying the...

Improving performance for segmenting images

In this recipe, we will apply transfer learning and fine-tuning to semantic segmentation, a CV task.

In the fourth recipe, Segmenting objects in images with MXNet: PSPNet and DeepLab-v3, in Chapter 5, Analyzing Images with Computer Vision, we saw how we could use GluonCV to retrieve pre-trained models and use them directly for a semantic segmentation task, effectively leveraging past knowledge by using the architecture and the weights/parameters of the pre-trained model.

In this recipe, we will continue leveraging the weights/parameters of the model, obtained for a task consisting of classifying images among a set of 21 classes using semantic segmentation models. The dataset used for the pre-training was MS COCO (source task) and we will run several experiments to evaluate our models in a new (target) task, using the Penn-Fudan Pedestrian dataset. In these experiments, we will also include knowledge from the target dataset to improve...

Improving performance for translating English to German

In the previous recipes, we have seen how we can leverage pre-trained models and new datasets for transfer learning and fine-tuning applied to CV tasks. In this recipe, we will follow a similar approach, but with an NLP task, translating from English to German.

In the fourth recipe, Translating text from Vietnamese to English, in Chapter 6, Understanding Text with Natural Language Processing, we saw how we could use GluonNLP to retrieve pre-trained models and use them directly for a translation task, training them from scratch, effectively only leveraging past knowledge by using the architecture of the pre-trained model.

In this recipe, we will also leverage the weights/parameters of the model, obtained for a task consisting of translating text from English to German using machine translation models. The dataset that we will use for pre-training will be WMT2014 (task source), and we will run several experiments to evaluate...

The rest of the chapter is locked

You have been reading a chapter from

Deep Learning with MXNet Cookbook

Published in: Dec 2023Publisher: PacktISBN-13: 9781800569607

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Andrés P. Torres

Andrés P. Torres, is the Head of Perception at Oxa, a global leader in industrial autonomous vehicles, leading the design and development of State-Of The-Art algorithms for autonomous driving. Before, Andrés had a stint as an advisor and Head of AI at an early-stage content generation startup, Maekersuite, where he developed several AI-based algorithms for mobile phones and the web. Prior to this, Andrés was a Software Development Manager at Amazon Prime Air, developing software to optimize operations for autonomous drones.
Read more about Andrés P. Torres

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages