Reader small image

You're reading from  Deep Learning with PyTorch Lightning

Product typeBook
Published inApr 2022
Reading LevelBeginner
PublisherPackt
ISBN-139781800561618
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Kunal Sawarkar
Kunal Sawarkar
author image
Kunal Sawarkar

Kunal Sawarkar is a chief data scientist and AI thought leader. He leads the worldwide partner ecosystem in building innovative AI products. He also serves as an advisory board member and an angel investor. He holds a master's degree from Harvard University with major coursework in applied statistics. He has been applying machine learning to solve previously unsolved problems in industry and society, with a special focus on deep learning and self-supervised learning. Kunal has led various AI product R&D labs and has 20+ patents and papers published in this field. When not diving into data, he loves doing rock climbing and learning to fly aircraft, in addition to an insatiable curiosity for astronomy and wildlife.
Read more about Kunal Sawarkar

Right arrow

Chapter 3: Transfer Learning Using Pre-Trained Models

Deep learning models become more accurate the more data they have for training. The most spectacular Deep Learning models, such as ImageNet, are trained on millions of images and often require a massive amount of computing power. To put things into perspective, the amount of power used to train OpenAI's GPT3 model could power an entire city. Unsurprisingly, the cost of training such Deep Learning models from scratch is prohibitive for most projects.

This begs the question: do we really need to train a Deep Learning model from scratch each time? One way of getting around this problem, rather than training Deep Learning models from scratch, is to borrow representations from an already trained model for a similar subject. For example, if you wanted to train an image recognition model to detect faces, you could train your Convolutional Neural Network (CNN) to learn all the representations for each of the layers – or...

Technical requirements

The code for this chapter has been developed and tested on macOS with Anaconda or in Google Colab with Python 3.6. If you are using another environment, please make the appropriate changes to your env variables.

In this chapter, we will primarily be using the following Python modules, mentioned with their versions:

  • PyTorch Lightning (version: 1.5.2)
  • Seaborn (version: 0.11.2)
  • NumPy (version: 1.21.5)
  • Torch (version: 1.10.0)
  • pandas (version: 1.3.5)

Please import all these modules into your Jupyter environment. In order to make sure that these modules work together and not go out of sync, we have used the specific version of torch, torchvision, torchtext, torchaudio with PyTorch Lightning 1.5.2. You can also use the latest version of PyTorch Lightning and torch compatible with each other. More details can be found on the GitHub link: https://github.com/PacktPublishing/Deep-Learning-with-PyTorch-Lightning

!pip install torch==1...

Getting started with transfer learning

Transfer learning has many interesting applications, with one of the most fascinating being converting an image into the style of a famous painter, such as Van Gogh or Picasso.

Figure 3.1 – Image credit: A neural algorithm of artistic style (https://arxiv.org/pdf/1508.06576v2.pdf)

The preceding example is also known as Style Transfer. There are many specialized algorithms for accomplishing this task, and VGG-16, ResNet, and AlexNet are some of the more popular architectures.

In this chapter, we will start with the creation of a simple image classification model using ResNet-50 architecture on the PCam dataset, which contains image scans of cancer tissues. Later, we will build a text classification model that uses Bi-directional Encoder Representations from Transformers (BERT).

In both examples in this chapter, we will make use of a pre-trained model and its weights and fine-tune the model to make it work...

An image classifier using a pre-trained ResNet-50 architecture

ResNet-50 stands for Residual Network, which is a type of CNN architecture that was first published in a computer vision research paper entitled Deep Residual Learning for Image Recognition, by Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun, in 2015.

ResNet is currently the most popular architecture for image-related tasks. While it certainly works great on image classification problems (which we will see as follows), it works equally great as an encoder to learn image representations for more complex tasks such as Self-Supervised Learning. There are multiple variations of ResNet architecture, including ResNet-18, ResNet-34, ResNet-50, and ResNet-152 based on the number of deep layers it has.

The ResNet-50 architecture has 50 deep layers and is trained on the ImageNet dataset, which has 14 million images belonging to 1,000 different classes, including animals, cars, keyboards, mice, pens, and pencils. The...

Text classification using BERT transformers

Text classification using BERT transformers is a transformer-based machine learning technique for Natural Language Processing (NLP) developed by Google. BERT was created and published in 2018 by Jacob Devlin. Before BERT, for language tasks, semi-supervised models such as Recurrent Neural Networks (RNNs) or sequence models were commonly used. BERT was the first unsupervised approach to language models and achieved state-of-the-art performance on NLP tasks. The large BERT model consists of 24 encoders and 16 bi-directional attention heads. It was trained with Book Corpora words and English Wikipedia entries for about 3,000,000,000 words. It later expanded to over 100 languages. Using pre-trained BERT models, we can perform several tasks on text, such as classification, information extraction, question answering, summarization, translation, and text generation.

Figure 3.7 – BERT architecture diagram (Image credit...

Summary

Transfer learning is one of the most common ways used to cut compute costs, save time, and get the best results. In this chapter, we learned how to build models with ResNet-50 and pre-trained BERT architectures using PyTorch Lightning.

We have built an image classifier and a text classifier, and along the way, we have covered some useful PyTorch Lightning life cycle methods. We have learned how to make use of pre-trained models to work on our customized datasets with less effort and a smaller number of training epochs. Even with very little model tuning, we were able to achieve decent accuracy.

While transfer learning methods work great, their limitations should also be borne in mind. They work incredibly well for language models because the given dataset's text is usually made up of the same English words as in your core training set. When the core training set is very different from your given dataset, performance suffers. For example, if you want to build an image...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Deep Learning with PyTorch Lightning
Published in: Apr 2022Publisher: PacktISBN-13: 9781800561618
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Kunal Sawarkar

Kunal Sawarkar is a chief data scientist and AI thought leader. He leads the worldwide partner ecosystem in building innovative AI products. He also serves as an advisory board member and an angel investor. He holds a master's degree from Harvard University with major coursework in applied statistics. He has been applying machine learning to solve previously unsolved problems in industry and society, with a special focus on deep learning and self-supervised learning. Kunal has led various AI product R&D labs and has 20+ patents and papers published in this field. When not diving into data, he loves doing rock climbing and learning to fly aircraft, in addition to an insatiable curiosity for astronomy and wildlife.
Read more about Kunal Sawarkar