You're reading from Deep Learning with PyTorch Lightning

Product typeBook

Published inApr 2022

Reading LevelBeginner

PublisherPackt

ISBN-139781800561618

Edition1st Edition

Languages

Python

Tools

PyTorch

Concepts

Deep Learning

Author (1)

Kunal Sawarkar

Chapter 3: Transfer Learning Using Pre-Trained Models

Deep learning models become more accurate the more data they have for training. The most spectacular Deep Learning models, such as ImageNet, are trained on millions of images and often require a massive amount of computing power. To put things into perspective, the amount of power used to train OpenAI's GPT3 model could power an entire city. Unsurprisingly, the cost of training such Deep Learning models from scratch is prohibitive for most projects.

This begs the question: do we really need to train a Deep Learning model from scratch each time? One way of getting around this problem, rather than training Deep Learning models from scratch, is to borrow representations from an already trained model for a similar subject. For example, if you wanted to train an image recognition model to detect faces, you could train your Convolutional Neural Network (CNN) to learn all the representations for each of the layers – or...

Technical requirements

The code for this chapter has been developed and tested on macOS with Anaconda or in Google Colab with Python 3.6. If you are using another environment, please make the appropriate changes to your env variables.

In this chapter, we will primarily be using the following Python modules, mentioned with their versions:

PyTorch Lightning (version: 1.5.2)
Seaborn (version: 0.11.2)
NumPy (version: 1.21.5)
Torch (version: 1.10.0)
pandas (version: 1.3.5)

Please import all these modules into your Jupyter environment. In order to make sure that these modules work together and not go out of sync, we have used the specific version of torch, torchvision, torchtext, torchaudio with PyTorch Lightning 1.5.2. You can also use the latest version of PyTorch Lightning and torch compatible with each other. More details can be found on the GitHub link: https://github.com/PacktPublishing/Deep-Learning-with-PyTorch-Lightning

!pip install torch==1...

Getting started with transfer learning

Transfer learning has many interesting applications, with one of the most fascinating being converting an image into the style of a famous painter, such as Van Gogh or Picasso.

Figure 3.1 – Image credit: A neural algorithm of artistic style (https://arxiv.org/pdf/1508.06576v2.pdf)

The preceding example is also known as Style Transfer. There are many specialized algorithms for accomplishing this task, and VGG-16, ResNet, and AlexNet are some of the more popular architectures.

In this chapter, we will start with the creation of a simple image classification model using ResNet-50 architecture on the PCam dataset, which contains image scans of cancer tissues. Later, we will build a text classification model that uses Bi-directional Encoder Representations from Transformers (BERT).

In both examples in this chapter, we will make use of a pre-trained model and its weights and fine-tune the model to make it work...

An image classifier using a pre-trained ResNet-50 architecture

ResNet-50 stands for Residual Network, which is a type of CNN architecture that was first published in a computer vision research paper entitled Deep Residual Learning for Image Recognition, by Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun, in 2015.

ResNet is currently the most popular architecture for image-related tasks. While it certainly works great on image classification problems (which we will see as follows), it works equally great as an encoder to learn image representations for more complex tasks such as Self-Supervised Learning. There are multiple variations of ResNet architecture, including ResNet-18, ResNet-34, ResNet-50, and ResNet-152 based on the number of deep layers it has.

The ResNet-50 architecture has 50 deep layers and is trained on the ImageNet dataset, which has 14 million images belonging to 1,000 different classes, including animals, cars, keyboards, mice, pens, and pencils. The...

Text classification using BERT transformers

Text classification using BERT transformers is a transformer-based machine learning technique for Natural Language Processing (NLP) developed by Google. BERT was created and published in 2018 by Jacob Devlin. Before BERT, for language tasks, semi-supervised models such as Recurrent Neural Networks (RNNs) or sequence models were commonly used. BERT was the first unsupervised approach to language models and achieved state-of-the-art performance on NLP tasks. The large BERT model consists of 24 encoders and 16 bi-directional attention heads. It was trained with Book Corpora words and English Wikipedia entries for about 3,000,000,000 words. It later expanded to over 100 languages. Using pre-trained BERT models, we can perform several tasks on text, such as classification, information extraction, question answering, summarization, translation, and text generation.

Figure 3.7 – BERT architecture diagram (Image credit...

Summary

Transfer learning is one of the most common ways used to cut compute costs, save time, and get the best results. In this chapter, we learned how to build models with ResNet-50 and pre-trained BERT architectures using PyTorch Lightning.

We have built an image classifier and a text classifier, and along the way, we have covered some useful PyTorch Lightning life cycle methods. We have learned how to make use of pre-trained models to work on our customized datasets with less effort and a smaller number of training epochs. Even with very little model tuning, we were able to achieve decent accuracy.

While transfer learning methods work great, their limitations should also be borne in mind. They work incredibly well for language models because the given dataset's text is usually made up of the same English words as in your core training set. When the core training set is very different from your given dataset, performance suffers. For example, if you want to build an image...

The rest of the chapter is locked

You have been reading a chapter from

Deep Learning with PyTorch Lightning

Published in: Apr 2022Publisher: PacktISBN-13: 9781800561618

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Kunal Sawarkar

Kunal Sawarkar is a chief data scientist and AI thought leader. He leads the worldwide partner ecosystem in building innovative AI products. He also serves as an advisory board member and an angel investor. He holds a master's degree from Harvard University with major coursework in applied statistics. He has been applying machine learning to solve previously unsolved problems in industry and society, with a special focus on deep learning and self-supervised learning. Kunal has led various AI product R&D labs and has 20+ patents and papers published in this field. When not diving into data, he loves doing rock climbing and learning to fly aircraft, in addition to an insatiable curiosity for astronomy and wildlife.
Read more about Kunal Sawarkar

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages