You're reading from Deep Learning with PyTorch Lightning

Product type Book

Published in Apr 2022

Publisher Packt

ISBN-13 9781800561618

Pages 366 pages

Edition 1st Edition

Languages

Python

Concepts

Deep Learning

Author (1):

Kunal Sawarkar

Table of Contents (15) Chapters

Preface

Section 1: Kickstarting with PyTorch Lightning

Chapter 1: PyTorch Lightning Adventure

Chapter 2: Getting off the Ground with the First Deep Learning Model

Chapter 3: Transfer Learning Using Pre-Trained Models

Chapter 4: Ready-to-Cook Models from Lightning Flash

Section 2: Solving using PyTorch Lightning

Chapter 5: Time Series Models

Chapter 6: Deep Generative Models

Chapter 7: Semi-Supervised Learning

Chapter 8: Self-Supervised Learning

Section 3: Advanced Topics

Chapter 9: Deploying and Scoring Models

Chapter 10: Scaling and Managing Training

Other Books You May Enjoy

Chapter 8: Self-Supervised Learning

Since the dawn of Machine Learning, the field has been neatly divided into two camps: supervised learning and unsupervised learning. In supervised learning, there should be a labeled dataset available, and if that is not the case, then the only option left is unsupervised learning. While unsupervised learning may sound great as it can work without labels, in practice, the applications of unsupervised methods such as clustering are quite limited. There is also no easy option to evaluate the accuracy of unsupervised methods or to deploy them.

The most practical Machine Learning applications tend to be supervised learning applications (for example, recognizing objects in images, predicting future stock prices or sales, or recommending the right movie to you on Netflix). The trade-off for supervised learning is the necessity for well-curated and high-quality trustworthy labels. Most datasets are not born with labels and getting such labels can be...

Technical requirements

In this chapter, we will be primarily using the following Python modules:

NumPy (version 1.21.5)
torch (version 1.10)
torchvision (version 0.11.1)
PyTorch Lightning (version 1.5.2)

Please check the correct version of the packages before running the code.

In order to make sure that these modules work together and not go out of sync, we have used the specifi c version of torch, torchvision, torchtext, torchaudio with PyTorch Lightning 1.5.2. You can also use the latest version of PyTorch Lightning and torch compatible with each other. More details can be found on the GitHub link:

https://github.com/PacktPublishing/Deep-Learning-with-PyTorch-Lightning.

!pip install torch==1.10.0 torchvision==0.11.1 torchtext==0.11.0 torchaudio==0.10.0 --quiet
!pip install pytorch-lightning==1.5.2 --quiet

Working examples for this chapter can be found at this GitHub link: https://github.com/PacktPublishing/Deep-Learning-with-PyTorch-Lightning...

Getting started with Self-Supervised Learning

The future of Machine Learning has been hotly contested given the spectacular success of Deep Learning methods such as CNN and RNN in recent years. While CNNs can do amazing things, such as image recognition, and RNNs can generate text, and other advanced NLP methods, such as the Transformer, can achieve marvelous results, all of them have serious limitations when compared to human intelligence. They don't compare very well to humans on tasks such as reasoning, deduction, and comprehension. Also, most notably, they require an enormous amount of well-labeled training data to learn even something as simple as image recognition.

Figure 8.2 – A child learns to classify objects with very few labels

Unsurprisingly, that is not the way humans learn. A child does not need millions of labeled images as input before it can recognize objects. The incredible ability of the human brain to generate its own new labels...

What is Contrastive Learning?

The idea of understanding an image is to get an image of a particular kind (say a dog) and then we can recognize all other dogs by reasoning that they share the same representation or structure. For example, if you show a child who is not yet able to talk or understand language (say, less than 2 years old) a picture of a dog (or a real dog for that matter) and then give them a pack of cards with a collection of animals, which includes dogs, cats, elephants, and birds, and ask the child which picture is similar to the first one, it is most likely that the child could easily pick the card with a dog on it. And the child would be able to do so even without you explaining that this picture equals "dog" (in other words, without supplying any new labels).

You could say that a child learned to recognize all dogs in a single instance and with a single label! Wouldn't it be awesome if a machine could do that as well? That is exactly what contrastive...

SimCLR architecture

SimCLR stands for Simple Contrastive Learning Architecture. This architecture is based on the paper "A Simple Framework for Contrastive Learning of Visual Representations", published by Geoffrey Hinton and Google Team. Geoffrey Hinton (just like Yann LeCun) is a co-recipient of the Turing Award for his work on Deep Learning. There are SimCLR and SimCLR2 versions. SimCLR2 is a larger and denser network than SimCLR. At the time of writing, SimCLR2 was the best architecture update available, but don't be surprised if there is a SimCLR3 soon that is even denser and better than the previous one.

The architecture has shown in relation to the ImageNet dataset that we can achieve 93% accuracy with just 1% of labels. This is a truly remarkable result considering that it took over 2 years and a great deal of effort from over 140,000 labelers (mostly graduate students) on Mechanical Turk to label ImageNet by hand. It was a massive undertaking carried out...

SimCLR model for image recognition

We have seen that SimCLR can do the following:

Learn feature representations (unit hypersphere) by grouping similar images together and pushing dissimilar images apart.
Balance alignment (keeping similar images together) and uniformity (preserving the maximum information).
Learn on unlabeled training data.

The primary challenge is to use the unlabeled data (that comes from a similar but different distribution from the labeled data) to build a useful prior, which is then used to generate labels for the unlabeled set. Let's look at the architecture we will implement in this section.

Figure 8.7 – SimCLR architecture implementation

We will use the ResNet-50 as the Encoder, followed by a three-layer MLP as the projection head. We will then use logistic regression, or MLP, as the supervised classifier to measure the accuracy.

The SimCLR architecture involves the following steps, which we implement...

Summary

Most of the image datasets that exist in nature or industry are unlabeled image datasets. Think of X-ray images generated by diagnostic labs, or MRI or dental scans, and many more. Pictures generated on Amazon reviews or images from Google Street View or e-commerce websites like EBay are also mostly unlabelled; also a large proportion of Facebook, Instagram, or WhatsApp images are never tagged and are therefore unlabelled as well. A lot of these image datasets remain unused with untapped potential due to current modelling techniques requiring large amounts of manually labelled sets. Removing the need for large, labelled datasets and expanding the realm of what is possible is Self-Supervised Learning.

We have seen in this chapter how PyTorch Lightning can be used to quickly create Self-Supervised Learning models such as contrastive learning. In fact, PyTorch Lightning is the first framework to provide out-of-the-box support for many Self-Supervised Learning models. We implemented...