Reader small image

You're reading from  The Deep Learning with Keras Workshop

Product typeBook
Published inJul 2020
Reading LevelIntermediate
PublisherPackt
ISBN-139781800562967
Edition1st Edition
Languages
Tools
Right arrow
Authors (3):
Matthew Moocarme
Matthew Moocarme
author image
Matthew Moocarme

Matthew Moocarme is an accomplished data scientist with more than eight years of experience in creating and utilizing machine learning models. He comes from a background in the physical sciences, in which he holds a Ph.D. in physics from the Graduate Center of CUNY. Currently, he leads a team of data scientists and engineers in the media and advertising space to build and integrate machine learning models for a variety of applications. In his spare time, Matthew enjoys sharing his knowledge with the data science community through published works, conference presentations, and workshops.
Read more about Matthew Moocarme

Mahla Abdolahnejad
Mahla Abdolahnejad
author image
Mahla Abdolahnejad

Mahla Abdolahnejad is a Ph.D. candidate in systems and computer engineering with Carleton University, Canada. She also holds a bachelor's degree and a master's degree in biomedical engineering, which first exposed her to the field of artificial intelligence and artificial neural networks, in particular. Her Ph.D. research is focused on deep unsupervised learning for computer vision applications. She is particularly interested in exploring the differences between a human's way of learning from the visual world and a machine's way of learning from the visual world, and how to push machine learning algorithms toward learning and thinking like humans.
Read more about Mahla Abdolahnejad

Ritesh Bhagwat
Ritesh Bhagwat
author image
Ritesh Bhagwat

Ritesh Bhagwat has a master's degree in applied mathematics with a specialization in computer science. He has over 14 years of experience in data-driven technologies and has led and been a part of complex projects ranging from data warehousing and business intelligence to machine learning and artificial intelligence. He has worked with top-tier global consulting firms as well as large multinational financial institutions. Currently, he works as a data scientist. Besides work, he enjoys playing and watching cricket and loves to travel. He is also deeply interested in Bayesian statistics.
Read more about Ritesh Bhagwat

View More author details
Right arrow

8. Transfer Learning and Pre-Trained Models

Overview

This chapter introduces the concept of pre-trained models and utilizing them for different applications from those for which they were trained, known as transfer learning. By the end of this chapter, you will be able to apply feature extraction to pre-trained models, exploit pre-trained models for image classification, and apply fine-tuning to pre-trained models to classify images of flowers and cars into their respective classes. We will see that this achieves the same task that we completed in the previous chapter but with greater accuracy and shorter training times.

Introduction

In the previous chapter, we learned how to create a Convolutional Neural Network (CNN) from scratch with Keras. We experimented with different architectures by adding more convolutional and Dense layers and changing the activation function. We compared the performance of each model by classifying images of cars and flowers into their respective classes and comparing their accuracies.

In real-world projects, however, you almost never code a convolutional neural network from scratch. You always tweak and train them as per the requirements. This chapter will introduce you to the important concepts of transfer learning and pre-trained networks (also known as pre-trained models), both of which are used in the industry.

We will use images and, rather than building a CNN from scratch, we will match these images on pre-trained models to try and classify them. We will also tweak our models to make them more flexible. The models we will use in this chapter are called VGG16...

Pre-Trained Sets and Transfer Learning

Humans learn by experience. We apply the knowledge we gain in one situation to similar situations we face in the future. Suppose you want to learn how to drive an SUV. You have never driven an SUV; all you know is how to drive a small hatchback car.

The dimensions of the SUV are considerably larger than the hatchback, so navigating the SUV in traffic will surely be a challenge. Still, some basic systems (such as the clutch, accelerator, and brakes) remain similar to that of the hatchback. So, knowing how to drive a hatchback will surely be of great help to you when you are learning to drive the SUV. All the knowledge that you acquired while driving a hatchback can be used when you learn to drive a big SUV.

This is precisely what transfer learning is. By definition, transfer learning is a concept in machine learning in which we store and use the knowledge gained in one activity while learning another similar activity. The hatchback-SUV model...

Fine-Tuning a Pre-Trained Network

Fine-tuning means tweaking our neural network in such a way that it becomes more relevant to the task at hand. We can freeze some of the initial layers of the network so that we don't lose information stored in those layers. The information stored there is generic and useful. However, if we can freeze those layers while our classifier is learning and then unfreeze them, we can tweak them a little so that they fit even better to the problem at hand. Suppose we have a pre-trained network that identifies animals. If we want to identify specific animals, such as dogs and cats, we can tweak the layers a little bit so that they can learn what dogs and cats look like. This is like using the whole pre-trained network and then adding a new layer that consists of images of dogs and cats. We will be doing a similar activity by using a pre-built network and adding a classifier on top of it, which will be trained on pictures of dogs and cats.

There is...

Summary

In this chapter, we covered the concept of transfer learning and how is it related to pre-trained networks. We utilized this knowledge by using the pre-trained deep learning networks VGG16 and ResNet50 to predict various images. We practiced how to take advantage of such pre-trained networks using techniques such as feature extraction and fine-tuning to train models faster and more accurately. Finally, we learned the powerful technique of tweaking existing models and making them work according to our dataset. This technique of building our own ANN over an existing CNN is one of the most powerful techniques used in the industry.

In the next chapter, we will learn about sequential modeling and sequential memory by looking at some real-life cases with Google Assistant. Furthermore, we will learn how sequential modeling is related to Recurrent Neural Networks (RNN). We will learn about the vanishing gradient problem in detail and how using an LSTM is better than a simple RNN...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
The Deep Learning with Keras Workshop
Published in: Jul 2020Publisher: PacktISBN-13: 9781800562967
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (3)

author image
Matthew Moocarme

Matthew Moocarme is an accomplished data scientist with more than eight years of experience in creating and utilizing machine learning models. He comes from a background in the physical sciences, in which he holds a Ph.D. in physics from the Graduate Center of CUNY. Currently, he leads a team of data scientists and engineers in the media and advertising space to build and integrate machine learning models for a variety of applications. In his spare time, Matthew enjoys sharing his knowledge with the data science community through published works, conference presentations, and workshops.
Read more about Matthew Moocarme

author image
Mahla Abdolahnejad

Mahla Abdolahnejad is a Ph.D. candidate in systems and computer engineering with Carleton University, Canada. She also holds a bachelor's degree and a master's degree in biomedical engineering, which first exposed her to the field of artificial intelligence and artificial neural networks, in particular. Her Ph.D. research is focused on deep unsupervised learning for computer vision applications. She is particularly interested in exploring the differences between a human's way of learning from the visual world and a machine's way of learning from the visual world, and how to push machine learning algorithms toward learning and thinking like humans.
Read more about Mahla Abdolahnejad

author image
Ritesh Bhagwat

Ritesh Bhagwat has a master's degree in applied mathematics with a specialization in computer science. He has over 14 years of experience in data-driven technologies and has led and been a part of complex projects ranging from data warehousing and business intelligence to machine learning and artificial intelligence. He has worked with top-tier global consulting firms as well as large multinational financial institutions. Currently, he works as a data scientist. Besides work, he enjoys playing and watching cricket and loves to travel. He is also deeply interested in Bayesian statistics.
Read more about Ritesh Bhagwat