In the previous chapter, we learned how to create a convolutional neural network (CNN) from scratch with Keras. However, in real-world projects, you almost never code a convolutional neural network from scratch. You always tweak and train them as per the requirement. This book introduces you to the important concepts of transfer learning and pre-trained networks, also known as pre-trained models, which are used in the industry. This is an advanced level of machine learning, so this chapter assumes that you have adequate knowledge of neural networks and CNNs. We will use images and, rather than building a CNN from scratch, we will match these images on pre-trained models to try to classify them. We will also tweak our models to make them more flexible. The models we will use here are VGG16 and ResNet50, which we will discuss further in the chapter. Before starting to work on pre-trained models, we need to understand about transfer learning.
Humans are trained to learn by experience. We tend to use the knowledge we gain in one situation in similar situations we face in the future. Suppose you want to learn how to drive an SUV. You have never driven an SUV; all you know is how to drive a small hatchback car.
The dimensions of the SUV are considerably larger than the hatchback, so navigating the SUV in traffic will surely be a challenge. Still, some basic systems, such as the clutch, accelerator, and brakes, remain similar to that of the hatchback. So, knowing how to drive a hatchback will surely be of great help to you when you starting to learn to drive the SUV. All the knowledge that you acquired while driving a hatchback can be used when you are learning to drive a big SUV.
This is precisely what transfer learning is. By definition, transfer learning is a concept in machine learning in which we store and use knowledge gained in one activity while learning another similar activity. The hatchback...
Fine-tuning means tweaking our neural network in such a way that it becomes more relevant to the task at hand. We can freeze some of the initial layers of the network so that we don't lose information stored in those layers. The information stored there is generic and of useful. However, if we can freeze those layers while our classifier is learning and then unfreeze them, we can tweak them a little so that they fit even better to the problem at hand. Suppose we have a pre-trained network that identifies animals. However, if we want to identify specific animals, such as dogs and cats, then we can tweak the layers a little bit so that they can learn how dogs and cats look. This is like using the whole pre-trained network and then adding a new layer that consists of images of dogs and cats. We will be doing a similar activity by using a pre-built network and adding a classifier on top of it, which will be trained on pictures of dogs and cats.
There is a three...
In this chapter, we learned the concept of transfer learning and how is it related to pre-trained networks, and we learned how to use a pre-trained deep learning network. We also learned how to use techniques such as feature extraction and fine-tuning for better use of image classification tasks. We used both the VGG16 and ResNet50 networks. First, we learned how to use an existing model and classify images, and then we learned the powerful technique of tweaking existing models and making it work according to our dataset. This technique for building our own ANN over an existing CNN is one of the most powerful techniques used in the industry.
In the next chapter, we will learn about sequential modeling and sequential memory by looking at some real-life cases with Google Assistant. Further to this, we will learn how sequential modeling is related to Recurrent Neural Networks (RNN). We will learn about the vanishing gradient problem in detail, and how using an LSTM is better than a simple...