Reader small image

You're reading from  Practical Convolutional Neural Networks

Product typeBook
Published inFeb 2018
Reading LevelIntermediate
PublisherPackt
ISBN-139781788392303
Edition1st Edition
Languages
Right arrow
Authors (3):
Mohit Sewak
Mohit Sewak
author image
Mohit Sewak

Mohit is a Python programmer with a keen interest in the field of information security. He has completed his Bachelor's degree in technology in computer science from Kurukshetra University, Kurukshetra, and a Master's in engineering (2012) in computer science from Thapar University, Patiala. He is a CEH, ECSA from EC-Council USA. He has worked in IBM, Teramatrix (Startup), and Sapient. He currently doing a Ph.D. from Thapar Institute of Engineering & Technology under Dr. Maninder Singh. He has published several articles in national and international magazines. He is the author of Python Penetration Testing Essentials, Python: Penetration Testing for Developers and Learn Python in 7 days, also by Packt. For more details on the author, you can check the following user name mohitraj.cs
Read more about Mohit Sewak

Md. Rezaul Karim
Md. Rezaul Karim
author image
Md. Rezaul Karim

Md. Rezaul Karim is a researcher, author, and data science enthusiast with a strong computer science background, coupled with 10 years of research and development experience in machine learning, deep learning, and data mining algorithms to solve emerging bioinformatics research problems by making them explainable. He is passionate about applied machine learning, knowledge graphs, and explainable artificial intelligence (XAI). Currently, he is working as a research scientist at Fraunhofer FIT, Germany. He is also a PhD candidate at RWTH Aachen University, Germany. Before joining FIT, he worked as a researcher at the Insight Centre for Data Analytics, Ireland. Previously, he worked as a lead software engineer at Samsung Electronics, Korea.
Read more about Md. Rezaul Karim

Pradeep Pujari
Pradeep Pujari
author image
Pradeep Pujari

https://www.linkedin.com/in/ppujari/
Read more about Pradeep Pujari

View More author details
Right arrow

Introduction to ImageNet

ImageNet is a database of over 15 million hand-labeled, high-resolution images in roughly 22,000 categories. This database is organized just like the WordNet hierarchy, where each concept is also called a synset (that is, synonym set). Each synset is a node in the ImageNet hierarchy. Each node has more than 500 images. 

The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) was founded in 2010 to improve state-of-the-art technology for object detection and image classification on a large scale:

Following this overview of ImageNet, we will now take a look at various CNN model architectures.

LeNet

In 2010, a challenge from ImageNet (known as ILSVRC 2010) came out with a CNN architecture, LeNet 5, built by Yann Lecun. This network takes a 32 x 32 image as input, which goes to the convolution layers (C1) and then to the subsampling layer (S2). Today, the subsampling layer is replaced by a pooling layer. Then, there is another sequence of convolution layers (C3) followed by a pooling (that is, subsampling) layer (S4). Finally, there are three fully connected layers, including the OUTPUT layer at the end. This network was used for zip code recognition in post offices. Since then, every year various CNN architectures were introduced with the help of this competition:

LeNet 5 – CNN architecture from Yann Lecun's article in 1998

Therefore, we can conclude the following points:

  • The input to this network is a grayscale 32 x 32 image
  • The architecture...

AlexNet architecture

The first breakthrough in the architecture of CNN came in the year 2012. This award-winning CNN architecture is called AlexNet. It was developed at the University of Toronto by Alex Krizhevsky and his professor, Jeffry Hinton. 

In the first run, a ReLU activation function and a dropout of 0.5 were used in this network to fight overfitting. As we can see in the following image, there is a normalization layer used in the architecture, but this is not used in practice anymore as it used heavy data augmentation. AlexNet is still used today even though there are more accurate networks available, because of its relative simple structure and small depth. It is widely used in computer vision:

AlexNet is trained on the ImageNet database using two separate GPUs, possibly due to processing limitations with inter-GPU connections at the time, as shown in the...

VGGNet architecture

The runner-up in the 2014 ImageNet challenge was VGGNet from the visual geometric group at Oxford University. This convolutional neural network is a simple and elegant architecture with a 7.3% error rate. It has two versions: VGG16 and VGG19.

VGG16 is a 16-layer neural network, not counting the max pooling layer and the softmax layer. Hence, it is known as VGG16. VGG19 consists of 19 layers. A pre-trained model is available in Keras for both Theano and TensorFlow backends.

The key design consideration here is depth. Increases in the depth of the network were achieved by adding more convolution layers, and it was done due to the small 3 x 3 convolution filters in all the layers. The default input size of an image for this model is 224 x 224 x 3. The image is passed through a stack of convolution layers with a stride of 1 pixel and padding of 1. It uses 3 x 3...

GoogLeNet architecture

In 2014, ILSVRC, Google published its own network known as GoogLeNet. Its performance is a little better than VGGNet; GoogLeNet's performance is 6.7% compared to VGGNet's performance of 7.3%. The main attractive feature of GoogLeNet is that it runs very fast due to the introduction of a new concept called inception module, thus reducing the number of parameters to only 5 million; that's 12 times less than AlexNet. It has lower memory use and lower power use too.

It has 22 layers, so it is a very deep network. Adding more layers increases the number of parameters and it is likely that the network overfits. There will be more computation, because a linear increase in filters results in a quadratic increase in computation. So, the designers use the inception module and GAP. The fully connected layer at the end of the network is replaced...

Introduction to ImageNet


ImageNet is a database of over 15 million hand-labeled, high-resolution images in roughly 22,000 categories. This database is organized just like the WordNet hierarchy, where each concept is also called a synset (that is, synonym set). Each synset is a node in the ImageNet hierarchy. Each node has more than 500 images. 

The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) was founded in 2010 to improve state-of-the-art technology for object detection and image classification on a large scale:

Following this overview of ImageNet, we will now take a look at various CNN model architectures.

LeNet


In 2010, a challenge from ImageNet (known as ILSVRC 2010) came out with a CNN architecture, LeNet 5, built by Yann Lecun. This network takes a 32 x 32 image as input, which goes to the convolution layers (C1) and then to the subsampling layer (S2). Today, the subsampling layer is replaced by a pooling layer. Then, there is another sequence of convolution layers (C3) followed by a pooling (that is, subsampling) layer (S4). Finally, there are three fully connected layers, including the OUTPUT layer at the end. This network was used for zip code recognition in post offices. Since then, every year various CNN architectures were introduced with the help of this competition:

LeNet 5 – CNN architecture from Yann Lecun's article in 1998

Therefore, we can conclude the following points:

  • The input to this network is a grayscale 32 x 32 image
  • The architecture implemented is a CONV layer, followed by POOL and a fully connected layer
  • CONV filters are 5 x 5, applied at a stride of 1
lock icon
The rest of the chapter is locked
You have been reading a chapter from
Practical Convolutional Neural Networks
Published in: Feb 2018Publisher: PacktISBN-13: 9781788392303
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (3)

author image
Mohit Sewak

Mohit is a Python programmer with a keen interest in the field of information security. He has completed his Bachelor's degree in technology in computer science from Kurukshetra University, Kurukshetra, and a Master's in engineering (2012) in computer science from Thapar University, Patiala. He is a CEH, ECSA from EC-Council USA. He has worked in IBM, Teramatrix (Startup), and Sapient. He currently doing a Ph.D. from Thapar Institute of Engineering & Technology under Dr. Maninder Singh. He has published several articles in national and international magazines. He is the author of Python Penetration Testing Essentials, Python: Penetration Testing for Developers and Learn Python in 7 days, also by Packt. For more details on the author, you can check the following user name mohitraj.cs
Read more about Mohit Sewak

author image
Md. Rezaul Karim

Md. Rezaul Karim is a researcher, author, and data science enthusiast with a strong computer science background, coupled with 10 years of research and development experience in machine learning, deep learning, and data mining algorithms to solve emerging bioinformatics research problems by making them explainable. He is passionate about applied machine learning, knowledge graphs, and explainable artificial intelligence (XAI). Currently, he is working as a research scientist at Fraunhofer FIT, Germany. He is also a PhD candidate at RWTH Aachen University, Germany. Before joining FIT, he worked as a researcher at the Insight Centre for Data Analytics, Ireland. Previously, he worked as a lead software engineer at Samsung Electronics, Korea.
Read more about Md. Rezaul Karim

author image
Pradeep Pujari

https://www.linkedin.com/in/ppujari/
Read more about Pradeep Pujari