Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Practical Convolutional Neural Networks

You're reading from  Practical Convolutional Neural Networks

Product type Book
Published in Feb 2018
Publisher Packt
ISBN-13 9781788392303
Pages 218 pages
Edition 1st Edition
Languages
Authors (3):
Mohit Sewak Mohit Sewak
Profile icon Mohit Sewak
Md. Rezaul Karim Md. Rezaul Karim
Profile icon Md. Rezaul Karim
Pradeep Pujari Pradeep Pujari
Profile icon Pradeep Pujari
View More author details

Table of Contents (11) Chapters

Preface Deep Neural Networks – Overview Introduction to Convolutional Neural Networks Build Your First CNN and Performance Optimization Popular CNN Model Architectures Transfer Learning Autoencoders for CNN Object Detection and Instance Segmentation with CNN GAN: Generating New Images with CNN Attention Mechanism for CNN and Visual Models Other Books You May Enjoy

References


  1. Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron C. Courville, Ruslan Salakhutdinov, Richard S. Zemel, Yoshua Bengio, Show, Attend and Tell: Neural Image Caption Generation with Visual Attention, CoRR, arXiv:1502.03044, 2015.
  2. Karl Moritz Hermann, Tom's Kocisk, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, Phil Blunsom, Teaching Machines to Read and Comprehend, CoRR, arXiv:1506.03340, 2015.
  3. Volodymyr Mnih, Nicolas Heess, Alex Graves, Koray Kavukcuoglu, Recurrent Models of Visual Attention, CoRR, arXiv:1406.6247, 2014.
  4. Long Chen, Hanwang Zhang, Jun Xiao, Liqiang Nie, Jian Shao, Tat-Seng Chua, SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning, CoRR, arXiv:1611.05594, 2016.
  5. Kan Chen, Jiang Wang, Liang-Chieh Chen, Haoyuan Gao, Wei Xu, Ram Nevatia, ABC-CNN: An Attention Based Convolutional Neural Network for Visual Question Answering, CoRR, arXiv:1511.05960, 2015.
  6. Wenpeng Yin, Sebastian Ebert, Hinrich Schutze, Attention-Based...

Summary


The attention mechanism is the hottest topic in deep learning today and is conceived to be in the center of most of the cutting-edge algorithms under current research, and in probable future applications. Problems such as image captioning, visual question answering, and many more have gotten great solutions by using this approach. In fact, attention is not limited to visual tasks and was conceived earlier for problems such as neural machine translations and other sophisticated NLP problems. Thus, understanding the attention mechanism is vital to mastering many advanced deep learning techniques.

CNNs are used not only for vision but also for many good applications with attention for solving complex NLP problems, such as modeling sentence pairs and machine translation. This chapter covered the attention mechanism and its application to some NLP problems, along with image captioning and recurrent vision models. In RAMs, we did not use CNN; instead, we applied RNN and attention to reduced...

Types of Attention


There are two types attention mechanisms. They are as follows:

  • Hard attention
  • Soft attention

Let's now take a look at each one in detail in the following sections.

Hard Attention

In reality, in our recent image caption example, several more pictures would be selected, but due to our training with the handwritten captions, those would never be weighted higher. However, the essential thing to understand is how the system would understand what all pixels (or more precisely, the CNN representations of them) the system focuses on to draw these high-resolution images of different aspects and then how to choose the next pixel to repeat the process.

In the preceding example, the points are chosen at random from a distribution and the process is repeated. Also, which pixels around this point get a higher resolution is decided inside the attention network. This type of attention is known as hard attention.

Hard attention has something called the differentiability problem. Let's spend some...

Using attention to improve visual models


As we discovered in the NLP example covered in the earlier section on Attention Mechanism - Intuition, Attention did help us a lot in both achieving new use-cases, not optimally feasible with conventional NLP, and vastly improving the performance of the existing NLP mechanism. Similar is the usage of Attention in CNN and Visual Models as well

In the earlier chapter Chapter 7, Object-Detection & Instance-Segmentation with CNN, we discovered how Attention (like) mechanism are used as Region Proposal Networks for networks like Faster R-CNN and Mask R-CNN, to greatly enhance and optimize the proposed regions, and enable the generation of segment masks. This corresponds to the first part of the discussion. In this section, we will cover the second part of the discussion, where we will use 'Attention' mechanism to improve the performance of our CNNs, even under extreme conditions.

Reasons for sub-optimal performance of visual CNN models

The performance...

References


  1. Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron C. Courville, Ruslan Salakhutdinov, Richard S. Zemel, Yoshua Bengio, Show, Attend and Tell: Neural Image Caption Generation with Visual Attention, CoRR, arXiv:1502.03044, 2015.
  2. Karl Moritz Hermann, Tom's Kocisk, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, Phil Blunsom, Teaching Machines to Read and Comprehend, CoRR, arXiv:1506.03340, 2015.
  3. Volodymyr Mnih, Nicolas Heess, Alex Graves, Koray Kavukcuoglu, Recurrent Models of Visual Attention, CoRR, arXiv:1406.6247, 2014.
  4. Long Chen, Hanwang Zhang, Jun Xiao, Liqiang Nie, Jian Shao, Tat-Seng Chua, SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning, CoRR, arXiv:1611.05594, 2016.
  5. Kan Chen, Jiang Wang, Liang-Chieh Chen, Haoyuan Gao, Wei Xu, Ram Nevatia, ABC-CNN: An Attention Based Convolutional Neural Network for Visual Question Answering, CoRR, arXiv:1511.05960, 2015.
  6. Wenpeng Yin, Sebastian Ebert, Hinrich Schutze, Attention-Based...

Summary


The attention mechanism is the hottest topic in deep learning today and is conceived to be in the center of most of the cutting-edge algorithms under current research, and in probable future applications. Problems such as image captioning, visual question answering, and many more have gotten great solutions by using this approach. In fact, attention is not limited to visual tasks and was conceived earlier for problems such as neural machine translations and other sophisticated NLP problems. Thus, understanding the attention mechanism is vital to mastering many advanced deep learning techniques.

CNNs are used not only for vision but also for many good applications with attention for solving complex NLP problems, such as modeling sentence pairs and machine translation. This chapter covered the attention mechanism and its application to some NLP problems, along with image captioning and recurrent vision models. In RAMs, we did not use CNN; instead, we applied RNN and attention to reduced...

lock icon The rest of the chapter is locked
You have been reading a chapter from
Practical Convolutional Neural Networks
Published in: Feb 2018 Publisher: Packt ISBN-13: 9781788392303
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}