Packt+ | Advance your knowledge in tech

You're reading from Hands-On Image Processing with Python

Product typeBook

Published inNov 2018

Reading LevelIntermediate

PublisherPackt

ISBN-139781789343731

Edition1st Edition

Languages

Python

Tools

Keras

Concepts

Computer Vision

Author (1)

Sandipan Dey

Chapter 10. Deep Learning in Image Processing - Image Classification

In this chapter, we shall discuss recent advances in image processing with deep learning. We'll start by differentiating between classical and deep learning techniques, followed by a conceptual section on convolutional neural networks (CNN), the deep neural net architectures particularly useful for image processing. Then we'll continue our discussion on the image classification problem with a couple of image datasets and how to implement it with TensorFlow and Keras, two very popular deep learning libraries. Also, we'll see how to train deep CNN architectures and use them for predictions.

The topics to be covered in this chapter are as follows:

Deep learning in image processing
CNNs
Image classification with TensorFlow or Keras with the handwritten digits images dataset
Some popular deep CNNs (VGG-16/19, InceptionNet, ResNet) with an application in classifying the cats versus dogs images with the VGG-16 network

Deep learning in image processing

The main goal of Machine Learning (ML) is generalization; that is, we train an algorithm on a training dataset and we want the algorithm to work with high performance (accuracy) on an unseen dataset. In order to solve a complex image processing task (such as image classification), the more training data we have, we may expect better generalization—ability of the ML model learned, provided we have taken care of overfitting (for example, with regularization). But with traditional ML techniques, not only does it become computationally very expensive with huge training data, but also, the learning (improvement in generalization) often stops at a certain point. Also, the traditional ML algorithms often need lots of domain expertise and human intervention and they are only capable of what they are designed for—nothing more and nothing less. This is where deep learning models are very promising.

What is deep learning?

Some of the well-known and widely accepted definitions...

CNNs

CNNs are deep neural networks for which the primarily used input is images. CNNs learn the filters (features) that are hand-engineered in traditional algorithms. This independence from prior knowledge and human effort in feature design is a major advantage. They also reduce the number of parameters to be learned with their shared-weights architecture and possess translation invariance characteristics. In the next subsection, we'll discuss the general architecture of a CNN and how it works.

Conv or pooling or FC layers – CNN architecture and how it works

The next screenshot shows the typical architecture of a CNN. It consists of one or more convolutional layer, followed by a nonlinear ReLU activation layer, a pooling layer, and, finally, one (or more) fully connected (FC) layer, followed by an FC softmax layer, for example, in the case of a CNN designed to solve an image classification problem.

There can be multiple convolution ReLU pooling sequences of layers in the network, making the...

Image classification with TensorFlow or Keras

In this section, we shall revisit the problem of handwritten digits classification (with the MNIST dataset), but this time with deep neural networks. We are going to solve the problem using two very popular deep learning libraries, namely TensorFlow and Keras. TensorFlow (TF) is the most famous library used in production for deep learning models. It has a very large and awesome community. However, TensorFlow is not that easy to use. On the other hand, Keras is a high level API built on TensorFlow. It is more user-friendly and easy to use compared to TF, although it provides less control over low-level structures. Low-level libraries provide more flexibility. Hence TF can be tweaked much more as compared to Keras.

Classification with TF

First, we shall start with a very simple deep neural network, one containing only a single FC hidden layer (with ReLU activation) and a softmax FC layer, with no convolutional layer. The next screenshot shows the...

Some popular deep CNNs

In this section, let's discuss popular deep CNNs (for example, VGG-18/19, ResNet, and InceptionNet) used for image classification. The following screenshot shows single-crop accuracies (top-1 accuracy: how many times the correct label has the highest probability predicted by the CNN) of the most relevant entries submitted to the ImageNet challenge, from AlexNet (Krizhevsky et al., 2012), on the far left, to the best performing, Inception-v4 (Szegedy et al., 2016):

Also, we shall train a VGG-16 CNN with Keras to classify the cat images against the dog images.

VGG-16/19

The following screenshot shows the architecture of a popular CNN called VGG-16/19. The remarkable thing about the VGG-16 net is that, instead of having so many hyper-parameters, it lets you use a much simpler network where you focus on just having convolutional layers that are just 3 x 3 filters with a stride of 1 and that always use the same padding and make all the max pooling layers 2 x 2 with a stride...

Summary

In this chapter, the recent advances in image processing with deep learning models were introduced. We started by discussing the basic concepts of deep learning, how it's different from traditional ML, and why we need it. Then CNNs were introduced as deep neural networks designed particularly to solve complex image processing and computer vision tasks. The CNN architecture with convolutional, pooling, and FC layers were discussed. Next, we introduced TensorFlow and Keras, two popular deep learning libraries in Python. We showed how test accuracy on the MNIST dataset for handwritten digits classification can be increased with CNNs, then the same using FC layers only. Finally, we discussed a few popular networks such as VGG-16/19, GoogleNet, and ResNet. Kera's VGG-16 model was trained on Kaggle's Dogs vs. Cats competition images and we showed how it performs on the validation image dataset with decent accuracy.

In the next chapter, we'll discuss how to solve more complex image processing...

Questions

For classification of the mnist dataset using an FC layer with Keras, write a Python code fragment to visualize the output layer (what the neural network sees).
For classification of the mnist dataset using the neural network with FC layers only and with the CNN with Keras, we have directly used the test dataset for evaluating the model while training it. Set aside a few thousand images from training images and create a validation dataset and train the model on the remaining images. Use the validation dataset to evaluate the model while training. At the end of training, use the model learned to predict the labels of the test dataset and evaluate the accuracy of the model. Does it increase?

Use VGG-16/19, Resnet-50, and Inception V3 models (from Keras) to train (from scratch) on the mnist training images. What is the maximum accuracy you get on the test images?

Sandipan Dey is a data scientist with a wide range of interests, covering topics such as machine learning, deep learning, image processing, and computer vision. He has worked in numerous data science fields, working with recommender systems, predictive models for the events industry, sensor localization models, sentiment analysis, and device prognostics. He earned his master's degree in computer science from the University of Maryland, Baltimore County, and has published in a few IEEE Data Mining conferences and journals. He has earned certifications from 100+ MOOCs on data science, machine learning, deep learning, image processing, and related courses. He is a regular blogger (sandipanweb) and is a machine learning education enthusiast.
Read more about Sandipan Dey

Other recommended products

Related to this chapter

Python Image Processing Cookbook

Advancements in wireless devices and mobile technology have enabled the acquisition of a tremendous amount of graphics, pictures, and videos. Through cutting edge recipes, this book provides coverage on tools, algorithms, and analysis for image processing. This book provides solutions addressing the challenges and complex tasks of image processing.

BookApr 2020438 pages

Computer Vision with Python 3

The field of computer vision involves designing and implementing algorithms to understand images and extract meaningful information from them. This book enables you to build real-world applications using Python and open source image processing libraries.

BookAug 2017206 pages

Hands-On Computer Vision with Julia

This book is a thorough guide for developers who want to get started with building computer vision applications using Julia. Julia is well suited to image processing because of its ease of use and the fact that it lets you write easy-to-compile and efficient machine code.

BookJun 2018202 pages

Raspberry Pi Computer Vision Programming

You will learn the basics of hardware and software required for image processing and computer vision with Raspberry Pi and Python 3. You will have a look at all the major image processing, manipulation, and computer vision techniques and algorithms in detail using engaging examples. You will build a lot of real-life computer vision applications.

BookJun 2020306 pages5

OpenCV 3 Computer Vision Application Programming Cookbook

BookFeb 2017474 pages

OpenCV 3.x with Python By Example

Computer vision is found everywhere in modern technology. OpenCV for Python enables us to run computer vision algorithms in real time. With the advent of powerful machines, we have more processing power to work with. Using this technology, we can seamlessly integrate our computer vision applications into the cloud. Focusing on OpenCV 3.x and Python 3.6, this book will walk you through all the building blocks needed to build amazing computer vision applications with ease.

BookJan 2018268 pages

OpenCV 3 Computer Vision with Python Cookbook

OpenCV 3 is a native cross-platform library for computer vision, machine learning, and image processing. OpenCV's convenient high-level APIs hide very powerful internals designed for computational efficiency that can take advantage of multicore and GPU processing. This book will help you tackle increasingly challenging computer vision problems by providing a number of recipes that you can use to improve your applications.

BookMar 2018306 pages

Practical Computer Vision

Computer Vision is a broadly used term associated with acquiring, processing, and analyzing images. This book will show you how you can perform various Computer Vision techniques in the most practical way possible. Right from capturing images from various sources, you will learn how to perform image filtering/manipulation and detect features in your images. As you go through the chapters, you'll work with increasingly complex algorithms to develop complex Computer Vision applications

BookFeb 2018234 pages

OpenCV 4 Computer Vision Application Programming Cookbook

This book will present a variety of CV algorithms using the standard library. It will implement any shortfall that might come in CV by practicing the recipes that implement various tasks such as image processing and object recognition among others. It will help you in implementing CV algorithms to meet the technical requirement of your projects.

BookMay 2019494 pages

Hands-On GPU-Accelerated Computer Vision with OpenCV and CUDA

This book is a guide to explore how accelerating of computer vision applications using GPUs will help you develop algorithms that work on complex image data in real time. It will solve the problems you face while deploying these algorithms on embedded platforms with the help of development boards from NVIDIA such as the Jetson TX1, Jetson TX2, and Jetson TK1.

BookSep 2018380 pages

The Computer Vision Workshop

With The Computer Vision Workshop, you’ll explore the basic and advanced techniques in video and image processing using OpenCV and Python. It is filled with real-world exercises and activities that will make the learning process easy and enjoyable.

BookJul 2020568 pages

Learn OpenCV 4 By Building Projects

OpenCV is mainly used in Computer Vision and image processing and is considered to be one of the best open source libraries that helps developers focus on constructing complete projects on image processing, motion detection, and image segmentation. This book will be your guide to understanding the basic OpenCV concepts and algorithms.

BookNov 2018310 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages

You're reading from Hands-On Image Processing with Python

Chapter 10. Deep Learning in Image Processing - Image Classification

Deep learning in image processing

What is deep learning?

CNNs

Conv or pooling or FC layers – CNN architecture and how it works

Image classification with TensorFlow or Keras

Classification with TF

Some popular deep CNNs

VGG-16/19

Summary

Questions

Further reading

Unlock this book and the full library FREE for 7 days

Author (1)

Python Image Processing Cookbook

Computer Vision with Python 3

The field of computer vision involves designing and implementing algorithms to understand images and extract meaningful information from them. This book enables you to build real-world applications using Python and open source image processing libraries.

Hands-On Computer Vision with Julia

This book is a thorough guide for developers who want to get started with building computer vision applications using Julia. Julia is well suited to image processing because of its ease of use and the fact that it lets you write easy-to-compile and efficient machine code.

Raspberry Pi Computer Vision Programming

OpenCV 3 Computer Vision Application Programming Cookbook

OpenCV 3.x with Python By Example

OpenCV 3 Computer Vision with Python Cookbook

Practical Computer Vision

OpenCV 4 Computer Vision Application Programming Cookbook

Hands-On GPU-Accelerated Computer Vision with OpenCV and CUDA

The Computer Vision Workshop

With The Computer Vision Workshop, you’ll explore the basic and advanced techniques in video and image processing using OpenCV and Python. It is filled with real-world exercises and activities that will make the learning process easy and enjoyable.

Learn OpenCV 4 By Building Projects

Et al.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Mastering Tableau 2023

Building AI Applications with ChatGPT APIs

Building AI Applications with ChatGPT APIs

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

Modern Data Architecture on AWS

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

TinyML Cookbook