Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Modern Computer Vision with PyTorch

You're reading from  Modern Computer Vision with PyTorch

Product type Book
Published in Nov 2020
Publisher Packt
ISBN-13 9781839213472
Pages 824 pages
Edition 1st Edition
Languages
Authors (2):
V Kishore Ayyadevara V Kishore Ayyadevara
Profile icon V Kishore Ayyadevara
Yeshwanth Reddy Yeshwanth Reddy
Profile icon Yeshwanth Reddy
View More author details

Table of Contents (25) Chapters

Preface Section 1 - Fundamentals of Deep Learning for Computer Vision
Artificial Neural Network Fundamentals PyTorch Fundamentals Building a Deep Neural Network with PyTorch Section 2 - Object Classification and Detection
Introducing Convolutional Neural Networks Transfer Learning for Image Classification Practical Aspects of Image Classification Basics of Object Detection Advanced Object Detection Image Segmentation Applications of Object Detection and Segmentation Section 3 - Image Manipulation
Autoencoders and Image Manipulation Image Generation Using GANs Advanced GANs to Manipulate Images Section 4 - Combining Computer Vision with Other Techniques
Training with Minimal Data Points Combining Computer Vision and NLP Techniques Combining Computer Vision and Reinforcement Learning Moving a Model to Production Using OpenCV Utilities for Image Analysis Other Books You May Enjoy Appendix
Using OpenCV Utilities for Image Analysis

So far, in previous chapters, we have learned about leveraging various techniques to perform object classification, localization, and segmentation, as well as generating images. While all these techniques leverage deep learning to solve tasks, for relatively simple and well-defined tasks, we can leverage specific functionalities provided in the OpenCV package. For example, we don't need YOLO if the object that needs to be detected is always the same object with the same background. In cases where images come from a constrained environment, there is a high chance that one of the OpenCV utilities can help solve the problem to a large extent.

We are going to cover only a few use cases in this chapter as there are just so many utilities to cover that it would warrant a separate book focusing on OpenCV. In doing word detection, you will...

Drawing bounding boxes around words in an image

Imagine a scenario where you are building a model that performs word transcription from the image of a document. The first step would be to identify the location of words within the image. Primarily, there are two ways of identifying words within an image:

  • Using deep learning techniques such as CRAFT, EAST, and more
  • Using OpenCV-based techniques

In this section, we will learn about how machine-printed words can be identified in a clean image without leveraging deep learning. As the contrast between the background and foreground is high, you do not need an overkill solution such as YOLO to identify the location of individual words. Using OpenCV is going to be especially handy in these scenarios because we can arrive at a solution with very limited computational resources and, consequently, even the inference time will be very small. The only drawback is that the accuracy may not be 100%, but that is also subject to how clean the scanned...

Detecting lanes in an image of a road

Imagine a scenario where you have to detect the lanes within an image of a road. One way to solve this is by leveraging semantic segmentation techniques in deep learning. One of the traditional ways of solving this problem using OpenCV has been using edge and line detectors. In this section, we will learn about how edge detection followed by line detection can help in identifying lanes within an image of a road.

Here, we will have outlined a high-level understanding of the strategy:

  1. Find the edges of various objects present in the image.
  2. Identify the edges that follow a straight line and are also connected.
  3. Extend the identified lines from one end of the image to the other end.

Let's code up our strategy:

The following code is available as detecting_lanes_in_the_image_of_a_road.ipynb in the Chapter18 folder of this book's GitHub repository - https://tinyurl.com/mcvp-packt Be sure to copy the URL from the notebook in GitHub to avoid any...

Detecting objects based on color

Green screen is a classic video editing technique where we can make someone look like they are standing in front of a completely different background. This is widely used in weather reports, where reporters point to backgrounds of moving clouds and maps. The trick in this technique is that the reporter never wears a certain color of clothing (say, green) and stands in front of a background that is only green. Then, identifying green pixels will identify what is the background and helps replace content at only those pixels.

In this section, we will learn about leveraging the cv2.inRange and cv2.bitwise_and methods to detect the green color in any given image.

The strategy that we will adopt is as follows:

  1. Convert the image from RGB into HSV space.
  2. Specify the upper and lower limits of HSV space that correspond to the color green.
  3. Identify the pixels that have a green color – this will be the mask.
  4. Perform a bitwise_and operation between the original...

Building a panoramic view of images

In this section, we will learn about one of the techniques that helps in creating a panoramic view by combining multiple images.

Imagine a scenario where you are capturing the panorama of a place using your camera. Essentially, you are taking multiple shots, and in the backend, the algorithm is mapping the common elements present across the images (moving from the leftmost to the rightmost side) into a single image.

To perform the stitching of images, we will leverage the ORB (Oriented FAST and Rotated BRIEF) method available in cv2. Getting into the details of how these algorithms work is beyond the scope of this book – we encourage you to go through the documentation and the paper available at https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_feature2d/py_orb/py_orb.html.

At a high level, the method identifies keypoints within a query image (image1) and then associates them with the keypoints identified in another training...

Detecting the number plate of a car

Imagine a scenario where we ask you to identify the location of a number plate in the image of a car. One way we have learned how to do this in the chapters on object detection is to come up with anchor box-based techniques to identify the location of the number plate. This would require us to train the model on a few hundred images before we leverage the model.

However, there is a cascade classifier that is readily available as a pre-trained file that we can use to identify the location of the number plate in an image of a car. A classifier is a cascade classifier if it consists of several simpler classifiers (stages) that are applied subsequently to a region of interest until at some stage, the candidate region is rejected or all the stages are passed. These are analogous to convolution kernels that we have learned how to use so far. Instead of having a deep neural network that learns kernels from other kernels, this is a list of kernels that have...

Summary

In this chapter, we learned about leveraging some of the OpenCV-based techniques to identify contours, edges, and lines, and track colored objects. While we discussed a few use cases in this chapter, these techniques have a much broader application across the various use cases. Then, we learned about identifying similarities between two images using the keypoint and feature extraction techniques when stitching two images related to each other. Finally, we learned about cascade classifiers and leveraging the pre-trained ones to arrive at an optimal solution with little development effort, and also generating predictions in real time.

Broadly, through this chapter, we wanted to show that not all problems need neural networks and, especially in constrained environments, we can use a vast library of historical knowledge and techniques to quickly solve those problems. Where it is not possible to solve with OpenCV, we have already delved deep into neural networks.

Images are fascinating...

lock icon The rest of the chapter is locked
You have been reading a chapter from
Modern Computer Vision with PyTorch
Published in: Nov 2020 Publisher: Packt ISBN-13: 9781839213472
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}