Packt+ | Advance your knowledge in tech

You're reading from OpenCV 3.0 Computer Vision with Java

Product typeBook

Published inJul 2015

Reading LevelIntermediate

Publisher

ISBN-139781783283972

Edition1st Edition

Languages

Java

Tools

OpenCV

Concepts

Computer Vision

Author (1)

Daniel Lelis Baggio

Chapter 4. Image Transforms

This chapter covers the methods to change an image into an alternate representation of data in order to cover important problems of computer vision and image processing. Some examples of these methods are artifacts that are used to find image edges as well as transforms that help us find lines and circles in an image. In this chapter, we have covered stretch, shrink, warp, and rotate operations. A very useful and famous transform is Fourier, which transforms signals between the time domain and frequency domain. In OpenCV, you can find the Discrete Fourier Transform (DFT) and Discrete Cosine Transform (DCT). Another transform that we've covered in this chapter is related to integral images that allow rapid summing of sub regions, which is a very useful step in tracking faces algorithm. Besides this, you will also get to see distance transform and histogram equalization in this chapter.

We will cover the following topics:

Gradients and sobel derivatives
The Laplace...

The Gradient and Sobel derivatives

A key building block in computer vision is finding edges and this is closely related to finding an approximation to derivatives in an image. From basic calculus, it is known that a derivative shows the variation of a given function or an input signal with some dimension. When we find the local maximum of the derivative, this will yield regions where the signal varies the most, which for an image might mean an edge. Hopefully, there's an easy way to approximate a derivative for discrete signals through a kernel convolution. A convolution basically means applying some transforms to every part of the image. The most used transform for differentiation is the Sobel filter [1], which works for horizontal, vertical, and even mixed partial derivatives of any order.

In order to approximate the value for the horizontal derivative, the following sobel kernel matrix is convoluted with an input image:

This means that, for each input pixel, the calculated value of its...

The Laplace and Canny transforms

Another quite useful operator to find edges is the Laplacian transformation. Instead of relying on the first order derivatives, OpenCV's Laplacian transformation implements the discrete operator for the following function:

The matrix can be approximated to the convolution with the following kernel when using finite difference methods and a 3x3 aperture:

The signature for the preceding function is as follows:

Laplacian(Mat source, Mat destination, int ddepth)

While source and destination matrices are simple parameters, ddepth is the depth of the destination matrix. When you set this parameter to -1, it will have the same depth as the source image, although you might want more depth when you apply this operator. Besides this, there are overloaded versions of this method that receive an aperture size, a scale factor, and an adding scalar.

Besides using the Laplacian method, you can also use the Canny algorithm, which is an excellent approach that was proposed by...

The line and circle Hough transforms

In case you need to find straight lines or circles in an image, you can use Hough transforms, as they are very useful. In this section, we will cover OpenCV methods to extract them from your image.

The idea behind the original Hough line transform is that any point in a binary image could be part of a set of lines. Suppose each straight line could be parameterized by the y = mx + b line equation, where m is the line slope and b is the y axis intercept of this line. Now, we could iterate the whole binary image, storing each of the m and b parameters and checking their accumulation. The local maximum points of the m and b parameters would yield equations of straight lines that mostly appeared in the image. Actually, instead of using the slope and y axis interception point, we use the polar straight line representation.

Since OpenCV not only supports the standard Hough transform, but also the progressive probabilistic Hough transform for which the two functions...

Geometric transforms – stretch, shrink, warp, and rotate

While working with images and computer vision, it is very common that you will require the ability to preprocess an image using known geometric transforms, such as stretching, shrinking, rotation, and warping. The latter is the same as nonuniform resizing. These transforms can be realized through the multiplication of source points with a 2 x 3 matrix and they get the name of affine transformations while turning rectangles in parallelograms. Hence, they have the limitation of requiring the destination to have parallel sides. On the other hand, a 3 x 3 matrix multiplication represents perspective transforms. They offer more flexibility since they can map a 2D quadrilateral to another. The following screenshot shows a very useful application of this concept.

Here, we will find out which is the perspective transform that maps the side of a building in a perspective view to its frontal view:

Note that the input to this problem is the perspective...

Discrete Fourier Transform and Discrete Cosine Transform

When dealing with image analysis, it would be very useful if you could change an image from the spatial domain, which is the image in terms of its x and y coordinates, to the frequency domain—the image decomposed in its high and low frequency components—so that you would be able to see and manipulate frequency parameters. This could come in handy in image compression because it is known that human vision is not much sensitive to high frequency signals as it is to low frequency signals. In this way, you could transform an image from the spatial domain to the frequency domain and remove high frequency components, reducing the required memory to represent the image and hence compressing it. An image frequency can be pictured in a better way by the next image.

In order to change an image from the spatial domain to the frequency domain, the Discrete Fourier Transform can be used. As we might need to bring it back from the frequency domain...

Integral images

Some face recognition algorithms, such as OpenCV's face detection algorithm make heavy use of features like the ones shown in the following image:

These are the so-called Haar-like features and they are calculated as the sum of pixels in the white area minus the sum of pixels in the black area. You might find this type of a feature kind of odd, but when training it for face detection, it can be built to be an extremely powerful classifier using only two of these features, as depicted in the following image:

In fact, a classifier that uses only the two preceding features can be adjusted to detect 100 percent of a given face training database with only 40 percent of false positives. Taking out the sum of all pixels in an image as well as calculating the sum of each area can be a long process. However, this process must be tested for each frame in a given input image, hence calculating these features fast is a requirement that we need to fulfill.

First, let's define an integral...

Distance transforms

Simply put, a distance transform applied to an image will generate an output image whose pixel values will be the closest distance to a zero-valued pixel in the input image. Basically, they will have the closest distance to the background, given a specified distance measure. The following screenshot gives you an idea of what happens to the silhouette of a human body:

Human silhouette by J E Theriot

This transform can be very useful in the process of getting the topological skeleton of a given segmented image as well as to produce blurring effects. Another interesting application of this transform is in the segmentation of overlapping objects, along with a watershed.

Generally, the distance transform is applied to an edge image, which results from a Canny filter. We are going to make use of Imgproc's distanceTransform method, which can be seen in action in the distance project, which you can find in this chapter's source code. Here are the most important lines of this example...

Histogram equalization

The human visual system is very sensitive to contrast in images, which is the difference in the color and brightness of different objects. Besides, the human eye is a miraculous system that can feel intensities at the 10₁₆ light levels [4]. No wonder some sensors could mess up the image data.

When analyzing images, it is very useful to draw their histograms. They simply show you the lightness distribution of a digital image. In order to do that, you need to count the number of pixels with the exact lightness and plot that as a distribution graph. This gives us a great insight into the dynamic range of an image.

When a camera picture has been captured with a very narrow light range, it gets difficult to see the details in the shadowed areas or other areas with poor local contrast. Fortunately, there's a technique to spread frequencies for uniform intensity distribution, which is called histogram equalization. The following image shows the same picture with their respective...

References

A 3x3 Isotropic Gradient Operator for Image Processing presented at a talk at the Stanford Artificial Project in 1968, by I. Sobel and G. Feldman.
A Computational Approach To Edge Detection, IEEE Trans. Pattern Analysis and Machine Intelligence, by Canny, J.
Robust Detection of Lines Using the Progressive Probabilistic Hough Transform, CVIU 78 1, by Matas, J. and Galambos, C., and Kittler, J.V. pp 119-137 (2000).
Advanced High Dynamic Range Imaging: Theory and Practice, CRC Press, by Banterle, Francesco; Artusi, Alessandro; Debattista, Kurt; Chalmers, Alan.

Summary

This chapter covered the key aspects of computer vision's daily use. We started with the important edge detectors, where you gained the experience of how to find them through the Sobel, Laplacian, and Canny edge detectors. Then, we saw how to use the Hough transforms to find straight lines and circles. After that, the geometric transforms stretch, shrink, warp, and rotate were explored with an interactive sample. We then explored how to transform images from the spatial domain to the frequency domain using the Discrete Fourier analysis. After that, we showed you a trick to calculate Haar-like features fast in an image through the use of integral images. We then explored the important distance transforms and finished the chapter by explaining histogram equalization to you.

Now, be ready to dive into machine learning algorithms, as we will cover how to detect faces in the next chapter. Also, you will learn how to create your own object detector and understand how supervised learning...

The rest of the chapter is locked

You have been reading a chapter from

OpenCV 3.0 Computer Vision with Java

Published in: Jul 2015Publisher: ISBN-13: 9781783283972

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Daniel Lelis Baggio

Daniel Lélis Baggio has started his works in computer vision through medical image processing at InCor (Instituto do Coração – Heart Institute) in São Paulo, Brazil, where he worked with intra-vascular ultrasound (IVUS) image segmentation. After that he has focused on GPGPU and ported that algorithm to work with NVidia's Cuda. He has also dived into 6 degrees of freedom head tracking with Natural User Interface group through a project called EHCI (http://code.google.com/p/ehci/ ). He also wrote “Mastering OpenCV with Practical Computer Vision Projects” from Packt Publishing.
Read more about Daniel Lelis Baggio

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages