Reader small image

You're reading from  OpenCV 3.0 Computer Vision with Java

Product typeBook
Published inJul 2015
Reading LevelIntermediate
Publisher
ISBN-139781783283972
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Daniel Lelis Baggio
Daniel Lelis Baggio
author image
Daniel Lelis Baggio

Daniel Lélis Baggio has started his works in computer vision through medical image processing at InCor (Instituto do Coração – Heart Institute) in São Paulo, Brazil, where he worked with intra-vascular ultrasound (IVUS) image segmentation. After that he has focused on GPGPU and ported that algorithm to work with NVidia's Cuda. He has also dived into 6 degrees of freedom head tracking with Natural User Interface group through a project called EHCI (http://code.google.com/p/ehci/ ). He also wrote “Mastering OpenCV with Practical Computer Vision Projects” from Packt Publishing.
Read more about Daniel Lelis Baggio

Right arrow

Chapter 4. Image Transforms

This chapter covers the methods to change an image into an alternate representation of data in order to cover important problems of computer vision and image processing. Some examples of these methods are artifacts that are used to find image edges as well as transforms that help us find lines and circles in an image. In this chapter, we have covered stretch, shrink, warp, and rotate operations. A very useful and famous transform is Fourier, which transforms signals between the time domain and frequency domain. In OpenCV, you can find the Discrete Fourier Transform (DFT) and Discrete Cosine Transform (DCT). Another transform that we've covered in this chapter is related to integral images that allow rapid summing of sub regions, which is a very useful step in tracking faces algorithm. Besides this, you will also get to see distance transform and histogram equalization in this chapter.

We will cover the following topics:

  • Gradients and sobel derivatives

  • The Laplace...

The Gradient and Sobel derivatives


A key building block in computer vision is finding edges and this is closely related to finding an approximation to derivatives in an image. From basic calculus, it is known that a derivative shows the variation of a given function or an input signal with some dimension. When we find the local maximum of the derivative, this will yield regions where the signal varies the most, which for an image might mean an edge. Hopefully, there's an easy way to approximate a derivative for discrete signals through a kernel convolution. A convolution basically means applying some transforms to every part of the image. The most used transform for differentiation is the Sobel filter [1], which works for horizontal, vertical, and even mixed partial derivatives of any order.

In order to approximate the value for the horizontal derivative, the following sobel kernel matrix is convoluted with an input image:

This means that, for each input pixel, the calculated value of its...

The Laplace and Canny transforms


Another quite useful operator to find edges is the Laplacian transformation. Instead of relying on the first order derivatives, OpenCV's Laplacian transformation implements the discrete operator for the following function:

The matrix can be approximated to the convolution with the following kernel when using finite difference methods and a 3x3 aperture:

The signature for the preceding function is as follows:

Laplacian(Mat source, Mat destination, int ddepth)

While source and destination matrices are simple parameters, ddepth is the depth of the destination matrix. When you set this parameter to -1, it will have the same depth as the source image, although you might want more depth when you apply this operator. Besides this, there are overloaded versions of this method that receive an aperture size, a scale factor, and an adding scalar.

Besides using the Laplacian method, you can also use the Canny algorithm, which is an excellent approach that was proposed by...

The line and circle Hough transforms


In case you need to find straight lines or circles in an image, you can use Hough transforms, as they are very useful. In this section, we will cover OpenCV methods to extract them from your image.

The idea behind the original Hough line transform is that any point in a binary image could be part of a set of lines. Suppose each straight line could be parameterized by the y = mx + b line equation, where m is the line slope and b is the y axis intercept of this line. Now, we could iterate the whole binary image, storing each of the m and b parameters and checking their accumulation. The local maximum points of the m and b parameters would yield equations of straight lines that mostly appeared in the image. Actually, instead of using the slope and y axis interception point, we use the polar straight line representation.

Since OpenCV not only supports the standard Hough transform, but also the progressive probabilistic Hough transform for which the two functions...

Geometric transforms – stretch, shrink, warp, and rotate


While working with images and computer vision, it is very common that you will require the ability to preprocess an image using known geometric transforms, such as stretching, shrinking, rotation, and warping. The latter is the same as nonuniform resizing. These transforms can be realized through the multiplication of source points with a 2 x 3 matrix and they get the name of affine transformations while turning rectangles in parallelograms. Hence, they have the limitation of requiring the destination to have parallel sides. On the other hand, a 3 x 3 matrix multiplication represents perspective transforms. They offer more flexibility since they can map a 2D quadrilateral to another. The following screenshot shows a very useful application of this concept.

Here, we will find out which is the perspective transform that maps the side of a building in a perspective view to its frontal view:

Note that the input to this problem is the perspective...

Discrete Fourier Transform and Discrete Cosine Transform


When dealing with image analysis, it would be very useful if you could change an image from the spatial domain, which is the image in terms of its x and y coordinates, to the frequency domain—the image decomposed in its high and low frequency components—so that you would be able to see and manipulate frequency parameters. This could come in handy in image compression because it is known that human vision is not much sensitive to high frequency signals as it is to low frequency signals. In this way, you could transform an image from the spatial domain to the frequency domain and remove high frequency components, reducing the required memory to represent the image and hence compressing it. An image frequency can be pictured in a better way by the next image.

In order to change an image from the spatial domain to the frequency domain, the Discrete Fourier Transform can be used. As we might need to bring it back from the frequency domain...

Integral images


Some face recognition algorithms, such as OpenCV's face detection algorithm make heavy use of features like the ones shown in the following image:

These are the so-called Haar-like features and they are calculated as the sum of pixels in the white area minus the sum of pixels in the black area. You might find this type of a feature kind of odd, but when training it for face detection, it can be built to be an extremely powerful classifier using only two of these features, as depicted in the following image:

In fact, a classifier that uses only the two preceding features can be adjusted to detect 100 percent of a given face training database with only 40 percent of false positives. Taking out the sum of all pixels in an image as well as calculating the sum of each area can be a long process. However, this process must be tested for each frame in a given input image, hence calculating these features fast is a requirement that we need to fulfill.

First, let's define an integral...

Distance transforms


Simply put, a distance transform applied to an image will generate an output image whose pixel values will be the closest distance to a zero-valued pixel in the input image. Basically, they will have the closest distance to the background, given a specified distance measure. The following screenshot gives you an idea of what happens to the silhouette of a human body:

Human silhouette by J E Theriot

This transform can be very useful in the process of getting the topological skeleton of a given segmented image as well as to produce blurring effects. Another interesting application of this transform is in the segmentation of overlapping objects, along with a watershed.

Generally, the distance transform is applied to an edge image, which results from a Canny filter. We are going to make use of Imgproc's distanceTransform method, which can be seen in action in the distance project, which you can find in this chapter's source code. Here are the most important lines of this example...

Histogram equalization


The human visual system is very sensitive to contrast in images, which is the difference in the color and brightness of different objects. Besides, the human eye is a miraculous system that can feel intensities at the 1016 light levels [4]. No wonder some sensors could mess up the image data.

When analyzing images, it is very useful to draw their histograms. They simply show you the lightness distribution of a digital image. In order to do that, you need to count the number of pixels with the exact lightness and plot that as a distribution graph. This gives us a great insight into the dynamic range of an image.

When a camera picture has been captured with a very narrow light range, it gets difficult to see the details in the shadowed areas or other areas with poor local contrast. Fortunately, there's a technique to spread frequencies for uniform intensity distribution, which is called histogram equalization. The following image shows the same picture with their respective...

References


  1. A 3x3 Isotropic Gradient Operator for Image Processing presented at a talk at the Stanford Artificial Project in 1968, by I. Sobel and G. Feldman.

  2. A Computational Approach To Edge Detection, IEEE Trans. Pattern Analysis and Machine Intelligence, by Canny, J.

  3. Robust Detection of Lines Using the Progressive Probabilistic Hough Transform, CVIU 78 1, by Matas, J. and Galambos, C., and Kittler, J.V. pp 119-137 (2000).

  4. Advanced High Dynamic Range Imaging: Theory and Practice, CRC Press, by Banterle, Francesco; Artusi, Alessandro; Debattista, Kurt; Chalmers, Alan.

Summary


This chapter covered the key aspects of computer vision's daily use. We started with the important edge detectors, where you gained the experience of how to find them through the Sobel, Laplacian, and Canny edge detectors. Then, we saw how to use the Hough transforms to find straight lines and circles. After that, the geometric transforms stretch, shrink, warp, and rotate were explored with an interactive sample. We then explored how to transform images from the spatial domain to the frequency domain using the Discrete Fourier analysis. After that, we showed you a trick to calculate Haar-like features fast in an image through the use of integral images. We then explored the important distance transforms and finished the chapter by explaining histogram equalization to you.

Now, be ready to dive into machine learning algorithms, as we will cover how to detect faces in the next chapter. Also, you will learn how to create your own object detector and understand how supervised learning...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
OpenCV 3.0 Computer Vision with Java
Published in: Jul 2015Publisher: ISBN-13: 9781783283972
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Daniel Lelis Baggio

Daniel Lélis Baggio has started his works in computer vision through medical image processing at InCor (Instituto do Coração – Heart Institute) in São Paulo, Brazil, where he worked with intra-vascular ultrasound (IVUS) image segmentation. After that he has focused on GPGPU and ported that algorithm to work with NVidia's Cuda. He has also dived into 6 degrees of freedom head tracking with Natural User Interface group through a project called EHCI (http://code.google.com/p/ehci/ ). He also wrote “Mastering OpenCV with Practical Computer Vision Projects” from Packt Publishing.
Read more about Daniel Lelis Baggio