Reader small image

You're reading from  Learning OpenCV 3 Application Development

Product typeBook
Published inDec 2016
Reading LevelIntermediate
PublisherPackt
ISBN-139781784391454
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Samyak Datta
Samyak Datta
author image
Samyak Datta

Samyak Datta has a bachelor's and a master's degree in Computer Science from the Indian Institute of Technology, Roorkee. He is a computer vision and machine learning enthusiast. His first contact with OpenCV was in 2013 when he was working on his master's thesis, and since then, there has been no looking back. He has contributed to OpenCV's GitHub repository. Over the course of his undergraduate and master's degrees, Samyak has had the opportunity to engage with both the industry and research. He worked with Google India and Media.net (Directi) as a software engineering intern, where he was involved with projects ranging from machine learning and natural language processing to computer vision. As of 2016, he is working at the Center for Visual Information Technology (CVIT) at the Indian Institute of Information Technology, Hyderabad.
Read more about Samyak Datta

Right arrow

Chapter 9. Machine Learning with OpenCV

We are now at the last leg of our journey. We started off with the very basics of images, pixels and their traversals, and gradually moved on to our very first image processing algorithms: image filtering box filter as well as Gaussian smoothing). Gradually, we paved our way up to the more sophisticated image processing and computer vision algorithms such as image histograms, thresholding and edge detectors. To understand and grasp the true power of the OpenCV library, we demonstrated how seemingly complex algorithms, such as those for detecting faces in images, can be so effortlessly run by a single line of OpenCV / C++ code!

After having covered the major parts of the OpenCV toolkit, it was time to dig into a real-world project. Using the example of gender classification from facial images, we demonstrated the concept of feature detectors, especially digging deeper into the nuances of the uniform pattern LBP histograms. This chapter will witness a...

What is machine learning


Most programs that we have worked with as part of this book followed a specific pattern. We gave the computer detailed instructions on how to operate on input data (mostly images in our case), and our algorithm diligently followed our instructions to generate the output (which could have been an image or a sequence of values as in the case of a histogram). For example, let's consider image filtering covered in Chapter 2, Image Filtering. We call the OpenCV function, filter2D(), with the appropriate parameters. The implementation of filter2D() holds the detailed instructions to perform image filtering, that is, it consists of the logic to traverse through all the pixels in the image and perform the correlation operation at each pixel location. Everything, including the input image and the filter parameters, is provided to the algorithm. It simply follows the instructions and generates the output image.

In fact, this is how programs were written for a long time. The...

Supervised and unsupervised learning


Now that we know what machine learning involves-learning a set of rules (or building a model) by looking at examples and then using these rules to work out answers for previously unseen data-let's dig a little deeper. In this section, we will discuss two major categories of learning algorithms-supervised and unsupervised learning. These two categories differ in the nature and type of data being presented to the learning algorithm.

Instead of working with formal definitions, let's go with examples. Let's say that we are interested in building a machine learning system that can differentiate between the images of cats and dogs. That is, given an image, our algorithm should tell us whether the picture is that of a cat or a dog. Following the general guidelines that we laid out in the previous section, we have to present our system with a set of example images from where the learning will take place. For such a problem, we present to our system, what are known...

Revisiting the image classification framework


Right at the outset of Chapter 6, Face Detection Using OpenCV, we had a brief discourse on image classification systems. Let's revisit that once again and put it in the context of what we have learnt about machine learning so far.

For convenience, we reproduce the figure representing a typical image classification framework that we introduced in Chapter 6, Face Detection Using OpenCV:

This schematic is, in fact, incomplete! While this perfectly describes what happens once our algorithm has already seen the training data and created its model, it does not depict what goes on during the training phase. In order to incorporate that information, let's revise the preceding diagram as follows:

As you can see, during the training phase, we provide both the image and the associated label as inputs (assume that we are dealing with a supervised machine learning setup for now). The first step is to extract relevant features from the input image. We have...

k-means clustering - the basics


We are going to start with an unsupervised learning algorithm that goes by the name of k-means clustering. As the name suggests, k-means clustering is a type of a more generic class of clustering algorithms. So, what do we understand by clustering?

Clustering does what you would expect it to do-group together similar objects (similar in meaning to what the English word clustering implies). What do you mean by similar objects and how exactly does it perform the grouping? We will answer these questions in detail in this and the following sections.

Like before, we will motivate the basic concept behind k-means clustering by showing examples of what kind of data it operates on and what it does. Let's say that we have a sufficiently large class of students. We want to divide them into three separate groups for the purpose of some academic activity. We want the group division to happen on the basis of the marks that they obtained in the most recent exams. For each...

k-nearest neighbors classifier - introduction


We have covered an unsupervised learning algorithm: k-means clustering. It is time we move on to the supervised counterparts. We are going to discuss a machine learning algorithm that goes by the name of k-nearest neighbors classifier, often abbreviated as the kNN classifier. Although the names of both (k-means and kNN) sound similar, they are, in fact, somewhat different in their working, the most glaring difference being the fact that k-means clustering is an unsupervised technique used to divide the data points into meaningful clusters, while the kNN algorithm is a classifier that associates a class label with each data point.

As always, let's use an example to motivate the main concepts behind the kNN classification algorithm. In the previous example, we had information about the marks of every student in a couple of subjects. Based on this information, our goal was to divide them into some meaningful groups so that each group can then be...

Support vector machines (SVMs) - introduction


Right at the outset of this chapter, we defined the modus operandi of machine learning algorithms. If you recall, we had said that an ML system is presented with training data. It then makes its own set of rules or builds a model, which it uses to further make predictions on unseen (test) data. By revisiting this definition, I want to focus on the two key things that an ML algorithm can do with the training data:

  1. Formulate a set of rules.

  2. Build a model.

We have covered the basics of the k-nearest neighbor classifier in great detail. Let's try to place the operation of the kNN algorithm in the context of the two points we have listed above. Given the training data and a query point to classify, the kNN looks at the neighboring points and decides the class of the query point based on a majority vote. Clearly, this is a case of an ML algorithm that applies a set of rules based on the training data it has been presented with for the purpose of classifying...

Non-linear SVMs


If you have been following the discussion on SVMs closely, you will have noticed a fundamental limitation in the way an SVM operates. We have discussed how the SVM classifier fits a hyperplane to our data such that the margin of separation between the classes is maximized. Now, doing that involves a very strong assumption. It assumes that our data is linearly separable. This is another fancy way of saying that we can separate the data using geometrical structures such as straight lines or planes (hyperplanes, in general). What would happen if our data is non-linearly separable.

For example, try as hard as you would, there is simply no way that you can fit a straight line that can separate the two classes of data in the following image:

As you can see, the decision boundary here is highly non-linear. So, how does an SVM classifier overcome this? The answer is known as the kernel trick.

The basic idea behind the kernel trick is that even if our data is not linearly separable...

Using an SVM as a gender classifier


Now that we have seen how to implement a generic SVM classifier using OpenCV/C++, in this section, we outline the steps to use SVM for the gender classification project that we have been working on.

If you noticed in the example that we discussed in the last section, the training data that we loaded was 2-dimensional and had 10 data points. In the previous chapter, we discussed the fact that we are going to represent our faces using the 531-dimensional uniform pattern LBP histogram descriptor. This means that each data point (face) will be represented using 531-dimensions. These values (the feature vector corresponding to the representation of a face) are usually read into the source code through text files. This means that we design our program to accept two files as input, one holding the feature vectors of the faces in the training data set and the other for the test data.

So essentially, this means that we want the feature descriptors of all our face...

Overfitting


This completes our discussion on some representative machine learning algorithms. We will now focus on some extremely crucial issues that we need to keep in mind while we apply these ML algorithms in any application domain. First, we will discuss the concept of overfitting to our training data.

The whole point of presenting our learning algorithm with training data is that it can, in the future, predict labels for data points that it has never seen. The ability of any learning algorithm to apply its learnt set of rules to completely new and unseen data is known as the generalization ability of the algorithm. The aim of training any ML classifier is that it should generalize unseen data well.

Let's briefly go back to an example that we introduced early on in this chapter. When students attend classes, a professor teaches them a concept using some illustrative examples (training data). The students (ML algorithms) are expected to build a mental model out of the information they are...

Cross-validation


Now that we know overfitting is a serious issue in designing and running machine learning-based systems, let's look at a way in which we can mitigate its effects. Remember that we need to ensure that our learning algorithm doesn't start overfitting on the training data; instead, it should maintain a good enough generalization power to predict labels on unseen data.

How can we enforce such a behavior? Let's go back to our classroom example. To make sure that the students are actually understanding the concepts and not merely overfitting by memorizing the classroom problems, the teacher hands over certain assignments. These assignments contain questions that are similar in concept to what has been taught in the classroom but at the same time, also give the students an idea of the type of questions to expect in the actual exam. In machine learning parlance, the assignments are analogous to the validation set. The students are expected to periodically check their level of understanding...

Common evaluation metrics


I hope the previous section gave you a nice insight of how to deal with your dataset while training ML models and also how to avoid some common pitfalls such as overfitting. In this section, we will take a brief look at some evaluation metrics that can help us judge how well our model is performing. For the purpose of all our explanations, we will assume a binary classification framework.

Before we begin, let's introduce some new terminologies. When we are dealing with a binary problem, instead of labelling the classes as 0 - 1, or even +1 - -1, we usually prefer to use the labels, positive and negative. Which of the two classes is positive is a choice that is left to the designers of the ML algorithm. Now, having said that, each prediction that our algorithm makes on the test data can be classified into the following four categories:

  • True positive (TP): The data points which actually belong to the positive class, and have indeed been classified as positive by our...

The P-R curve


Why do we need two separate metrics--precision as well as recall? Let's dig a little deeper into their meanings. When we say that a classification system has a high precision, what does it exactly mean? It means that if the system predicts that a particular data point belongs to the positive class, then there is a very high probability that it indeed does so. You can revisit the definition of precision to convince yourself that this is indeed the case. Now, imagine that we have a classification system in place at a pathological laboratory, which, given the necessary medical details of a patient, classifies it as a positive (or a negative) occurrence of cancer. Now, obviously, we would want such a system to have a very high precision. It would be disastrous (mentally, physically, and financially) to tell a healthy patient that they have been diagnosed with cancer.

Now, we would like our cancer classifier to have a high recall. Having a low recall would mean that there are a lot...

Some qualitative results


So, we have covered a lot of ground in this chapter! Before we close, it would be a good idea to see what all this effort has resulted in. One of the key motivations of doing an image processing computer vision project is that by the end of it, you get to see some really cool results!

The following are some images which have been classified as Male by our algorithm! There are some interesting results for Justin Bieber fans:

In the following image, you can see some Male predictions:

Summary


This brings us to the end of our chapter on machine learning with OpenCV. We started the chapter by introducing the learning paradigm of solving problems. Under such a scheme, we saw that if our algorithm is presented with a lot of data, it can learn to detect patterns and develop its own set of rules that the further help to make predictions on new, unseen data.

We touched upon a lot of different aspects of ML, both in the supervised and the unsupervised domain. We discussed in detail about the k-means clustering algorithm (unsupervised), k-nearest neighbors classifier, and support vector machines (both supervised). We also looked at some practical issues that crop up when we are trying to deploy a machine learning algorithm on our data. Also, you must have noticed that employing ML algorithms enables our programs to make much more human-like predictions using the available data.

This completes our journey that we began in Chapter 1, Laying the Foundation. The book started with the...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Learning OpenCV 3 Application Development
Published in: Dec 2016Publisher: PacktISBN-13: 9781784391454
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Samyak Datta

Samyak Datta has a bachelor's and a master's degree in Computer Science from the Indian Institute of Technology, Roorkee. He is a computer vision and machine learning enthusiast. His first contact with OpenCV was in 2013 when he was working on his master's thesis, and since then, there has been no looking back. He has contributed to OpenCV's GitHub repository. Over the course of his undergraduate and master's degrees, Samyak has had the opportunity to engage with both the industry and research. He worked with Google India and Media.net (Directi) as a software engineering intern, where he was involved with projects ranging from machine learning and natural language processing to computer vision. As of 2016, he is working at the Center for Visual Information Technology (CVIT) at the Indian Institute of Information Technology, Hyderabad.
Read more about Samyak Datta