A Taste of Machine Learning
I am writing a new line with double spaces .
So, you have decided to enter the field of machine learning. That's great!
Nowadays, machine learning is all around us--from protecting our email, to automatically tagging our friends in pictures, to predicting what movies we like. As a form of artificial intelligence, machine learning enables computers to learn through experience: to make predictions about the future using collected data from the past. On top of that, computer vision is one of today's most exciting application fields of machine learning, with deep learning and convolutional neural networks driving innovative systems such as self-driving cars and Google's DeepMind.
However, fret not; your application does not need to be as large-scale or world-changing as the previous examples in order to benefit from machine learning. In this chapter, we will talk about why machine learning has become so popular and discuss the kinds of problems that it can solve. We will then introduce the tools that we need in order to solve machine learning problems using OpenCV. Throughout the book, I will assume that you already have a basic knowledge of OpenCV and Python, but that there is always room to learn more.
Are you ready then? Let's go!
Getting started with machine learning
Machine learning has been around for at least 60 years. Growing out of the quest for artificial intelligence, early machine learning systems used hand-coded rules of if...else statements to process data and make decisions. Think of a spam filter whose job is to parse incoming emails and move unwanted messages to a spam folder:

We could come up with a blacklist of words that, whenever they show up in a message, would mark an email as spam. This is a simple example of a hand-coded expert system. (We will build a smarter one in Chapter 7, Implementing a Spam Filter with Bayesian Learning.)
We can think of these expert decision rules to become arbitrarily complicated if we are allowed to combine and nest them in what is known as a decision tree (Chapter 5, Using Decision Trees to Make a Medical Diagnosis). Then, it becomes possible to make more informed decisions that involve a series of decision steps, as shown in the following image:

Hand-coding these decision rules is sometimes feasible, but has two major disadvantages:
- The logic required to make a decision applies only to a specific task in a single domain. For example, there is no way that we could use this spam filter to tag our friends in a picture. Even if we wanted to change the spam filter to do something slightly different, such as filtering out phishing emails in general, we would have to redesign all the decision rules.
- Designing rules by hand requires a deep understanding of the problem. We would have to know exactly which type of emails constitute spam, including all possible exceptions. This is not as easy as it seems; otherwise, we wouldn't often be double-checking our spam folder for important messages that might have been accidentally filtered out. For other domain problems, it is simply not possible to design the rules by hand.
This is where machine learning comes in. Sometimes, tasks cannot be defined well--except maybe by example--and we would like machines to make sense of and solve the tasks by themselves. Other times, it is possible that, hidden among large piles of data, are important relationships and correlations that we as humans might have missed (see Chapter 8, Discovering Hidden Structures with Unsupervised Learning). In these cases, machine learning can often be used to extract these hidden relationships (also known as data mining).
A good example of where man-made expert systems have failed is in detecting faces in images. Silly, isn't it? Today, every smart phone can detect a face in an image. However, 20 years ago, this problem was largely unsolved. The reason for this was the way humans think about what constitutes a face was not very helpful to machines. As humans, we tend not to think in pixels. If we were asked to detect a face, we would probably just look for the defining features of a face, such as eyes, nose, mouth, and so on. But how would we tell a machine what to look for, when all the machine knows is that images have pixels and pixels have a certain shade of gray? For the longest time, this difference in image representation basically made it impossible for a human to come up with a good set of decision rules that would allow a machine to detect a face in an image. We will talk about different approaches to this problem in Chapter 4, Representing Data and Engineering Features.
However, with the advent of convolutional neural networks and deep learning (Chapter 9, Using Deep Learning to Classify Handwritten Digits), machines have become as successful as us when it comes to recognizing faces. All we had to do was simply present a large collection of images of faces to the machine. From there on, the machine was able to discover the set of characteristics that would allow it to identify a face, without having to approach the problem in the same way as we would do. This is the true power of machine learning.
Problems that machine learning can solve
Most machine learning problems belong to one of the following three main categories:
- In supervised learning, each data point is labeled or associated with a category or value of interest (Chapter 3, First Steps in Supervised Learning). An example of a categorical label is assigning an image as either a cat or dog. An example of a value label is the sale price associated with a used car. The goal of supervised learning is to study many labeled examples like these (called training data) in order to make predictions about future data points (called test data). These predictions come in two flavors, such as identifying new photos with the correct animal (called a classification problem) or assigning accurate sale prices to other used cars (called a regression problem). Don't worry if this seems a little over your head for now--we will have the entirety of the book to nail down the details.
- In unsupervised learning, data points have no labels associated with them (Chapter 8, Discovering Hidden Structures with Unsupervised Learning). Instead, the goal of an unsupervised learning algorithm is to organize the data in some way or to describe its structure. This can mean grouping them into clusters or finding different ways of looking at complex data so that they appear simpler.
- In reinforcement learning, the algorithm gets to choose an action in response to each data point. It is a common approach in robotics, where the set of sensor readings at one point in time is a data point and the algorithm must choose the robot's next action. It's also a natural fit for Internet of Things applications, where the learning algorithm receives a reward signal at a short time into the future, indicating how good the decision was. Based on this, the algorithm modifies its strategy in order to achieve the highest reward.
These three main categories are illustrated in the following figure:

Getting started with Python
Python has become the common language for many data science and machine learning applications, thanks to its great number of open-source libraries for processes such as data loading, data visualization, statistics, image processing, and natural language processing. One of the main advantages of using Python is the ability to interact directly with the code, using a terminal or other tools such as the Jupyter Notebook, which we'll look at shortly.
If you have mostly been using OpenCV in combination with C++, I would strongly suggest that you switch to Python, at least for the purpose of studying this book. This decision has not been made out of spite! Quite the contrary: I have done my fair share of C/C++ programming--especially in combination with GPU computing via NVIDIA's Compute Unified Device Architecture (CUDA)--and like it a lot. However, I consider Python to be a better choice fundamentally if you want to pick up a new topical skill, because you can do more by typing less. This will help reduce the cognitive load. Rather than getting annoyed by the syntactic subtleties of C++ or wasting hours trying to convert data from one format to another, Python will help you concentrate on the topic at hand: becoming an expert in machine learning.
Getting started with OpenCV
Being the avid user of OpenCV that I believe you are, I probably don't have to convince you about the power of OpenCV.
Built to provide a common infrastructure for computer vision applications, OpenCV has become a comprehensive set of both classic and state-of-the-art computer vision and machine learning algorithms. According to their own documentation, OpenCV has a user community of more than 47,000 people and has been downloaded over seven million times. That's pretty impressive! As an open-source project, it is very easy for researchers, businesses, and government bodies to utilize and modify already available code.
This being said, a number of open-source machine learning libraries have popped up since the recent machine learning boom that provide far more functionality than OpenCV. A prominent example is scikit-learn, which provides a number of state-of-the-art machine learning algorithms as well as a wealth of online tutorials and code snippets. As OpenCV was developed mainly to provide computer vision algorithms, its machine learning functionality is restricted to a single module, called ml. As we will see in this book, OpenCV still provides a number of state-of-the-art algorithms, but sometimes lacks a bit in functionality. In these rare cases, instead of reinventing the wheel, we will simply use scikit-learn for our purposes.
Last but not least, installing OpenCV using the Python Anaconda distribution is essentially a one-liner!
Installation
Before we get started, let's make sure that we have all the tools and libraries installed that are necessary to create a fully functioning data science environment. After downloading the latest code for this book from GitHub, we are going to install the following software:
- Python's Anaconda distribution, based on Python 3.5 or higher
- OpenCV 3.1 or higher
- Some supporting packages
Getting the latest code for this book
You can get the latest code for this book from GitHub, https://github.com/mbeyeler/opencv-machine-learning. You can either download a .zip package (beginners) or clone the repository using git (intermediate users).
If you choose to go with git, the first step is to make sure it is installed (https://git-scm.com/downloads).
Then, open a terminal (or command prompt, as it is called in Windows):
- On Windows 10, right-click on the Start Menu button, and select Command Prompt.
- On Mac OS X, press Cmd + Space to open spotlight search, then type terminal, and hit Enter.
- On Ubuntu and friends, press Ctrl + Alt + T. On Red Hat, right-click on the desktop and choose Open Terminal from the menu.
Navigate to a directory where you want the code downloaded, for example:
$ cd Desktop
Then you can grab a local copy of the latest code by typing the following:
$ git clone https://github.com/mbeyeler/opencv-machine-learning.git
This will download the latest code in a folder called opencv-machine-learning.
After a while, the code might change online. In that case, you can update your local copy by running the following command from within the opencv-machine-learning directory:
$ git pull origin master
Getting to grips with Python's Anaconda distribution
Anaconda is a free Python distribution developed by Continuum Analytics that is made for scientific computing. It works across Windows, Linux, and Mac OS X platforms and is free even for commercial use. However, the best thing about it is that it comes with a number of preinstalled packages that are essential for data science, math, and engineering. These packages include the following:
- NumPy: A fundamental package for scientific computing in Python, which provides functionality for multidimensional arrays, high-level mathematical functions, and pseudo-random number generators
- SciPy: A collection of functions for scientific computing in Python, which provides advanced linear algebra routines, mathematical function optimization, signal processing, and so on
- scikit-learn: An open-source machine learning library in Python, which provides useful helper functions and infrastructure that OpenCV lacks
- Matplotlib: The primary scientific plotting library in Python, which provides functionality for producing line charts, histograms, scatter plots, and so on
- Jupyter Notebook: An interactive environment for the running of code in a web browser
An installer for our platform of choice (Windows, Mac OS X, or Linux) can be found on the Continuum website, https://www.continuum.io/Downloads. I recommend using the Python 3.6-based distribution, as Python 2 is no longer under active development.
To run the installer, do one of the following:
- On Windows, double-click on the .exe file and follow the instructions on the screen
- On Mac OS X, double-click on the .pkg file and follow the instructions on the screen
- On Linux, open a terminal and run the .sh script using bash:
$ bash Anaconda3-4.3.0-Linux-x86_64.sh # Python 3.6 based
$ bash Anaconda2-4.3.0-Linux-x64_64.sh # Python 2.7 based
In addition, Python Anaconda comes with conda--a simple package manager similar to apt-get on Linux. After successful installation, we can install new packages in the terminal using the following command:
$ conda install package_name
Here, package_name is the actual name of the package that we want to install.
Existing packages can be updated using the following command:
$ conda update package_name
We can also search for packages using the following command:
$ anaconda search -t conda package_name
This will bring up a whole list of packages available through individual users. For example, searching for a package named opencv, we get the following hits:

This will bring up a long list of users who have OpenCV packages installed, where we can locate users that have our version of the software installed on our own platform. A package called package_name from a user called user_name can then be installed as follows:
$ conda install -c user_name package_name
Finally, conda provides something called an environment, which allows us to manage different versions of Python and/or packages installed in them. This means we could have one environment where we have all packages necessary to run OpenCV 2.4 with Python 2.7, and another where we run OpenCV 3.2 with Python 3.6. In the following section, we will create an environment that contains all the packages needed to run the code in this book.
Installing OpenCV in a conda environment
In a terminal, navigate to the directory where you downloaded the code:
$ cd Desktop/opencv-machine-learning
Before we create a new conda environment, we want to make sure we added the Conda-Forge channel to our list of trusted conda channels:
$ conda config --add channels conda-forge
The Conda-Forge channel is led by an open-source community that provides a wide variety of code recipes and software packages (for more info, see https://conda-forge.github.io). Specifically, it provides an OpenCV package for 64-bit Windows, which will simplify the remaining steps of the installation.
Then run the following command to create a conda environment based on Python 3.5, which will also install all the necessary packages listed in the file requirements.txt in one fell swoop:
$ conda create -n Python3 python=3.5 --file requirements.txt
To activate the environment, type one of the following, depending on your platform:
$ source activate Python3 # on Linux / Mac OS X
$ activate Python3 # on Windows
Once we close the terminal, the session will be deactivated--so we will have to run this last command again the next time we open a terminal. We can also deactivate the environment by hand:
$ source deactivate # on Linux / Mac OS X
$ deactivate # on Windows
And done!
Verifying the installation
It's a good idea to double-check our installation. While our terminal is still open, we fire up IPython, which is an interactive shell to run Python commands:
$ ipython
Now make sure that you are running (at least) Python 3.5 and not Python 2.7. You might see the version number displayed in IPython's welcome message. If not, you can run the following commands:
In [1]: import sys
... print(sys.version)
3.5.3 |Continuum Analytics, Inc.| (default, Feb 22 2017, 21:28:42) [MSC v.1900 64 bit (AMD64)]
Now try to import OpenCV:
In [2]: import cv2
You should get no error messages. Then, try to find out the version number:
In [3]: cv2.__version__
Out[3]: '3.1.0'
Make sure that the Python version reads 3.5 or 3.6, but not 2.7. Additionally, make sure that OpenCV's version number reads at least 3.1.0; otherwise, you will not be able to use some OpenCV functionality later on.
You can then exit the IPython shell by typing exit - or hitting Ctrl + D and confirming that you want to quit.
Alternatively, you can run the code in a web browser thanks to Jupyter Notebook. If you have never heard of Jupyter Notebooks or played with them before, trust me - you will love them! If you followed the directions as mentioned earlier and installed the Python Anaconda stack, Jupyter is already installed and ready to go. In a terminal, type as follows:
$ jupyter notebook
This will automatically open a browser window, showing a list of files in the current directory. Click on the opencv-machine-learning folder, then on the notebooks folder, and voila! Here you will find all the code for this book, ready for you to be explored:

The notebooks are arranged by chapter and section. For the most part, they contain only the relevant code, but no additional information or explanations. These are reserved for those who support our effort by buying this book - so thank you!
Simply click on a notebook of your choice, such as 01.00-A-Taste-of-Machine-Learning.ipynb, and you will be able to run the code yourself by selecting Kernel > Restart & Run All:

There are a few handy keyboard shortcuts for navigating Jupyter Notebooks. However, the only ones that you need to know about right now are the following:
- Click in a cell in order to edit it.
- While the cell is selected, hit Ctrl + Enter to execute the code in it.
- Alternatively, hit Shift + Enter to execute a cell and select the cell below it.
- Hit Esc to exit write mode, then hit A to insert a cell above the currently selected one and B to insert a cell below.
However, I strongly encourage you to follow along the book by actually typing out the commands yourself, preferably in an IPython shell or an empty Jupyter Notebook. There is no better way to learn how to code than by getting your hands dirty. Even better if you make mistakes--we have all been there. At the end of the day, it's all about learning by doing!
Getting a glimpse of OpenCV's ML module
Starting with OpenCV 3.1, all machine learning-related functions in OpenCV have been grouped into the ml module. This has been the case for the C++ API for quite some time. You can get a glimpse of what's to come by displaying all functions in the ml module:
In [4]: dir(cv2.ml)
Out[4]: ['ANN_MLP_BACKPROP',
'ANN_MLP_GAUSSIAN',
'ANN_MLP_IDENTITY',
'ANN_MLP_NO_INPUT_SCALE',
'ANN_MLP_NO_OUTPUT_SCALE',
...
'__spec__']
Summary
In this chapter, we talked about machine learning at a high abstraction level: what it is, why it is important, and what kinds of problems it can solve. We learned that machine learning problems come in three flavors: supervised learning, unsupervised learning, and reinforcement learning. We talked about the prominence of supervised learning, and that this field can be further divided into two subfields: classification and regression. Classification models allow us to categorize objects into known classes (such as animals into cats and dogs), whereas regression analysis can be used to predict continuous outcomes of target variables (such as the sales price of used cars).
We also learned how to set up a data science environment using the Python Anaconda distribution, how to get the latest code of this book from GitHub, and how to run code in a Jupyter Notebook.
With these tools in hand, we are now ready to start talking about machine learning in more detail. In the next chapter, we will look at the inner workings of machine learning systems and learn how to work with data in OpenCV with the help of common Pythonic tools such as NumPy and Matplotlib.