Home Data OpenCV 4 Computer Vision Application Programming Cookbook - Fourth Edition

OpenCV 4 Computer Vision Application Programming Cookbook - Fourth Edition

By David Millán Escrivá , Robert Laganiere
books-svg-icon Book
eBook $29.99 $20.98
Print $43.99
Subscription $15.99 $10 p/m for three months
$10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!
What do you get with a Packt Subscription?
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
BUY NOW $10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!
eBook $29.99 $20.98
Print $43.99
Subscription $15.99 $10 p/m for three months
What do you get with a Packt Subscription?
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
  1. Free Chapter
    Playing with Images
About this book
OpenCV is an image and video processing library used for all types of image and video analysis. Throughout the book, you'll work with recipes to implement a variety of tasks. With 70 self-contained tutorials, this book examines common pain points and best practices for computer vision (CV) developers. Each recipe addresses a specific problem and offers a proven, best-practice solution with insights into how it works, so that you can copy the code and configuration files and modify them to suit your needs. This book begins by guiding you through setting up OpenCV, and explaining how to manipulate pixels. You'll understand how you can process images with classes and count pixels with histograms. You'll also learn detecting, describing, and matching interest points. As you advance through the chapters, you'll get to grips with estimating projective relations in images, reconstructing 3D scenes, processing video sequences, and tracking visual motion. In the final chapters, you'll cover deep learning concepts such as face and object detection. By the end of this book, you'll have the skills you need to confidently implement a range of computer vision algorithms to meet the technical requirements of your complex CV projects.
Publication date:
May 2019
Publisher
Packt
Pages
494
ISBN
9781789340723

 

Playing with Images

This chapter will teach you the basic elements of OpenCV and will show you how to accomplish the most fundamental image-processing tasks—reading, displaying, and saving images. However, before you can start with OpenCV, you need to install the library. This is a simple process that is explained in the first recipe of this chapter.

All your computer vision applications will involve the processing of images. This is why the most fundamental tool that OpenCV offers you is a data structure to handle images and matrices. It is a powerful data structure, with many useful attributes and methods. It also incorporates an advanced memory management model that greatly facilitates the development of applications. The last two recipes of this chapter will teach you how to use this important OpenCV data structure.

In this chapter, we will get you started with the OpenCV library. You will learn how to perform the following tasks:

  • Installing the OpenCV library
  • Loading, displaying, and saving images
  • Exploring the cv::Mat data structure
  • Defining regions of interest
 

Installing the OpenCV library

OpenCV is an open source library for developing computer vision applications that run on Windows, Linux, Android, and macOS. It can be used in both academic and commercial applications under a BSD license that allows you to use, distribute, and adapt it freely. This recipe will show you how to install the library on your machine.

Getting ready

When you visit the OpenCV official website at https://opencv.org/, you will find the latest release of the library, the online documentation, and many other useful resources concerning OpenCV.

How to do it...

The following steps will help take us through the installation, as follows:

  1. From the OpenCV website, go to the downloads page that corresponds to the platform of your choice (Unix/Windows or Android). From there, you will be able to download the OpenCV package.
  2. You will then uncompress it, normally under a directory with a name that corresponds to the library version (for example, in Windows, you can save the uncompressed directory under C:\OpenCV4.0.0).

Once this is done, you will find a collection of files and directories that constitute the library at the chosen location. Notably, you will find the modules directory here, which contains all the source files. (Yes, it is open source!)

  1. However, in order to complete the installation of the library and have it ready for use, you need to undertake an additional step—generating the binary files of the library for the environment of your choice. This is indeed the point where you have to make a decision on the target platform that you will use to create your OpenCV applications. Which operating system should you use? Windows or Linux? Which compiler should you use? Microsoft Visual Studio 2013 or MinGW? 32-bit or 64-bit? The integrated development environment (IDE) that you will use in your project development will also guide you to make these choices.
Note that if you are working under Windows with Visual Studio, the executable installation package will, most probably, not only install the library sources, but also install all of the precompiled binaries needed to build your applications. Check for the build directory; it should contain the x64 and x86 subdirectories (corresponding to the 64-bit and 32-bit versions). Within these subdirectories, you should find directories such as vc14 and vc15; these contain the binaries for the different versions of Microsoft Visual Studio. In that case, you are ready to start using OpenCV. Therefore, you can skip the compilation step described in this recipe, unless you want a customized build with specific options.
  1. To complete the installation process and build the OpenCV binaries, you need to use the CMake tool, available at https://cmake.org/.

CMake is another open source software tool designed to control the compilation process of a software system using platform-independent configuration files. It generates the required makefiles or workspaces needed for compiling a software library in your environment. Therefore, you need to download and install CMake.

  1. Then, run it using the command line. Thereafter, it is easier to use CMake with its GUI (cmake-gui).
  2. Specify the folder containing the OpenCV library source and the one that will contain the binaries. You need to click on Configure in order to select the compiler of your choice, and then click on Configure again as shown in the following screenshot:
  1. You are now ready to generate your project files by clicking on the Generate button. These files will allow you to compile the library.
  1. This is the last step of the installation process, which will make the library ready to be used under your development environment:
    1. If you have selected Visual Studio, then all you need to do is to open the top-level solution file that CMake has created for you (most probably, the OpenCV.sln file).
    2. You then click on Build Solution in Visual Studio.
    3. To get both a Release and a Debug build, you will have to repeat the compilation process twice, one for each configuration. The bin directory that is created contains the dynamic library files that your executable will call at runtime.
    4. Make sure to set your system PATH environment variable from the Control Panel such that your operating system can find the dll files when you run your applications:
  1. In Linux environments, you will use the generated makefiles by running your make utility command. To complete the installation of all the directories, you also have to run a Build INSTALL or sudo make INSTALL command.

If you wish to use Qt as your IDE, the There's more... section of this recipe describes an alternative way to compile the OpenCV project.

How it works...

Since Version 2.2, the OpenCV library has been divided into several modules. These modules are built-in library files located in the lib directory. Some of the commonly used modules are as follows:

  • The opencv_core module that contains the core functionalities of the library, in particular, basic data structures and arithmetic functions
  • The opencv_imgproc module that contains the main image-processing functions
  • The opencv_highgui module that contains the image and video reading and writing functions along with some user interface functions
  • The opencv_features2d module that contains the feature point detectors and descriptors and the feature point matching framework
  • The opencv_calib3d module that contains the camera calibration, two-view geometry estimation, and stereo functions
  • The opencv_video module that contains the motion estimation, feature tracking, and foreground extraction functions and classes
  • The opencv_objdetect module that contains the object detection functions such as the face and people detectors

The library also includes other utility modules that contain machine learning functions (opencv_ml), computational geometry algorithms (opencv_flann), contributed code (opencv_contrib), and many more. You will also find other specialized libraries that implement higher level functions, such as opencv_photo for computational photography and opencv_stitching for image-stitching algorithms. There is also a new branch that contains other library modules, which include non-free algorithms, non-stable modules, or experimental modules. This branch is on the opencv-contrib GitHub branch. When you compile your application, you will have to link your program with the libraries that contain the OpenCV functions you are using, linking it with the opencv-contrib folder.

All these modules have a header file associated with them (located in the include directory). A typical OpenCV C++ code will, therefore, start by including the required modules. For example (and this is the suggested declaration style), it will look like the following code:

#include <opencv2/core/core.hpp> 
#include <opencv2/imgproc/imgproc.hpp> 
#include <opencv2/highgui/highgui.hpp>
Downloading the example code
You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files emailed directly to you.

You might see an OpenCV code starting with the following command:

#include "cv.h" 

This is because it used the old style before the library was restructured into modules and became compatible with older definitions.

There's more...

The OpenCV website at https://opencv.org/ contains detailed instructions on how to install the library. It also contains complete online documentation that includes several tutorials on the different components of the library.

Using Qt for OpenCV developments

Qt is a cross-platform IDE for C++ applications developed as an open source project. It is offered under the GNU Lesser General Public License (LGPL) open source license as well as under a commercial (and paid) license for the development of proprietary projects. It is composed of two separate elements—a cross-platform IDE called Qt Creator, and a set of Qt class libraries and development tools. Using Qt to develop C++ applications has the following benefits:

  • It is an open source initiative, developed by the Qt community, that gives you access to the source code of the different Qt components
  • It is a cross-platform IDE, meaning that you can develop applications that can run on different operating systems, such as Windows, Linux, macOS, and so on
  • It includes a complete and cross-platform GUI library that follows an effective object-oriented and event-driven model
  • Qt also includes several cross-platform libraries that help you to develop multimedia, graphics, databases, multithreading, web applications, and many other interesting building blocks useful for designing advanced applications

You can download Qt from https://www.qt.io/developers/. When you install it, you will be offered the choice of different compilers. Under Windows, MinGW is an excellent alternative to the Visual Studio compilers.

Compiling the OpenCV library with Qt is particularly easy because it can read CMake files. Once OpenCV and CMake have been installed, simply select Open File or Project... from the Qt menu, and open the CMakeLists.txt file that you will find under the sources directory of OpenCV. This will create an OpenCV project that you will have built by clicking on Build Project in the Qt menu:

You might get a few warnings, but these can be overlooked without consequences.

The OpenCV developer site

OpenCV is an open source project that welcomes user contributions. You can access the developer site at https://docs.opencv.org/. Among other things, you can access the currently developed version of OpenCV. The community uses Git as its version control system. You then have to use it to check out the latest version of OpenCV. Git is also a free and open source software system; it is probably the best tool you can use to manage your own source code. You can download it from https://git-scm.com/.

See also

  • The website of the author of this cookbook (www.laganiere.name) also presents step-by-step instructions on how to install the latest versions of the library.
  • The There's more... section of the next recipe explains how to create an OpenCV project with Qt.

We've successfully learned how to install the OpenCV library. Now, let's move on to the next recipe!

 

Loading, displaying, and saving images

It is now time to run your first OpenCV application. Since OpenCV is about processing images, this task will show you how to perform the most fundamental operations needed in the development of imaging applications. These are loading an input image from a file, displaying an image on a window, applying a processing function, and storing an output image on a disk.

Getting ready

Using your favorite IDE (for example, MS Visual Studio or Qt), create a new console application with the main function that is ready to be filled.

How to do it...

Let's take a look at the following steps:

  1. Include the header files, declaring the classes and functions you will use. Here, we simply want to display an image, so we need the core library that declares the image data structure and the highgui header file that contains all the graphical interface functions:
#include <opencv2/core/core.hpp> 
#include <opencv2/highgui/highgui.hpp> 
  1. Our main function starts by declaring a variable that will hold the image. Under OpenCV2, define an object of the cv::Mat class:
cv::Mat image; // create an empty image 
  1. This definition creates an image sized 0 x 0. This can be confirmed by accessing the cv::Mat size attributes:
std::cout << "This image is " << image.rows << " x "  << image.cols << std::endl; 
  1. Next, a simple call to the reading function will read an image from the file, decode it, and allocate the memory:
image= cv::imread("puppy.bmp"); // read an input image 
  1. You are now ready to use this image. However, you should first check whether the image has been correctly read (an error will occur if the file is not found, if the file is corrupted, or if it is not in a recognizable format) using the empty() function. The empty method returns true if no image data has been allocated:
if (image.empty()) {  // error handling 
  // no image has been created... 
  // possibly display an error message 
// and quit the application ... }
  1. The first thing you might want to do with this image is to display it. You can do this by using the functions of the highgui module. Start by declaring the window on which you want to display the images, and then specify the image to be shown on this special window:
// define the window (optional) 
cv::namedWindow("Original Image"); 
// show the image  
cv::imshow("Original Image", image); 

As you can see, the window is identified by a name. You can reuse this window to display another image later, or you can create multiple windows with different names. When you run this application, you will see an image window as follows:

  1. Now, you would normally apply some processing to the image. OpenCV offers a wide selection of processing functions, and several of them are explored in this book. Let's start with a very simple one that flips an image horizontally. Several image transformations in OpenCV can be performed in-place, meaning that the transformation is applied directly on the input image (no new image is created). This is the case with the flipping method. However, we can always create another matrix to hold the output result, and that is what we will do:
cv::Mat result; // we create another empty image 
cv::flip(image,result,1); // positive for horizontal 
                          // 0 for vertical,
// negative for both
  1. We are going to display the result on another window:
cv::namedWindow("Output Image"); // the output window 
cv::imshow("Output Image", result);

  1. Since it is a console window that will terminate when it reaches the end of the main function, we add an extra highgui function to wait for a user keypress before ending the program:
cv::waitKey(0); // 0 to indefinitely wait for a key pressed 
                // specifying a positive value will wait for 
                // the given amount of msec 

You can then see that the output image is displayed on a distinct window, as shown in the following screenshot:

  1. Finally, you will probably want to save the processed image on your disk. This is done using the following highgui function:
cv::imwrite("output.bmp", result); // save result 

The file extension determines which codec will be used to save the image. Other popular supported image formats are JPG, TIFF, and PNG.

How it works...

All classes and functions in the C++ API of OpenCV are defined within the cv namespace. You have two ways to access them. First, precede the main function's definition with the following declaration:

using namespace cv; 

Alternatively, prefix all OpenCV class and function names by the namespace specification, that is, cv::, as we will do in this book. The use of the prefix makes the OpenCV classes and functions easier to identify.

The highgui module contains a set of functions that allows you to visualize and interact with your images easily. When you load an image with the imread function, you also have the option to read it as a gray-level image. This is very advantageous since several computer vision algorithms require gray-level images. Converting an input color image on the fly as you read it will save your time and minimize your memory usage. This can be done as follows:

// read the input image as a gray-scale image 
image= cv::imread("puppy.bmp", cv::IMREAD_GRAYSCALE); 

This will produce an image made of unsigned bytes (unsigned char in C++) that OpenCV designates with the CV_8U defined constant. Alternatively, it is sometimes necessary to read an image as a three-channel color image even if it has been saved as a gray-level image. This can be achieved by calling the imread function with a positive second argument:

// read the input image as a 3-channel color image 
image= cv::imread("puppy.bmp", cv::IMREAD_COLOR); 

This time, an image made of three bytes per pixel will be created, designated as CV_8UC3 in OpenCV. Of course, if your input image has been saved as a gray-level image, all three channels will contain the same value. Finally, if you wish to read the image in the format in which it has been saved, then simply input a negative value as the second argument. The number of channels in an image can be checked by using the channels method:

std::cout << "This image has " << image.channels() << " channel(s)"; 

Pay attention when you open an image with imread without specifying a full path (as we did here). In that case, the default directory will be used. When you run your application from the console, this directory is obviously one of your executable files. However, if you run the application directly from your IDE, the default directory will most often be the one that contains your project file. Consequently, make sure that your input image file is located in the right directory.

When you use imshow to display an image made up of integers (designated as CV_16U for 16-bit unsigned integers, or as CV_32S for 32-bit signed integers), the pixel values of this image will be divided by 256 first, in an attempt to make it displayable with 256 gray shades. Similarly, an image made of floating points will be displayed by assuming a range of possible values between 0.0 (displayed as black) and 1.0 (displayed as white). Values outside this defined range are displayed in white (for values above 1.0) or black (for values below 1.0).

The highgui module is very useful for building quick prototypal applications. When you are ready to produce a finalized version of your application, you will probably want to use the GUI module offered by your IDE in order to build an application with a more professional look.

Here, our application uses both input and output images. As an exercise, you should rewrite this simple program such that it takes advantage of the function's in-place processing, that is, by not declaring the output image and writing it instead:

cv::flip(image,image,1); // in-place processing 

There's more...

The highgui module contains a rich set of functions that help you to interact with your images. Using these, your applications can react to mouse or key events. You can also draw shapes and write texts on images.

Clicking on images

You can program your mouse to perform specific operations when it is over one of the image windows you created. This is done by defining an appropriate callback function. A callback function is a function that you do not explicitly call but which is called by your application in response to specific events (here, the events that concern the mouse interacting with an image window). To be recognized by applications, callback functions need to have a specific signature and must be registered. In the case of the mouse event handler, the callback function must have the following signature:

void onMouse( int event, int x, int y, int flags, void* param); 

The first parameter is an integer that is used to specify which type of mouse event has triggered the call to the callback function. The other two parameters are simply the pixel coordinates of the mouse location when the event occurred. The flags are used to determine which button was pressed when the mouse event was triggered. Finally, the last parameter is used to send an extra parameter to the function in the form of a pointer to an object. This callback function can be registered in the application through the following call:

cv::setMouseCallback("Original Image", onMouse, reinterpret_cast<void*>(&image));

In this example, the onMouse function is associated with the image window called Original Image, and the address of the displayed image is passed as an extra parameter to the function. Now, if we define the onMouse callback function as shown in the following code, then each time the mouse is clicked, the value of the corresponding pixel will be displayed on the console (here, we assume that it is a gray-level image):

void onMouse( int event, int x, int y, int flags, void* param)  { 
 
  cv::Mat *im= reinterpret_cast<cv::Mat*>(param); 
  switch (event) {  // dispatch the event 
  case cv::EVENT_LBUTTONDOWN: // left mouse button down event 
        // display pixel value at (x,y) 
        std::cout << "at (" << x << "," << y << ") value is: " << 
static_cast<int>(im->at<uchar>(cv::Point(x,y))) << std::endl; break; } }

Note that in order to obtain the pixel value at (x,y), we used the at method of the cv::Mat object here; this is discussed in Chapter 2, Manipulating the Pixels. Other possible events that can be received by the mouse event callback function include cv::EVENT_MOUSE_MOVE, cv::EVENT_LBUTTONUP, cv::EVENT_RBUTTONDOWN, and cv::EVENT_RBUTTONUP.

Drawing on images

OpenCV also offers a few functions to draw shapes and write texts on images. The examples of basic shape-drawing functions are circle, ellipse, line, and rectangle. The following is an example of how to use the circle function:

cv::circle(image,  // destination image  
            cv::Point(155,110), // center coordinate 
            65,                 // radius
            0,                  // color (here black) 
            3);                 // thickness 

The cv::Point structure is often used in OpenCV methods and functions to specify a pixel coordinate. Note that here we assume that the drawing is done on a gray-level image; this is why the color is specified with a single integer. In the next recipe, you will learn how to specify a color value in the case of color images that use the cv::Scalar structure. It is also possible to write text on an image. This can be done as follows:

cv::putText(image,  // destination image 
            "This is a dog.",// text 
            cv::Point(40,200), // text position 
            cv::FONT_HERSHEY_PLAIN,  // font type 
            2.0, // font scale 
            255, // text color (here white) 
            2);  // text thickness 

Calling these two functions on our test image will then result in the following screenshot:

Let's see what happens when you run the example using Qt.

Running the example with Qt

If you wish to use Qt to run your OpenCV applications, you will need to create project files. For the example of this recipe, here is how the project file (loadDisplaySave.pro) will look:

QT += core 
QT -= gui 
 
TARGET = loadDisplaySave 
CONFIG  += console 
CONFIG  -= app_bundle 
 
TEMPLATE = app 
 
SOURCES += loadDisplaySave.cpp 
INCLUDEPATH += C:\OpenCV4.0.0\build\include 
LIBS += -LC:\OpenCV4.0.0\build\x86\MinGWqt32\lib \ 
-lopencv_core400 \ 
-lopencv_imgproc400 \ 
-lopencv_highgui400

This file shows you where to find the include and library files. It also lists the library modules that are used by the example. Make sure to use the library binaries compatible with the compiler that Qt is using. Note that if you download the source code of the examples for this book, you will find the CMakeLists files that you can open with Qt (or CMake) in order to create the associated projects.

See also

We've successfully learned how to load, display, and save images. Now, let's move on to the next recipe!

 

Exploring the cv::Mat data structure

In the previous recipe, you were introduced to the cv::Mat data structure. As mentioned, this is a key element of the library. It is used to manipulate images and matrices (in fact, an image is a matrix from a computational and mathematical point of view). Since you will be using this data structure extensively in your application developments, it is imperative that you become familiar with it. Notably, you will learn in this recipe that this data structure incorporates an elegant memory management mechanism, allowing efficient usage.

How to do it...

Let's write the following test program that will allow us to test the different properties of the cv::Mat data structure, as follows:

  1. Include the opencv headers and a c++ i/o stream utility:
#include <iostream> 
#include <opencv2/core/core.hpp> 
#include <opencv2/highgui/highgui.hpp>

  1. We are going to create a function that generates a new gray image with a default value for all its pixels:
cv::Mat function() { 
  // create image 
  cv::Mat ima(500,500,CV_8U,50); 
  // return it 
  return ima; 
} 
  1. In the main function, we are going to create six windows to show our results:
// define image windows 
  cv::namedWindow("Image 1");  
  cv::namedWindow("Image 2");  
  cv::namedWindow("Image 3");  
  cv::namedWindow("Image 4");  
  cv::namedWindow("Image 5");  
  cv::namedWindow("Image");  
  1. Now, we can start to create different mats (with different sizes, channels, and default values) and wait for the key to be pressed:
// create a new image made of 240 rows and 320 columns 
  cv::Mat image1(240,320,CV_8U,100); 
 
  cv::imshow("Image", image1); // show the image 
  cv::waitKey(0); // wait for a key pressed 
 
  // re-allocate a new image 
  image1.create(200,200,CV_8U); 
  image1= 200; 
 
  cv::imshow("Image", image1); // show the image 
  cv::waitKey(0); // wait for a key pressed 
 
  // create a red color image 
  // channel order is BGR 
  cv::Mat image2(240,320,CV_8UC3,cv::Scalar(0,0,255)); 
 
  // or: 
  // cv::Mat image2(cv::Size(320,240),CV_8UC3); 
  // image2= cv::Scalar(0,0,255); 
 
  cv::imshow("Image", image2); // show the image 
  cv::waitKey(0); // wait for a key pressed
  1. We are going to read an image with the imread function and copy it to another mat:
// read an image 
  cv::Mat image3= cv::imread("puppy.bmp");  
 
  // all these images point to the same data block 
  cv::Mat image4(image3); 
  image1= image3; 
 
  // these images are new copies of the source image 
  image3.copyTo(image2); 
  cv::Mat image5= image3.clone(); 
  1. Now, we are going to apply an image transformation (flip) to a copied image, show all images created, and wait for a keypress:
// transform the image for testing 
  cv::flip(image3,image3,1);  
 
  // check which images have been affected by the processing 
  cv::imshow("Image 3", image3);  
  cv::imshow("Image 1", image1);  
  cv::imshow("Image 2", image2);  
  cv::imshow("Image 4", image4);  
  cv::imshow("Image 5", image5);  
  cv::waitKey(0); // wait for a key pressed 
  1. Now, we are going to use the function created before to generate a new gray mat:
// get a gray-level image from a function 
  cv::Mat gray= function(); 
 
cv::imshow("Image", gray); // show the image cv::waitKey(0); // wait for a key pressed
  1. Finally, we are going to load a color image but convert it to gray in the loading process. Then, we will convert its values to float mat:
  // read the image in gray scale 
  image1= cv::imread("puppy.bmp", IMREAD_GRAYSCALE);  
  image1.convertTo(image2,CV_32F,1/255.0,0.0); 
 
  cv::imshow("Image", image2); // show the image 
  cv::waitKey(0); // wait for a key pressed 
 

Run this program and take a look at the following images produced:

Now, let's go behind the scenes to understand the code better.

How it works...

The cv::Mat data structure is essentially made up of two parts: a header and a data block. The header contains all the information associated with the matrix (size, number of channels, data type, and so on). The previous recipe showed you how to access some of the attributes of this structure contained in its header (for example, by using cols, rows, or channels). The data block holds all the pixel values of an image. The header contains a pointer variable that points to this data block; it is the data attribute. An important property of the cv::Mat data structure is the fact that the memory block is only copied when it is explicitly requested. Indeed, most operations will simply copy the cv::Mat header such that multiple objects will point to the same data block at the same time. This memory management model makes your applications more efficient while avoiding memory leaks, but its consequences have to be understood. The examples for this recipe illustrate this fact.

By default, the cv::Mat objects have a zero size when they are created, but you can also specify an initial size as follows:

// create a new image made of 240 rows and 320 columns 
cv::Mat image1(240,320,CV_8U,100);

In this case, you also need to specify the type of each matrix element; CV_8U here, which corresponds to 1-byte pixel images. The letter U means it is unsigned. You can also declare signed numbers by using the letter S. For a color image, you would specify three channels (CV_8UC3). You can also declare integers (signed or unsigned) of size 16 and 32 (for example, CV_16SC3). You also have access to 32-bit and 64-bit floating-point numbers (for example, CV_32F).

Each element of an image (or a matrix) can be composed of more than one value (for example, the three channels of a color image); therefore, OpenCV has introduced a simple data structure that is used when pixel values are passed to functions. It is the cv::Scalar structure, which is generally used to hold one value or three values. For example, to create a color image initialized with red pixels, you will write the following code:

// create a red color image 
// channel order is BGR 
cv::Mat image2(240,320,CV_8UC3,cv::Scalar(0,0,255)); 

Similarly, the initialization of the gray-level image could also have been done using this structure by writing cv::Scalar(100).

The image size also often needs to be passed to functions. We have already mentioned that the cols and rows attributes can be used to get the dimensions of a cv::Mat instance. The size information can also be provided through the cv::Size structure that simply contains the height and width of the matrix. The size() method allows you to obtain the current matrix size. It is the format that is used in many methods where a matrix size must be specified. For example, an image could be created as follows:

// create a non-initialized color image  
cv::Mat image2(cv::Size(320,240),CV_8UC3); 

The data block of an image can always be allocated or reallocated using the create method. When an image has been previously allocated, its old content is deallocated first. For reasons of efficiency, if the newly proposed size and type match the already existing size and type, then no new memory allocation is performed:

// re-allocate a new image 
// (only if size or type are different) 
image1.create(200,200,CV_8U);

When no more references point to a given cv::Mat object, the allocated memory is automatically released. This is very convenient because it avoids the common memory leak problems often associated with dynamic memory allocation in C++. This is a key mechanism in OpenCV 2 that is accomplished by having the cv::Mat class implement reference counting and shallow copying. Therefore, when an image is assigned to another one, the image data (that is, the pixels) is not copied; both the images will point to the same memory block. This also applies to images passed by value or returned by value. A reference count is kept, such that the memory will be released only when all the references to the image will be destroyed or assigned to another image:

// all these images point to the same data block 
cv::Mat image4(image3); 
image1= image3; 

Any transformation applied to one of the preceding images will also affect the other images. If you wish to create a deep copy of the content of an image, use the copyTo method. In that case, the create method is called on the destination image. Another method that produces a copy of an image is the clone method, which creates an identical new image as follows:

// these images are new copies of the source image 
image3.copyTo(image2); 
cv::Mat image5= image3.clone(); 

If you need to copy an image into another image that does not necessarily have the same data type, you have to use the convertTo method:

// convert the image into a floating point image [0,1] 
image1.convertTo(image2,CV_32F,1/255.0,0.0); 

In this example, the source image is copied into a floating-point image. The method includes two optional parameters—a scaling factor and an offset. Note that both the images must, however, have the same number of channels.

The allocation model for the cv::Mat objects also allows you safely to write functions (or class methods) that return an image:

cv::Mat function() {  
  // create image 
  cv::Mat ima(240,320,CV_8U,cv::Scalar(100)); 
  // return it 
  return ima; 
}

We also call this function from our main function, as follows:

  // get a gray-level image 
  cv::Mat gray= function(); 

If we do this, then the gray variable will now hold the image created by the function without extra memory allocation. Indeed, as we explained, only a shallow copy of the image will be transferred from the returned cv::Mat instance to the gray image. When the ima local variable goes out of scope, this variable is deallocated, but since the associated reference counter indicates that its internal image data is being referred to by another instance (that is, the gray variable), its memory block is not released.

It's worth noting that in the case of classes, you should be careful and not return image class attributes. Here is an example of an error-prone implementation:

class Test { 
  // image attribute 
  cv::Mat ima; 
  public: 
  // constructor creating a gray-level image 
 Test() : ima(240,320,CV_8U,cv::Scalar(100)) {} 
 
  // method return a class attribute, not a good idea... 
  cv::Mat method() { return ima; } 
}; 

Here, if a function calls the method of this class, it obtains a shallow copy of the image attributes. If later this copy is modified, the class attribute will also be surreptitiously modified, which can affect the subsequent behavior of the class (and vice versa). To avoid these kinds of errors, you should instead return a clone of the attribute.

There's more...

While you are manipulating the cv::Mat class, you will discover that OpenCV also includes several other related classes. It will be important for you to become familiar with them.

The input and output arrays

If you look at the OpenCV documentation, you will see that many methods and functions accept parameters of the cv::InputArray type as the input. This type is a simple proxy class introduced to generalize the concept of arrays in OpenCV, and thus, avoid the duplication of several versions of the same method or function with different input parameter types. It basically means that you can supply a cv::Mat object or other compatible types as an argument. This class is just an interface, so you should never declare it explicitly in your code. It is interesting to know that cv::InputArray can also be constructed from the popular std::vector class. This means that such objects can be used as the input to OpenCV methods and functions (as long as it makes sense to do so). Other compatible types are cv::Scalar and cv::Vec; this later structure will be presented in Chapter 2, Manipulating the Pixels. There is also a cv::OutputArray proxy class that is used to designate the arrays returned by some methods or functions.

See also

  • The complete OpenCV documentation can be found at https://docs.opencv.org/.
  • Chapter 2, Manipulating the Pixels, will show you how to access and modify the pixel values of an image represented by the cv::Mat class efficiently.

The next recipe will explain how to define a region of interest (ROI) inside an image.

 

Defining regions of interest

Sometimes, a processing function needs to be applied only to a portion of an image. OpenCV incorporates an elegant and simple mechanism to define a subregion in an image and manipulate it as a regular image. This recipe will teach you how to define an ROI inside an image.

Getting ready

Suppose we want to copy a small image onto a larger one. For example, let's say we want to insert the following small logo into our test image:

To do this, an ROI can be defined over which the copy operation can be applied. As we will see, the position of the ROI will determine where the logo will be inserted in the image.

How to do it...

Let's take a look at the following steps:

  1. The first step consists of defining the ROI. We can use Rect to define the ROI:
cv::Rect myRoi= cv::Rect(image.cols-logo.cols, //ROI coordinates 
                image.rows-logo.rows, 
                logo.cols,logo.rows)
  1. Once the ROI is defined, we can create a new mat applying the ROI to another mat and it can be manipulated as a regular cv::Mat instance. The key is that the ROI is indeed a cv::Mat object that points to the same data buffer as its parent image and has a header that specifies the coordinates of the ROI. Inserting the logo would then be accomplished as follows:
  // define image ROI at image bottom-right 
  cv::Mat imageROI(image, myRoi);
 
  // insert logo 
  logo.copyTo(imageROI);

Here, image is the destination image, and logo is the logo image (of a smaller size). The following image is then obtained by executing the previous code:

Now, let's go behind the scenes to understand the code better.

How it works...

One way to define an ROI is to use a cv::Rect instance. As the name indicates, it describes a rectangular region by specifying the position of the upper-left corner (the first two parameters of the constructor) and the size of the rectangle (the width and height are given in the last two parameters). In our example, we used the size of the image and the size of the logo in order to determine the position where the logo would cover the bottom-right corner of the image. Obviously, the ROI should always be completely inside the parent image.

The ROI can also be described using row and column ranges. A range is a continuous sequence from a start index to an end index (excluding both). The cv::Range structure is used to represent this concept. Therefore, an ROI can be defined from two ranges; in our example, the ROI could have been equivalently defined as follows:

imageROI= image(cv::Range(image.rows-logo.rows,image.rows),  
                cv::Range(image.cols-logo.cols,image.cols)); 

In this case, the operator() function of cv ::Mat returns another cv::Mat instance that can then be used in subsequent calls. Any transformation of the ROI will affect the original image in the corresponding area because the image and the ROI share the same image data. Since the definition of an ROI does not include the copying of data, it is executed in a constant amount of time, no matter the size of the ROI.

If one wants to define an ROI made of some lines of an image, the following call could be used:

cv::Mat imageROI= image.rowRange(start,end); 

Similarly, for an ROI made of some image columns, the following could be used:

cv::Mat imageROI= image.colRange(start,end); 

There's more...

The OpenCV methods and functions include many optional parameters that are not discussed in the recipes of this book. When you wish to use a function for the first time, you should always take the time to look at the documentation to learn more about the possible options that this function offers. One very common option is the possibility to define image masks.

Using image masks

Some OpenCV operations allow you to define a mask that will limit the applicability of a given function or method, which is normally supposed to operate on all the image pixels. A mask is an 8-bit image that should be nonzero at all locations where you want an operation to be applied. At the pixel locations that correspond to the zero values of the mask, the image is untouched. For example, the copyTo method can be called with a mask. We can use it here to copy only the white portion of the logo shown previously, as follows:

// define image ROI at image bottom-right 
imageROI= image(cv::Rect(image.cols-logo.cols,image.rows-logo.rows,  logo.cols,logo.rows)); 
// use the logo as a mask (must be gray-level) 
cv::Mat mask(logo); 
 
// insert by copying only at locations of non-zero mask 
logo.copyTo(imageROI,mask); 

The following image is obtained by executing the previous code:

The background of our logo was black (therefore, it had the value 0); therefore, it was easy to use it as both the copied image and the mask. Of course, you can define the mask of your choice in your application; most OpenCV pixel-based operations give you the opportunity to use masks.

See also

  • The row and col methods that will be used in the Scanning an image with neighbor access recipe of Chapter 2, Manipulating the Pixels. These are special cases of the rowRange and colRange methods in which the start and end indexes are equal in order to define a single-line or single-column ROI.
About the Authors
  • David Millán Escrivá

    David Millán Escrivá was 8 years old when he wrote his first program on an 8086 PC in Basic, which enabled the 2D plotting of basic equations. In 2005, he finished his studies in IT with honors, through the Universitat Politécnica de Valencia, in human-computer interaction supported by computer vision with OpenCV (v0.96). He has worked with Blender, an open source, 3D software project, and on its first commercial movie, Plumiferos, as a computer graphics software developer. David has more than 10 years' experience in IT, with experience in computer vision, computer graphics, pattern recognition, and machine learning, working on different projects, and at different start-ups, and companies. He currently works as a researcher in computer vision.

    Browse publications by this author
  • Robert Laganiere

    Robert Laganiere is a professor at the School of Electrical Engineering and Computer Science of the University of Ottawa, Canada. He is also a faculty member of the VIVA research lab and is the co-author of several scientific publications and patents in content based video analysis, visual surveillance, driver-assistance, object detection, and tracking. Robert authored the OpenCV2 Computer Vision Application Programming Cookbook in 2011 and co-authored Object Oriented Software Development published by McGraw Hill in 2001. He co-founded Visual Cortek in 2006, an Ottawa-based video analytics start-up that was later acquired by iwatchlife.com in 2009. He is also a consultant in computer vision and has assumed the role of Chief Scientist in a number of start-up companies such as Cognivue Corp, iWatchlife, and Tempo Analytics. Robert has a Bachelor of Electrical Engineering degree from Ecole Polytechnique in Montreal (1987) and MSc and PhD degrees from INRS-Telecommunications, Montreal (1996). You can visit the author's website at laganiere.name.

    Browse publications by this author
Latest Reviews (3 reviews total)
Not finished yet. Until now, well written, I use it as introduction to image processing (I have already some knowledge about it).
Los programas que traen junto con el repositorio de Github están excelentes
Un proceso muy sencillo, solo que he tenido problemas para crear un usuario en su website
OpenCV 4 Computer Vision Application Programming Cookbook - Fourth Edition
Unlock this book and the full library FREE for 7 days
Start now