OpenCV 3 Computer Vision Application Programming Cookbook - Third Edition

4.2 (5 reviews total)
By Robert Laganiere
  • Instant online access to over 7,500+ books and videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. Playing with Images

About this book

Making your applications see has never been easier with OpenCV. With it, you can teach your robot how to follow your cat, write a program to correctly identify the members of One Direction, or even help you find the right colors for your redecoration.

OpenCV 3 Computer Vision Application Programming Cookbook Third Edition provides a complete introduction to the OpenCV library and explains how to build your first computer vision program. You will be presented with a variety of computer vision algorithms and exposed to important concepts in image and video analysis that will enable you to build your own computer vision applications.

This book helps you to get started with the library, and shows you how to install and deploy the OpenCV library to write effective computer vision applications following good programming practices. You will learn how to read and write images and manipulate their pixels. Different techniques for image enhancement and shape analysis will be presented. You will learn how to detect specific image features such as lines, circles or corners. You will be introduced to the concepts of mathematical morphology and image filtering.

The most recent methods for image matching and object recognition are described, and you’ll discover how to process video from files or cameras, as well as how to detect and track moving objects. Techniques to achieve camera calibration and perform multiple-view analysis will also be explained. Finally, you’ll also get acquainted with recent approaches in machine learning and object classification.

Publication date:
February 2017


Chapter 1. Playing with Images

In this chapter, we will get you started with the OpenCV library. You will learn how to perform the following tasks:

  • Installing the OpenCV library

  • Loading, displaying, and saving images

  • Exploring the cv::Mat data structure

  • Defining regions of interest



This chapter will teach you the basic elements of OpenCV and will show you how to accomplish the most fundamental image processing tasks: reading, displaying, and saving images. However, before you start with OpenCV, you need to install the library. This is a simple process that is explained in the first recipe of this chapter.

All your computer vision applications will involve the processing of images. This is why OpenCV offers you a data structure to handle images and matrices. It is a powerful data structure with many useful attributes and methods. It also incorporates an advanced memory management model that greatly facilitates the development of applications. The last two recipes of this chapter will teach you how to use this important data structure of OpenCV.


Installing the OpenCV library

OpenCV is an open source library for developing computer vision applications that can run on multiple platforms, such as Windows, Linux, Mac, Android, and iOS. It can be used in both academic and commercial applications under a BSD license that allows you to freely use, distribute, and adapt it. This recipe will show you how to install the library on your machine.

Getting ready

When you visit the OpenCV official website at , you will find the latest release of the library, the online documentation describing the Application Programming Interface (API), and many other useful resources on OpenCV.

How to do it...

From the OpenCV website, find the latest available downloads and select the one that corresponds to the platform of your choice (Windows, Linux/Mac, or iOS). Once the OpenCV package is downloaded, run the WinZip self-extractor and select the location of your choice. An opencv directory will be created; it is a good idea to rename it in a way that will show which version you are using (for example, in Windows, your final directory could be C:\opencv-3.2). This directory will contain a collection of files and directories that constitute the library. Notably, you will find the sources directory that will contain all the source files (yes, it is open source!).

In order to complete the installation of the library and have it ready for use, you need to take an important step: generate the binary files of the library for the environment of your choice. This is indeed the point where you have to make a decision on the target platform you wish to use to create your OpenCV applications. Which operating system do you prefer to use? Which compiler should you select? Which version? 32-bit or 64-bit? As you can see, there are many possible options, and this is why you have to build the library that fits your needs.

The Integrated Development Environment (IDE) you will use in your project development will also guide you to make these choices. Note that the library package also comes with precompiled binaries that you can directly use if they correspond to your situation (check the build directory adjacent to the sources directory). If one of the precompiled binaries satisfies your requirements, then you are ready to go.

One important remark, however. Since version 3, OpenCV has been split into two major components. The first one is the main OpenCV source repository that includes the mature algorithms. This is the one you have downloaded. A separate contribution repository also exists, and it contains the new computer vision algorithm, recently added by the OpenCV contributors. If your plan is to use only the core functions of OpenCV, you do not need the contrib package. But if you want to play with the latest state-of-the-art algorithms, then there is a good chance that you will need this extra module. As a matter of fact, this cookbook will show you how to use several of these advanced algorithms. You therefore need the contrib modules to follow the recipes of this book. So you have to go to and download OpenCV's extra modules (download the ZIP file). You can unzip the extra modules into the directory of your choice; these modules should be found at opencv_contrib-master/modules. For simplicity, you can rename this directory as contrib and copy it directly inside the sources directory of the main package. Note that you can also pick the extra modules of your choice and only save them; however, you will probably find it easier, at this point, to simply keep everything.

You are now ready to proceed with the installation. To build the OpenCV binaries, it is highly suggested that you use the CMake tool, available at . CMake is another open source software tool designed to control the compilation process of a software system using platform-independent configuration files. It generates the required makefile or solution files needed for compiling a software library in your environment. Therefore, you have to download and install CMake. Also see the There's more... section of this recipe for an additional software package, the Visualization Toolkit (VTK), that you may want to install before compiling the library.

You can run cmake using a command-line interface, but it is easier to use CMake with its graphical interface (cmake-gui). In the latter case, all you need to do is specify the folder containing the OpenCV library source and the one that will contain the binaries. Now click on Configure and select the compiler of your choice:

Once this initial configuration is completed, CMake will provide you with a number of configuration options. You have to decide, for example, whether you want to have the documentation installed or whether you wish to have some additional libraries installed. Unless you know what you are doing, it is probably better to leave the default options as they are. However, since we want to include the extra modules, we have to specify the directory where they can be found:

Once the extra module path is specified, click on Configure again. You are now ready to generate the project files by clicking on the Generate button. These files will allow you to compile the library. This is the last step of the installation process, which will make the library ready to be used in your development environment. For example, if you select MS Visual Studio, then all you need to do is open the top-level solution file that CMake has created for you (the OpenCV.sln file). You then select the INSTALL project (under CMakeTargets) and issue the Build command (use right-click).

To get both a Release and Debug build, you will have to repeat the compilation process twice, one for each configuration. If everything goes well, you will have an install directory (under build) created. This directory will contain all the binary files of the OpenCV library to be linked with your application as well as the dynamic library files that your executables have to call at runtime. Make sure you set your system's PATH environment variable (from Control Panel) such that your operating system would be able to find the .dll files when you run your applications (for example, C:\opencv-3.2\build \install\x64\vc14\bin). You should also define the environment variable, OPENCV_DIR pointing to the INSTALL directory. This way, CMake will be able to find the library when configuring future projects.

In Linux environments, you can use Cmake to generate the required Makefiles; you then complete the installation by executing a sudo make install command. Alternatively, you could also use the packaging tool apt-get which can automatically perform a complete installation of the library. For Mac OS, you should use the Homebrew package manager. Once installed, you just have to type brew install opencv3 --with-contrib in order to have the complete library installed (run brew info opencv3 to view all possible options). 

How it works...

OpenCV is a library that is in constant evolution. With version 3, the library continues to expand offering a lot of new functionalities with enhanced performances. The move to having a full C++ API, which was initiated in version 2, is now almost complete, and more uniform interfaces have been implemented. One of the major changes introduced in this new version is the restructuring of the modules of the library in order to facilitate its distribution. In particular, a separate repository containing the most recent algorithms has been created. This contrib repository also contains non-free algorithms that are subject to specific licenses. The idea is for OpenCV to be able to offer state-of-the-art functionalities that developers and researchers want to share while still being able to offer a very stable and well-maintained core API. The main modules are therefore the ones you get when you download the library at The extra modules must be downloaded directly from the development repository hosted on GitHub ( ). Since these extra modules are in constant development, you should expect more frequent changes to the algorithms they contain.

The OpenCV library is divided into several modules. For example, the opencv_core module contains the core functionalities of the library; the opencv_imgproc module includes the main image processing functions; the opencv_highgui module offers the image and video reading and writing functions along with some user interface functions; and so on. To use a particular module, you have to include the corresponding top-level header file. For instance, most applications that use OpenCV start with the following declarations:

    #include <opencv2/core.hpp> 
    #include <opencv2/imgproc.hpp> 
    #include <opencv2/highgui.hpp> 

As you learn to work with OpenCV, you will discover more and more functionalities available in its numerous modules.

There's more...

The OpenCV website at contains detailed instructions on how to install the library. It also contains complete online documentation that includes several tutorials on the different components of the library.

The Visualization Toolkit and the cv::viz module

In some applications, computer vision is used to reconstruct the 3D information of a scene from images. When working with 3D data, it is often useful to be able to visualize the results in some 3D virtual world. As you will learn in Chapter 11 , Reconstructing 3D Scenes, the cv::viz module offers many useful functions that allow you to visualize scene objects and cameras in 3D. However, this module is built on top of another open source library: VTK. Therefore, if you want to use the cv::viz module, you need to install VTK on your machine before compiling OpenCV.

VTK is available at All you have to do is download the library and use CMake in order to create the binaries for your development environment. In this book, we used version 6.3.0. In addition, you should define the VTK_DIR environment variable, pointing to the directory containing the built files. Also, in the configuration options proposed during the OpenCV installation process with CMake, make sure that the WITH_VTK option is checked.

The OpenCV developer site

OpenCV is an open source project that welcomes user contributions. The library is hosted on GitHub, a web service that offers version control and source code management tools based on Git. You can access the developer site at . Among other things, you can access the currently developed version of OpenCV. The community uses Git as their version control system. Git is also a free open source software system; it is probably the best tool you can use to manage your own source code.


Downloading the example source code of this book: The source code files of the examples presented in this cookbook are also hosted on GitHub. Please visit the author's repository at to obtain the latest version of the code. Note that you can download the example code files for all the Packt books you have purchased from your account at . If you have purchased this book elsewhere, you can visit and register yourselves there to have the files e-mailed directly to you.

See also


Loading, displaying, and saving images

It is now time to run your first OpenCV application. Since OpenCV is about processing images, this task will show you how to perform the most fundamental operations needed in the development of imaging applications. These are loading an input image from a file, displaying an image on a window, applying a processing function, and saving the output image.

Getting ready

Using your favorite IDE (for example, MS Visual Studio or Qt), create a new console application with a main function that is ready to be filled.

How to do it...

The first thing to do is to include the header files, declaring the classes and functions you wish to use. Here, we simply want to display an image, so we need the core header that declares the image data structure and the highgui header file that contains all the graphical interface functions:

    #include <opencv2/core.hpp> 
    #include <opencv2/highgui.hpp> 

Our main function starts by declaring a variable that will hold the image. Under OpenCV, this is done by defining an object of the cv::Mat class:

    cv::Mat image; // create an empty image 

This definition creates an image of size 0x0. This can be confirmed by accessing the cv::Mat size attributes:

    std::cout << "This image is " << image.rows << " x "  
              << image.cols << std::endl; 

Next, a simple call to the reading function will read an image from a file, decode it, and allocate the memory:

    image=  cv::imread("puppy.bmp"); // read an input image 

You are now ready to use this image. However, you should first check whether the image has been correctly read (an error will occur if the file is not found, is corrupted, or is not in a recognizable format). The validity of the image is tested using the following code:

    if (image.empty()) {  // error handling 
      // no image has been created... 
      // possibly display an error message 
      // and quit the application  

The empty method returns true if no image data has been allocated.

The first thing you might want to do with this image is display it. You can do this using the functions of the highgui module. Start by declaring the window on which you want to display the images, then specify the image to be shown on this special window:

    // define the window (optional) 
    cv::namedWindow("Original Image"); 
    // show the image  
    cv::imshow("Original Image", image); 

As you can see, the window is identified by a name. You can reuse this window to display another image later, or you can create multiple windows with different names. When you run this application, you will see an image window, as follows:

Now, you would normally apply some processing to the image. OpenCV offers a wide selection of processing functions, and several of them are explored in this book. Let's start with a very simple one that flips an image horizontally. Several image transformations in OpenCV can be performed in-place, meaning the transformation is applied directly on the input image (no new image is created). This is the case for the flipping method. However, we can always create another matrix to hold the output result, and this is what we will do:

    cv::Mat result; // we create another empty image 
    cv::flip(image,result,1); // positive for horizontal 
                              // 0 for vertical, 
                              // negative for both 

The result is displayed on another window:

    cv::namedWindow("Output Image");    // the output window 
    cv::imshow("Output Image", result); 

Since it is a console window that will terminate when it reaches the end of the main function, we add an extra highgui function to wait for a user key before we end the program:

    cv::waitKey(0); // 0 to indefinitely wait for a key pressed 
                    // specifying a positive value will wait for 
                    // the given amount of msec 

You can then see that the output image is displayed in a distinct window, as shown in the following screenshot:

Finally, you will probably want to save the processed image on your disk. This is done using the following highgui function:

    cv::imwrite("output.bmp", result); // save result 

The file extension determines which codec will be used to save the image. Other popular supported image formats are JPG, TIFF, and PNG.

How it works...

All classes and functions in the C++ API of OpenCV are defined within the cv namespace. You have two ways to access them. First, precede the main function's definition with the following declaration:

    using namespace cv; 

Alternatively, prefix all OpenCV class and function names with the namespace specification, that is, cv::, as we will do in this book. The use of this prefix makes the OpenCV classes and functions easier to identify within your code.

The highgui module contains a set of functions that allow you to easily visualize and interact with your images. When you load an image with the imread function, you also have the option to read it as a gray-level image. This is very advantageous since several computer vision algorithms require gray-level images. Converting an input color image on the fly as you read it will save you time and minimize your memory usage. This can be done as follows:

    // read the input image as a gray-scale image 
    image=  cv::imread("puppy.bmp", cv::IMREAD_GRAYSCALE); 

This will produce an image made of unsigned bytes (unsigned char in C++) that OpenCV designates with the constant CV_8U. Alternatively, it is sometimes necessary to read an image as a three-channel color image even if it has been saved as a gray-level image. This can be achieved by calling the imread function with a positive second argument:

    // read the input image as a 3-channel color image 
    image=  cv::imread("puppy.bmp", cv::IMREAD_COLOR); 

This time, an image made of 3 bytes per pixel will be created and designated as CV_8UC3 in OpenCV. Of course, if your input image has been saved as a gray-level image, all three channels will contain the same value. Finally, if you wish to read the image in the format in which it has been saved, then simply input a negative value as the second argument. The number of channels in an image can be checked using the channels method:

    std::cout << "This image has "  
              << image.channels() << " channel(s)"; 

Pay attention when you open an image with imread without specifying a full path (as we did here). In such a case, the default directory will be used. When you run your application from the console, this directory is obviously the current console's directory. However, if you run the application directly from your IDE, the default directory will most often be the one that contains your project file. Consequently, make sure that your input image file is located in the right directory.

When you use imshow to display an image made up of integers (designated as CV_16U for 16-bit unsigned integers or as CV_32S for 32-bit signed integers), the pixel values of this image will be divided by 256 first. This is done in an attempt to make it displayable with 256 gray shades. Similarly, an image made up of floating points will be displayed by assuming a range of possible values between 0.0 (displayed as black) and 1.0 (displayed as white). Values outside this defined range are displayed in white (for values above 1.0) or black (for values below 0.0).

The highgui module is very useful to build quick prototypal applications. When you are ready to produce a finalized version of your application, you will probably want to use the GUI module offered by your IDE in order to build an application with a more professional look.

Here, our application uses both input and output images. As an exercise, you should rewrite this simple program such that it takes advantage of the function's in-place processing, that is, by not declaring the output image and writing it instead:

    cv::flip(image,image,1); // in-place processing 

There's more...

The highgui module contains a rich set of functions that help you interact with your images. Using these, your applications can react to mouse or key events. You can also draw shapes and write text on images.

Clicking on images

You can program your mouse to perform specific operations when it is over one of the image windows you created. This is done by defining an appropriate callback function. A callback function is a function that you do not explicitly call but which is called by your application in response to specific events (here, the events that concern the mouse interacting with an image window). To be recognized by applications, callback functions need to have a specific signature and must be registered. In the case of a mouse event handler, the callback function must have the following signature:

    void onMouse( int event, int x, int y, int flags, void* param); 

The first parameter is an integer that is used to specify which type of mouse event has triggered the call to the callback function. The other two parameters are simply the pixel coordinates of the mouse location when the event has occurred. The flags are used to determine which button was pressed when the mouse event was triggered. Finally, the last parameter is used to send an extra parameter to the function in the form of a pointer to any object. This callback function can be registered in the application through the following call:

    cv::setMouseCallback("Original Image", onMouse,  

In this example, the onMouse function is associated with the image window called Original Image, and the address of the displayed image is passed as an extra parameter to the function. Now, if we define the onMouse callback function as shown in the following code, then each time the mouse is clicked, the value of the corresponding pixel will be displayed on the console (here, we assume that it is a gray-level image):

    void onMouse( int event, int x, int y, int flags, void* param)  { 
      cv::Mat *im= reinterpret_cast<cv::Mat*>(param); 
      switch (event) {  // dispatch the event 
        case cv::EVENT_LBUTTONDOWN: // left mouse button down event 
          // display pixel value at (x,y) 
          std::cout << "at (" << x << "," << y << ") value is: "  
                    << static_cast<int>(               
                            im->at<uchar>(cv::Point(x,y))) << std::endl; 

Note that in order to obtain the pixel value at (x,y), we used the at method of the cv::Mat object; this is discussed in Chapter 2 , Manipulating Pixels. Other possible events that can be received by the mouse event callback function include cv::EVENT_MOUSEMOVE, cv::EVENT_LBUTTONUP, cv::EVENT_RBUTTONDOWN, and cv::EVENT_RBUTTONUP.

Drawing on images

OpenCV also offers a few functions to draw shapes and write text on images. The examples of basic shape-drawing functions are circle, ellipse, line, and rectangle. The following is an example of how to use the circle function:

    cv::circle(image,                // destination image  
               cv::Point(155,110),   // center coordinate 
               65,                   // radius   
               0,                    // color (here black) 
               3);                   // thickness 

The cv::Point structure is often used in OpenCV methods and functions to specify a pixel coordinate. Note that here we assume that the drawing is done on a gray-level image; this is why the color is specified with a single integer. In the next recipe, you will learn how to specify a color value in the case of color images that use the cv::Scalar structure. It is also possible to write text on an image. This can be done as follows:

    cv::putText(image,                    // destination image 
                "This is a dog.",         // text 
                cv::Point(40,200),        // text position 
                cv::FONT_HERSHEY_PLAIN,   // font type 
                2.0,                      // font scale 
                255,                      // text color (here white) 
                2);                       // text thickness 

Calling these two functions on our test image will then result in the following screenshot:

Note that you have to include the top-level module header opencv2/imgproc.hpp for these examples to work.

See also

  • The cv::Mat class is the data structure that is used to hold your images (and obviously, other matrix data). This data structure is at the core of all OpenCV classes and functions; the next recipe offers a detailed explanation of this data structure.


Exploring the cv::Mat data structure

In the previous recipe, you were introduced to the cv::Mat data structure. As mentioned, this is a key component of the library. It is used to manipulate images and matrices (in fact, an image is a matrix from a computational and mathematical point of view). Since you will be using this data structure extensively in your application development processes, it is imperative that you become familiar with it. Notably, in this recipe, you will learn that this data structure incorporates an elegant memory management mechanism.

How to do it...

Let's write the following test program that will allow us to test the different properties of the cv::Mat data structure:

    #include <iostream> 
    #include <opencv2/core.hpp> 
    #include <opencv2/highgui.hpp> 
    // test function that creates an image 
    cv::Mat function() { 
       // create image 
       cv::Mat ima(500,500,CV_8U,50); 
       // return it 
       return ima; 
    int main() { 
      // create a new image made of 240 rows and 320 columns 
      cv::Mat image1(240,320,CV_8U,100); 
      cv::imshow("Image", image1); // show the image 
      cv::waitKey(0); // wait for a key pressed 
      // re-allocate a new image 
      image1= 200; 
      cv::imshow("Image", image1); // show the image 
      cv::waitKey(0); // wait for a key pressed 
      // create a red color image 
      // channel order is BGR 
      cv::Mat image2(240,320,CV_8UC3,cv::Scalar(0,0,255)); 
      // or: 
      // cv::Mat image2(cv::Size(320,240),CV_8UC3); 
      // image2= cv::Scalar(0,0,255); 
      cv::imshow("Image", image2); // show the image 
      cv::waitKey(0); // wait for a key pressed 
      // read an image 
      cv::Mat image3=  cv::imread("puppy.bmp");  
      // all these images point to the same data block 
      cv::Mat image4(image3); 
      image1= image3; 
      // these images are new copies of the source image 
      cv::Mat image5= image3.clone(); 
      // transform the image for testing 
      // check which images have been affected by the processing 
      cv::imshow("Image 3", image3);  
      cv::imshow("Image 1", image1);  
      cv::imshow("Image 2", image2);  
      cv::imshow("Image 4", image4);  
      cv::imshow("Image 5", image5);  
      cv::waitKey(0); // wait for a key pressed 
      // get a gray-level image from a function 
      cv::Mat gray= function(); 
      cv::imshow("Image", gray); // show the image 
      cv::waitKey(0); // wait for a key pressed 
      // read the image in gray scale 
      image1= cv::imread("puppy.bmp", CV_LOAD_IMAGE_GRAYSCALE);  
      cv::imshow("Image", image2); // show the image 
      cv::waitKey(0); // wait for a key pressed 
      return 0; 

Run this program and take a look at the images it produces:

How it works...

The cv::Mat data structure is essentially made up of two parts: a header and a data block. The header contains all of the information associated with the matrix (size, number of channels, data type, and so on). The previous recipe showed you how to access some of the attributes of this structure contained in its header (for example, by using cols, rows, or channels). The data block holds all the pixel values of an image. The header contains a pointer variable that points to this data block; it is the data attribute. An important property of the cv::Mat data structure is the fact that the memory block is only copied when explicitly requested for. Indeed, most operations will simply copy the cv::Mat header such that multiple objects will point to the same data block. This memory management model makes your applications more efficient while avoiding memory leaks, but its consequences need to be understood. The examples of this recipe illustrate this fact.

By default, the cv::Mat objects have a zero size when they are created, but you can also specify an initial size as follows:

    // create a new image made of 240 rows and 320 columns 
    cv::Mat image1(240,320,CV_8U,100); 

In this case, you also need to specify the type of each matrix element-CV_8U here, which corresponds to 1-byte pixel (grayscale) images. The U letter here means it is unsigned. You can also declare signed numbers using S. For a color image, you would specify three channels (CV_8UC3). You can also declare integers (signed or unsigned) of size 16 and 32 (for example, CV_16SC3). You also have access to 32-bit and 64-bit floating-point numbers (for example, CV_32F).

Each element of an image (or a matrix) can be composed of more than one value (for example, the three channels of a color image); therefore, OpenCV has introduced a simple data structure that is used when pixel values are passed to functions. This is the cv::Scalar structure, which is generally used to hold one or three values. For example, to create a color image initialized with red pixels, write the following code:

    // create a red color image 
    // channel order is BGR 
    cv::Mat image2(240,320,CV_8UC3,cv::Scalar(0,0,255)); 

Similarly, the initialization of the gray-level image could have also been done using this structure by writing cv::Scalar(100).

The image size often needs to be passed to functions as well. We have already mentioned that the cols and rows attributes can be used to get the dimensions of a cv::Mat instance. The size information can also be provided through the cv::Size structure that simply contains the height and width of the matrix. The size() method allows you to obtain the current matrix size. This is the format that is used in many methods where a matrix size must be specified.

For example, an image could be created as follows:

    // create a non-initialized color image  
    cv::Mat image2(cv::Size(320,240),CV_8UC3); 

The data block of an image can always be allocated or reallocated using the create method. When an image has already been previously allocated, its old content is deallocated first. For reasons of efficiency, if the new proposed size and type matches the already existing size and type, then no new memory allocation is performed:

    // re-allocate a new image 
    // (only if size or type are different) 

When no more references point to a given cv::Mat object, the allocated memory is automatically released. This is very convenient because it avoids the common memory leak problems often associated with dynamic memory allocation in C++. This is a key mechanism in OpenCV (introduced in version 2) that is accomplished by having the cv::Mat class implement reference counting and shallow copy. Therefore, when an image is assigned to another one, the image data (that is, the pixels) is not copied; both images will point to the same memory block. This also applies to images either passed or returned by a value. A reference count is kept such that the memory will be released only when all the references to the image are destructed or assigned to another image:

    // all these images point to the same data block 
    cv::Mat image4(image3); 
    image1= image3; 

Any transformation applied to one of the preceding images will also affect the other images. If you wish to create a deep copy of the content of an image, use the copyTo method. In this case, the create method is called on the destination image. Another method that produces a copy of an image is the clone method, which creates a new identical image as follows:

    // these images are new copies of the source image 
    cv::Mat image5= image3.clone(); 

In the example of this recipe, we applied a transformation to image3. The other images also contain this image; some of them share the same image data, while others hold a copy of this image. Check the displayed images and find out which ones were affected by the image3 transformation.

If you need to copy an image into another image that does not necessarily have the same data type, use the convertTo method:

    // convert the image into a floating point image [0,1] 

In this example, the source image is copied into a floating-point image. The method includes two optional parameters: a scaling factor and an offset. Note that both the images must, however, have the same number of channels.

The allocation model for the cv::Mat objects also allows you to safely write functions (or class methods) that return an image:

    cv::Mat function() { 
      // create image 
      cv::Mat ima(240,320,CV_8U,cv::Scalar(100)); 
      // return it 
      return ima; 

We can also call this function from our main function as follows:

      // get a gray-level image 
      cv::Mat gray= function(); 

If we do this, the gray variable will then hold the image created by the function without extra memory allocation. Indeed, as we explained, only a shallow copy of the image will be transferred from the returned cv::Mat instance to the gray image. When the ima local variable goes out of scope, this variable is deallocated. However, since the associated reference counter indicates that its internal image data is being referred to by another instance (that is, the gray variable), its memory block is not released.

It's worth noting that in the case of classes, you should be careful and not return image class attributes. Here is an example of an error-prone implementation:

    class Test { 
      // image attribute 
      cv::Mat ima; 
        // constructor creating a gray-level image 
        Test() : ima(240,320,CV_8U,cv::Scalar(100)) {} 
        // method return a class attribute, not a good idea... 
        cv::Mat method() { return ima; } 

Here, if a function calls the method of this class, it obtains a shallow copy of the image attributes. If this copy is modified later, the class attribute will also be surreptitiously modified, which can affect the subsequent behavior of the class (and vice versa). This is a violation of the important principle of encapsulation in object-oriented programming. To avoid these kinds of errors, you should instead return a clone of the attribute.

There's more...

When you are manipulating the cv::Mat class, you will discover that OpenCV also includes several other related classes. It will be important for you to become familiar with them.

The input and output arrays

If you look at the OpenCV documentation, you will see that many methods and functions accept parameters of the cv::InputArray type as an input. This type is a simple proxy class introduced to generalize the concept of arrays in OpenCV and thus avoid the duplication of several versions of the same method or function with different input parameter types. It basically means that you can supply either a cv::Mat object or other compatible types as an argument. Since it is declared as an input array, you have the guarantee that your data structure will not be modified by the function. It is interesting to know that cv::InputArray can also be constructed from the popular std::vector class. This means that such objects can be used as input parameters to OpenCV methods and functions (however, never use this class inside your classes and functions). Other compatible types are cv::Scalar and cv::Vec; the latter structure will be presented in the next chapter. There is also a cv::OutputArray proxy class that is used to designate parameters that correspond to an image that is returned by a function or method.

Manipulating small matrices

When writing your applications, you might have to manipulate small matrices. You can then use the cv::Matx template class and its subclasses. For example, the following code declares a 3x3 matrix of double-precision floating-point numbers and a 3-element vector. These two are then multiplied together:

      // a 3x3 matrix of double 
      cv::Matx33d matrix(3.0, 2.0, 1.0, 
                         2.0, 1.0, 3.0, 
                         1.0, 2.0, 3.0); 
      // a 3x1 matrix (a vector) 
      cv::Matx31d vector(5.0, 1.0, 3.0); 
      // multiplication 
      cv::Matx31d result = matrix*vector; 

As you can see, the usual math operators can be applied to these matrices.

See also

  • The complete OpenCV documentation can be found at

  • Chapter 2 , Manipulating Pixels, will show you how to efficiently access and modify the pixel values of an image represented by the cv::Mat class

  • The next recipe, Defining regions of interest , will explain how to define a region of interest inside an image


Defining regions of interest

Sometimes, a processing function needs to be applied only to a portion of an image. OpenCV incorporates an elegant and simple mechanism to define a subregion in an image and manipulate it as a regular image. This recipe will teach you how to define a region of interest inside an image.

Getting ready

Suppose we want to copy a small image onto a larger one. For example, let's say we want to insert the following logo into our test image:

To do this, a Region Of Interest (ROI) can be defined over which the copy operation can be applied. As we will see, the position of the ROI will determine where the logo will be inserted in the image.

How to do it...

The first step consists of defining the ROI. Once defined, the ROI can be manipulated as a regular cv::Mat instance. The key is that the ROI is indeed a cv::Mat object that points to the same data buffer as its parent image and has a header that specifies the coordinates of the ROI. Inserting the logo is then accomplished as follows:

    // define image ROI at image bottom-right 
    cv::Mat imageROI(image,  
              cv::Rect(image.cols-logo.cols,   // ROI coordinates 
                       logo.cols,logo.rows));  // ROI size 
    // insert logo 

Here, image is the destination image and logo is the logo image (of a smaller size). The following image is then obtained by executing the previous code:

How it works...

One way to define an ROI is to use a cv::Rect instance. As the name indicates, it describes a rectangular region by specifying the position of the upper-left corner (the first two parameters of the constructor) and the size of the rectangle (the width and height are given in the last two parameters). In our example, we used the size of the image and the size of the logo in order to determine the position where the logo would cover the bottom-right corner of the image. Obviously, the ROI should always be completely inside the parent image.

The ROI can also be described using row and column ranges. A range is a continuous sequence from a start index to an end index (excluding both). The cv::Range structure is used to represent this concept. Therefore, an ROI can be defined from two ranges; in our example, the ROI could have been equivalently defined as follows:

    imageROI= image(cv::Range(image.rows-logo.rows,image.rows),  

In this case, the operator() function of cv ::Mat returns another cv::Mat instance that can then be used in subsequent calls. Any transformation of the ROI will affect the original image in the corresponding area because the image and the ROI share the same image data. Since the definition of an ROI does not include the copying of data, it is executed in a constant amount of time, no matter the size of the ROI.

If you want to define an ROI made up of some lines of an image, the following call can be used:

    cv::Mat imageROI= image.rowRange(start,end); 

Similarly, for an ROI made up of some image columns, the following can be used:

    cv::Mat imageROI= image.colRange(start,end); 

There's more...

The OpenCV methods and functions include many optional parameters that are not discussed in the recipes of this book. When you wish to use a function for the first time, you should always take the time to look at the documentation to learn more about the possible options that the function offers. One very common option is the possibility to define image masks.

Using image masks

Some OpenCV operations allow you to define a mask that will limit the applicability of a given function or method, which is normally supposed to operate on all the image pixels. A mask is an 8-bit image that should be nonzero at all locations where you want an operation to be applied. At the pixel locations that correspond to the zero values of the mask, the image is untouched. For example, the copyTo method can be called with a mask. We can use it here to copy only the white portion of the logo shown previously, as follows:

    // define image ROI at image bottom-right 
    imageROI= image(cv::Rect(image.cols-logo.cols, 
    // use the logo as a mask (must be gray-level) 
    cv::Mat mask(logo); 
    // insert by copying only at locations of non-zero mask 

The following image is obtained by executing the previous code:

The background of our logo was black (therefore, it had the value 0); this is why it was easy to use it as both the copied image and the mask. Of course, you can define the mask of your choice in your application; most OpenCV pixel-based operations give you the opportunity to use masks.

See also

  • The row and col methods will be used in the Scanning an image with neighbor access recipe of Chapter 2 , Manipulating Pixels. These are a special case of the rowRange and colRange methods in which the start and end indexes are equal in order to define a single-line or single-column ROI.

About the Author

  • Robert Laganiere

    Robert Laganiere is a professor at the University of Ottawa, Canada. He is also a faculty member of the VIVA research lab and is the coauthor of several scientific publications and patents in content-based video analysis, visual surveillance, driver-assistance, object detection, and tracking. He cofounded Visual Cortek, a video analytics start-up, which was later acquired by iWatchLife. He is also a consultant in computer vision and has assumed the role of chief scientist in a number of start-ups companies, including Cognivue Corp, iWatchLife, and Tempo Analytics. Robert has a Bachelor of Electrical Engineering degree from Ecole Polytechnique in Montreal (1987), and M.Sc. and Ph.D. degrees from INRS-Telecommunications, Montreal (1996).

    Browse publications by this author

Latest Reviews

(5 reviews total)
Another great book from Packt!
Die elektronische Lieferung war OK. Durch verwirrende Führung auf der Internetseite kam zu Doppelbestellung und die Doppelbezahlung: für das Buch 9,99€ für das Abo 29€
Excellent price and very fast process