This chapter will teach you the basic elements of OpenCV and will show you how to accomplish the most fundamental image processing tasks: reading, displaying, and saving images. However, before you start with OpenCV, you need to install the library. This is a simple process that is explained in the first recipe of this chapter.
All your computer vision applications will involve the processing of images. This is why OpenCV offers you a data structure to handle images and matrices. It is a powerful data structure with many useful attributes and methods. It also incorporates an advanced memory management model that greatly facilitates the development of applications. The last two recipes of this chapter will teach you how to use this important data structure of OpenCV.
OpenCV is an open source library for developing computer vision applications that can run on multiple platforms, such as Windows, Linux, Mac, Android, and iOS. It can be used in both academic and commercial applications under a BSD license that allows you to freely use, distribute, and adapt it. This recipe will show you how to install the library on your machine.
When you visit the OpenCV official website at http://opencv.org/ , you will find the latest release of the library, the online documentation describing the Application Programming Interface (API), and many other useful resources on OpenCV.
From the OpenCV website, find the latest available downloads and select the one that corresponds to the platform of your choice (Windows, Linux/Mac, or iOS). Once the OpenCV package is downloaded, run the WinZip self-extractor and select the location of your choice. An opencv
directory will be created; it is a good idea to rename it in a way that will show which version you are using (for example, in Windows, your final directory could be C:\opencv-3.2
). This directory will contain a collection of files and directories that constitute the library. Notably, you will find the sources
directory that will contain all the source files (yes, it is open source!).
In order to complete the installation of the library and have it ready for use, you need to take an important step: generate the binary files of the library for the environment of your choice. This is indeed the point where you have to make a decision on the target platform you wish to use to create your OpenCV applications. Which operating system do you prefer to use? Which compiler should you select? Which version? 32-bit or 64-bit? As you can see, there are many possible options, and this is why you have to build the library that fits your needs.
The Integrated Development Environment (IDE) you will use in your project development will also guide you to make these choices. Note that the library package also comes with precompiled binaries that you can directly use if they correspond to your situation (check the build
directory adjacent to the sources
directory). If one of the precompiled binaries satisfies your requirements, then you are ready to go.
One important remark, however. Since version 3, OpenCV has been split into two major components. The first one is the main OpenCV source repository that includes the mature algorithms. This is the one you have downloaded. A separate contribution repository also exists, and it contains the new computer vision algorithm, recently added by the OpenCV contributors. If your plan is to use only the core functions of OpenCV, you do not need the contrib
package. But if you want to play with the latest state-of-the-art algorithms, then there is a good chance that you will need this extra module. As a matter of fact, this cookbook will show you how to use several of these advanced algorithms. You therefore need the contrib
modules to follow the recipes of this book. So you have to go to
https://github.com/opencv/opencv_contrib
and download OpenCV's extra modules (download the ZIP file). You can unzip the extra modules into the directory of your choice; these modules should be found at opencv_contrib-master/modules
. For simplicity, you can rename this directory as contrib
and copy it directly inside the sources
directory of the main package. Note that you can also pick the extra modules of your choice and only save them; however, you will probably find it easier, at this point, to simply keep everything.
You are now ready to proceed with the installation. To build the OpenCV binaries, it is highly suggested that you use the CMake tool, available at
http://cmake.org
. CMake is another open source software tool designed to control the compilation process of a software system using platform-independent configuration files. It generates the required makefile
or solution
files needed for compiling a software library in your environment. Therefore, you have to download and install CMake. Also see the There's more... section of this recipe for an additional software package, the Visualization Toolkit (VTK), that you may want to install before compiling the library.
You can run cmake
using a command-line interface, but it is easier to use CMake
with its graphical interface (cmake-gui). In the latter case, all you need to do is specify the folder containing the OpenCV library source and the one that will contain the binaries. Now click on Configure and select the compiler of your choice:

Once this initial configuration is completed, CMake will provide you with a number of configuration options. You have to decide, for example, whether you want to have the documentation installed or whether you wish to have some additional libraries installed. Unless you know what you are doing, it is probably better to leave the default options as they are. However, since we want to include the extra modules, we have to specify the directory where they can be found:

Once the extra module path is specified, click on Configure again. You are now ready to generate the project files by clicking on the Generate button. These files will allow you to compile the library. This is the last step of the installation process, which will make the library ready to be used in your development environment. For example, if you select MS Visual Studio, then all you need to do is open the top-level solution file that CMake has created for you (the OpenCV.sln
file). You then select the INSTALL project (under CMakeTargets) and issue the Build command (use right-click).

To get both a Release and Debug build, you will have to repeat the compilation process twice, one for each configuration. If everything goes well, you will have an install
directory (under build
) created. This directory will contain all the binary
files of the OpenCV library to be linked with your application as well as the dynamic library files that your executables have to call at runtime. Make sure you set your system's PATH
environment variable (from Control Panel) such that your operating system would be able to find the .dll
files when you run your applications (for example, C:\opencv-3.2\build \install\x64\vc14\bin
). You should also define the environment variable, OPENCV_DIR
pointing to the INSTALL
directory. This way, CMake will be able to find the library when configuring future projects.
In Linux environments, you can use Cmake to generate the required Makefiles
; you then complete the installation by executing a sudo make install
command. Alternatively, you could also use the packaging tool apt-get
which can automatically perform a complete installation of the library. For Mac OS, you should use the Homebrew
package manager. Once installed, you just have to type brew install opencv3 --with-contrib
in order to have the complete library installed (run brew info opencv3
to view all possible options).
OpenCV is a library that is in constant evolution. With version 3, the library continues to expand offering a lot of new functionalities with enhanced performances. The move to having a full C++ API, which was initiated in version 2, is now almost complete, and more uniform interfaces have been implemented. One of the major changes introduced in this new version is the restructuring of the modules of the library in order to facilitate its distribution. In particular, a separate repository containing the most recent algorithms has been created. This contrib
repository also contains non-free algorithms that are subject to specific licenses. The idea is for OpenCV to be able to offer state-of-the-art functionalities that developers and researchers want to share while still being able to offer a very stable and well-maintained core API. The main modules are therefore the ones you get when you download the library at http://opencv.org/. The extra modules must be downloaded directly from the development repository hosted on GitHub (
https://github.com/opencv/
). Since these extra modules are in constant development, you should expect more frequent changes to the algorithms they contain.
The OpenCV library is divided into several modules. For example, the opencv_core
module contains the core functionalities of the library; the opencv_imgproc
module includes the main image processing functions; the opencv_highgui
module offers the image and video reading and writing functions along with some user interface functions; and so on. To use a particular module, you have to include the corresponding top-level header file. For instance, most applications that use OpenCV start with the following declarations:
#include <opencv2/core.hpp> #include <opencv2/imgproc.hpp> #include <opencv2/highgui.hpp>
As you learn to work with OpenCV, you will discover more and more functionalities available in its numerous modules.
The OpenCV website at http://opencv.org/ contains detailed instructions on how to install the library. It also contains complete online documentation that includes several tutorials on the different components of the library.
In some applications, computer vision is used to reconstruct the 3D information of a scene from images. When working with 3D data, it is often useful to be able to visualize the results in some 3D virtual world. As you will learn in
Chapter 11
, Reconstructing 3D Scenes, the cv::viz
module offers many useful functions that allow you to visualize scene objects and cameras in 3D. However, this module is built on top of another open source library: VTK. Therefore, if you want to use the cv::viz
module, you need to install VTK on your machine before compiling OpenCV.
VTK is available at http://www.vtk.org/. All you have to do is download the library and use CMake in order to create the binaries for your development environment. In this book, we used version 6.3.0. In addition, you should define the VTK_DIR
environment variable, pointing to the directory containing the built files. Also, in the configuration options proposed during the OpenCV installation process with CMake, make sure that the WITH_VTK
option is checked.
OpenCV is an open source project that welcomes user contributions. The library is hosted on GitHub, a web service that offers version control and source code management tools based on Git. You can access the developer site at https://github.com/opencv/opencv/wiki . Among other things, you can access the currently developed version of OpenCV. The community uses Git as their version control system. Git is also a free open source software system; it is probably the best tool you can use to manage your own source code.
Note
Downloading the example source code of this book: The source code files of the examples presented in this cookbook are also hosted on GitHub. Please visit the author's repository at https://github.com/laganiere to obtain the latest version of the code. Note that you can download the example code files for all the Packt books you have purchased from your account at http://www.packtpub.com . If you have purchased this book elsewhere, you can visit http://www.packtpub.com/support and register yourselves there to have the files e-mailed directly to you.
The author's website (http://www.laganiere.name/) also presents step-by-step instructions on how to install the latest versions of the OpenCV library
Visit https://git-scm.com/ and https://github.com/ to learn more about source code management.
It is now time to run your first OpenCV application. Since OpenCV is about processing images, this task will show you how to perform the most fundamental operations needed in the development of imaging applications. These are loading an input image from a file, displaying an image on a window, applying a processing function, and saving the output image.
Using your favorite IDE (for example, MS Visual Studio or Qt), create a new console application with a main
function that is ready to be filled.
The first thing to do is to include the header files, declaring the classes and functions you wish to use. Here, we simply want to display an image, so we need the core
header that declares the image data structure and the highgui
header file that contains all the graphical interface functions:
#include <opencv2/core.hpp> #include <opencv2/highgui.hpp>
Our main function starts by declaring a variable that will hold the image. Under OpenCV, this is done by defining an object of the cv::Mat
class:
cv::Mat image; // create an empty image
This definition creates an image of size 0x0
. This can be confirmed by accessing the cv::Mat
size attributes:
std::cout << "This image is " << image.rows << " x " << image.cols << std::endl;
Next, a simple call to the reading function will read an image from a file, decode it, and allocate the memory:
image= cv::imread("puppy.bmp"); // read an input image
You are now ready to use this image. However, you should first check whether the image has been correctly read (an error will occur if the file is not found, is corrupted, or is not in a recognizable format). The validity of the image is tested using the following code:
if (image.empty()) { // error handling // no image has been created... // possibly display an error message // and quit the application ... }
The empty
method returns true
if no image data has been allocated.
The first thing you might want to do with this image is display it. You can do this using the functions of the highgui
module. Start by declaring the window on which you want to display the images, then specify the image to be shown on this special window:
// define the window (optional) cv::namedWindow("Original Image"); // show the image cv::imshow("Original Image", image);
As you can see, the window is identified by a name. You can reuse this window to display another image later, or you can create multiple windows with different names. When you run this application, you will see an image window, as follows:

Now, you would normally apply some processing to the image. OpenCV offers a wide selection of processing functions, and several of them are explored in this book. Let's start with a very simple one that flips an image horizontally. Several image transformations in OpenCV can be performed in-place, meaning the transformation is applied directly on the input image (no new image is created). This is the case for the flipping method. However, we can always create another matrix to hold the output result, and this is what we will do:
cv::Mat result; // we create another empty image cv::flip(image,result,1); // positive for horizontal // 0 for vertical, // negative for both
The result is displayed on another window:
cv::namedWindow("Output Image"); // the output window cv::imshow("Output Image", result);
Since it is a console window that will terminate when it reaches the end of the main
function, we add an extra highgui
function to wait for a user key before we end the program:
cv::waitKey(0); // 0 to indefinitely wait for a key pressed // specifying a positive value will wait for // the given amount of msec
You can then see that the output image is displayed in a distinct window, as shown in the following screenshot:

Finally, you will probably want to save the processed image on your disk. This is done using the following highgui
function:
cv::imwrite("output.bmp", result); // save result
The file extension determines which codec will be used to save the image. Other popular supported image formats are JPG, TIFF, and PNG.
All classes and functions in the C++ API of OpenCV are defined within the cv
namespace. You have two ways to access them. First, precede the main
function's definition with the following declaration:
using namespace cv;
Alternatively, prefix all OpenCV class and function names with the namespace specification, that is, cv::
, as we will do in this book. The use of this prefix makes the OpenCV classes and functions easier to identify within your code.
The highgui
module contains a set of functions that allow you to easily visualize and interact with your images. When you load an image with the imread
function, you also have the option to read it as a gray-level image. This is very advantageous since several computer vision algorithms require gray-level images. Converting an input color image on the fly as you read it will save you time and minimize your memory usage. This can be done as follows:
// read the input image as a gray-scale image image= cv::imread("puppy.bmp", cv::IMREAD_GRAYSCALE);
This will produce an image made of unsigned bytes (unsigned char
in C++) that OpenCV designates with the constant CV_8U
. Alternatively, it is sometimes necessary to read an image as a three-channel color image even if it has been saved as a gray-level image. This can be achieved by calling the imread
function with a positive second argument:
// read the input image as a 3-channel color image image= cv::imread("puppy.bmp", cv::IMREAD_COLOR);
This time, an image made of 3 bytes per pixel will be created and designated as CV_8UC3
in OpenCV. Of course, if your input image has been saved as a gray-level image, all three channels will contain the same value. Finally, if you wish to read the image in the format in which it has been saved, then simply input a negative value as the second argument. The number of channels in an image can be checked using the channels
method:
std::cout << "This image has " << image.channels() << " channel(s)";
Pay attention when you open an image with imread
without specifying a full path (as we did here). In such a case, the default directory will be used. When you run your application from the console, this directory is obviously the current console's directory. However, if you run the application directly from your IDE, the default directory will most often be the one that contains your project file. Consequently, make sure that your input image file is located in the right directory.
When you use imshow
to display an image made up of integers (designated as CV_16U
for 16-bit unsigned integers or as CV_32S
for 32-bit signed integers), the pixel values of this image will be divided by 256
first. This is done in an attempt to make it displayable with 256
gray shades. Similarly, an image made up of floating points will be displayed by assuming a range of possible values between 0.0
(displayed as black) and 1.0
(displayed as white). Values outside this defined range are displayed in white (for values above 1.0
) or black (for values below 0.0
).
The highgui
module is very useful to build quick prototypal applications. When you are ready to produce a finalized version of your application, you will probably want to use the GUI module offered by your IDE in order to build an application with a more professional look.
Here, our application uses both input and output images. As an exercise, you should rewrite this simple program such that it takes advantage of the function's in-place processing, that is, by not declaring the output image and writing it instead:
cv::flip(image,image,1); // in-place processing
The highgui
module contains a rich set of functions that help you interact with your images. Using these, your applications can react to mouse or key events. You can also draw shapes and write text on images.
You can program your mouse to perform specific operations when it is over one of the image windows you created. This is done by defining an appropriate callback function. A callback function is a function that you do not explicitly call but which is called by your application in response to specific events (here, the events that concern the mouse interacting with an image window). To be recognized by applications, callback functions need to have a specific signature and must be registered. In the case of a mouse event handler, the callback function must have the following signature:
void onMouse( int event, int x, int y, int flags, void* param);
The first parameter is an integer that is used to specify which type of mouse event has triggered the call to the callback function. The other two parameters are simply the pixel coordinates of the mouse location when the event has occurred. The flags are used to determine which button was pressed when the mouse event was triggered. Finally, the last parameter is used to send an extra parameter to the function in the form of a pointer to any object. This callback function can be registered in the application through the following call:
cv::setMouseCallback("Original Image", onMouse, reinterpret_cast<void*>(&image));
In this example, the onMouse
function is associated with the image window called Original Image, and the address of the displayed image is passed as an extra parameter to the function. Now, if we define the onMouse
callback function as shown in the following code, then each time the mouse is clicked, the value of the corresponding pixel will be displayed on the console (here, we assume that it is a gray-level image):
void onMouse( int event, int x, int y, int flags, void* param) { cv::Mat *im= reinterpret_cast<cv::Mat*>(param); switch (event) { // dispatch the event case cv::EVENT_LBUTTONDOWN: // left mouse button down event // display pixel value at (x,y) std::cout << "at (" << x << "," << y << ") value is: " << static_cast<int>( im->at<uchar>(cv::Point(x,y))) << std::endl; break; } }
Note that in order to obtain the pixel value at (x,y)
, we used the at
method of the cv::Mat
object; this is discussed in
Chapter 2
, Manipulating Pixels. Other possible events that can be received by the mouse event callback function include cv::EVENT_MOUSEMOVE
, cv::EVENT_LBUTTONUP
, cv::EVENT_RBUTTONDOWN
, and cv::EVENT_RBUTTONUP
.
OpenCV also offers a few functions to draw shapes and write text on images. The examples of basic shape-drawing functions are circle
, ellipse
, line
, and rectangle
. The following is an example of how to use the circle
function:
cv::circle(image, // destination image cv::Point(155,110), // center coordinate 65, // radius 0, // color (here black) 3); // thickness
The cv::Point
structure is often used in OpenCV methods and functions to specify a pixel coordinate. Note that here we assume that the drawing is done on a gray-level image; this is why the color is specified with a single integer. In the next recipe, you will learn how to specify a color value in the case of color images that use the cv::Scalar
structure. It is also possible to write text on an image. This can be done as follows:
cv::putText(image, // destination image "This is a dog.", // text cv::Point(40,200), // text position cv::FONT_HERSHEY_PLAIN, // font type 2.0, // font scale 255, // text color (here white) 2); // text thickness
Calling these two functions on our test image will then result in the following screenshot:

Note that you have to include the top-level module header opencv2/imgproc.hpp
for these examples to work.
In the previous recipe, you were introduced to the cv::Mat
data structure. As mentioned, this is a key component of the library. It is used to manipulate images and matrices (in fact, an image is a matrix from a computational and mathematical point of view). Since you will be using this data structure extensively in your application development processes, it is imperative that you become familiar with it. Notably, in this recipe, you will learn that this data structure incorporates an elegant memory management mechanism.
Let's write the following test program that will allow us to test the different properties of the cv::Mat
data structure:
#include <iostream> #include <opencv2/core.hpp> #include <opencv2/highgui.hpp> // test function that creates an image cv::Mat function() { // create image cv::Mat ima(500,500,CV_8U,50); // return it return ima; } int main() { // create a new image made of 240 rows and 320 columns cv::Mat image1(240,320,CV_8U,100); cv::imshow("Image", image1); // show the image cv::waitKey(0); // wait for a key pressed // re-allocate a new image image1.create(200,200,CV_8U); image1= 200; cv::imshow("Image", image1); // show the image cv::waitKey(0); // wait for a key pressed // create a red color image // channel order is BGR cv::Mat image2(240,320,CV_8UC3,cv::Scalar(0,0,255)); // or: // cv::Mat image2(cv::Size(320,240),CV_8UC3); // image2= cv::Scalar(0,0,255); cv::imshow("Image", image2); // show the image cv::waitKey(0); // wait for a key pressed // read an image cv::Mat image3= cv::imread("puppy.bmp"); // all these images point to the same data block cv::Mat image4(image3); image1= image3; // these images are new copies of the source image image3.copyTo(image2); cv::Mat image5= image3.clone(); // transform the image for testing cv::flip(image3,image3,1); // check which images have been affected by the processing cv::imshow("Image 3", image3); cv::imshow("Image 1", image1); cv::imshow("Image 2", image2); cv::imshow("Image 4", image4); cv::imshow("Image 5", image5); cv::waitKey(0); // wait for a key pressed // get a gray-level image from a function cv::Mat gray= function(); cv::imshow("Image", gray); // show the image cv::waitKey(0); // wait for a key pressed // read the image in gray scale image1= cv::imread("puppy.bmp", CV_LOAD_IMAGE_GRAYSCALE); image1.convertTo(image2,CV_32F,1/255.0,0.0); cv::imshow("Image", image2); // show the image cv::waitKey(0); // wait for a key pressed return 0; }
Run this program and take a look at the images it produces:

The cv::Mat
data structure is essentially made up of two parts: a header and a data block. The header contains all of the information associated with the matrix (size, number of channels, data type, and so on). The previous recipe showed you how to access some of the attributes of this structure contained in its header (for example, by using cols
, rows
, or channels
). The data block holds all the pixel values of an image. The header contains a pointer variable that points to this data block; it is the data
attribute. An important property of the cv::Mat
data structure is the fact that the memory block is only copied when explicitly requested for. Indeed, most operations will simply copy the cv::Mat
header such that multiple objects will point to the same data block. This memory management model makes your applications more efficient while avoiding memory leaks, but its consequences need to be understood. The examples of this recipe illustrate this fact.
By default, the cv::Mat
objects have a zero size when they are created, but you can also specify an initial size as follows:
// create a new image made of 240 rows and 320 columns cv::Mat image1(240,320,CV_8U,100);
In this case, you also need to specify the type of each matrix element-CV_8U
here, which corresponds to 1-byte pixel (grayscale) images. The U
letter here means it is unsigned. You can also declare signed numbers using S
. For a color image, you would specify three channels (CV_8UC3
). You can also declare integers (signed or unsigned) of size 16
and 32
(for example, CV_16SC3
). You also have access to 32-bit and 64-bit floating-point numbers (for example, CV_32F
).
Each element of an image (or a matrix) can be composed of more than one value (for example, the three channels of a color image); therefore, OpenCV has introduced a simple data structure that is used when pixel values are passed to functions. This is the cv::Scalar
structure, which is generally used to hold one or three values. For example, to create a color image initialized with red pixels, write the following code:
// create a red color image // channel order is BGR cv::Mat image2(240,320,CV_8UC3,cv::Scalar(0,0,255));
Similarly, the initialization of the gray-level image could have also been done using this structure by writing cv::Scalar(100)
.
The image size often needs to be passed to functions as well. We have already mentioned that the cols
and rows
attributes can be used to get the dimensions of a cv::Mat
instance. The size information can also be provided through the cv::Size
structure that simply contains the height and width of the matrix. The size()
method allows you to obtain the current matrix size. This is the format that is used in many methods where a matrix size must be specified.
For example, an image could be created as follows:
// create a non-initialized color image cv::Mat image2(cv::Size(320,240),CV_8UC3);
The data block of an image can always be allocated or reallocated using the create
method. When an image has already been previously allocated, its old content is deallocated first. For reasons of efficiency, if the new proposed size and type matches the already existing size and type, then no new memory allocation is performed:
// re-allocate a new image // (only if size or type are different) image1.create(200,200,CV_8U);
When no more references point to a given cv::Mat
object, the allocated memory is automatically released. This is very convenient because it avoids the common memory leak problems often associated with dynamic memory allocation in C++. This is a key mechanism in OpenCV (introduced in version 2) that is accomplished by having the cv::Mat
class implement reference counting and shallow copy. Therefore, when an image is assigned to another one, the image data (that is, the pixels) is not copied; both images will point to the same memory block. This also applies to images either passed or returned by a value. A reference count is kept such that the memory will be released only when all the references to the image are destructed or assigned to another image:
// all these images point to the same data block cv::Mat image4(image3); image1= image3;
Any transformation applied to one of the preceding images will also affect the other images. If you wish to create a deep copy of the content of an image, use the copyTo
method. In this case, the create
method is called on the destination image. Another method that produces a copy of an image is the clone
method, which creates a new identical image as follows:
// these images are new copies of the source image image3.copyTo(image2); cv::Mat image5= image3.clone();
In the example of this recipe, we applied a transformation to image3
. The other images also contain this image; some of them share the same image data, while others hold a copy of this image. Check the displayed images and find out which ones were affected by the image3
transformation.
If you need to copy an image into another image that does not necessarily have the same data type, use the convertTo
method:
// convert the image into a floating point image [0,1] image1.convertTo(image2,CV_32F,1/255.0,0.0);
In this example, the source image is copied into a floating-point image. The method includes two optional parameters: a scaling factor and an offset. Note that both the images must, however, have the same number of channels.
The allocation model for the cv::Mat
objects also allows you to safely write functions (or class methods) that return an image:
cv::Mat function() { // create image cv::Mat ima(240,320,CV_8U,cv::Scalar(100)); // return it return ima; }
We can also call this function from our main
function as follows:
// get a gray-level image cv::Mat gray= function();
If we do this, the gray
variable will then hold the image created by the function without extra memory allocation. Indeed, as we explained, only a shallow copy of the image will be transferred from the returned cv::Mat
instance to the gray image. When the ima
local variable goes out of scope, this variable is deallocated. However, since the associated reference counter indicates that its internal image data is being referred to by another instance (that is, the gray
variable), its memory block is not released.
It's worth noting that in the case of classes, you should be careful and not return image class attributes. Here is an example of an error-prone implementation:
class Test { // image attribute cv::Mat ima; public: // constructor creating a gray-level image Test() : ima(240,320,CV_8U,cv::Scalar(100)) {} // method return a class attribute, not a good idea... cv::Mat method() { return ima; } };
Here, if a function calls the method of this class, it obtains a shallow copy of the image attributes. If this copy is modified later, the class
attribute will also be surreptitiously modified, which can affect the subsequent behavior of the class (and vice versa). This is a violation of the important principle of encapsulation in object-oriented programming. To avoid these kinds of errors, you should instead return a clone of the attribute.
When you are manipulating the cv::Mat
class, you will discover that OpenCV also includes several other related classes. It will be important for you to become familiar with them.
If you look at the OpenCV documentation, you will see that many methods and functions accept parameters of the cv::InputArray
type as an input. This type is a simple proxy class introduced to generalize the concept of arrays in OpenCV and thus avoid the duplication of several versions of the same method or function with different input parameter types. It basically means that you can supply either a cv::Mat
object or other compatible types as an argument. Since it is declared as an input array, you have the guarantee that your data structure will not be modified by the function. It is interesting to know that cv::InputArray
can also be constructed from the popular std::vector
class. This means that such objects can be used as input parameters to OpenCV methods and functions (however, never use this class inside your classes and functions). Other compatible types are cv::Scalar
and cv::Vec
; the latter structure will be presented in the next chapter. There is also a cv::OutputArray
proxy class that is used to designate parameters that correspond to an image that is returned by a function or method.
When writing your applications, you might have to manipulate small matrices. You can then use the cv::Matx
template class and its subclasses. For example, the following code declares a 3x3
matrix of double-precision floating-point numbers and a 3-element vector. These two are then multiplied together:
// a 3x3 matrix of double cv::Matx33d matrix(3.0, 2.0, 1.0, 2.0, 1.0, 3.0, 1.0, 2.0, 3.0); // a 3x1 matrix (a vector) cv::Matx31d vector(5.0, 1.0, 3.0); // multiplication cv::Matx31d result = matrix*vector;
As you can see, the usual math operators can be applied to these matrices.
The complete OpenCV documentation can be found at http://docs.opencv.org/
Chapter 2 , Manipulating Pixels, will show you how to efficiently access and modify the pixel values of an image represented by the
cv::Mat
classThe next recipe, Defining regions of interest , will explain how to define a region of interest inside an image
Sometimes, a processing function needs to be applied only to a portion of an image. OpenCV incorporates an elegant and simple mechanism to define a subregion in an image and manipulate it as a regular image. This recipe will teach you how to define a region of interest inside an image.
Suppose we want to copy a small image onto a larger one. For example, let's say we want to insert the following logo into our test image:

To do this, a Region Of Interest (ROI) can be defined over which the copy operation can be applied. As we will see, the position of the ROI will determine where the logo will be inserted in the image.
The first step consists of defining the ROI. Once defined, the ROI can be manipulated as a regular cv::Mat
instance. The key is that the ROI is indeed a cv::Mat
object that points to the same data buffer as its parent image and has a header that specifies the coordinates of the ROI. Inserting the logo is then accomplished as follows:
// define image ROI at image bottom-right cv::Mat imageROI(image, cv::Rect(image.cols-logo.cols, // ROI coordinates image.rows-logo.rows, logo.cols,logo.rows)); // ROI size // insert logo logo.copyTo(imageROI);
Here, image
is the destination image and logo
is the logo image (of a smaller size). The following image is then obtained by executing the previous code:

One way to define an ROI is to use a cv::Rect
instance. As the name indicates, it describes a rectangular region by specifying the position of the upper-left corner (the first two parameters of the constructor) and the size of the rectangle (the width and height are given in the last two parameters). In our example, we used the size of the image and the size of the logo in order to determine the position where the logo would cover the bottom-right corner of the image. Obviously, the ROI should always be completely inside the parent image.
The ROI can also be described using row and column ranges. A range is a continuous sequence from a start index to an end index (excluding both). The cv::Range
structure is used to represent this concept. Therefore, an ROI can be defined from two ranges; in our example, the ROI could have been equivalently defined as follows:
imageROI= image(cv::Range(image.rows-logo.rows,image.rows), cv::Range(image.cols-logo.cols,image.cols));
In this case, the operator()
function of cv ::Mat
returns another cv::Mat
instance that can then be used in subsequent calls. Any transformation of the ROI will affect the original image in the corresponding area because the image and the ROI share the same image data. Since the definition of an ROI does not include the copying of data, it is executed in a constant amount of time, no matter the size of the ROI.
If you want to define an ROI made up of some lines of an image, the following call can be used:
cv::Mat imageROI= image.rowRange(start,end);
Similarly, for an ROI made up of some image columns, the following can be used:
cv::Mat imageROI= image.colRange(start,end);
The OpenCV methods and functions include many optional parameters that are not discussed in the recipes of this book. When you wish to use a function for the first time, you should always take the time to look at the documentation to learn more about the possible options that the function offers. One very common option is the possibility to define image masks.
Some OpenCV operations allow you to define a mask that will limit the applicability of a given function or method, which is normally supposed to operate on all the image pixels. A mask is an 8-bit image that should be nonzero at all locations where you want an operation to be applied. At the pixel locations that correspond to the zero values of the mask, the image is untouched. For example, the copyTo
method can be called with a mask. We can use it here to copy only the white portion of the logo shown previously, as follows:
// define image ROI at image bottom-right imageROI= image(cv::Rect(image.cols-logo.cols, image.rows-logo.rows, logo.cols,logo.rows)); // use the logo as a mask (must be gray-level) cv::Mat mask(logo); // insert by copying only at locations of non-zero mask logo.copyTo(imageROI,mask);
The following image is obtained by executing the previous code:

The background of our logo was black (therefore, it had the value 0
); this is why it was easy to use it as both the copied image and the mask. Of course, you can define the mask of your choice in your application; most OpenCV pixel-based operations give you the opportunity to use masks.
The
row
andcol
methods will be used in the Scanning an image with neighbor access recipe of Chapter 2 , Manipulating Pixels. These are a special case of therowRange
andcolRange
methods in which the start and end indexes are equal in order to define a single-line or single-column ROI.