Learning Image Processing with OpenCV

4.3 (3 reviews total)
By Gloria Bueno García , Oscar Deniz Suarez , José Luis Espinosa Aranda and 3 more
  • Instant online access to over 7,500+ books and videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies

About this book

OpenCV, arguably the most widely used computer vision library, includes hundreds of ready-to-use imaging and vision functions and is used in both academia and enterprises.

This book provides an example-based tour of OpenCV's main image processing algorithms. Starting with an exploration of library installation, wherein the library structure and basics of image and video reading/writing are covered, you will dive into image filtering and the color manipulation features of OpenCV with LUTs. You'll then be introduced to techniques such as inpainting and denoising to enhance images as well as the process of HDR imaging. Finally, you'll master GPU-based accelerations. By the end of this book, you will be able to create smart and powerful image processing applications with ease! All the topics are described with short, easy-to-follow examples.

Publication date:
March 2015


Chapter 1. Handling Image and Video Files

This chapter is intended as a first contact with OpenCV, its installation, and first basic programs. We will cover the following topics:

  • A brief introduction to OpenCV for the novice, followed by an easy step-by-step guide to the installation of the library

  • A quick tour of OpenCV's structure after the installation in the user's local disk

  • Quick recipes to create projects using the library with some common programming frameworks

  • How to use the functions to read and write images and videos

  • Finally, we describe the library functions to add rich user interfaces to the software projects, including mouse interaction, drawing primitives, and Qt support


An introduction to OpenCV

Initially developed by Intel, OpenCV (Open Source Computer Vision) is a free cross-platform library for real-time image processing that has become a de facto standard tool for all things related to Computer Vision. The first version was released in 2000 under BSD license and since then, its functionality has been very much enriched by the scientific community. In 2012, the nonprofit foundation OpenCV.org took on the task of maintaining a support site for developers and users.


At the time of writing this book, a new major version of OpenCV (Version 3.0) is available, still on beta status. Throughout the book, we will present the most relevant changes brought with this new version.

OpenCV is available for the most popular operating systems, such as GNU/Linux, OS X, Windows, Android, iOS, and some more. The first implementation was in the C programming language; however, its popularity grew with its C++ implementation as of Version 2.0. New functions are programmed with C++. However, nowadays, the library has a full interface for other programming languages, such as Java, Python, and MATLAB/Octave. Also, wrappers for other languages (such as C#, Ruby, and Perl) have been developed to encourage adoption by programmers.

In an attempt to maximize the performance of computing intensive vision tasks, OpenCV includes support for the following:

  • Multithreading on multicore computers using Threading Building Blocks (TBB)—a template library developed by Intel.

  • A subset of Integrated Performance Primitives (IPP) on Intel processors to boost performance. Thanks to Intel, these primitives are freely available as of Version 3.0 beta.

  • Interfaces for processing on Graphic Processing Unit (GPU) using Compute Unified Device Architecture (CUDA) and Open Computing Language (OpenCL).

The applications for OpenCV cover areas such as segmentation and recognition, 2D and 3D feature toolkits, object identification, facial recognition, motion tracking, gesture recognition, image stitching, high dynamic range (HDR) imaging, augmented reality, and so on. Moreover, to support some of the previous application areas, a module with statistical machine learning functions is included.


Downloading and installing OpenCV

OpenCV is freely available for download at http://opencv.org. This site provides the last version for distribution (currently, 3.0 beta) and older versions.


Special care should be taken with possible errors when the downloaded version is a nonstable release, for example, the current 3.0 beta version.

On http://opencv.org/downloads.html, suitable versions of OpenCV for each platform can be found. The code and information of the library can be obtained from different repositories depending on the final purpose:

  • The main repository (at http://sourceforge.net/projects/opencvlibrary), devoted to final users. It contains binary versions of the library and ready-to‑compile sources for the target platform.

  • The test data repository (at https://github.com/itseez/opencv_extra) with sets of data to test purposes of some library modules.

  • The contributions repository (at http://github.com/itseez/opencv_contrib) with the source code corresponding to extra and cutting-edge features supplied by contributors. This code is more error-prone and less tested than the main trunk.


    With the last version, OpenCV 3.0 beta, the extra contributed modules are not included in the main package. They should be downloaded separately and explicitly included in the compilation process through the proper options. Be cautious if you include some of those contributed modules, because some of them have dependencies on third‑party software not included with OpenCV.

  • The documentation site (at http://docs.opencv.org/master/) for each of the modules, including the contributed ones.

  • The development repository (at https://github.com/Itseez/opencv) with the current development version of the library. It is intended for developers of the main features of the library and the "impatient" user who wishes to use the last update even before it is released.

Rather than GNU/Linux and OS X, where OpenCV is distributed as source code only, in the Windows distribution, one can find precompiled (with Microsoft Visual C++ v10, v11, and v12) versions of the library. Each precompiled version is ready to be used with Microsoft compilers. However, if the primary intention is to develop projects with a different compiler framework, we need to compile the library for that specific compiler (for example, GNU GCC).


The fastest route to working with OpenCV is to use one of the precompiled versions included with the distribution. Then, a better choice is to build a fine-tuned version of the library with the best settings for the local platform used for software development. This chapter provides the information to build and install OpenCV on Windows. Further information to set the library on Linux can be found at http://docs.opencv.org/doc/tutorials/introduction/linux_install and https://help.ubuntu.com/community/OpenCV.

Getting a compiler and setting CMake

A good choice for cross‑platform development with OpenCV is to use the GNU toolkit (including gmake, g++, and gdb). The GNU toolkit can be easily obtained for the most popular operating systems. Our preferred choice for a development environment consists of the GNU toolkit and the cross‑platform Qt framework, which includes the Qt library and the Qt Creator Integrated Development Environment (IDE). The Qt framework is freely available at http://qt-project.org/.


After installing the compiler on Windows, remember to properly set the Path environment variable, adding the path for the compiler's executable, for example, C:\Qt\Qt5.2.1\5.2.1\mingw48_32\bin for the GNU/compilers included with the Qt framework. On Windows, the free Rapid Environment Editor tool (available at http://www.rapidee.com) provides a convenient way to change Path and other environment variables.

To manage the build process for the OpenCV library in a compiler-independent way, CMake is the recommended tool. CMake is a free and open source cross‑platform tool available at http://www.cmake.org/.

Configuring OpenCV with CMake

Once the sources of the library have been downloaded into the local disk, it is required that you configure the makefiles for the compilation process of the library. CMake is the key tool for an easy configuration of OpenCV's installation process. It can be used from the command line or in a more user‑friendly way with its Graphical User Interface (GUI) version.

The steps to configure OpenCV with CMake can be summarized as follows:

  1. Choose the source (let's call it OPENCV_SRC in what follows) and target (OPENCV_BUILD) directories. The target directory is where the compiled binaries will be located.

  2. Mark the Grouped and Advanced checkboxes and click on the Configure button.

  3. Choose the desired compiler (for example, GNU default compilers, MSVC, and so on).

  4. Set the preferred options and unset those not desired.

  5. Click on the Configure button and repeat steps 4 and 5 until no errors are obtained.

  6. Click on the Generate button and close CMake.

The following screenshot shows you the main window of CMake with the source and target directories and the checkboxes to group all the available options:

The main window of CMake after the preconfiguration step


For brevity, we use OPENCV_BUILD and OPENCV_SRC in this text to denote the target and source directories of the OpenCV local setup, respectively. Keep in mind that all directories should match your current local configuration.

During the preconfiguration process, CMake detects the compilers present and many other local properties to set the build process of OpenCV. The previous screenshot displays the main CMake window after the preconfiguration process, showing the grouped options in red.

It is possible to leave the default options unchanged and continue the configuration process. However, some convenient options can be set:

  • BUILD_EXAMPLES: This is set to build some examples using OpenCV.

  • BUILD_opencv_<module_name>: This is set to include the module (module_name) in the build process.

  • OPENCV_EXTRA_MODULES_PATH: This is used when you need some extra contributed module; set the path for the source code of the extra modules here (for example, C:/opencv_contrib-master/modules).

  • WITH_QT: This is turned on to include the Qt functionality into the library.

  • WITH_IPP: This option is turned on by default. The current OpenCV 3.0 version includes a subset of the Intel Integrated Performance Primitives (IPP) that speed up the execution time of the library.


If you compile the new OpenCV 3.0 (beta), be cautious because some unexpected errors have been reported related to the IPP inclusion (that is, with the default value of this option). We recommend that you unset the WITH_IPP option.

If the configuration steps with CMake (loop through steps 4 and 5) don't produce any further errors, it is possible to generate the final makefiles for the build process. The following screenshot shows you the main window of CMake after a generation step without errors:

Compiling and installing the library

The next step after the generation process of makefiles with CMake is the compilation with the proper make tool. This tool is usually executed on the command line (the console) from the target directory (the one set at the CMake configuration step). For example, in Windows, the compilation should be launched from the command line as follows:


This command launches a build process using the makefiles generated by CMake. The whole compilation typically takes several minutes. If the compilation ends without errors, the installation continues with the execution of the following command:

OPENCV_BUILD>mingw32-make install

This command copies the OpenCV binaries to the OPENCV_BUILD\install directory.

If something went wrong during the compilation, we should run CMake again to change the options selected during the configuration. Then, we should regenerate the makefiles.

The installation ends by adding the location of the library binaries (for example, in Windows, the resulting DLL files are located at OPENCV_BUILD\install\x64\mingw\bin) to the Path environment variable. Without this directory in the Path field, the execution of every OpenCV executable will give an error as the library binaries won't be found.

To check the success of the installation process, it is possible to run some of the examples compiled along with the library (if the BUILD_EXAMPLES option was set using CMake). The code samples (written in C++) can be found at OPENCV_BUILD\install\x64\mingw\samples\cpp.


The short instructions given to install OpenCV apply to Windows. A detailed description with the prerequisites for Linux can be read at http://docs.opencv.org/doc/tutorials/introduction/linux_install/linux_install.html. Although the tutorial applies to OpenCV 2.0, almost all the information is still valid for Version 3.0.


The structure of OpenCV

Once OpenCV is installed, the OPENCV_BUILD\install directory will be populated with three types of files:

  • Header files: These are located in the OPENCV_BUILD\install\include subdirectory and are used to develop new projects with OpenCV.

  • Library binaries: These are static or dynamic libraries (depending on the option selected with CMake) with the functionality of each of the OpenCV modules. They are located in the bin subdirectory (for example, x64\mingw\bin when the GNU compiler is used).

  • Sample binaries: These are executables with examples that use the libraries. The sources for these samples can be found in the source package (for example, OPENCV_SRC\sources\samples).

OpenCV has a modular structure, which means that the package includes a static or dynamic (DLL) library for each module. The official documentation for each module can be found at http://docs.opencv.org/master/. The main modules included in the package are:

  • core: This defines the basic functions used by all the other modules and the fundamental data structures including the important multidimensional array Mat.

  • highgui: This provides simple user interface (UI) capabilities. Building the library with Qt support (the WITH_QT CMake option) allows UI compatibility with such a framework.

  • imgproc: These are image processing functions that include filtering (linear and nonlinear), geometric transformations, color space conversion, histograms, and so on.

  • imgcodecs: This is an easy-to-use interface to read and write images.


    Pay attention to the changes in modules since OpenCV 3.0 as some functionality has been moved to a new module (for example, reading and writing images functions were moved from highgui to imgcodecs).

  • photo: This includes Computational Photography including inpainting, denoising, High Dynamic Range (HDR) imaging, and some others.

  • stitching: This is used for image stitching.

  • videoio: This is an easy-to-use interface for video capture and video codecs.

  • video: This supplies the functionality of video analysis (motion estimation, background extraction, and object tracking).

  • features2d: These are functions for feature detection (corners and planar objects), feature description, feature matching, and so on.

  • objdetect: These are functions for object detection and instances of predefined detectors (such as faces, eyes, smile, people, cars, and so on).

Some other modules are calib3d (camera calibration), flann (clustering and search), ml (machine learning), shape (shape distance and matching), superres (super resolution), video (video analysis), and videostab (video stabilization).


As of Version 3.0 beta, the new contributed modules are distributed in a separate package (opencv_contrib-master.zip) that can be downloaded from https://github.com/itseez/opencv_contrib. These modules provide extra features that should be fully understood before using them. For a quick overview of the new functionality in the new release of OpenCV (Version 3.0), refer to the document at http://opencv.org/opencv-3-0-beta.html.


Creating user projects with OpenCV

In this book, we assume that C++ is the main language for programming image processing applications, although interfaces and wrappers for other programming languages are actually provided (for instance, Python, Java, MATLAB/Octave, and some more).

In this section, we explain how to develop applications with OpenCV's C++ API using an easy-to-use cross-platform framework.

General usage of the library

To develop an OpenCV application with C++, we require our code to:

  • Include the OpenCV header files with definitions

  • Link the OpenCV libraries (binaries) to get the final executable

The OpenCV header files are located in the OPENCV_BUILD\install\include\opencv2 directory where there is a file (*.hpp) for each of the modules. The inclusion of the header file is done with the #include directive, as shown here:

#include <opencv2/<module_name>/<module_name>.hpp>
// Including the header file for each module used in the code

With this directive, it is possible to include every header file needed by the user program. On the other hand, if the opencv.hpp header file is included, all the header files will be automatically included as follows:

#include <opencv2/opencv.hpp>
// Including all the OpenCV's header files in the code


Remember that all the modules installed locally are defined in the OPENCV_BUILD\install\include\opencv2\opencv_modules.hpp header file, which is generated automatically during the building process of OpenCV.

The use of the #include directive is not always a guarantee for the correct inclusion of the header files, because it is necessary to tell the compiler where to find the include files. This is achieved by passing a special argument with the location of the files (such as I\<location> for GNU compilers).

The linking process requires you to provide the linker with the libraries (dynamic or static) where the required OpenCV functionality can be found. This is usually done with two types of arguments for the linker: the location of the library (such as ‑L<location> for GNU compilers) and the name of the library (such as -l<module_name>).


You can find a complete list of available online documentation for GNU GCC and Make at https://gcc.gnu.org/onlinedocs/ and https://www.gnu.org/software/make/manual/.

Tools to develop new projects

The main prerequisites to develop our own OpenCV C++ applications are:

  • OpenCV header files and library binaries: Of course we need to compile OpenCV, and the auxiliary libraries are prerequisites for such a compilation. The package should be compiled with the same compiler used to generate the user application.

  • A C++ compiler: Some associate tools are convenient as the code editor, debugger, project manager, build process manager (for instance CMake), revision control system (such as Git, Mercurial, SVN, and so on), and class inspector, among others. Usually, these tools are deployed together in a so-called Integrated Development Environment (IDE).

  • Any other auxiliary libraries: Optionally, any other auxiliary libraries needed to program the final application, such as graphical, statistical, and so on will be required.

The most popular available compiler kits to program OpenCV C++ applications are:

  • Microsoft Visual C (MSVC): This is only supported on Windows and it is very well integrated with the IDE Visual Studio, although it can be also integrated with other cross-platform IDEs, such as Qt Creator or Eclipse. Versions of MSVC that currently compatible with the latest OpenCV release are VC 10, VC 11, and VC 12 (Visual Studio 2010, 2012, and 2013).

  • GNU Compiler Collection GNU GCC: This is a cross‑platform compiler system developed by the GNU project. For Windows, this kit is known as MinGW (Minimal GNU GCC). The version compatible with the current OpenCV release is GNU GCC 4.8. This kit may be used with several IDEs, such as Qt Creator, Code::Blocks, Eclipse, among others.

For the examples presented in this book, we used the MinGW 4.8 compiler kit for Windows plus the Qt 5.2.1 library and the Qt Creator IDE (3.0.1). The cross-platform Qt library is required to compile OpenCV with the new UI capabilities provided by such a library.


For Windows, it is possible to download a Qt bundle (including Qt library, Qt Creator, and the MinGW kit) from http://qt-project.org/. The bundle is approximately 700 MB.

Qt Creator is a cross-platform IDE for C++ that integrates the tools we need to code applications. In Windows, it may be used with MinGW or MSVC. The following screenshot shows you the Qt Creator main window with the different panels and views for an OpenCV C++ project:

The main window of Qt Creator with some views from an OpenCV C++ project

Creating an OpenCV C++ program with Qt Creator

Next, we explain how to create a code project with the Qt Creator IDE. In particular, we apply this description to an OpenCV example.

We can create a project for any OpenCV application using Qt Creator by navigating to File | New File or File | Project… and then navigating to Non-Qt Project | Plain C++ Project. Then, we have to choose a project name and the location at which it will be stored. The next step is to pick a kit (that is, the compiler) for the project (in our case, Desktop Qt 5.2.1 MinGW 32 bit) and the location for the binaries generated. Usually, two possible build configurations (profiles) are used: debug and release. These profiles set the appropriate flags to build and run the binaries.

When a project is created using Qt Creator, two special files (with .pro and .pro.user extensions) are generated to configure the build and run processes. The build process is determined by the kit chosen during the creation of the project. With the Desktop Qt 5.2.1 MinGW 32 bit kit, this process relies on the qmake and mingw32‑make tools. Using the *.pro file as the input, qmake generates the makefile that drives the build process for each profile (that is, release and debug). The qmake tool is used from the Qt Creator IDE as an alternative to CMake to simplify the build process of software projects. It automates the generation of makefiles from a few lines of information.

The following lines represent an example of a *.pro file (for example, showImage.pro):

TARGET: showImage
CONFIG += console 
CONFIG -= app_bundle
CONFIG -= qt
INCLUDEPATH += C:/opencv300-buildQt/install/include
LIBS += -LC:/opencv300-buildQt/install/x64/mingw/lib \
    -lopencv_core300.dll \

The preceding file illustrates the options that qmake needs to generate the appropriate makefiles to build the binaries for our project. Each line starts with a tag indicating an option (TARGET, CONFIG, SOURCES, INCLUDEPATH, and LIBS) followed with a mark to add (+=) or remove (-=) the value of the option. In this sample project, we use the non-Qt console application. The executable file is showImage.exe (TARGET) and the source file is showImage.cpp (SOURCES). As this project is an OpenCV-based application, the two last tags indicate the location of the header files (INCLUDEPATH) and the OpenCV libraries (LIBS) used by this particular project (core, imgcodecs, highgui, and imgproc). Note that a backslash at the end of the line denotes continuation in the next line.


For a detailed description of the tools (including Qt Creator and qmake) developed within the Qt project, visit http://doc.qt.io/.


Reading and writing image files

Image processing relies on getting an image (for instance, a photograph or a video fame) and "playing" with it by applying signal processing techniques on it to get the desired results. In this section, we show you how to read images from files using the functions supplied by OpenCV.

The basic API concepts

The Mat class is the main data structure that stores and manipulates images in OpenCV. This class is defined in the core module. OpenCV has implemented mechanisms to allocate and release memory automatically for these data structures. However, the programmer should still take special care when data structures share the same buffer memory. For instance, the assignment operator does not copy the memory content from an object (Mat A) to another (Mat B); it only copies the reference (the memory address of the content). Then, a change in one object (A or B) affects both objects. To duplicate the memory content of a Mat object, the Mat::clone() member function should be used.


Many functions in OpenCV process dense single or multichannel arrays, usually using the Mat class. However, in some cases, a different datatype may be convenient, such as std::vector<>, Matx<>, Vec<>, or Scalar. For this purpose, OpenCV provides the proxy classes InputArray and OutputArray, which allow any of the previous types to be used as parameters for functions.

The Mat class is used for dense n-dimensional single or multichannel arrays. It can actually store real or complex-valued vectors and matrices, colored or grayscale images, histograms, point clouds, and so on.

There are many different ways to create a Mat object, the most popular being the constructor where the size and type of the array are specified as follows:

Mat(nrows, ncols, type, fillValue)

The initial value for the array elements might be set by the Scalar class as a typical four-element vector (for each RGB and transparency component of the image stored in the array). Next, we show you a usage example of Mat as follows:

Mat img_A(4, 4, CV_8U, Scalar(255));
// White image:
// 4 x 4 single-channel array with 8 bits of unsigned integers
// (up to 255 values, valid for a grayscale image, for example,
// 255=white)

The DataType class defines the primitive datatypes for OpenCV. The primitive datatypes can be bool, unsigned char, signed char, unsigned short, signed short, int, float, double, or a tuple of values of one of these primitive types. Any primitive type can be defined by an identifier in the following form:

CV_<bit depth>{U|S|F}C(<number of channels>)

In the preceding code U, S, and F stand for unsigned, signed, and float, respectively. For the single channel arrays, the following enumeration is applied, describing the datatypes:

enum {CV_8U=0, CV_8S=1, CV_16U=2, CV_16S=3,CV_32S=4, CV_32F=5, CV_64F=6};


Here, it should be noted that these three declarations are equivalent: CV_8U, CV_8UC1, and CV_8UC(1). The single-channel declaration fits well for integer arrays devoted to grayscale images, whereas the three channel declaration of an array is more appropriate for images with three components (for example, RGB, BRG, HSV, and so on). For linear algebra operations, the arrays of type float (F) might be used.

We can define all of the preceding datatypes for multichannel arrays (up to 512 channels). The following screenshots illustrate an image's internal representation with one single channel (CV_8U, grayscale) and the same image represented with three channels (CV_8UC3, RGB). These screenshots are taken by zooming in on an image displayed in the window of an OpenCV executable (the showImage example):

An 8-bit representation of an image in RGB color and grayscale


It is important to notice that to properly save a RGB image with OpenCV functions, the image must be stored in memory with its channels ordered as BGR. In the same way, when an RGB image is read from a file, it is stored in memory with its channels in a BGR order. Moreover, it needs a supplementary fourth channel (alpha) to manipulate images with three channels, RGB, plus a transparency. For RGB images, a larger integer value means a brighter pixel or more transparency for the alpha channel.

All OpenCV classes and functions are in the cv namespace, and consequently, we will have the following two options in our source code:

  • Add the using namespace cv declaration after including the header files (this is the option used in all the code examples in this book).

  • Append the cv:: prefix to all the OpenCV classes, functions, and data structures that we use. This option is recommended if the external names provided by OpenCV conflict with the often-used standard template library (STL) or other libraries.

Image file-supported formats

OpenCV supports the most common image formats. However, some of them need (freely available) third-party libraries. The main formats supported by OpenCV are:

  • Windows bitmaps (*.bmp, *dib)

  • Portable image formats (*.pbm, *.pgm, *.ppm)

  • Sun rasters (*.sr, *.ras)

The formats that need auxiliary libraries are:

  • JPEG (*.jpeg, *.jpg, *.jpe)

  • JPEG 2000 (*.jp2)

  • Portable Network Graphics (*.png)

  • TIFF (*.tiff, *.tif)

  • WebP (*.webp).

In addition to the preceding listed formats, with the OpenCV 3.0 version, it includes a driver for the formats (NITF, DTED, SRTM, and others) supported by the Geographic Data Abstraction Library (GDAL) set with the CMake option, WITH_GDAL. Notice that the GDAL support has not been extensively tested on Windows OSes yet. In Windows and OS X, codecs shipped with OpenCV are used by default (libjpeg, libjasper, libpng, and libtiff). Then, in these OSes, it is possible to read the JPEG, PNG, and TIFF formats. Linux (and other Unix-like open source OSes) looks for codecs installed in the system. The codecs can be installed before OpenCV or else the libraries can be built from the OpenCV package by setting the proper options in CMake (for example, BUILD_JASPER, BUILD_JPEG, BUILD_PNG, and BUILD_TIFF).

The example code

To illustrate how to read and write image files with OpenCV, we will now describe the showImage example. The example is executed from the command line with the corresponding output windows as follows:

<bin_dir>\showImage.exe fruits.jpg fruits_bw.jpg

The output window for the showImage example

In this example, two filenames are given as arguments. The first one is the input image file to be read. The second one is the image file to be written with a grayscale copy of the input image. Next, we show you the source code and its explanation:

#include <opencv2/opencv.hpp>
#include <iostream>

using namespace std;
using namespace cv;

int main(int, char *argv[])
    Mat in_image, out_image;

    // Usage: <cmd> <file_in> <file_out>
    // Read original image
    in_image = imread(argv[1], IMREAD_UNCHANGED);
    if (in_image.empty()) { 
    // Check whether the image is read or not
    cout << "Error! Input image cannot be read...\n";
    return -1;
// Creates two windows with the names of the images
    namedWindow(argv[1], WINDOW_AUTOSIZE);
    namedWindow(argv[2], WINDOW_AUTOSIZE);
    // Shows the image into the previously created window
    imshow(argv[1], in_image);
    cvtColor(in_image, out_image, COLOR_BGR2GRAY);
    imshow(argv[2], in_image);
    cout << "Press any key to exit...\n";
    waitKey(); // Wait for key press
    // Writing image
    imwrite(argv[2], in_image);
    return 0;

Here, we use the #include directive with the opencv.hpp header file that, in fact, includes all the OpenCV header files. By including this single file, no more files need to be included. After declaring the use of cv namespace, all the variables and functions inside this namespace don't need the cv:: prefix. The first thing to do in the main function is to check the number of arguments passed in the command line. Then, a help message is displayed if an error occurs.

Reading image files

If the number of arguments is correct, the image file is read into the Mat in_image object with the imread(argv[1], IMREAD_UNCHANGED) function, where the first parameter is the first argument (argv[1]) passed in the command line and the second parameter is a flag (IMREAD_UNCHANGED), which means that the image stored into the memory object should be unchanged. The imread function determines the type of image (codec) from the file content rather than from the file extension.

The prototype for the imread function is as follows:

Mat imread(const String& filename, 
int flags = IMREAD_COLOR )

The flag specifies the color of the image read and they are defined and explained by the following enumeration in the imgcodecs.hpp header file:

enum { IMREAD_UNCHANGED = -1, // 8bit, color or not
  IMREAD_GRAYSCALE = 0, // 8bit, gray
  IMREAD_COLOR = 1, // unchanged depth, color
  IMREAD_ANYDEPTH = 2, // any depth, unchanged color
  IMREAD_ANYCOLOR = 4, // unchanged depth, any color
  IMREAD_LOAD_GDAL = 8 // Use gdal driver


As of Version 3.0 of OpenCV, the imread function is in the imgcodecs module and not in highgui like in OpenCV 2.x.


As several functions and declarations are moved into OpenCV 3.0, it is possible to get some compilation errors as one or more declarations (symbols and/or functions) are not found by the linker. To figure out where (*.hpp) a symbol is defined and which library to link, we recommend the following trick using the Qt Creator IDE:

Add the #include <opencv2/opencv.hpp> declaration to the code. Press the F2 function key with the mouse cursor over the symbol or function; this opens the *.hpp file where the symbol or function is declared.

After the input image file is read, check to see whether the operation succeeded. This check is achieved with the in_image.empty()member function. If the image file is read without errors, two windows are created to display the input and output images, respectively. The creation of windows is carried out with the following function:

void namedWindow(const String& winname,int flags = WINDOW_AUTOSIZE )

OpenCV windows are identified by a univocal name in the program. The flags' definition and their explanation are given by the following enumeration in the highgui.hpp header file:

enum { WINDOW_NORMAL = 0x00000000, 
  // the user can resize the window (no constraint) 
  // also use to switch a fullscreen window to a normal size
  WINDOW_AUTOSIZE = 0x00000001, 
  // the user cannot resize the window,
  // the size is constrained by the image displayed
  WINDOW_OPENGL = 0x00001000, // window with opengl support
  WINDOW_FREERATIO = 0x00000100, 
  // the image expends as much as it can (no ratio constraint)
  WINDOW_KEEPRATIO = 0x00000000 
  // the ratio of the image is respected

The creation of a window does not show anything on screen. The function (belonging to the highgui module) to display an image in a window is:

void imshow(const String& winname, InputArray mat)

The image (mat) is shown with its original size if the window (winname) was created with the WINDOW_AUTOSIZE flag.

In the showImage example, the second window shows a grayscale copy of the input image. To convert a color image to grayscale, the cvtColor function from the imgproc module is used. This function can actually be used to change the image color space.

Any window created in a program can be resized and moved from its default settings. When any window is no longer required, it should be destroyed in order to release its resources. This resource liberation is done implicitly at the end of a program, like in the example.

Event handling into the intrinsic loop

If we do nothing more after showing an image on a window, surprisingly, the image will not be shown at all. After showing an image on a window, we should start a loop to fetch and handle events related to user interaction with the window. Such a task is carried out by the following function (from the highgui module):

int waitKey(int delay=0)

This function waits for a key pressed during a number of milliseconds (delay > 0) returning the code of the key or -1 if the delay ends without a key pressed. If delay is 0 or negative, the function waits forever until a key is pressed.


Remember that the waitKey function only works if there is a created and active window at least.

Writing image files

Another important function in the imgcodecs module is:

bool imwrite(const String& filename, InputArray img, const vector<int>& params=vector<int>())

This function saves the image (img) into a file (filename), being the third optional argument a vector of property-value pairs specifying the parameters of the codec (leave it empty to use the default values). The codec is determined by the extension of the file.


For a detailed list of codec properties, take a look at the imgcodecs.hpp header file and the OpenCV API reference at http://docs.opencv.org/master/modules/refman.html.


Reading and writing video files

Rather than still images, a video deals with moving images. The sources of video can be a dedicated camera, a webcam, a video file, or a sequence of image files. In OpenCV, the VideoCapture and VideoWriter classes provide an easy-to-use C++ API for the task of capturing and recording involved in video processing.

The example code

The recVideo example is a short snippet of code where you can see how to use a default camera as a capture device to grab frames, process them for edge detection, and save this new converted frame to a file. Also, two windows are created to simultaneously show you the original frame and the processed one. The example code is:

#include <opencv2/opencv.hpp>
#include <iostream>

using namespace std;
using namespace cv;

int main(int, char **)
  Mat in_frame, out_frame;
  const char win1[]="Grabbing...", win2[]="Recording...";
  double fps=30; // Frames per second
  char file_out[]="recorded.avi";

  VideoCapture inVid(0); // Open default camera
  if (!inVid.isOpened()) { // Check error
    cout << "Error! Camera not ready...\n";
    return -1;
  // Gets the width and height of the input video
  int width = (int)inVid.get(CAP_PROP_FRAME_WIDTH);
  int height = (int)inVid.get(CAP_PROP_FRAME_HEIGHT);
  VideoWriter recVid(file_out,
    fps, Size(width, height));
  if (!recVid.isOpened()) {
    cout << "Error! Video file not opened...\n";
    return -1;
  // Create two windows for orig. and final video
  while (true) {
    // Read frame from camera (grabbing and decoding)
    inVid >> in_frame;
    // Convert the frame to grayscale
    cvtColor(in_frame, out_frame, COLOR_BGR2GRAY);
    // Write frame to video file (encoding and saving)
    recVid << out_frame;
    imshow(win1, in_frame); // Show frame in window
    imshow(win2, out_frame); // Show frame in window
    if (waitKey(1000/fps) >= 0)
  inVid.release(); // Close camera
  return 0;

In this example, the following functions deserve a quick review:

  • double VideoCapture::get(int propId): This returns the value of the specified property for a VideoCapture object. A complete list of properties based on DC1394 (IEEE 1394 Digital Camera Specifications) is included with the videoio.hpp header file.

  • static int VideoWriter::fourcc(char c1, char c2, char c3, char c4): This concatenates four characters to a fourcc code. In the example, MSVC stands for Microsoft Video (only available for Windows). The list of valid fourcc codes is published at http://www.fourcc.org/codecs.php.

  • bool VideoWriter::isOpened(): This returns true if the object for writing the video was successfully initialized. For instance, using an improper codec produces an error.


    Be cautious; the valid fourcc codes in a system depend on the locally installed codecs. To know the installed fourcc codecs available in the local system, we recommend the open source tool MediaInfo, available for many platforms at http://mediaarea.net/en/MediaInfo.

  • VideoCapture& VideoCapture::operator>>(Mat& image): This grabs, decodes, and returns the next frame. This method has the equivalent bool VideoCapture::read(OutputArray image) function. It can be used rather than using the VideoCapture::grab()function, followed by VideoCapture::retrieve().

  • VideoWriter& VideoWriter::operator<<(const Mat& image): This writes the next frame. This method has the equivalent void VideoWriter::write(const Mat& image) function.

    In this example, there is a reading/writing loop where the window events are fetched and handled as well. The waitKey(1000/fps) function call is in charge of that; however, in this case, 1000/fps indicates the number of milliseconds to wait before returning to the external loop. Although not exact, an approximate measure of frames per second is obtained for the recorded video.

  • void VideoCapture::release(): This releases the video file or capturing device. Although not explicitly necessary in this example, we include it to illustrate its use.


User-interactions tools

In the previous sections, we explained how to create (namedWindow) a window to display (imshow) an image and fetch/handle events (waitKey). The examples we provide show you a very easy method for user interaction with OpenCV applications through the keyboard. The waitKey function returns the code of a key pressed before a timeout expires.

Fortunately, OpenCV provides more flexible ways for user interaction, such as trackbars and mouse interaction, which can be combined with some drawing functions to provide a richer user experience. Moreover, if OpenCV is locally compiled with Qt support (the WITH_QT option of CMake), a set of new functions are available to program an even better UI.

In this section, we provide a quick review of the available functionality to program user interfaces in an OpenCV project with Qt support. We illustrate this review on OpenCV UI support with the next example named showUI.

The example shows you a color image in a window, illustrating how to use some basic elements to enrich the user interaction. The following screenshot displays the UI elements created in the example:

The output window for the showUI example

The source code of the showUI example (without the callback functions) is as follows:

#include <opencv2/opencv.hpp>
#include <iostream>

using namespace std;
using namespace cv;

// Callback functions declarations
void cbMouse(int event, int x, int y, int flags, void*);
void tb1_Callback(int value, void *);
void tb2_Callback(int value, void *);
void checkboxCallBack(int state, void *);
void radioboxCallBack(int state, void *id);
void pushbuttonCallBack(int, void *font);

// Global definitions and variables
Mat orig_img, tmp_img;
const char main_win[]="main_win";
char msg[50];

int main(int, char* argv[]) {
  const char track1[]="TrackBar 1";
  const char track2[]="TrackBar 2";
  const char checkbox[]="Check Box";
  const char radiobox1[]="Radio Box1";
  const char radiobox2[]="Radio Box2";
  const char pushbutton[]="Push Button";
  int tb1_value = 50; // Initial value of trackbar 1
  int tb2_value = 25; // Initial value of trackbar 1

  orig_img = imread(argv[1]); // Open and read the image
  if (orig_img.empty()) {
    cout << "Error!!! Image cannot be loaded..." << endl;
    return -1;
  namedWindow(main_win); // Creates main window
  // Creates a font for adding text to the image
  QtFont font = fontQt("Arial", 20, Scalar(255,0,0,0),
  // Creation of CallBack functions
    setMouseCallback(main_win, cbMouse, NULL);
    createTrackbar(track1, main_win, &tb1_value,
      100, tb1_Callback);
    createButton(checkbox, checkboxCallBack, 0, 
    // Passing values (font) to the CallBack
    createButton(pushbutton, pushbuttonCallBack,
      (void *)&font, QT_PUSH_BUTTON);
    createTrackbar(track2, NULL, &tb2_value,
      50, tb2_Callback);
    // Passing values to the CallBack
    createButton(radiobox1, radioboxCallBack,
      (void *)radiobox1, QT_RADIOBOX);
    createButton(radiobox2, radioboxCallBack,
      (void *)radiobox2, QT_RADIOBOX);

  imshow(main_win, orig_img); // Shows original image
  cout << "Press any key to exit..." << endl;
  waitKey(); // Infinite loop with handle for events
  return 0;

When OpenCV is built with Qt support, every created window—through the highgui module—shows a default toolbar (see the preceding figure) with options (from left to right) for panning, zooming, saving, and opening the properties window.

Additional to the mentioned toolbar (only available with Qt), in the next subsections, we comment the different UI elements created in the example and the code to implement them.


Trackbars are created with the createTrackbar(const String& trackbarname, const String& winname, int* value, int count, TrackbarCallback onChange=0, void* userdata=0) function in the specified window (winname), with a linked integer value (value), a maximum value (count), an optional callback function (onChange) to be called on changes of the slider, and an argument (userdata) to the callback function. The callback function itself gets two arguments: value (selected by the slider) and a pointer to userdata (optional).With Qt support, if no window is specified, the trackbar is created in the properties window. In the showUI example, we create two trackbars: the first in the main window and the second one in the properties window. The code for the trackbar callbacks is:

void tb1_Callback(int value, void *) {

  sprintf(msg, "Trackbar 1 changed. New value=%d", value);
  displayOverlay(main_win, msg);
void tb2_Callback(int value, void *) {

  sprintf(msg, "Trackbar 2 changed. New value=%d", value);
  displayStatusBar(main_win, msg, 1000);

Mouse interaction

Mouse events are always generated so that the user interacts with the mouse (moving and clicking). By setting the proper handler or callback functions, it is possible to implement actions such as select, drag and drop, and so on. The callback function (onMouse) is enabled with the setMouseCallback(const String& winname, MouseCallback onMouse, void* userdata=0 ) function in the specified window (winname) and optional argument (userdata).

The source code for the callback function that handles the mouse event is:

void cbMouse(int event, int x, int y, int flags, void*) {
  // Static vars hold values between calls
  static Point p1, p2;
  static bool p2set = false;

  // Left mouse button pressed
  if (event == EVENT_LBUTTONDOWN) {
    p1 = Point(x, y); // Set orig. point
    p2set = false;
  } else if (event == EVENT_MOUSEMOVE &&
  flags == EVENT_FLAG_LBUTTON) {
    // Check moving mouse and left button down
    // Check out bounds
    if (x > orig_img.size().width)
      x = orig_img.size().width;
    else if (x < 0)
      x = 0;
    // Check out bounds
    if (y > orig_img.size().height)
      y = orig_img.size().height;
    else if (y < 0)
      y = 0;
    p2 = Point(x, y); // Set final point
    p2set = true;
    // Copy orig. to temp. image
    // Draws rectangle
    rectangle(tmp_img, p1, p2, Scalar(0, 0 ,255));
    // Draw temporal image with rect.
    imshow(main_win, tmp_img);
  } else if (event == EVENT_LBUTTONUP
  && p2set) {
    // Check if left button is released
    // and selected an area
    // Set subarray on orig. image
    // with selected rectangle
    Mat submat = orig_img(Rect(p1, p2));
    // Here some processing for the submatrix
    // Mark the boundaries of selected rectangle
    rectangle(orig_img, p1, p2, Scalar(0, 0, 255), 2);
    imshow("main_win", orig_img);

In the showUI example, the mouse events are used to control through a callback function (cbMouse), the selection of a rectangular region by drawing a rectangle around it. In the example, this function is declared as void cbMouse(int event, int x, int y, int flags, void*), the arguments being the position of the pointer (x, y) where the event occurs, the condition when the event occurs (flags), and optionally, userdata.


The available events, flags, and their corresponding definition symbols can be found in the highgui.hpp header file.


OpenCV (only with Qt support) allows you to create three types of buttons: checkbox (QT_CHECKBOX), radiobox (QT_RADIOBOX), and push button (QT_PUSH_BUTTON). These types of button can be used respectively to set options, set exclusive options, and take actions on push. The three are created with the createButton(const String& button_name, ButtonCallback on_change, void* userdata=0, int type=QT_PUSH_BUTTON, bool init_state=false ) function in the properties window arranged in a buttonbar after the last trackbar created in this window. The arguments for the button are its name (button_name), the callback function called on the status change (on_change), and optionally, an argument (userdate) to the callback, the type of button (type), and the initial state of the button (init_state).

Next, we show you the source code for the callback functions corresponding to buttons in the example:

void checkboxCallBack(int state, void *) {

  sprintf(msg, "Check box changed. New state=%d", state);
  displayStatusBar(main_win, msg);

void radioboxCallBack(int state, void *id) {

  // Id of the radio box passed to the callBack
  sprintf(msg, "%s changed. New state=%d",
    (char *)id, state);
  displayStatusBar(main_win, msg);

void pushbuttonCallBack(int, void *font) {

  // Add text to the image
  addText(orig_img, "Push button clicked",
    Point(50,50), *((QtFont *)font));
  imshow(main_win, orig_img); // Shows original image

The callback function for a button gets two arguments: its status and, optionally, a pointer to user data. In the showUI example, we show you how to pass an integer (radioboxCallBack(int state, void *id)) to identify the button and a more complex object (pushbuttonCallBack(int, void *font)).

Drawing and displaying text

A very efficient way to communicate the results of some image processing to the user is by drawing shapes or/and displaying text over the figure being processed. Through the imgproc module, OpenCV provides some convenient functions to achieve such tasks as putting text, drawing lines, circles, ellipses, rectangles, polygons, and so on. The showUI example illustrates how to select a rectangular region over an image and draw a rectangle to mark the selected area. The following function draws (img) a rectangle defined by two points (p1, p2) over an image with the specified color and other optional parameters as thickness (negative for a fill shape) and the type of lines:

void rectangle(InputOutputArray img, Point pt1, Point pt2,const Scalar& color, int thickness=1,int lineType=LINE_8, int shift=0 )

Additional to shapes' drawing support, the imgproc module provides a function to put text over an image with the function:

void putText(InputOutputArray img, const String& text, Point org, int fontFace, double fontScale, Scalar color, int thickness=1, int lineType=LINE_8, bool bottomLeftOrigin=false )


The available font faces for the text can be inspected in the core.hpp header file.

Qt support, in the highgui module, adds some additional ways to show text on the main window of an OpenCV application:

  • Text over the image: We get this result using the addText(const Mat& img, const String& text, Point org, const QtFont& font) function. This function allows you to select the origin point for the displayed text with a font previously created with the fontQt(const String& nameFont, int pointSize=-1, Scalar color=Scalar::all(0), int weight=QT_FONT_NORMAL, int style=QT_STYLE_NORMAL, int spacing=0) function. In the showUI example, this function is used to put text over the image when the push button is clicked on, calling the addText function inside the callback function.

  • Text on the status bar: Using the displayStatusBar(const String& winname, const String& text, int delayms=0 ) function, we display text in the status bar for a number of milliseconds given by the last argument (delayms). In the showUI example, this function is used (in the callback functions) to display an informative text when the buttons and trackbar of the properties window change their state.

  • Text overlaid on the image: Using the displayOverlay(const String& winname, const String& text, int delayms=0) function, we display text overlaid on the image for a number of milliseconds given by the last argument. In the showUI example, this function is used (in the callback function) to display informative text when the main window trackbar changes its value.



In this chapter, you got a quick review of the main purpose of the OpenCV library and its modules. You learned the foundations of how to compile, install, and use the library in your local system to develop C++ OpenCV applications with Qt support. To develop your own software, we explained how to start with the free Qt Creator IDE and the GNU compiler kit.

To start with, full code examples were provided in the chapter. These examples showed you how to read and write images and video. Finally, the chapter gave you an example of displaying some easy-to-implement user interface capabilities in OpenCV programs, such as trackbars, buttons, putting text on images, drawing shapes, and so on.

The next chapter will be devoted to establishing the main image processing tools and tasks that will set the basis for the remaining chapters.

About the Authors

  • Gloria Bueno García

    Gloria Bueno García holds a PhD in machine vision from Coventry University, UK. She has experience working as the principal researcher in several research centers, such as UMR 7005 research unit CNRS/ Louis Pasteur Univ. Strasbourg (France), Gilbert Gilkes & Gordon Technology (UK), and CEIT San Sebastian (Spain). She is the author of two patents, one registered type of software, and more than 100 refereed papers. Her interests are in 2D/3D multimodality image processing and artificial intelligence. She leads the VISILAB research group at the University of Castilla-La Mancha. She has coauthored a book on OpenCV programming for mobile devices: OpenCV essentials, Packt Publishing.

    Browse publications by this author
  • Oscar Deniz Suarez

    Oscar Deniz Suarez's research interests are mainly focused on computer vision and pattern recognition. He is the author of more than 50 refereed papers in journals and conferences. He received the runner-up award for the best PhD work on computer vision and pattern recognition by AERFAI and the Image File and Reformatting Software Challenge Award by Innocentive Inc. He has been a national finalist for the 2009 Cor Baayen award. His work is used by cutting-edge companies, such as Existor, Gliif, Tapmedia, E-Twenty, and others, and has also been added to OpenCV. Currently, he works as an associate professor at the University of Castilla-La Mancha and contributes to VISILAB. He is a senior member of IEEE and is affiliated with AAAI, SIANI, CEA-IFAC, AEPIA, and AERFAI-IAPR. He serves as an academic editor of the PLoS ONE journal. He has been a visiting researcher at Carnegie Mellon University, Imperial College London, and Leica Biosystems. He has coauthored two books on OpenCV previously.

    Browse publications by this author
  • José Luis Espinosa Aranda

    José Luis Espinosa Aranda holds a PhD in computer science from the University of Castilla-La Mancha. He has been a finalist for Certamen Universitario Arquímedes de Introducción a la Investigación científica in 2009 for his final degree project in Spain. His research interests involve computer vision, heuristic algorithms, and operational research. He is currently working at the VISILAB group as an assistant researcher and developer in computer vision topics.

    Browse publications by this author
  • Jesus Salido Tercero

    Jesus Salido Tercero gained his electrical engineering degree and PhD (1996) from Universidad Politécnica de Madrid (Spain). He then spent 2 years (1997 and 1998) as a visiting scholar at the Robotics Institute (Carnegie Mellon University, Pittsburgh, USA), working on cooperative multirobot systems. Since his return to the Spanish University of Castilla-La Mancha, he spends his time teaching courses on robotics and industrial informatics, along with research on vision and intelligent systems. Over the last 3 years, his efforts have been directed to develop vision applications on mobile devices. He has coauthored a book on OpenCV programming for mobile devices.

    Browse publications by this author
  • Ismael Serrano Gracia

    Ismael Serrano Gracia received his degree in computer science in 2012 from the University of Castilla-La Mancha. He got the highest marks for his final degree project on person detection. This application uses depth cameras with OpenCV libraries. Currently, he is a PhD candidate at the same university, holding a research grant from the Spanish Ministry of Science and Research. He is also working at the VISILAB group as an assistant researcher and developer on different computer vision topics.

    Browse publications by this author
  • Noelia Vállez Enano

    Noelia Vállez Enano has liked computers since her childhood, though she didn't have one before her mid-teens. In 2009, she finished her studies in computer science at the University of Castilla-La Mancha, where she graduated with top honors. She started working at the VISILAB group through a project on mammography CAD systems and electronic health records. Since then, she has obtained a master's degree in physics and mathematics and has enrolled for a PhD degree. Her work involves using image processing and pattern recognition methods. She also likes teaching and working in other areas of artificial intelligence.

    Browse publications by this author

Latest Reviews

(3 reviews total)
Book Title
Unlock this full book FREE 10 day trial
Start Free Trial