Free eBook - Learning OpenCV 4 Computer Vision with Python 3 - Third Edition

4 (1 reviews total)
By Joseph Howse , Joe Minichino
  • A new free eBook every day on the latest in tech
  • 30 permanently free eBooks from our core tech library
  1. Setting Up OpenCV

About this book

Computer vision is a rapidly evolving science, encompassing diverse applications and techniques. This book will not only help those who are getting started with computer vision but also experts in the domain. You’ll be able to put theory into practice by building apps with OpenCV 4 and Python 3.

You’ll start by understanding OpenCV 4 and how to set it up with Python 3 on various platforms. Next, you’ll learn how to perform basic operations such as reading, writing, manipulating, and displaying still images, videos, and camera feeds. From taking you through image processing, video analysis, and depth estimation and segmentation, to helping you gain practice by building a GUI app, this book ensures you’ll have opportunities for hands-on activities. Next, you’ll tackle two popular challenges: face detection and face recognition. You’ll also learn about object classification and machine learning concepts, which will enable you to create and use object detectors and classifiers, and even track objects in movies or video camera feed. Later, you’ll develop your skills in 3D tracking and augmented reality. Finally, you’ll cover ANNs and DNNs, learning how to develop apps for recognizing handwritten digits and classifying a person's gender and age.

By the end of this book, you’ll have the skills you need to execute real-world computer vision projects.

Publication date:
February 2020


Handling Files, Cameras, and GUIs

Installing OpenCV and running samples is fun, but at this stage, we want to try things out in our own way. This chapter introduces OpenCV's I/O functionality. We also discuss the concept of a project and the beginnings of an object-oriented design for this project, which we will flesh out in subsequent chapters.

By starting with a look at I/O capabilities and design patterns, we will build our project in the same way we would make a sandwich: from the outside in. Bread slices and spread, or endpoints and glue, come before fillings or algorithms. We choose this approach because computer vision is mostly extrovertedit contemplates the real world outside our computerand we want to apply all of our subsequent algorithmic work to the real world through a common interface.

Specifically, in this chapter, our code samples and discussions...


Technical requirements


Basic I/O scripts

Most CV applications need to get images as input. Most also produce images as output. An interactive CV application might require a camera as an input source and a window as an output destination. However, other possible sources and destinations include image files, video files, and raw bytes. For example, raw bytes might be transmitted via a network connection, or they might be generated by an algorithm if we incorporate procedural graphics into our application. Let's look at each of these possibilities.

Reading/writing an image file

OpenCV provides the imread function to load an image from a file and the imwrite function to write an image to a file. These functions support various file formats for...


Project Cameo (face tracking and image manipulation)

OpenCV is often studied through a cookbook approach that covers a lot of algorithms, but nothing about high-level application development. To an extent, this approach is understandable because OpenCV's potential applications are so diverse. OpenCV is used in a wide variety of applications, such as photo/video editors, motion-controlled games, a robot's AI, or psychology experiments where we log participants' eye movements. Across these varied use cases, can we truly study a useful set of abstractions?

The book's authors believe we can, and the sooner we start creating abstractions, the better. We will structure many of our OpenCV examples around a single application, but, at each step, we will design a component of this application to be extensible and reusable.

We will develop an interactive application...


Cameo – an object-oriented design

Python applications can be written in a purely procedural style. This is often done with small applications, such as our basic I/O scripts, discussed previously. However, from now on, we will often use an object-oriented style because it promotes modularity and extensibility.

From our overview of OpenCV's I/O functionality, we know that all images are similar, regardless of their source or destination. No matter how we obtain a stream of images or where we send it as output, we can apply the same application-specific logic to each frame in this stream. Separation of I/O code and application code becomes especially convenient in an application, such as Cameo, which uses multiple I/O streams.

We will create classes called CaptureManager and WindowManager as high-level interfaces to I/O streams. Our application code may use CaptureManager...



By now, we should have an application that displays a camera feed, listens for keyboard input, and (on command) records a screenshot or screencast. We are ready to extend the application by inserting some image-filtering code (Chapter 3, Processing Images with OpenCV) between the start and end of each frame. Optionally, we are also ready to integrate other camera drivers or application frameworks besides the ones supported by OpenCV.

We also possess the knowledge to manipulate images as NumPy arrays. This forms the perfect foundation for our next topic, filtering images.

About the Authors

  • Joseph Howse

    Joseph Howse lives in a Canadian fishing village with four cats; the cats like fish, but they prefer chicken.

    Joseph provides computer vision expertise through his company, Nummist Media. His books include OpenCV 4 for Secret Agents, Learning OpenCV 4 Computer Vision with Python 3, OpenCV 3 Blueprints, Android Application Programming with OpenCV 3, iOS Application Development with OpenCV 3, and Python Game Programming by Example, published by Packt.

    Browse publications by this author
  • Joe Minichino

    Joe Minichino is an R&D labs engineer at Teamwork. He is a passionate programmer who is immensely curious about programming languages and technologies and constantly experimenting with them. Born and raised in Varese, Lombardy, Italy, and coming from a humanistic background in philosophy (at Milan's Università Statale), Joe has lived in Cork, Ireland, since 2004. There, he became a computer science graduate at the Cork Institute of Technology.

    Browse publications by this author

Latest Reviews

(1 reviews total)
I like the book. Some examples are really great and detailed. The author goes deep into the subjects and it explains them well. My only problem is some examples are not complete and the author ask us to see the complete files in the code .. Other than that, it's well.

Recommended For You

40 Algorithms Every Programmer Should Know

Learn algorithms for solving classic computer science problems with this concise guide covering everything from fundamental algorithms, such as sorting and searching, to modern algorithms used in machine learning and cryptography

By Imran Ahmad
Modern Computer Vision with PyTorch

Get to grips with deep learning techniques for building image processing applications using PyTorch with the help of code notebooks and test questions

By V Kishore Ayyadevara and 1 more
Hands-On Mathematics for Deep Learning

A comprehensive guide to getting well-versed with the mathematical techniques for building modern deep learning architectures

By Jay Dawani
Python Machine Learning - Third Edition

Applied machine learning with a solid foundation in theory. Revised and expanded for TensorFlow 2, GANs, and reinforcement learning.

By Sebastian Raschka and 1 more