Reader small image

You're reading from  iOS Application Development with OpenCV 3

Product typeBook
Published inJun 2016
Reading LevelIntermediate
PublisherPackt
ISBN-139781785289491
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Joseph Howse
Joseph Howse
author image
Joseph Howse

Joseph Howse lives in a Canadian fishing village, where he chats with his cats, crafts his books, and nurtures an orchard of hardy fruit trees. He is President of Nummist Media Corporation, which exists to support his books and to provide mentoring and consulting services, with a specialty in computer vision. On average, in 2015-2022, Joseph has written 1.4 new books or new editions per year for Packt. He also writes fiction, including an upcoming novel about the lives of a group of young people in the last days of the Soviet Union.
Read more about Joseph Howse

Right arrow

Chapter 4. Detecting and Merging Faces of Mammals

 

"A cat may look at a king."

 
 --English proverb

This chapter puts a spotlight on two of my favorite subjects: cats and augmented reality (AR). We will build an AR application called ManyMasks, which will detect, highlight, and merge the faces of humans and cats. Specifically, the app's user will be able to do the following things:

  • See the boundaries of a human face or cat face in a live camera view as well as the centers of the eyes and the tip of the nose. This visualization depends on the result of a face detection algorithm.

  • Select two detected faces from different camera frames.

  • See a hybrid face, which is produced by aligning and blending the two selected faces.

  • Save and share the hybrid face.

Our face detection algorithm relies on cascade classifiers, which attempt to match various patches of the image to a pretrained, generic model of a human face, human eye, or cat face. We estimate the positions of other facial features based on a set...

Understanding detection with cascade classifiers


A cascade is a series of tests or stages, which differentiate between a positive and negative class of objects, such as face and non-face. For a positive classification, a patch of an image must pass all stages of the cascade. Conversely, if the patch fails any stage, the classifier immediately makes a negative classification.

A patch or window of an image is a sample of pixels around a given position and at a given magnification level. A cascade classifier takes windows of the image at various positions and various magnification levels, and for each window it runs the stages of the cascade. Often, positive detections occur in multiple, overlapping windows. These overlapping positive detections are called neighbors, and they imply a greater likelihood of a true positive. For example, a real face still looks like a face if we move or resize the frame around it slightly.

By now, you might be wondering exactly how we design a cascade's stages....

Understanding transformations


After we detect two faces and before we blend them, we will try to align the faces based on the eye and nose coordinates. This alignment step is a geometric transformation, which remaps points (or pixels) from one space to another. For example, the following geometric operations are special cases of a transformation:

  • Translation: This moves the points laterally. It repositions them around a new center.

  • Rotation: This spins the points around a center.

  • Scale: This moves the points farther from or nearer to a center.

Mathematically, a transformation is a matrix and a point (or pixel position) is a vector. We can multiply them together to apply the transformation to the point. The output of the multiplication is a new point.

Conversely, given three pairs of points—in our case, the pairs of left eye centers, right eye centers, and nose tips—we can solve for the transformation matrix that maps one set of points onto the other. This is a problem of linear algebra. After...

Planning a face merging application


When ManyMasks opens, it will present a live camera view, a toolbar, and two small images of a masked face in the lower corners. Whenever the application detects a human face, it will draw the following shapes:

  • A yellow rectangle around the face region

  • A red rectangle around the left eye region

  • A red circle at the left eye's center or pupil

  • A green rectangle around the right eye region

  • A green circle at the right eye's center or pupil

  • A blue circle at the tip of the nose

Similarly, for a detected cat face, the application will draw the following shapes:

  • A white rectangle around the face region

  • A red circle at the left eye's center or pupil

  • A green circle at the right eye's center or pupil

  • A blue circle at the tip of the nose

Note

For our purposes, the left and right directions refer to the viewer's perspective, not the subject's perspective. The OpenCV developers, and most authors in computer vision, also follow this convention.

The following screenshot shows how the...

Configuring the project


Create an Xcode project named ManyMasks. Use the Single View Application template. Configure the project according to the instructions in Chapter 1, Setting Up Software and Hardware and Chapter 2, Capturing, Storing, and Sharing Photos. (See the Configuring the project section of each chapter.) The ManyMasks project depends on the same frameworks and device capabilities as the LightWork project.

Our face detector will depend on several pretrained cascade files that come with OpenCV's source code. If you do not already have the source code, get it as described in Chapter 1, Setting Up Software and Hardware, in the Building an additional framework from source with extra modules section. Add copies of the following cascade files to the Supporting Files folder of the ManyMasks project:

  • <opencv_source_path>/data/haarcascades/haarcascade_frontalface_alt.xml. Alternatively, you may want to try <opencv_source_path>/data/lbpcascades/lbpcascade_frontalface.xml for...

Defining faces and a face detector


Let's define faces and a face detector in pure C++ code without using any dependencies except OpenCV. This ensures that the computer vision functionality of ManyMasks is portable. We could reuse the core of our code on a different platform with a different set of UI libraries.

A face has a species. For our purposes, this could be Human, Cat, or Hybrid. Let's create a header file, Species.h, and define the following enum in it:

#ifndef SPECIES_H
#define SPECIES_H

enum Species {
  Human,
  Cat,
  Hybrid
};

#endif // !SPECIES_H

A face also has a matrix of image data and three feature points representing the centers of the eyes and tip of the nose. We may construct a face in any of the following ways:

  • Specify a species, matrix, and feature points.

  • Create an empty face with default values, including an empty matrix.

  • Copy an existing face.

  • Merge two existing faces.

Let's create another header file, Face.h, and declare the following public interface of a Face class...

Defining and laying out the view controllers


ManyMasks divides its application logic between two view controllers. The first view controller enables the user to capture and preview real faces. The second enables the user to review, save, and share merged faces. A type of callback method called a segue enables the first view controller to instantiate the second and pass a merged face to it.

Capturing and previewing real faces

Import copies of the VideoCamera.h and VideoCamera.m files that we created in Chapter 2, Capturing, Storing, and Sharing Photos. These files contain our VideoCamera class, which extends OpenCV's CvVideoCamera to fix bugs and add new functionality.

Rename ViewController.h and ViewController.m to CaptureViewController.h and CaptureViewController.m. Edit CaptureViewController.h so that it declares a CaptureViewController class, as seen in the following code:

#import <UIKit/UIKit.h>

@interface CaptureViewController : UIViewController

@end

CaptureViewController will have...

Detecting a hierarchy of face elements


As part of our face detection algorithm, we will reject cat faces that intersect with human faces. The reason is that the cat face cascade produces more false positives than the human face cascade. Thus, if a region is detected as both a human face and cat face, it is probably a human face in reality. To help us check for intersections between face rectangles, let's write a utility function, intersects. Declare the function in a new header file, GeomUtils.h, with the following code:

#ifndef GEOM_UTILS_H
#define GEOM_UTILS_H

#include <opencv2/core.hpp>

namespace GeomUtils {
  bool intersects(const cv::Rect &rect0, const cv::Rect &rect1);
}

#endif // !GEOM_UTILS_H

Two rectangles intersect if (and only if) a corner of one rectangle lies inside the other rectangle. Create another file, GeomUtils.cpp, with the following implementation of the intersects function in the file:

#include "GeomUtils.h"

bool GeomUtils::intersects(const cv::Rect ...

Aligning and blending face elements


The rest of our app's functionality is in the implementation of the Face class. Create a new file, Face.cpp. Remember that Face has a species, matrix of image data, and coordinates for the centers of the eyes and tip of the nose. Also remember that we designed Face as an immutable type, and for this reason the constructor copies a given matrix rather than storing a reference to external data. At the start of Face.cpp, let's implement the constructor that takes a species, matrix, and feature points as arguments:

#include <opencv2/imgproc.hpp>

#include "Face.h"

Face::Face(Species species, const cv::Mat &mat,
    const cv::Point2f &leftEyeCenter,
    const cv::Point2f &rightEyeCenter, const cv::Point2f &noseTip)
: species(species)
, leftEyeCenter(leftEyeCenter)
, rightEyeCenter(rightEyeCenter)
, noseTip(noseTip)
{
  mat.copyTo(this->mat);
}

Face also has the following default constructor for an empty face:

Face::Face() {
}

The following...

Using the application and acting like a cat


Build ManyMasks and run it on an iOS device. For best results, obey the following guidelines:

  • Work in an area with bright lighting and no shadows.

  • Fill most of the frame with the face so that the image's resolution is not wasted on background areas.

  • Capture an upright image of the face. This is especially important for a cat because our algorithm does not address the problem of locating the eyes and nose in a tilted cat face. To entice a cat to look straight at the camera, you might need to use a toy or treat.

  • Capture a similar expression on the two faces. Like humans, cats have expressive faces, and different cats may develop different expressions as a form of communication with their humans. Here are some examples of my cats' expressions:

    • Wide eyes: Alert or assertive

    • Narrow eyes: Relaxed or submissive

    • Yawn: "Hello, my human."

    • Meow: "Pay attention, my human."

    • Tongue between lips: Paying attention to a scent, possibly a pleasant scent such as "my...

Learning more about face analysis


Although our model of a face is a good start, we could make it much more sophisticated. We could model many feature points in order to accurately represent the differences between expressions, such as happiness and sadness. We could consider the third dimension and the camera's perspective. We could identify specific humans and specific cats based on the details of the face or even just the eye. We could train cascades for other species besides humans and cats.

Packt Publishing offers several more advanced OpenCV books with fascinating projects about face analysis. You can consider the following titles:

  • OpenCV 3 Blueprints offers chapters on facial expression recognition, cascade training, and biometric identification of human faces, eyes, and fingerprints. The code is in C++.

  • OpenCV for Secret Agents has a chapter on cascade training and biometric identification of human and cat faces. The code is in Python.

  • Mastering OpenCV with Practical Computer Vision...

Summary


This chapter has been a big step forward for us because we have focused on developing a modular and artificially intelligent solution. Unlike our previous apps, ManyMasks has multiple view controllers with a segue, as well as pure C++ classes dedicated to computer vision, and it is truly a smart application because it can classify things in its environment and perform computations based on their geometry. The next chapter will explore other smart approaches to problems of classification and geometry.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
iOS Application Development with OpenCV 3
Published in: Jun 2016Publisher: PacktISBN-13: 9781785289491
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Joseph Howse

Joseph Howse lives in a Canadian fishing village, where he chats with his cats, crafts his books, and nurtures an orchard of hardy fruit trees. He is President of Nummist Media Corporation, which exists to support his books and to provide mentoring and consulting services, with a specialty in computer vision. On average, in 2015-2022, Joseph has written 1.4 new books or new editions per year for Packt. He also writes fiction, including an upcoming novel about the lives of a group of young people in the last days of the Soviet Union.
Read more about Joseph Howse