Reader small image

You're reading from  OpenCV for Secret Agents

Product typeBook
Published inJan 2015
Reading LevelIntermediate
PublisherPackt
ISBN-139781783287376
Edition1st Edition
Languages
Right arrow
Author (1)
Joseph Howse
Joseph Howse
author image
Joseph Howse

Joseph Howse lives in a Canadian fishing village, where he chats with his cats, crafts his books, and nurtures an orchard of hardy fruit trees. He is President of Nummist Media Corporation, which exists to support his books and to provide mentoring and consulting services, with a specialty in computer vision. On average, in 2015-2022, Joseph has written 1.4 new books or new editions per year for Packt. He also writes fiction, including an upcoming novel about the lives of a group of young people in the last days of the Soviet Union.
Read more about Joseph Howse

Right arrow

Chapter 3. Training a Smart Alarm to Recognize the Villain and His Cat

 

The naming of cats is a difficult matter.

 
 -- T. S. Eliot, Old Possum's Book of Practical Cats (1939)
 

Blofeld: I've taught you to love chickens, to love their flesh, their voice.

 
 --On Her Majesty's Secret Service (1969)

If you saw Ernst Stavro Blofeld, would you recognize him?

Let me remind you that Blofeld, as the Number 1 man in SPECTRE (SPecial Executive for Counterintelligence, Terrorism, Revenge, and Extortion), is a supervillain who eludes James Bond countless times before being written out of the movies due to an intellectual property dispute. Blofeld last appears as an anonymous character in the introductory sequence of the movie For Your Eyes Only (1981), where we see him fall from a helicopter and down a factory's smokestack as he shouts, "Mr. Boooooooooond!"

Despite this dramatic exit, the evidence of Blofeld's death is unclear. After all, Blofeld is a notoriously difficult man to recognize. His face is seldom...

Understanding machine learning in general


Our work throughout this chapter builds on the techniques of machine learning, meaning that the software makes predictions or decisions based on statistical models. Our approach is one of supervised learning, meaning that we (programmers and users) will provide the software with examples of data and correct responses. The software creates the statistical model to extrapolate from these examples. The human-provided examples are referred to as reference data or training data (or reference images or training images in the context of computer vision). Conversely, the software's extrapolations pertain to test data (or test images or scenes in the context of computer vision).

Supervised learning is much like the "flashcard" pedagogy used in early childhood education. The teacher shows the child a series of pictures (training images) and says, "This is a cow. Moo! This is a horse. Neigh!"

Then, on a field trip to a farm (a scene), the child can hopefully...

Planning the Interactive Recognizer app


Let's begin this project with the middle layer, the Interactive Recognizer app, in order to see how all layers connect. Like Luxocator (the previous chapter's project), Interactive Recognizer is a GUI app built with wxPython. Refer to the following screenshot, which features one of my colleagues, Chief Science Officer Sanibel "San" Delphinium Andromeda, Oracle of the Numm:

The app uses a face detection model, which is loaded from a disk, and it maintains a face recognition model that is saved or loaded to/from a disk. The user might specify the identity of any detected face and this input is added to the face recognition model. A detection result is shown by outlining the face in the video feed, while a recognition result is shown by displaying the name of the face in the text below. To elaborate, we can say that the app has the following flow of execution:

  1. The app loads a face detection model from a file. The role of the detection model is to distinguish...

Understanding Haar cascades and LBPH


 

Cookie Monster: Hey, you know what? A round cookie with one bite out of it looks like a "C". A round donut with one bite out of it also looks like a "C" but it is not as good as a cookie. Oh, and the moon sometimes looks like a "C" but you can't eat that.

 
 -- "C is for Cookie", Sesame Street

Think about cloud watching. If you lie on the ground and look up at the clouds, maybe you imagine that one cloud is shaped like a mound of mashed potatoes on a plate. If you board an airplane and fly to this cloud, you will still see some resemblance between the cloud's surface and the fluffy, lumpy texture of hearty mashed potatoes. However, if you could slice off a piece of cloud and examine it under a microscope, you might see ice crystals that do not resemble the microscopic structure of mashed potatoes at all.

Similarly, in an image made up of pixels, a person or a computer vision algorithm can see many distinctive shapes or patterns, partly depending on the...

Implementing the Interactive Recognizer app


Let's create a new folder, where we will store this chapter's project, including the following subfolders and files that are relevant to the Interactive Recognizer app:

  • cascades/haarcascade_frontalface_alt.xml: This is a detection model for a frontal, human face. It should be included with OpenCV at a path such as <opencv_unzip_destination>/data/haarcascades/haarcascade_frontalface_alt.xml, or for a MacPorts installation at /opt/local/share/OpenCV/haarcascades/haarcascade_frontalface_alt.xml. Create a copy of it or a link to it. (Alternatively, you can get it from this chapter's code bundle.)

  • cascades/lbpcascade_frontalface.xml: This is an alternative (faster but less reliable) detection model for a frontal, human face. It should be included with OpenCV at a path such as <opencv_unzip_destination>/data/lbpcascades/lbpcascade_frontalface.xml, or for a MacPorts installation at /opt/local/share/OpenCV/lbpcascades/lbpcascade_frontalface...

Planning the cat detection model


When I said soon, I meant in a day or two. Training a Haar cascade takes a lot of processing time. Training an LBP cascade is relatively quick. However, in either case, we need to download some big collections of images before we even start. Settle down with a reliable Internet connection, a power outlet, at least 4 GB of free disk space, and the fastest CPU and biggest RAM you can find. Do not attempt this segment of the project on Raspberry Pi. Keep the computer away from external heat sources or things that might block its fans. My processing time for Haar cascade training was 24 hours (or more for the whisker-friendly version that is sensitive to diagonal patterns), with 100 percent usage on four cores, on a MacBook Pro with a 2.6 GHz Intel Core i7 CPU, and 16 GB RAM.

We will use the following sets of images, which are freely available for research purposes:

  • The PASCAL Visual Object Classes Challenge 2007 (VOC2007) dataset. VOC2007 contains 10,000 images...

Implementing the training script for the cat detection model


 

Praline: I've never seen so many aerials in me life. The man told me, their equipment could pinpoint a purr at 400 yards and Eric, being such a happy cat, was a piece of cake.

 
 --—The Fish License sketch, Monty Python's Flying Circus, Episode 23 (1970)

This segment of the project uses tens of thousands of files including images, annotation files, scripts, and intermediate and final output of the training process. Let's organize all of this new material by giving our project a subfolder, cascade_training, which will ultimately have the following contents:

Planning the Angora Blue app


Angora Blue reuses the same detection and recognition models that we created earlier. It is a relatively linear and simple app because it has no GUI and does not modify any models. It just loads the detection and recognition models from file and then silently runs a camera until a face is recognized with a certain level of confidence. After recognizing a face, the app sends an e-mail alert and exits. To elaborate, we can say the app has the following flow of execution:

  1. Load face detection and face recognition models from file for both human and feline subjects.

  2. Capture a live video from a camera for each frame of video:

    1. Detect all human faces in the frame. Perform recognition on each human face. If a face is recognized with a certain level of confidence, send an e-mail alert and exit the app.

    2. Detect all cat faces in the frame. Discard any cat faces that intersect with human faces. (We assume that such cat faces are false positives, since our cat detector sometimes...

Implementing the Angora Blue app


The Angora Blue app uses three new files: GeomUtils.py, MailUtils.py, and AngoraBlue.py, which should all be in our project's top folder. Given the app's dependencies on our previous work, the following files are relevant to Angora Blue:

  • cascades/haarcascade_frontalface_alt.xml

  • cascades/haarcascade_frontalcatface.xml

  • recognizers/lbph_human_faces.xml

  • recognizers/lbph_cat_faces.xml

  • ResizeUtils.py: This contains the utility functions to resize images, including camera capture dimensions

  • GeomUtils.py: This consists of the utility functions used to perform geometric operations

  • MailUtils.py: This provides the utility functions used to send e-mails

  • AngoraBlue.py: This is the application that sends an e-mail alert when a person or cat is recognized

First, let's create GeomUtils.py. This does not need any import statements. Let's add the following intersects function, which accepts two rectangles as arguments and returns either True (if they intersect) or False...

Building Angora Blue for distribution


We can use PyInstaller to bundle Angora Blue along with the detection and recognition models for distribution. Since the build scripts should be quite similar to the ones we used for Luxocator (the previous chapter's project), we will not discuss their implementation here. However, they are included in this chapter's code bundle.

Further fun with finding felines


Kittydar (short for "kitty radar"), by Heather Arthur, is an open source, JavaScript library used to detect upright frontal cat faces. You can find its demo application at http://harthur.github.io/kittydar/ and its source code at https://github.com/harthur/kittydar.

Another detector for upright frontal cat faces was developed by Microsoft Research using the Microsoft Cat Dataset 2008. The detector is described in the following research paper but no demo application or source code has been released:

  • Weiwei Zhang, Jian Sun, and Xiaoou Tang. "Cat Head Detection - How to Effectively Exploit Shape and Texture Features", Proc. of European Conference Computer Vision, vol. 4, pp. 802-816, 2008.

If you know of other works on cat detectors, recognizers, or datasets, please write to me to tell me about them!

Summary


Like the previous chapter, this chapter has dealt with classification tasks, as well as interfaces among OpenCV, a source of images, and a GUI. This time, our classification labels have more objective meanings (a species or an individual's identity), so the classifier's success or failure is more obvious. To meet the challenge, we used much bigger sets of training images, we preprocessed the training images for greater consistency, and we applied two tried-and-true classification techniques in the sequence (either Haar cascades or LBP cascades for detection and then LBPH for recognition).

The methodology presented in this chapter, as well as in the entire Interactive Recognizer app and some of the other code, generalizes well with other original works in detection and recognition. With the right training images, you can detect and recognize many more animals in many poses. You can even detect an object such as a car and recognize the Batmobile!

For our next project, we turn our attention...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
OpenCV for Secret Agents
Published in: Jan 2015Publisher: PacktISBN-13: 9781783287376
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Joseph Howse

Joseph Howse lives in a Canadian fishing village, where he chats with his cats, crafts his books, and nurtures an orchard of hardy fruit trees. He is President of Nummist Media Corporation, which exists to support his books and to provide mentoring and consulting services, with a specialty in computer vision. On average, in 2015-2022, Joseph has written 1.4 new books or new editions per year for Packt. He also writes fiction, including an upcoming novel about the lives of a group of young people in the last days of the Soviet Union.
Read more about Joseph Howse