Packt+ | Advance your knowledge in tech

You're reading from OpenCV for Secret Agents

Product typeBook

Published inJan 2015

Reading LevelIntermediate

PublisherPackt

ISBN-139781783287376

Edition1st Edition

Languages

C++

Tools

OpenCV Android

Concepts

Computer Vision

Author (1)

Joseph Howse

Chapter 3. Training a Smart Alarm to Recognize the Villain and His Cat

	The naming of cats is a difficult matter.
	-- T. S. Eliot, Old Possum's Book of Practical Cats (1939)

	Blofeld: I've taught you to love chickens, to love their flesh, their voice.
	--On Her Majesty's Secret Service (1969)

If you saw Ernst Stavro Blofeld, would you recognize him?

Let me remind you that Blofeld, as the Number 1 man in SPECTRE (SPecial Executive for Counterintelligence, Terrorism, Revenge, and Extortion), is a supervillain who eludes James Bond countless times before being written out of the movies due to an intellectual property dispute. Blofeld last appears as an anonymous character in the introductory sequence of the movie For Your Eyes Only (1981), where we see him fall from a helicopter and down a factory's smokestack as he shouts, "Mr. Boooooooooond!"

Despite this dramatic exit, the evidence of Blofeld's death is unclear. After all, Blofeld is a notoriously difficult man to recognize. His face is seldom...

Understanding machine learning in general

Our work throughout this chapter builds on the techniques of machine learning, meaning that the software makes predictions or decisions based on statistical models. Our approach is one of supervised learning, meaning that we (programmers and users) will provide the software with examples of data and correct responses. The software creates the statistical model to extrapolate from these examples. The human-provided examples are referred to as reference data or training data (or reference images or training images in the context of computer vision). Conversely, the software's extrapolations pertain to test data (or test images or scenes in the context of computer vision).

Supervised learning is much like the "flashcard" pedagogy used in early childhood education. The teacher shows the child a series of pictures (training images) and says, "This is a cow. Moo! This is a horse. Neigh!"

Then, on a field trip to a farm (a scene), the child can hopefully...

Planning the Interactive Recognizer app

Let's begin this project with the middle layer, the Interactive Recognizer app, in order to see how all layers connect. Like Luxocator (the previous chapter's project), Interactive Recognizer is a GUI app built with wxPython. Refer to the following screenshot, which features one of my colleagues, Chief Science Officer Sanibel "San" Delphinium Andromeda, Oracle of the Numm:

The app uses a face detection model, which is loaded from a disk, and it maintains a face recognition model that is saved or loaded to/from a disk. The user might specify the identity of any detected face and this input is added to the face recognition model. A detection result is shown by outlining the face in the video feed, while a recognition result is shown by displaying the name of the face in the text below. To elaborate, we can say that the app has the following flow of execution:

The app loads a face detection model from a file. The role of the detection model is to distinguish...

Understanding Haar cascades and LBPH

	Cookie Monster: Hey, you know what? A round cookie with one bite out of it looks like a "C". A round donut with one bite out of it also looks like a "C" but it is not as good as a cookie. Oh, and the moon sometimes looks like a "C" but you can't eat that.
	-- "C is for Cookie", Sesame Street

Think about cloud watching. If you lie on the ground and look up at the clouds, maybe you imagine that one cloud is shaped like a mound of mashed potatoes on a plate. If you board an airplane and fly to this cloud, you will still see some resemblance between the cloud's surface and the fluffy, lumpy texture of hearty mashed potatoes. However, if you could slice off a piece of cloud and examine it under a microscope, you might see ice crystals that do not resemble the microscopic structure of mashed potatoes at all.

Similarly, in an image made up of pixels, a person or a computer vision algorithm can see many distinctive shapes or patterns, partly depending on the...

Implementing the Interactive Recognizer app

Let's create a new folder, where we will store this chapter's project, including the following subfolders and files that are relevant to the Interactive Recognizer app:

cascades/haarcascade_frontalface_alt.xml: This is a detection model for a frontal, human face. It should be included with OpenCV at a path such as <opencv_unzip_destination>/data/haarcascades/haarcascade_frontalface_alt.xml, or for a MacPorts installation at /opt/local/share/OpenCV/haarcascades/haarcascade_frontalface_alt.xml. Create a copy of it or a link to it. (Alternatively, you can get it from this chapter's code bundle.)
cascades/lbpcascade_frontalface.xml: This is an alternative (faster but less reliable) detection model for a frontal, human face. It should be included with OpenCV at a path such as <opencv_unzip_destination>/data/lbpcascades/lbpcascade_frontalface.xml, or for a MacPorts installation at /opt/local/share/OpenCV/lbpcascades/lbpcascade_frontalface...

Planning the cat detection model

When I said soon, I meant in a day or two. Training a Haar cascade takes a lot of processing time. Training an LBP cascade is relatively quick. However, in either case, we need to download some big collections of images before we even start. Settle down with a reliable Internet connection, a power outlet, at least 4 GB of free disk space, and the fastest CPU and biggest RAM you can find. Do not attempt this segment of the project on Raspberry Pi. Keep the computer away from external heat sources or things that might block its fans. My processing time for Haar cascade training was 24 hours (or more for the whisker-friendly version that is sensitive to diagonal patterns), with 100 percent usage on four cores, on a MacBook Pro with a 2.6 GHz Intel Core i7 CPU, and 16 GB RAM.

We will use the following sets of images, which are freely available for research purposes:

The PASCAL Visual Object Classes Challenge 2007 (VOC2007) dataset. VOC2007 contains 10,000 images...

Implementing the training script for the cat detection model

	Praline: I've never seen so many aerials in me life. The man told me, their equipment could pinpoint a purr at 400 yards and Eric, being such a happy cat, was a piece of cake.
	--—The Fish License sketch, Monty Python's Flying Circus, Episode 23 (1970)

This segment of the project uses tens of thousands of files including images, annotation files, scripts, and intermediate and final output of the training process. Let's organize all of this new material by giving our project a subfolder, cascade_training, which will ultimately have the following contents:

cascade_training/CAT_DATASET_01: The first half of the Microsoft Cat Dataset 2008. Download it from http://137.189.35.203/WebUI/CatDatabase/Data/CAT_DATASET_01.zip and unzip it.
cascade_training/CAT_DATASET_02: The second half of the Microsoft Cat Dataset 2008. Download it from http://137.189.35.203/WebUI/CatDatabase/Data/CAT_DATASET_02.zip and unzip it.
cascade_training/faces...

Planning the Angora Blue app

Angora Blue reuses the same detection and recognition models that we created earlier. It is a relatively linear and simple app because it has no GUI and does not modify any models. It just loads the detection and recognition models from file and then silently runs a camera until a face is recognized with a certain level of confidence. After recognizing a face, the app sends an e-mail alert and exits. To elaborate, we can say the app has the following flow of execution:

Load face detection and face recognition models from file for both human and feline subjects.
Capture a live video from a camera for each frame of video:
1. Detect all human faces in the frame. Perform recognition on each human face. If a face is recognized with a certain level of confidence, send an e-mail alert and exit the app.
2. Detect all cat faces in the frame. Discard any cat faces that intersect with human faces. (We assume that such cat faces are false positives, since our cat detector sometimes...

Implementing the Angora Blue app

The Angora Blue app uses three new files: GeomUtils.py, MailUtils.py, and AngoraBlue.py, which should all be in our project's top folder. Given the app's dependencies on our previous work, the following files are relevant to Angora Blue:

cascades/haarcascade_frontalface_alt.xml
cascades/haarcascade_frontalcatface.xml
recognizers/lbph_human_faces.xml
recognizers/lbph_cat_faces.xml
ResizeUtils.py: This contains the utility functions to resize images, including camera capture dimensions
GeomUtils.py: This consists of the utility functions used to perform geometric operations
MailUtils.py: This provides the utility functions used to send e-mails
AngoraBlue.py: This is the application that sends an e-mail alert when a person or cat is recognized

First, let's create GeomUtils.py. This does not need any import statements. Let's add the following intersects function, which accepts two rectangles as arguments and returns either True (if they intersect) or False...

Building Angora Blue for distribution

We can use PyInstaller to bundle Angora Blue along with the detection and recognition models for distribution. Since the build scripts should be quite similar to the ones we used for Luxocator (the previous chapter's project), we will not discuss their implementation here. However, they are included in this chapter's code bundle.

Further fun with finding felines

Kittydar (short for "kitty radar"), by Heather Arthur, is an open source, JavaScript library used to detect upright frontal cat faces. You can find its demo application at http://harthur.github.io/kittydar/ and its source code at https://github.com/harthur/kittydar.

Another detector for upright frontal cat faces was developed by Microsoft Research using the Microsoft Cat Dataset 2008. The detector is described in the following research paper but no demo application or source code has been released:

Weiwei Zhang, Jian Sun, and Xiaoou Tang. "Cat Head Detection - How to Effectively Exploit Shape and Texture Features", Proc. of European Conference Computer Vision, vol. 4, pp. 802-816, 2008.

If you know of other works on cat detectors, recognizers, or datasets, please write to me to tell me about them!

Summary

Like the previous chapter, this chapter has dealt with classification tasks, as well as interfaces among OpenCV, a source of images, and a GUI. This time, our classification labels have more objective meanings (a species or an individual's identity), so the classifier's success or failure is more obvious. To meet the challenge, we used much bigger sets of training images, we preprocessed the training images for greater consistency, and we applied two tried-and-true classification techniques in the sequence (either Haar cascades or LBP cascades for detection and then LBPH for recognition).

The methodology presented in this chapter, as well as in the entire Interactive Recognizer app and some of the other code, generalizes well with other original works in detection and recognition. With the right training images, you can detect and recognize many more animals in many poses. You can even detect an object such as a car and recognize the Batmobile!

For our next project, we turn our attention...

The rest of the chapter is locked

You have been reading a chapter from

OpenCV for Secret Agents

Published in: Jan 2015Publisher: PacktISBN-13: 9781783287376

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Joseph Howse

Joseph Howse lives in a Canadian fishing village, where he chats with his cats, crafts his books, and nurtures an orchard of hardy fruit trees. He is President of Nummist Media Corporation, which exists to support his books and to provide mentoring and consulting services, with a specialty in computer vision. On average, in 2015-2022, Joseph has written 1.4 new books or new editions per year for Packt. He also writes fiction, including an upcoming novel about the lives of a group of young people in the last days of the Soviet Union.
Read more about Joseph Howse

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages