Packt+ | Advance your knowledge in tech

You're reading from OpenCV for Secret Agents

Product type Book

Published in Jan 2015

Publisher Packt

ISBN-13 9781783287376

Pages 302 pages

Edition 1st Edition

Languages

C++

Concepts

Computer Vision

Author (1):

Joseph Howse

Table of Contents (15) Chapters

OpenCV for Secret Agents

Credits

About the Author

Acknowledgments

About the Reviewers

www.PacktPub.com

Preface

Preparing for the Mission

Searching for Luxury Accommodations Worldwide

Training a Smart Alarm to Recognize the Villain and His Cat

Controlling a Phone App with Your Suave Gestures

Equipping Your Car with a Rearview Camera and Hazard Detection

Seeing a Heartbeat with a Motion Amplifying Camera

Creating a Physics Simulation Based on a Pen and Paper Sketch

Index

Chapter 6. Seeing a Heartbeat with a Motion Amplifying Camera

	Remove everything that has no relevance to the story. If you say in the first chapter that there is a rifle hanging on the wall, in the second or third chapter it absolutely must go off. If it's not going to be fired, it shouldn't be hanging there.
	--Anton Chekhov

	King Julian: I don't know why the sacrifice didn't work. The science seemed so solid.
	--Madagascar: Escape 2 Africa (2008)

Despite their strange design and mysterious engineering, Q's gadgets always prove useful and reliable. Bond has such faith in the technology that he never even asks how to charge the batteries.

One of the inventive ideas in the Bond franchise is that even a lightly equipped spy should be able to see and photograph concealed objects, anyplace, anytime. Let's consider a timeline of a few relevant gadgets in the movies:

1967 (You Only Live Twice): An X-ray desk scans guests for hidden firearms.
1979 (Moonraker): A cigarette case contains an X-ray...

Planning the Lazy Eyes app

Of all our apps, Lazy Eyes has the simplest user interface. It just shows a live video feed with a special effect that highlights motion. The parameters of the effect are quite complex and, moreover, modifying them at runtime would have a big effect on performance. Thus, we do not provide a user interface to reconfigure the effect, but we do provide many parameters in code to allow a programmer to create many variants of the effect and the app.

The following is a screenshot illustrating one configuration of the app. This image shows me eating cake. My hands and face are moving often and we see an effect that looks like light and dark waves rippling around the places where moving edges have been. (The effect is more graceful in a live video than in a screenshot.)

Note

For more screenshots and an in-depth discussion of the parameters, refer to the section Configuring and testing the app for various motions, later in this chapter.

Regardless of how it is configured, the...

Understanding what Eulerian video magnification can do

Eulerian video magnification is inspired by a model in fluid mechanics called Eulerian specification of the flow field. Let's consider a moving, fluid body, such as a river. The Eulerian specification describes the river's velocity at a given position and time. The velocity would be fast in the mountains in springtime and slow at the river's mouth in winter. Also, the velocity would be slower at a silt-saturated point at the river's bottom, compared to a point where the river's surface hits a rock and sprays. An alternative to the Eulerian specification is the Lagrangian specification, which describes the position of a given particle at a given time. A given bit of silt might make its way down from the mountains to the river's mouth over a period of many years and then spend eons drifting around a tidal basin.

Note

For a more formal description of the Eulerian specification, the Lagrangian specification, and their relationship, refer to...

Extracting repeating signals from video using the Fast Fourier Transform (FFT)

An audio signal is typically visualized as a bar chart or wave. The bar chart or wave is high when the sound is loud and low when it is soft. We recognize that a repetitive sound, such as a metronome's beat, makes repetitive peaks and valleys in the visualization. When audio has multiple channels (being a stereo or surround sound recording), each channel can be considered as a separate signal and can be visualized as a separate bar chart or wave.

Similarly, in a video, every channel of every pixel can be considered as a separate signal, rising and falling (becoming brighter and dimmer) over time. Imagine that we use a stationary camera to capture a video of a metronome. Then, certain pixel values rise and fall at a regular interval as they capture the passage of the metronome's needle. If the camera has an attached microphone, its signal values rise and fall at the same interval. Based on either the audio or the...

Compositing two images using image pyramids

Running an FFT on a full-resolution video feed would be slow. Also, the resulting frequencies would reflect localized phenomena at each captured pixel, such that the motion map (the result of filtering the frequencies and then applying the IFFT) might appear noisy and overly sharpened. To address these problems, we want a cheap, blurry downsampling technique. However, we also want the option to enhance edges, which are important to our perception of motion.

Our need for a blurry downsampling technique is fulfilled by a Gaussian image pyramid. A Gaussian filter blurs an image by making each output pixel a weighted average of many input pixels in the neighborhood. An image pyramid is a series in which each image is half the width and height of the previous image. The halving of image dimensions is achieved by decimation, meaning that every other pixel is simply omitted. A Gaussian image pyramid is constructed by applying a Gaussian filter before...

Implementing the Lazy Eyes app

Let's create a new folder for Lazy Eyes and, in this folder, create copies of or links to the ResizeUtils.py and WxUtils.py files from any of our previous Python projects such as the project The Living Headlights in Chapter 5, Equipping Your Car with a Rearview Camera and Hazard Detection. Alongside the copies or links, let's create a new file, LazyEyes.py. Edit it and enter the following import statements:

import collections
import numpy
import cv2
import threading
import timeit
import wx

import pyfftw.interfaces.cache
from pyfftw.interfaces.scipy_fftpack import fft
from pyfftw.interfaces.scipy_fftpack import ifft
from scipy.fftpack import fftfreq

import ResizeUtils
import WxUtils

Besides the modules that we have used in previous projects, we are now using the standard library's collections module for efficient collections and timeit module for precise timing. Also for the first time, we are using signal processing functionality from PyFFTW and SciPy.

Like...

Configuring and testing the app for various motions

Currently, our main function initializes the LazyEyes object with the default parameters. By filling in the same parameter values explicitly, we would have this statement:

  lazyEyes = LazyEyes(maxHistoryLength=360,
            minHz=5.0/6.0, maxHz=1.0,
            amplification=32.0,
            numPyramidLevels=2,
            useLaplacianPyramid=True,
            useGrayOverlay=True,
            numFFTThreads = 4,
            numIFFTThreads=4,
            imageSize=(480, 360))

This recipe calls for a capture resolution of 480 x 360 and a signal processing resolution of 120 x 90 (as we are downsampling by 2 pyramid levels or a factor of 4). We are amplifying the motion only at frequencies of 0.833 Hz to 1.0 Hz, only at edges (as we are using the Laplacian pyramid), only in grayscale, and only over a history of 360 frames (about 20 to 40 seconds, depending on the frame rate). Motion is exaggerated by a factor of 32. These settings are suitable...

Seeing things in another light

Although we began this chapter by presenting Eulerian video magnification as a useful technique for visible light, it is also applicable to other kinds of light or radiation. For example, a person's blood (in veins and bruises) is more visible when imaged in ultraviolet (UV) or in near infrared (NIR) than in visible light. (Skin is more transparent to UV and NIR). Thus, a UV or NIR video is likely to be a better input if we are trying to magnify a person's pulse.

Here are some examples of NIR and UV cameras that might provide useful results, though I have not tested them:

The Pi NoIR camera (http://www.raspberrypi.org/products/pi-noir-camera/) is a consumer grade NIR camera with a MIPI interface. Here is a time lapse video showing the Pi NoIR renders outdoor scenes at https://www.youtube.com/watch?v=LLA9KHNvUK8. The camera is designed for Raspberry Pi, and on Raspbian it has V4L-compatible drivers that are directly compatible with OpenCV's VideoCapture class...

Summary

This chapter has introduced you to the relationship between computer vision and digital signal processing. We have considered a video feed as a collection of many signals—one for each channel value of each pixel—and we have understood that repetitive motions create wave patterns in some of these signals. We have used the Fast Fourier Transform and its inverse to create an alternative video stream that only sees certain frequencies of motion. Finally, we have superimposed this filtered video atop the original to amplify the selected frequencies of motion. There, we summarized Eulerian video magnification in 100 words!

Our implementation adapts Eulerian video magnification to real time by running the FFT repeatedly on a sliding window of recently captured frames rather than running it once on an entire prerecorded video. We have considered optimizations such as limiting our signal processing to grayscale, recycling large data structures rather than recreating them, and using several...