Reader small image

You're reading from  OpenCV with Python Blueprints

Product typeBook
Published inOct 2015
Reading LevelIntermediate
PublisherPackt
ISBN-139781785282690
Edition1st Edition
Languages
Right arrow
Authors (2):
Michael Beyeler
Michael Beyeler
author image
Michael Beyeler

Michael Beyeler is a postdoctoral fellow in neuroengineering and data science at the University of Washington, where he is working on computational models of bionic vision in order to improve the perceptual experience of blind patients implanted with a retinal prosthesis (bionic eye).His work lies at the intersection of neuroscience, computer engineering, computer vision, and machine learning. He is also an active contributor to several open source software projects, and has professional programming experience in Python, C/C++, CUDA, MATLAB, and Android. Michael received a PhD in computer science from the University of California, Irvine, and an MSc in biomedical engineering and a BSc in electrical engineering from ETH Zurich, Switzerland.
Read more about Michael Beyeler

Michael Beyeler (USD)
Michael Beyeler (USD)
author image
Michael Beyeler (USD)

Michael Beyeler is a postdoctoral fellow in neuroengineering and data science at the University of Washington, where he is working on computational models of bionic vision in order to improve the perceptual experience of blind patients implanted with a retinal prosthesis (bionic eye).His work lies at the intersection of neuroscience, computer engineering, computer vision, and machine learning. He is also an active contributor to several open source software projects, and has professional programming experience in Python, C/C++, CUDA, MATLAB, and Android. Michael received a PhD in computer science from the University of California, Irvine, and an MSc in biomedical engineering and a BSc in electrical engineering from ETH Zurich, Switzerland.
Read more about Michael Beyeler (USD)

View More author details
Right arrow

Chapter 2. Hand Gesture Recognition Using a Kinect Depth Sensor

The goal of this chapter is to develop an app that detects and tracks simple hand gestures in real time using the output of a depth sensor, such as that of a Microsoft Kinect 3D sensor or an Asus Xtion. The app will analyze each captured frame to perform the following tasks:

  • Hand region segmentation: The user's hand region will be extracted in each frame by analyzing the depth map output of the Kinect sensor, which is done by thresholding, applying some morphological operations, and finding connected components

  • Hand shape analysis: The shape of the segmented hand region will be analyzed by determining contours, convex hull, and convexity defects

  • Hand gesture recognition: The number of extended fingers will be determined based on the hand contour's convexity defects, and the gesture will be classified accordingly (with no extended fingers corresponding to a fist, and five extended fingers corresponding to an open hand)

Gesture...

Planning the app


The final app will consist of the following modules and scripts:

  • gestures: A module that consists of an algorithm for recognizing hand gestures. We separate this algorithm from the rest of the application so that it can be used as a standalone module without the need for a GUI.

  • gestures.HandGestureRecognition: A class that implements the entire process flow of hand-gesture recognition. It accepts a single-channel depth image (acquired from the Kinect depth sensor) and returns an annotated RGB color image with an estimated number of extended fingers.

  • gui: A module that provides a wxPython GUI application to access the capture device and display the video feed. This is the same module that we used in the last chapter. In order to have it access the Kinect depth sensor instead of a generic camera, we will have to extend some of the base class functionality.

  • gui.BaseLayout: A generic layout from which more complicated layouts can be built.

  • chapter2: The main script for the chapter...

Setting up the app


Before we can get down to the nitty-gritty of our gesture recognition algorithm, we need to make sure that we can access the Kinect sensor and display a stream of depth frames in a simple GUI.

Accessing the Kinect 3D sensor

Accessing Microsoft Kinect from within OpenCV is not much different from accessing a computer's webcam or camera device. The easiest way to integrate a Kinect sensor with OpenCV is by using an OpenKinect module called freenect. For installation instructions, take a look at the preceding information box. The following code snippet grants access to the sensor using cv2.VideoCapture:

import cv2
import freenect


device = cv2.cv.CV_CAP_OPENNI
capture = cv2.VideoCapture(device)

On some platforms, the first call to cv2.VideoCapture fails to open a capture channel. In this case, we provide a workaround by opening the channel ourselves:

if not(capture.isOpened(device)):
    capture.open(device)

If you want to connect to your Asus Xtion, the device variable should...

Tracking hand gestures in real time


Hand gestures are analyzed by the HandGestureRecognition class, especially by its recognize method. This class starts off with a few parameter initializations, which will be explained and used later:

class HandGestureRecognition:
    def __init__(self):
        # maximum depth deviation for a pixel to be considered # within range
        self.abs_depth_dev = 14

        # cut-off angle (deg): everything below this is a convexity 
        # point that belongs to two extended fingers
        self.thresh_deg = 80.0

The recognize method is where the real magic takes place. This method handles the entire process flow, from the raw grayscale image all the way to a recognized hand gesture. It implements the following procedure:

  1. It extracts the user's hand region by analyzing the depth map (img_gray) and returning a hand region mask (segment):

    def recognize(self, img_gray):
        segment = self._segment_arm(img_gray)
  2. It performs contour analysis on the hand region mask...

Hand region segmentation


The automatic detection of an arm, and later the hand region, could be designed to be arbitrarily complicated, maybe by combining information about the shape and color of an arm or hand. However, using a skin color as a determining feature to find hands in visual scenes might fail terribly in poor lighting conditions or when the user is wearing gloves. Instead, we choose to recognize the user's hand by its shape in the depth map. Allowing hands of all sorts to be present in any region of the image unnecessarily complicates the mission of the present chapter, so we make two simplifying assumptions:

  • We will instruct the user of our app to place their hand in front of the center of the screen, orienting their palm roughly parallel to the orientation of the Kinect sensor so that it is easier to identify the corresponding depth layer of the hand.

  • We will also instruct the user to sit roughly one to two meters away from the Kinect, and to slightly extend their arm in front...

Hand shape analysis


Now that we know (roughly) where the hand is located, we aim to learn something about its shape.

Determining the contour of the segmented hand region

The first step involves determining the contour of the segmented hand region. Luckily, OpenCV comes with a pre-canned version of such an algorithm—cv2.findContours. This function acts on a binary image and returns a set of points that are believed to be part of the contour. As there might be multiple contours present in the image, it is possible to retrieve an entire hierarchy of contours:

def _find_hull_defects(self, segment):
    contours, hierarchy = cv2.findContours(segment, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

Furthermore, because we do not know which contour we are looking for, we have to make an assumption to clean up the contour result. Since it is possible that some small cavities are left over even after the morphological closing—but we are fairly certain that our mask contains only the segmented area of interest...

Hand gesture recognition


What remains to be done is to classify the hand gesture based on the number of extended fingers. For example, if we find five extended fingers, we assume the hand to be open, whereas no extended fingers implies a fist. All that we are trying to do is count from zero to five and make the app recognize the corresponding number of fingers.

This is actually trickier than it might seem at first. For example, people in Europe might count to three by extending their thumb, index finger, and middle finger. If you do that in the US, people there might get horrendously confused, because they do not tend to use their thumbs when signaling the number two. This might lead to frustration, especially in restaurants (trust me). If we could find a way to generalize these two scenarios—maybe by appropriately counting the number of extended fingers—we would have an algorithm that could teach simple hand gesture recognition to not only a machine but also (maybe) to an average waitress...

Summary


This chapter showed a relatively simple and yet surprisingly robust way of recognizing a variety of hand gestures by counting the number of extended fingers.

The algorithm first shows how a task-relevant region of the image can be segmented using depth information acquired from a Microsoft Kinect 3D Sensor, and how morphological operations can be used to clean up the segmentation result. By analyzing the shape of the segmented hand region, the algorithm comes up with a way to classify hand gestures based on the types of convexity effects found in the image. Once again, mastering our use of OpenCV to perform a desired task did not require us to produce a large amount of code. Instead, we were challenged to gain an important insight that made us use the built-in functionality of OpenCV in the most effective way possible.

Gesture recognition is a popular but challenging field in computer science, with applications in a large number of areas, such as human-computer interaction, video surveillance...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
OpenCV with Python Blueprints
Published in: Oct 2015Publisher: PacktISBN-13: 9781785282690
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
Michael Beyeler

Michael Beyeler is a postdoctoral fellow in neuroengineering and data science at the University of Washington, where he is working on computational models of bionic vision in order to improve the perceptual experience of blind patients implanted with a retinal prosthesis (bionic eye).His work lies at the intersection of neuroscience, computer engineering, computer vision, and machine learning. He is also an active contributor to several open source software projects, and has professional programming experience in Python, C/C++, CUDA, MATLAB, and Android. Michael received a PhD in computer science from the University of California, Irvine, and an MSc in biomedical engineering and a BSc in electrical engineering from ETH Zurich, Switzerland.
Read more about Michael Beyeler

author image
Michael Beyeler (USD)

Michael Beyeler is a postdoctoral fellow in neuroengineering and data science at the University of Washington, where he is working on computational models of bionic vision in order to improve the perceptual experience of blind patients implanted with a retinal prosthesis (bionic eye).His work lies at the intersection of neuroscience, computer engineering, computer vision, and machine learning. He is also an active contributor to several open source software projects, and has professional programming experience in Python, C/C++, CUDA, MATLAB, and Android. Michael received a PhD in computer science from the University of California, Irvine, and an MSc in biomedical engineering and a BSc in electrical engineering from ETH Zurich, Switzerland.
Read more about Michael Beyeler (USD)