Kinect for Windows SDK Programming Guide

By Abhijit Jana
  • Instant online access to over 7,500+ books and videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. Understanding the Kinect Device

About this book

Kinect has been a game-changer in the world of motion games and applications since its first release. It has been touted as a controller for Microsoft Xbox but is much more than that. The developer version of Kinect, Kinect for Windows SDK, provides developers with the tools to develop applications that run on Windows. You can use this to develop applications that make interaction with your computer hands-free.

This book focuses on developing applications using the Kinect for Windows SDK. It is a complete end to end solution using different features of Kinect for Windows SDK with step by step guidance. The book will also help you develop motion sensitive and speech recognition enabled applications. You will also learn about building application using multiple Kinects.

The book begins with explaining the different components of Kinect and then moves into to the setting up the device and getting thedevelopment environment ready. You will be surprised at how quickly the book takes you through the details of Kinect APIs. You will use NUI to use the Kinect for Natural Inputs like skeleton tracking, sensing, speech recognizing.

You will capture different types of stream, and images, handle stream event, and capture frame. Kinect device contains a motorized tilt to control sensor angles, you will learn how to adjust it automatically. The last part of the book teaches you how to build application using multiple Kinects and discuss how Kinect can be used to integrate with other devices such as Windows Phone and microcontroller.

Publication date:
December 2012


Chapter 1. Understanding the Kinect Device

Welcome to the world of motion computing with Kinect. Kinect was originally known by the code name "Project Natal". It is a motion-sensing device which was originally developed for the Xbox 360 gaming console. One of the distinguishing factors that makes this device stand out among others in this genre is that it is not a hand-controlled device, but rather detects your body position, motion, and voice. Kinect provides a Natural User Interface (NUI) for interaction using body motion and gesture as well as spoken commands. Although this concept seems straight out of a fairytale, it is very much a reality now. The controller that was once the heart of a gaming device finds itself redundant in this Kinect age. You must be wondering where its replacement is. The answer, my friend, is YOU. It's you who is the replacement for the controller, and from now on, you are the controller for your Xbox. Kinect has ushered a new revolution in the gaming world, and it has completely changed the perception of a gaming device. Since its inception it has gone on to shatter several records in the gaming hardware domain. No wonder Kinect holds the Guinness World Record for being the "fastest selling consumer electronics device". One of the key selling points of the Kinect was the idea of "hands-free control", which caught the attention of gamers and tech enthusiasts alike and catapulted the device into instant stardom. This tremendous success has caused the Kinect to shatter all boundaries and venture out as an independent and standalone, gesture-controlled device.

It has now outgrown its Xbox roots and the Kinect sensor is no longer limited to only gaming. Kinect for Windows is a specially designed PC-centric sensor that helps developers to write their own code and develop real-life applications with human gestures and body motions. With the launch of the PC-centric Kinect for Windows devices, interest in motion-sensing software development has scaled a new peak.

As Kinect blazed through the market in such a short span of time, it has also created a necessity of resources that help people learn the technology in an appropriate way. As Kinect is still a relatively new entry into the market, the resources for learning how to develop applications for this device are scant. So how does a developer understand the basics of Kinect right from scratch? Here comes the utility of this book.

This book assumes that you have basic knowledge of C# and a great enthusiasm to program for Kinect devices. This book can be enjoyed by anybody interested in knowing more about the device and learning how to interact with devices using Kinect for Windows Software Development Kit (SDK). This book will also help you explore how to process video depth and audio stream, and build applications that interact with human body motion. The book has deliberately been kept simple and concise, which will aid in the quick grasping of the concepts.

Before delving into the development process, we need a good understanding of the device and, moreover, what the different types of applications are, which we can develop using these devices. In order to develop standard applications using the Kinect for Windows SDK, it is really important for us to understand the components it interacts with.

In this chapter we will cover the following topics:

  • Identifying the critical components that make up Kinect

  • Looking into the functionalities of each of the components

  • Learning how they interact with each other

  • Choosing between Kinect for Windows and Kinect for Xbox

  • Exploring different application areas where we can use Kinect


Components of Kinect for Windows

Kinect is a horizontal device with depth sensors, color camera, and a set of microphones with everything secured inside a small, flat box. The flat box is attached to a small motor working as the base that enables the device to be tilted in a horizontal direction. The Kinect sensor includes the following key components:

  • Color camera

  • Infrared (IR) emitter

  • IR depth sensor

  • Tilt motor

  • Microphone array

  • LED

Apart from the previously mentioned components, the Kinect device also has a power adapter for external power supply and a USB adapter to connect with a computer. The following figure shows the different components of a Kinect sensor:

Inside the Kinect sensor

From the outside, the Kinect sensor appears to be a plastic case with three cameras visible, but it has very sophisticated components, circuits, and algorithms embedded. If you remove the black plastic cover from the Kinect device, what will you see? The hardware components that make the Kinect sensor work.

The following image shows a front view of a Kinect sensor that's been unwrapped from its black case. Take a look (from left to right) at its IR emitter, color camera, and IR depth sensor:

Let's move further and discuss about component.

The color camera

This color camera is responsible for capturing and streaming the color video data. Its function is to detect the red, blue, and green colors from the source. The stream of data returned by the camera is a succession of still image frames. The Kinect color stream supports a speed of 30 frames per second (FPS) at a resolution of 640 x 480 pixels, and a maximum resolution of 1280 x 960 pixels at up to 12 FPS. The value of frames per second can vary depending on the resolution used for the image frame.

The viewable range for the Kinect cameras is 43 degrees vertical by 57 degrees horizontal. The following figure shows an illustration of the viewable range of the Kinect camera:

The following image shows a color image that was captured using Kinect color sensors with a resolution of 640 x 480 pixels:

IR emitter and IR depth sensor

Kinect depth sensors consist of an IR emitter and an IR depth sensor. Both of them work together to make things happen. The IR emitter may look like a camera from the outside, but it's an IR projector that constantly emits infrared light in a "pseudo-random dot" pattern over everything in front of it. These dots are normally invisible to us, but it is possible to capture their depth information using an IR depth sensor. The dotted light reflects off different objects, and the IR depth sensor reads them from the objects and converts them into depth information by measuring the distance between the sensor and the object from where the IR dot was read. The following figure shows how the overall depth sensing looks:


It is quite fun and entertaining to know that these infrared dots can be seen by you. All we need is a night vision camera or goggles.

The depth data stream supports a resolution of 640 x 480 pixels, 320 x 240 pixels, and 80 x 60 pixels, and the sensor viewable range remains the same as the color camera.

The following image shows depth images that are captured from the depth image stream:

How depth data processing works

The Kinect sensor has the ability to capture a raw, 3D view of the objects in front of it, regardless of the lighting conditions of the room. It uses an infrared (IR) emitter and an IR depth sensor that is a monochrome CMOS (Complimentary Metal-Oxide-Semiconductor) sensor. The backbone behind this technology is from PrimeSense, and the following diagram shows how this works:

The sequence explained in the diagram is as follows:

When there is a need to capture depth data, the PrimeSense chip sends a signal to the infrared emitter to turn on the infrared light (1), and sends another signal to the IR depth sensor to initiate depth data capture from the current viewable range of the sensor (2). The IR emitter meanwhile starts sending an infrared light invisible to human eyes (3) to the objects in front of the device. The IR depth sensor starts reading the inferred data from the object based on the distance of the individual light points of reflection (4) and passes it to the PrimeSense chip (5). The PrimeSense chip then analyzes the captured data, and creates a per-frame depth image and passes it to the output depth stream as a depth image (6).


The IR emitter emits an electromagnetic radiation. The wavelengths of the radiations are longer than the wavelength of the visible light, which makes the sensor's IR lights invisible. The wavelengths need to be consistent to minimize the noise within the captured data. Heat generated by the laser diode when the Kinect sensor is running can impact the wavelength. The Kinect sensor has a small, inbuilt fan to normalize the temperature and ensure that the wavelengths are consistent.

Tilt motor

The base and body part of the sensor are connected by a tiny motor. It is used to change the camera and sensor's angles, to get the correct position of the human skeleton within the room. The following image shows the motor along with three gears that enable the sensor to tilt at a specified range of angles:

The motor can be tilted vertically up to 27 degrees, which means that the Kinect sensor's angles can be shifted upwards or downwards by 27 degrees. The following figure shows an illustration of the angle being changed when the motor is tilted:


Do not physically force the device into a specific angle. The Kinect for Windows SDK has a few specific APIs that can help us control the sensor's motor tilting. Do not tilt the Kinect motor frequently; use this as few times as possible and only when it's required.

Microphone array

The Kinect device exhibits great support for audio with the help of a microphone array. The microphone array consists of four different microphones that are placed in a linear order (three of them are spread on the right side and the other one is placed on the left side, as shown in the following image) at the bottom of the Kinect sensor:

The purpose of the microphone array is not just to let the Kinect device capture the sound but to also locate the direction of the audio wave. The main advantages of having an array of microphones over a single microphone are that capturing and recognizing the voice is done more effectively with enhanced noise suppression, echo cancellation, and beam-forming technology. This enables Kinect to be a highly bidirectional microphone that can identify the source of the sound and recognize the voice irrespective of the noise and echo present in the environment:


An LED is placed in between the camera and the IR projector. It is used for indicating the status of the Kinect device. The green color of the LED indicates that the Kinect device drivers have loaded properly. If you are plugging Kinect into a computer, the LED will start with a green light once your system detects the device; however for full functionality of your device, you need to plug the device into an external power source.


Kinect for Windows versus Kinect for Xbox

Although "Kinect for Windows" and "Kinect for Xbox" are similar in many respects, there are several subtle differences from a developer's point of view. We have to keep in mind that the main purpose of Kinect for Xbox was to enhance the gaming experience of the players. Developing applications was not its primary purpose. In contrast, Kinect for Windows is primarily a developing device and not for gaming purposes.

You can develop applications that use either the Kinect for Windows sensor or the Kinect for Xbox sensor. The Kinect for Xbox sensor was built to track players that are up to 12 feet (4.0 meters) away from the sensor. But it fails to track objects that are very close (80 cm), and we might need to track objects at a very close range for different applications. The Kinect for Windows sensor has new firmware, which enables Near Mode tracking. Using Near Mode, Kinect for Windows supports the tracking of objects as close as 40 cm in front of the device without losing accuracy or precision. In terms of range both the sensors behave the same.


Kinect for Windows SDK exposes APIs that can control the mode of the sensors (Near Mode or Default Mode) using our application, however the core changes for this feature are built within the firmware of the Kinect for Windows sensor.

Both the Kinect for Windows and Kinect for Xbox sensors need additional power for the sensors to work with your PC. This might not be required when connected to the Xbox device as the Xbox port has enough power to operate the device. There is no difference between Xbox Kinect and Kinect for Windows in this respect. However in Kinect for Windows, the USB cable is small and improved to enable more reliability and portability across a wide range of computers.

And finally, the Kinect for Windows sensor is for commercial applications, which means that if you are developing a commercial application, you must use the Kinect for Windows device for production, whereas you can use Kinect for Xbox for general development, learning, and research purposes.


Where can you use Kinect

By now it has already struck you that this is something more than just gaming. The Kinect sensor for Windows and the Kinect for Windows SDK unwrap a new opportunity for the developer to build a wide range of applications. These can include:

  • Capturing real-time video using the color sensor

  • Tracking a human body and then responding to its movements and gestures as a natural user interface

  • Measuring the distances of objects and responding

  • Analyzing 3D data and making a 3D model and measurement

  • Generating a depth map of the objects tracked

  • Recognizing a human voice and developing hands-free applications that can be controlled by voice

With this you can build a number of real-world applications that fall under a different domain. The following are a few examples, which will help you understand the applicability of Kinect sensors:

  • Healthcare: Using the Kinect sensor, you can build different applications for healthcare, such as exercise measurement, monitoring patients, their body movements, and so on

  • Robotics: Kinect can be used as a navigation system for robots either by tracking human gestures, voice commands, or by human body movements

  • Education: You can build various applications for students and kids to educate and help them to learn subjects either by their gesture and voice commands

  • Security system: Kinect can be used for developing security systems where you can track human body movement or face and send the notifications

  • Virtual Reality: With the help of Kinect 3D technology and human gesture tracking, several virtual reality applications can be build using the Kinect sensor

  • Trainer: Kinect can potentially be used as a trainer by measuring the movements of human body joints, providing live feedback to users if the joints are moving in an appropriate manner by comparing the movements with previously stored data

  • Military: Kinect can be used to build intelligent drones to spy on enemy lines

Well these were just a few specific examples of domains where you can use Kinect, but at the end of the day it's up to your imagination; where and how you want this device to work.



This chapter gave you an inside look at the different components of the Kinect sensor. You saw that the major components of a Kinect device are its color sensor, IR depth sensors, IR emitter, microphone arrays, and a stepper motor that can be tilted to change the Kinect camera angles. While the color sensor and depth sensors ensure video and depth data input, which is of prime importance for the functioning of the device, the microphone arrays on the other hand ensure that the audio quality is also at par. Also worthwhile is mentioning about how kinect processes the depth data, and the array of microphones, which is a design novelty that helps in clear voice recognition with the use of the noise suppression and echo cancelation mechanisms. Kinect for Windows is also capable of tracking humans at a close range of approximately 40 centimeters using Near Mode. It wouldn't be wrong to say that it is this combination of technological innovations that make Kinect the awe-inspiring device that it is. You have also gone through the different possibilities of applications that can be developed using Kinect. In the next chapter, we will walk you through the step-by-step installation and configuration of the development environment setup along with different troubleshooting tips and tricks that will help you to be sure about everything before beginning with development.

About the Author

  • Abhijit Jana

    Abhijit Jana works with Microsoft as a development consultant as part of Microsoft Services. As a consultant, his job is to help customers design, develop, and deploy enterprise-level secure solutions using Microsoft technologies. Apart from being a former Microsoft MVP (Most Valuable Professional), he is a speaker and author as well as an avid technology evangelist. He has delivered sessions at prestigious Microsoft events, such as TechED, Web Camps, Azure Camps, Community TechDays, Virtual TechDays, DevDays, and developer conferences. He loves to work with different .NET communities and help them with different opportunities.

    He is a well-known author and has published many articles on various .NET community sites. You can follow him on Twitter at @abhijitjana. He has authored the book Kinect for Windows SDK Programming Guide (ISBN: 1849692386 ISBN 13: 9781849692380).

    Abhijit lives in Hyderabad, India, with his wife, Ananya, and a beautiful little angel, Nilova.

    Browse publications by this author
Book Title
Unlock this full book FREE 10 day trial
Start Free Trial