Chapter 6. Object Detection and Recognition
Object recognition has an important role in robotics. It is the process of identifying an object from camera images and finding its location. Using this, a robot can pick an object from the workspace and place it at another location.
This chapter will be useful for those who want to prototype a solution for a vision-related task. We are going to look at some popular ROS packages to perform object detection and recognition in 2D and 3D. We are not digging more into the theoretical aspects, but you may see short notes about the algorithm while we discuss their applications.
You will learn about the following topics:
Getting started with object detection and recognition
The find_object_2d
package in ROS
Installing find_object_2d
Detecting and tracking an object using a webcam
Detecting and tracking using 3D depth sensors
Getting started with 3D object recognition
Introducing the object-recognition package in ROS
Installing object-recognition packages
Detecting...
Getting started with object detection and recognition
So what's the main difference between detection and recognition? Consider face detection and face recognition. In face detection, the algorithm tries to detect a face from an image, but in recognition, the algorithm can also state information about whose face is detected. It may be the person's name, gender, or something else.
Similarly, object detection involves the detection of a class of object and recognition performs the next level of classification, which tells which us the name of the object.
There is a vast number of applications that use object detection and recognition techniques. Here is a popular application that is going to be used in Amazon warehouses:
Figure 1: A photo from an Amazon Picking Challenge
Amazon is planning to automate the picking and placing of objects from the shelves inside their warehouses. To retrieve objects from the shelves, they are planning to deploy robotic arms such as the one shown in the previous...
The find_object_2d package in ROS
One of the advantages of ROS is that it has tons of packages that can be reused in our applications. In our case, what we want is to implement an object recognition and detection system. The find_object_2d
package (http://wiki.ros.org/find_object_2d) implements SURF, SIFT, FAST, and BRIEF feature detectors (https://goo.gl/B8H9Zm) and descriptors for object detection. Using the GUI provided by this package, we can mark the objects we want to detect and save them for future detection. The detector node will detect the objects in camera images and publish the details of the object through a topic. Using a 3D sensor, it can estimate the depth and orientation of the object.
Installing find_object_2d
Installing this package is pretty easy. Here is the command to install it on Ubuntu 16.04 and ROS Kinetic:
$ sudo apt-get install ros-kinetic-find-object-2d
Installing from source code
Switch into the ROS workspace:
$ cd ~/catkin_ws/src
Clone the source code into the...
Getting started with 3D object recognition
In the previous section, we dealt with 2D object recognition using a 2D and 3D sensor. In this section, we will discuss 3D recognition. So what is 3D object recognition? In 3D object recognition, we take the 3D data or point cloud data of the surroundings and 3D model of the object. Then, we match the scene object with the trained model, and if there is a match found, the algorithm will mark the area of detection.
In real-world scenarios, 3D object recognition/detection is much better than 2D because in 3D detection, we use the complete information of the object, similar to human perception. But there are many challenges involved in this process too. Some of the main constrains are computational power and expensive sensors. We may need more expensive computers to process 3D information; also, the sensors for this purpose are costlier.
Some of the latest applications using 3D object detection and recognition are autonomous robots, especially self-driving...
Introduction to 3D object recognition packages in ROS
ROS has packages for performing 3D object recognition. One of the popular packages we are dealing with in this section is the Object Recognition Kitchen (ORK). This project was started at Willow Garage mainly for 3D object recognition. The ORK is a generic way to detect any kind of object, whether it be textured, nontextured, transparent, and so on. It is a complete kit in which we can run several object-recognition techniques simultaneously. It is not just a kit for object recognition, but it also provides non-vision aspects, such as database management to store 3D models, input/output handling, robot-ROS integration, and code reuse.
Installing ORK packages in ROS
Here are the installation instructions to set up the object_recognition
package in ROS. We can install it using prebuilt binaries and source code....
Detecting and recognizing objects from 3D meshes
After installing these packages, let's start the detection. What are the procedures involved? Here are the main steps:
Building a CAD model of the object or capturing its 3D model
Training the model
Detecting the object using the trained model
The first step in the recognition process is building the 3D model of the desired object. We can do it using a CAD tool, or we can capture the real object using depth-sensing cameras. If the object is rigid, then the best procedure is CAD modelling, because it will have all the 3D information regarding the object. When we try to capture and build a 3D model, it may have errors and the mesh may not be look like the actual object because of the accumulation of errors in each stage. After building the object model, it will be uploaded to the object database. The next phase is the training of the uploaded object on the database. After training, we can start the detection process. The detection process will start...
There are several commands to start recognition using a trained model.
Starting roscore
:
$ roscore
Starting the ROS driver for Kinect:
$ roslaunch openni_launch openni.launch
Setting the ROS parameters for the Kinect driver:
$ rosrun dynamic_reconfigure dynparam set /camera/driver depth_registration True
$ rosrun dynamic_reconfigure dynparam set /camera/driver image_mode 2
$ rosrun dynamic_reconfigure dynparam set /camera/driver depth_mode 2
Republishing the depth and RGB image topics using topic_tools relay
:
$ rosrun topic_tools relay /camera/depth_registered/image_raw /camera/depth/image_raw
$ rosrun topic_tools relay /camera/rgb/image_rect_color /camera/rgb/image_raw
Here is the command to start recognition; we can use different pipelines to perform detection. The following command uses the tod pipeline. This will work well for textured objects.
$ rosrun object_recognition_core detection -c `rospack find object_recognition_tod`/conf/detection.ros.ork --visualize
Alternatively...
In this chapter, we dealt with object detection and recognition. Both these things are extensively used in robotic applications. The chapter started with a popular ROS package for 2D object detection. The package is called find_2d_object
, and we covered object detection using a webcam and Kinect. After going through a demo using this package, we discussed 3D object recognition using a ROS package called object_recognition
, which is mainly for 3D object recognition. We saw methods to build and capture the object model and its training procedure. After training, we discussed the steps to start detecting the object using a depth camera. Finally, we visualized the object recognition in Rviz.
In the next chapter, we will deal with interfacing ROS and Google TensorFlow.