You're reading from OpenCV Computer Vision Application Programming Cookbook Second Edition
In the previous chapter, we learned how to detect special points in an image with the objective of subsequently performing local image analysis. These keypoints are chosen to be distinctive enough such that if a keypoint is detected on the image of an object, then the same point is expected to be detected in other images depicting the same object. We also described some more sophisticated interest point detectors that can assign a representative scale factor and/or an orientation to a keypoint. As we will see in this recipe, this additional information can be useful to normalize scene representations with respect to viewpoint variations.
In order to perform image analysis based on interest points, we now need to build rich representations that uniquely describe each of these keypoints. This chapter looks at the different approaches that have been proposed to extract descriptors from interest points. These descriptors are generally 1D or 2D vectors of binary, integer, or floating...
Feature point matching is the operation by which one can put in correspondence points from one image to points from another image (or points from an image set). Image points should match when they correspond to the image of the same scene element (or the object point) in the real world.
A single pixel is certainly not sufficient to make a decision on the similarity of two keypoints. This is why an image patch around each keypoint must be considered during the matching process. If two patches correspond to the same scene element, then one might expect their pixels to exhibit similar values. A direct pixel-by-pixel comparison of pixel patches is the solution presented in this recipe. This is probably the simplest approach to feature point matching, but as we will see, it is not the most reliable one. Nevertheless, in several situations, it can give good results.
The SURF and SIFT keypoint detection algorithms, discussed in Chapter 8, Detecting Interest Points, define a location, an orientation, and a scale for each of the detected features. The scale factor information is useful to define the size of a window of analysis around each feature point. Thus, the defined neighborhood would include the same visual information no matter what the scale of the object to which the feature belongs has been pictured. This recipe will show you how to describe an interest point's neighborhood using feature descriptors. In image analysis, the visual information included in this neighborhood can be used to characterize each feature point in order to make each point distinguishable from the others. Feature descriptors are usually N-dimensional vectors that describe a feature point in a way that is invariant to change in lighting and to small perspective deformations. Generally, descriptors can be compared using simple distance...
In the previous recipe, we learned how to describe a keypoint using rich descriptors extracted from the image intensity gradient. These descriptors are floating-point vectors that have a dimension of 64
, 128
, or sometimes even longer. This makes them costly to manipulate. In order to reduce the memory and computational load associated with these descriptors, the idea of using binary descriptors has been recently introduced. The challenge here is to make them easy to compute and yet keep them robust to scene and viewpoint changes. This recipe describes some of these binary descriptors. In particular, we will look at the ORB and BRISK descriptors for which we presented their associated feature point detectors in Chapter 8, Detecting Interest Points.