In this article by Joe Minichino, author of Learning OpenCV 3 Computer Vision with Python, Second Edition, will discuss these kinds of reasonings later.
(For more resources related to this topic, see here.)
Edges play a major role in both human and computer vision. We, as humans, can easily recognize many object types and their positons just by seeing a backlit silhouette or a rough sketch. Indeed, when art emphasizes edges and pose, it often seems to convey the idea of an archetype, such as Rodin's The Thinker or Joe Shuster's Superman. Software, too, can reason about edges, poses, and archetypes. In this article by Joe Minichino, author of Learning OpenCV 3 Computer Vision with Python, Second Edition, will discuss these kinds of reasonings later.
OpenCV provides many edge-finding filters, including Laplacian(), Sobel(), and Scharr(). These filters are supposed to turn non-edge regions to black, while turning edge regions to white or saturated colors. However, they are prone to misidentifying noise as edges. This flaw can be mitigated by blurring an image before trying to find its edges. OpenCV also provides many blurring filters, including blur() (simple average), medianBlur(), and GaussianBlur(). The arguments for the edge-finding and blurring filters vary, but always include ksize, an odd whole number that represents the width and height (in pixels) of the filter's kernel.
For the purpose of blurring, let's use medianBlur(), which is effective in removing digital video noise, especially in color images. For the purpose of edge-finding, let's use Laplacian(), which produces bold edge lines, especially in grayscale images. After applying medianBlur(), but before applying Laplacian(), we should convert the BGR to grayscale.
Once we have the result of Laplacian(), we can invert it to get black edges on a white background. Then, we can normalize (so that its values range from 0 to 1) and multiply it with the source image to darken the edges. Let's implement this approach in filters.py:
def strokeEdges(src, dst, blurKsize = 7, edgeKsize = 5): if blurKsize >= 3: blurredSrc = cv2.medianBlur(src, blurKsize) graySrc = cv2.cvtColor(blurredSrc, cv2.COLOR_BGR2GRAY) else: graySrc = cv2.cvtColor(src, cv2.COLOR_BGR2GRAY) cv2.Laplacian(graySrc, cv2.CV_8U, graySrc, ksize = edgeKsize) normalizedInverseAlpha = (1.0 / 255) * (255 - graySrc) channels = cv2.split(src) for channel in channels: channel[:] = channel * normalizedInverseAlpha cv2.merge(channels, dst)
Note that we allow kernel sizes to be specified as arguments to strokeEdges(). The blurKsize argument is used as ksize for medianBlur(), while edgeKsize is used as ksize for Laplacian(). With my webcams, I find that a blurKsize value of 7 and an edgeKsize value of 5 look best. Unfortunately, medianBlur() is expensive with a large ksize, such as 7.
If you encounter performance problems when running strokeEdges(), try decreasing the blurKsize value. To turn off the blur option, set it to a value less than 3.
Custom kernels – getting convoluted
As we have just seen, many of OpenCV's predefined filters use a kernel. Remember that a kernel is a set of weights that determine how each output pixel is calculated from a neighborhood of input pixels. Another term for a kernel is a convolution matrix. It mixes up or convolvesthe pixels in a region. Similarly, a kernel-based filter may be called a convolution filter.
OpenCV provides a very versatile function, filter2D(), which applies any kernel or convolution matrix that we specify. To understand how to use this function, let's first learn the format of a convolution matrix. This is a 2D array with an odd number of rows and columns. The central element corresponds to a pixel of interest and the other elements correspond to this pixel's neighbors. Each element contains an integer or floating point value, which is a weight that gets applied to an input pixel's value. Consider this example:
kernel = numpy.array([[-1, -1, -1], [-1, 9, -1], [-1, -1, -1]])
Here, the pixel of interest has a weight of 9 and its immediate neighbors each have a weight of -1. For the pixel of interest, the output color will be nine times its input color, minus the input colors of all eight adjacent pixels. If the pixel of interest was already a bit different from its neighbors, this difference becomes intensified. The effect is that the image looks sharper as the contrast between neighbors is increased.
Continuing our example, we can apply this convolution matrix to a source and destination image, respectively, as follows:
cv2.filter2D(src, -1, kernel, dst)
The second argument specifies the per-channel depth of the destination image (such as cv2.CV_8U for 8 bits per channel). A negative value (as used here) means that the destination image has the same depth as the source image.
For color images, note that filter2D() applies the kernel equally to each channel. To use different kernels on different channels, we would also have to use the split() and merge() functions.
Based on this simple example, let's add two classes to filters.py. One class, VConvolutionFilter, will represent a convolution filter in general. A subclass, SharpenFilter, will specifically represent our sharpening filter. Let's edit filters.py to implement these two new classes as follows:
class VConvolutionFilter(object): """A filter that applies a convolution to V (or all of BGR).""" def __init__(self, kernel): self._kernel = kernel def apply(self, src, dst): """Apply the filter with a BGR or gray source/destination.""" cv2.filter2D(src, -1, self._kernel, dst) class SharpenFilter(VConvolutionFilter): """A sharpen filter with a 1-pixel radius.""" def __init__(self): kernel = numpy.array([[-1, -1, -1], [-1, 9, -1], [-1, -1, -1]]) VConvolutionFilter.__init__(self, kernel)
Note that the weights sum up to 1. This should be the case whenever we want to leave the image's overall brightness unchanged. If we modify a sharpening kernel slightly so that its weights sum up to 0 instead, then we have an edge detection kernel that turns edges white and non-edges black. For example, let's add the following edge detection filter to filters.py:
class FindEdgesFilter(VConvolutionFilter): """An edge-finding filter with a 1-pixel radius.""" def __init__(self): kernel = numpy.array([[-1, -1, -1], [-1, 8, -1], [-1, -1, -1]]) VConvolutionFilter.__init__(self, kernel)
Next, let's make a blur filter. Generally, for a blur effect, the weights should sum up to 1 and should be positive throughout the neighborhood. For example, we can take a simple average of the neighborhood as follows:
class BlurFilter(VConvolutionFilter): """A blur filter with a 2-pixel radius.""" def __init__(self): kernel = numpy.array([[0.04, 0.04, 0.04, 0.04, 0.04], [0.04, 0.04, 0.04, 0.04, 0.04], [0.04, 0.04, 0.04, 0.04, 0.04], [0.04, 0.04, 0.04, 0.04, 0.04], [0.04, 0.04, 0.04, 0.04, 0.04]]) VConvolutionFilter.__init__(self, kernel)
Our sharpening, edge detection, and blur filters use kernels that are highly symmetric. Sometimes, though, kernels with less symmetry produce an interesting effect. Let's consider a kernel that blurs on one side (with positive weights) and sharpens on the other (with negative weights). It will produce a ridged or embossed effect. Here is an implementation that we can add to filters.py:
class EmbossFilter(VConvolutionFilter): """An emboss filter with a 1-pixel radius.""" def __init__(self): kernel = numpy.array([[-2, -1, 0], [-1, 1, 1], [ 0, 1, 2]]) VConvolutionFilter.__init__(self, kernel)
This set of custom convolution filters is very basic. Indeed, it is more basic than OpenCV's ready-made set of filters. However, with a bit of experimentation, you will be able to write your own kernels that produce a unique look.
Modifying an application
Now that we have high-level functions and classes for several filters, it is trivial to apply any of them to the captured frames in Cameo. Let's edit cameo.py and add the lines that appear in bold face in the following excerpt:
import cv2 import filters from managers import WindowManager, CaptureManager class Cameo(object): def __init__(self): self._windowManager = WindowManager('Cameo', self.onKeypress) self._captureManager = CaptureManager( cv2.VideoCapture(0), self._windowManager, True) self._curveFilter = filters.BGRPortraCurveFilter() def run(self): """Run the main loop.""" self._windowManager.createWindow() while self._windowManager.isWindowCreated: self._captureManager.enterFrame() frame = self._captureManager.frame filters.strokeEdges(frame, frame) self._curveFilter.apply(frame, frame) self._captureManager.exitFrame() self._windowManager.processEvents()
Here, I have chosen to apply two effects: stroking the edges and emulating Portra film colors. Feel free to modify the code to apply any filters you like.
Here is a screenshot from Cameo, with stroked edges and Portra-like colors:
Edge detection with Canny
OpenCV also offers a very handy function, called Canny, (after the algorithm's inventor, John F. Canny) which is very popular not only because of its effectiveness, but also the simplicity of its implementation in an OpenCV program as it is a one-liner:
import cv2 import numpy as np img = cv2.imread("../images/statue_small.jpg", 0) cv2.imwrite("canny.jpg", cv2.Canny(img, 200, 300)) cv2.imshow("canny", cv2.imread("canny.jpg")) cv2.waitKey() cv2.destroyAllWindows()
The result is a very clear identification of the edges:
The Canny edge detection algorithm is quite complex but also interesting: it's a five-step process that denoises the image with a Gaussian filter, calculates gradients, applies nonmaximum suppression (NMS) on edges and a double threshold on all the detected edges to eliminate false positives, and, lastly, analyzes all the edges and their connection to each other to keep the real edges and discard weaker ones.
Another vital task in computer vision is contour detection, not only because of the obvious aspect of detecting contours of subjects contained in an image or video frame, but because of the derivative operations connected with identifying contours.
These operations are, namely computing bounding polygons, approximating shapes, and, generally, calculating regions of interest, which considerably simplifies the interaction with image data. This is because a rectangular region with numpy is easily defined with an array slice. We will be using this technique a lot when exploring the concept of object detection (including faces) and object tracking.
Let's go in order and familiarize ourselves with the API first with an example:
import cv2 import numpy as np img = np.zeros((200, 200), dtype=np.uint8) img[50:150, 50:150] = 255 ret, thresh = cv2.threshold(img, 127, 255, 0) image, contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE) color = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR) img = cv2.drawContours(color, contours, -1, (0,255,0), 2) cv2.imshow("contours", color) cv2.waitKey() cv2.destroyAllWindows()
Firstly, we create an empty black image that is 200x200 pixels size. Then, we place a white square in the center of it, utilizing ndarray's ability to assign values for a slice.
We then threshold the image, and call the findContours() function. This function takes three parameters: the input image, hierarchy type, and the contour approximation method. There are a number of aspects of particular interest about this function:
- The function modifies the input image, so it would be advisable to use a copy of the original image (for example, by passing img.copy()).
- Secondly, the hierarchy tree returned by the function is quite important: cv2.RETR_TREE will retrieve the entire hierarchy of contours in the image, enabling you to establish "relationships" between contours. If you only want to retrieve the most external contours, use cv2.RETR_EXTERNAL. This is particularly useful when you want to eliminate contours that are entirely contained in other contours (for example, in a vast majority of cases, you won't need to detect an object within another object of the same type).
The findContours function returns three elements: the modified image, contours, and their hierarchy. We use the contours to draw on the color version of the image (so we can draw contours in green) and eventually display it.
The result is a white square, with its contour drawn in green. Spartan, but effective in demonstrating the concept! Let's move on to more meaningful examples.
Contours – bounding box, minimum area rectangle and minimum enclosing circle
Finding the contours of a square is a simple task; irregular, skewed, and rotated shapes bring the best out of the cv2.findContours utility function of OpenCV. Let's take a look at the following image:
In a real-life application, we would be most interested in determining the bounding box of the subject, its minimum enclosing rectangle, and circle. The cv2.findContours function in conjunction with another few OpenCV utilities makes this very easy to accomplish:
import cv2 import numpy as np img = cv2.pyrDown(cv2.imread("hammer.jpg", cv2.IMREAD_UNCHANGED)) ret, thresh = cv2.threshold(cv2.cvtColor(img.copy(), cv2.COLOR_BGR2GRAY) , 127, 255, cv2.THRESH_BINARY) image, contours, hier = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) for c in contours: # find bounding box coordinates x,y,w,h = cv2.boundingRect(c) cv2.rectangle(img, (x,y), (x+w, y+h), (0, 255, 0), 2) # find minimum area rect = cv2.minAreaRect(c) # calculate coordinates of the minimum area rectangle box = cv2.boxPoints(rect) # normalize coordinates to integers box = np.int0(box) # draw contours cv2.drawContours(img, [box], 0, (0,0, 255), 3)
# calculate center and radius of minimum enclosing circle (x,y),radius = cv2.minEnclosingCircle(c) # cast to integers center = (int(x),int(y)) radius = int(radius) # draw the circle img = cv2.circle(img,center,radius,(0,255,0),2) cv2.drawContours(img, contours, -1, (255, 0, 0), 1) cv2.imshow("contours", img)
After the initial imports, we load the image, and then apply a binary threshold on a grayscale version of the original image. By doing this, we operate all find-contours calculations on a grayscale copy, but we draw on the original so that we can utilize color information.
Firstly, let's calculate a simple bounding box:
x,y,w,h = cv2.boundingRect(c)
This is a pretty straightforward conversion of contour information to x and y coordinates, plus the height and width of the rectangle. Drawing this rectangle is an easy task:
cv2.rectangle(img, (x,y), (x+w, y+h), (0, 255, 0), 2)
Secondly, let's calculate the minimum area enclosing the subject:
rect = cv2.minAreaRect(c) box = cv2.boxPoints(rect) box = np.int0(box)
The mechanism here is particularly interesting: OpenCV does not have a function to calculate the coordinates of the minimum rectangle vertexes directly from the contour information. Instead, we calculate the minimum rectangle area, and then calculate the vertexes of this rectangle. Note that the calculated vertexes are floats, but pixels are accessed with integers (you can't access a "portion" of a pixel), so we'll need to operate this conversion. Next, we draw the box, which gives us the perfect opportunity to introduce the cv2.drawContours function:
cv2.drawContours(img, [box], 0, (0,0, 255), 3)
Firstly, this function—like all drawing functions—modifies the original image. Secondly, it takes an array of contours in its second parameter so that you can draw a number of contours in a single operation. So, if you have a single set of points representing a contour polygon, you need to wrap this into an array, exactly like we did with our box in the preceding example. The third parameter of this function specifies the index of the contour array that we want to draw: a value of -1 will draw all contours; otherwise, a contour at the specified index in the contour array (the second parameter) will be drawn.
Most drawing functions take the color of the drawing and its thickness as the last two parameters.
The last bounding contour we're going to examine is the minimum enclosing circle:
(x,y),radius = cv2.minEnclosingCircle(c) center = (int(x),int(y)) radius = int(radius) img = cv2.circle(img,center,radius,(0,255,0),2)
The only peculiarity of the cv2.minEnclosingCircle function is that it returns a two-element tuple, of which, the first element is a tuple itself, representing the coordinates of a circle's center, and the second element is the radius of this circle. After converting all these values to integers, drawing the circle is quite a trivial operation.
The final result on the original image looks like this:
Contours – convex contours and the Douglas-Peucker algorithm
Most of the time, when working with contours, subjects will have the most diverse shapes, including convex ones. A convex shape is defined as such when there exists two points within that shape whose connecting line goes outside the perimeter of the shape itself.
The first facility OpenCV offers to calculate the approximate bounding polygon of a shape is cv2.approxPolyDP. This function takes three parameters:
- A contour.
- An "epsilon" value representing the maximum discrepancy between the original contour and the approximated polygon (the lower the value, the closer the approximated value will be to the original contour).
- A boolean flag signifying that the polygon is closed.
The epsilon value is of vital importance to obtain a useful contour, so let's understand what it represents. Epsilon is the maximum difference between the approximated polygon's perimeter and the perimeter of the original contour. The lower this difference is, the more the approximated polygon will be similar to the original contour.
You may ask yourself why we need an approximate polygon when we have a contour that is already a precise representation. The answer is that a polygon is a set of straight lines, and the importance of being able to define polygons in a region for further manipulation and processing is paramount in many computer vision tasks.
Now that we know what an epsilon is, we need to obtain contour perimeter information as a reference value; this is obtained with the cv2.arcLength function of OpenCV:
epsilon = 0.01 * cv2.arcLength(cnt, True) approx = cv2.approxPolyDP(cnt, epsilon, True)
Effectively, we're instructing OpenCV to calculate an approximated polygon whose perimeter can only differ from the original contour in an epsilon ratio.
OpenCV also offers a cv2.convexHull function to obtain processed contour information for convex shapes, and this is a straightforward one-line expression:
hull = cv2.convexHull(cnt)
Let's combine the original contour, approximated polygon contour, and the convex hull in one image to observe the difference. To simplify things, I've applied the contours to a black image so that the original subject is not visible, but its contours are:
As you can see, the convex hull surrounds the entire subject, the approximated polygon is the innermost polygon shape, and in between the two is the original contour, mainly composed of arcs.
Detecting lines and circles
Detecting edges and contours are not only common and important tasks, they also constitute the basis for other—more complex—operations. Lines and shape detection walk hand in hand with edge and contour detection, so let's examine how OpenCV implements these.
The theory behind line and shape detection has its foundations in a technique called Hough transform, invented by Richard Duda and Peter Hart, extending (generalizing) the work done by Paul Hough in the early 1960s.
Let's take a look at OpenCV's API for Hough transforms.
First of all, let's detect some lines, which is done with the HoughLines and HoughLinesP functions. The only difference between the two functions is that one uses the standard Hough transform, and the second uses the probabilistic Hough transform (hence the P in the name).
The probabilistic version is called as such because it only analyzes lines as subset of points and estimates the probability of these points to all belong to the same line. This implementation is an optimized version of the standard Hough transform, in that, it's less computationally intensive and executes faster.
Let's take a look at a very simple example:
import cv2 import numpy as np img = cv2.imread('lines.jpg') gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY) edges = cv2.Canny(gray,50,120) minLineLength = 20 maxLineGap = 5 lines = cv2.HoughLinesP(edges,1,np.pi/180,100,minLineLength,maxLineGap) for x1,y1,x2,y2 in lines: cv2.line(img,(x1,y1),(x2,y2),(0,255,0),2) cv2.imshow("edges", edges) cv2.imshow("lines", img) cv2.waitKey() cv2.destroyAllWindows()
The crucial point of this simple script—aside from the HoughLines function call—is the setting of the minimum line length (shorter lines will be discarded) and maximum line gap, which is the maximum size of a gap in a line before the two segments start being considered as separate lines.
Also, note that the HoughLines function takes a single channel binary image, processed through the Canny edge detection filter. Canny is not a strict requirement, but an image that's been denoised and only represents edges is the ideal source for a Hough transform, so you will find this to be a common practice.
The parameters of HoughLinesP are the image, MinLineLength and MaxLineGap, which we mentioned previously, rho and theta which refers to the geometrical representations of the lines, which are usually 1 and np.pi/180, threshold which represents the threshold below which a line is discarded. The Hough transform works with a system of bins and votes, with each bin representing a line, so any line with a minimum of <threshold> votes is retained, and the rest are discarded.
OpenCV also has a function used to detect circles, called HoughCircles. It works in a very similar fashion to HoughLines, but where minLineLength and maxLineGap were the parameters to discard or retain lines, HoughCircles has a minimum distance between the circles' centers and the minimum and maximum radius of the circles. Here's the obligatory example:
import cv2 import numpy as np planets = cv2.imread('planet_glow.jpg') gray_img = cv2.cvtColor(planets, cv2.COLOR_BGR2GRAY) img = cv2.medianBlur(gray_img, 5) cimg = cv2.cvtColor(img,cv2.COLOR_GRAY2BGR) circles = cv2.HoughCircles(img,cv2.HOUGH_GRADIENT,1,120, param1=100,param2=30,minRadius=0,maxRadius=0) circles = np.uint16(np.around(circles)) for i in circles[0,:]: # draw the outer circle cv2.circle(planets,(i,i),i,(0,255,0),2) # draw the center of the circle cv2.circle(planets,(i,i),2,(0,0,255),3) cv2.imwrite("planets_circles.jpg", planets) cv2.imshow("HoughCirlces", planets) cv2.waitKey() cv2.destroyAllWindows()
Here's a visual representation of the result:
The detection of shapes using the Hough transform is limited to circles; however, we've already implicitly explored the detection of shapes of any kind, specifically, when we talked about approxPolyDP. This function allows the approximation of polygons, so if your image contains polygons, they will be quite accurately detected combining the usage of cv2.findContours and cv2.approxPolyDP.
You should also be proficient in detecting edges, lines, circles and shapes in general, additionally you should be able to find contours and exploit the information they provide about the subjects contained in an image. These concepts will serve as the ideal background to explore the topics in the next chapter, Image Segmentation and Depth Estimation.
Resources for Article:
Further resources on this subject:
- Basic Image Processing[article]
- Camera Calibration[article]
- Tracking Faces with Haar Cascades [article]