Reader small image

You're reading from  OpenCV 4 Computer Vision Application Programming Cookbook - Fourth Edition

Product typeBook
Published inMay 2019
Reading LevelIntermediate
PublisherPackt
ISBN-139781789340723
Edition4th Edition
Languages
Tools
Right arrow
Authors (2):
David Millán Escrivá
David Millán Escrivá
author image
David Millán Escrivá

David Millán Escrivá was 8 years old when he wrote his first program on an 8086 PC in Basic, which enabled the 2D plotting of basic equations. In 2005, he finished his studies in IT with honors, through the Universitat Politécnica de Valencia, in human-computer interaction supported by computer vision with OpenCV (v0.96). He has worked with Blender, an open source, 3D software project, and on its first commercial movie, Plumiferos, as a computer graphics software developer. David has more than 10 years' experience in IT, with experience in computer vision, computer graphics, pattern recognition, and machine learning, working on different projects, and at different start-ups, and companies. He currently works as a researcher in computer vision.
Read more about David Millán Escrivá

Robert Laganiere
Robert Laganiere
author image
Robert Laganiere

Robert Laganiere is a professor at the School of Electrical Engineering and Computer Science of the University of Ottawa, Canada. He is also a faculty member of the VIVA research lab and is the co-author of several scientific publications and patents in content based video analysis, visual surveillance, driver-assistance, object detection, and tracking. Robert authored the OpenCV2 Computer Vision Application Programming Cookbook in 2011 and co-authored Object Oriented Software Development published by McGraw Hill in 2001. He co-founded Visual Cortek in 2006, an Ottawa-based video analytics start-up that was later acquired by iwatchlife.com in 2009. He is also a consultant in computer vision and has assumed the role of Chief Scientist in a number of start-up companies such as Cognivue Corp, iWatchlife, and Tempo Analytics. Robert has a Bachelor of Electrical Engineering degree from Ecole Polytechnique in Montreal (1987) and MSc and PhD degrees from INRS-Telecommunications, Montreal (1996). You can visit the author's website at laganiere.name.
Read more about Robert Laganiere

View More author details
Right arrow

Reconstructing 3D Scenes

In the previous chapter, we learned how a camera captures a 3D scene by projecting light rays on a 2D sensor plane. The image produced is an accurate representation of what the scene looks like from a particular point of view in the instant that the image is captured. However, by its nature, the process of image formation eliminates all of the information concerning the depth of the represented scene elements. This chapter will examine how, under specific conditions, the 3D structure of the scene and the 3D pose of the cameras that captured it can be recovered. We will demonstrate how a good understanding of the concepts of projective geometry allows us to devise methods that enable 3D reconstruction. We will, therefore, revisit the principle of image formation that was introduced in the previous chapter; in particular, we will now take into consideration...

Digital image formation

Let's redraw a new version of the diagram shown in Chapter 10, Estimating Projective Relations in Images, describing the pinhole camera model. More specifically, we want to demonstrate the relationship between a point in 3D at its position (X, Y, Z) and its image (x, y) on a camera, specified in pixel coordinates:

Notice the changes that have been made to the original diagram. First, we added a reference frame that we positioned at the center of the projection. Second, we have the y axis pointing downward in order to ensure that the coordinate system is compatible with the usual convention that places the image origin in the upper-left corner of the image. Finally, we also identified a special point on the image plane—considering the line coming from the focal point is orthogonal to the image plane, then the point (u0, v0) is the pixel position...

Calibrating a camera

Camera calibration is the process by which the different camera parameters are obtained. You can obviously use the specifications provided by the camera manufacturer, but for some tasks, such as 3D reconstruction, these specifications are not accurate enough. Camera calibration works by showing known patterns to the camera and analyzing the obtained images. An optimization process will then determine the optimal parameter values that explain the observations. This is a complex process that has been made easy by the availability of OpenCV's calibration functions.

Getting ready

To calibrate a camera, you show it a set of scene points where the 3D positions are known. Then, you need to observe where...

Recovering the camera pose

When a camera is calibrated, it becomes possible to relate the captured images to the outside world. We previously explained that if the 3D structure of an object is known, then you can predict how the object will be projected on to the sensor of the camera. The process of image formation is described by the projective equation presented at the beginning of this chapter. When most of the terms of this equation are known, then it becomes possible to infer the value of the other elements (2D or 3D) through the observation of some images. In this recipe, we will look at the camera pose recovery problem when a known 3D structure is observed.

How to do it...

Let's consider a simple object—...

Reconstructing a 3D scene from calibrated cameras

In the previous recipe, we saw that it is possible to recover the position of a camera that is observing a 3D scene when the camera is calibrated. The approach that was described took advantage of the fact that, sometimes, the coordinates of some 3D points visible in the scene might be known. We will now learn that if a scene is observed from more than one point of view, a 3D pose and structure can be reconstructed even if no information about the 3D scene is available. This time, we will use correspondences between image points in the different views in order to infer 3D information. We will introduce a new mathematical entity encompassing the relationship between two views of a calibrated camera, and we will discuss the principle of triangulation in order to reconstruct 3D points from 2D images.

...

Computing depth from a stereo image

Humans view the world in three dimensions using their two eyes. Robots can do the same when they are equipped with two cameras; this is called stereo vision. A stereo rig is a pair of cameras mounted on a device, looking at the same scene and separated by a fixed baseline (that is, the distance between the two cameras). This recipe will demonstrate how a depth map can be computed from two stereo images by computing the depth correspondence between the two views.

Getting ready

A stereo vision system is generally made of two side-by-side cameras looking toward the same direction. The following diagram illustrates such a stereo system in a perfectly-aligned configuration:

Under this ideal...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
OpenCV 4 Computer Vision Application Programming Cookbook - Fourth Edition
Published in: May 2019Publisher: PacktISBN-13: 9781789340723
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
David Millán Escrivá

David Millán Escrivá was 8 years old when he wrote his first program on an 8086 PC in Basic, which enabled the 2D plotting of basic equations. In 2005, he finished his studies in IT with honors, through the Universitat Politécnica de Valencia, in human-computer interaction supported by computer vision with OpenCV (v0.96). He has worked with Blender, an open source, 3D software project, and on its first commercial movie, Plumiferos, as a computer graphics software developer. David has more than 10 years' experience in IT, with experience in computer vision, computer graphics, pattern recognition, and machine learning, working on different projects, and at different start-ups, and companies. He currently works as a researcher in computer vision.
Read more about David Millán Escrivá

author image
Robert Laganiere

Robert Laganiere is a professor at the School of Electrical Engineering and Computer Science of the University of Ottawa, Canada. He is also a faculty member of the VIVA research lab and is the co-author of several scientific publications and patents in content based video analysis, visual surveillance, driver-assistance, object detection, and tracking. Robert authored the OpenCV2 Computer Vision Application Programming Cookbook in 2011 and co-authored Object Oriented Software Development published by McGraw Hill in 2001. He co-founded Visual Cortek in 2006, an Ottawa-based video analytics start-up that was later acquired by iwatchlife.com in 2009. He is also a consultant in computer vision and has assumed the role of Chief Scientist in a number of start-up companies such as Cognivue Corp, iWatchlife, and Tempo Analytics. Robert has a Bachelor of Electrical Engineering degree from Ecole Polytechnique in Montreal (1987) and MSc and PhD degrees from INRS-Telecommunications, Montreal (1996). You can visit the author's website at laganiere.name.
Read more about Robert Laganiere