Packt+ | Advance your knowledge in tech

You're reading from OpenCV with Python Blueprints

Product typeBook

Published inOct 2015

Reading LevelIntermediate

PublisherPackt

ISBN-139781785282690

Edition1st Edition

Languages

Python

Tools

SciPy OpenCV

Concepts

Computer Vision

Authors (2):

Michael Beyeler

Michael Beyeler (USD)

View More author details

Chapter 4. 3D Scene Reconstruction Using Structure from Motion

The goal of this chapter is to study how to reconstruct a scene in 3D by inferring the geometrical features of the scene from camera motion. This technique is sometimes referred to as structure from motion. By looking at the same scene from different angles, we will be able to infer the real-world 3D coordinates of different features in the scene. This process is known as triangulation, which allows us to reconstruct the scene as a 3D point cloud.

In the previous chapter, you learned how to detect and track an object of interest in the video stream of a webcam, even if the object is viewed from different angles or distances, or under partial occlusion. Here, we will take the tracking of interesting features a step further and consider what we can learn about the entire visual scene by studying similarities between image frames. If we take two pictures of the same scene from different angles, we can use feature matching or...

Planning the app

The final app will extract and visualize structure from motion on a pair of images. We will assume that these two images have been taken with the same camera, whose internal camera parameters we know. If these parameters are not known, they need to be estimated first in a camera calibration process.

The final app will then consist of the following modules and scripts:

chapter4.main: This is the main function routine for starting the application.
scene3D.SceneReconstruction3D: This is a class that contains a range of functionalities for calculating and visualizing structure from motion. It includes the following public methods:
- __init__: This constructor will accept the intrinsic camera matrix and the distortion coefficients
- load_image_pair: A method used to load from the file, two images that have been taken with the camera described earlier
- plot_optic_flow: A method used to visualize the optic flow between the two image frames
- draw_epipolar_lines: A method used to draw the...

Camera calibration

So far, we have worked with whatever image came straight out of our webcam, without questioning the way in which it was taken. However, every camera lens has unique parameters, such as focal length, principal point, and lens distortion. What happens behind the covers when a camera takes a picture, is that; light falls through a lens, followed by an aperture, before falling on the surface of a light sensor. This process can be approximated with the pinhole camera model. The process of estimating the parameters of a real-world lens such that it would fit the pinhole camera model is called camera calibration (or camera resectioning, and it should not be confused with photometric camera calibration).

The pinhole camera model

The pinhole camera model is a simplification of a real camera in which there is no lens and the camera aperture is approximated by a single point (the pinhole). When viewing a real-world 3D scene (such as a tree), light rays pass through the point-sized...

Setting up the app

Going forward, we will be using a famous open source dataset called fountain-P11. It depicts a Swiss fountain viewed from various angles. An example of this is shown in the following image:

The dataset consists of 11 high-resolution images and can be downloaded from http://cvlabwww.epfl.ch/data/multiview/denseMVS.html. Had we taken the pictures ourselves, we would have had to go through the entire camera calibration procedure to recover the intrinsic camera matrix and the distortion coefficients. Luckily, these parameters are known for the camera that took the fountain dataset, so we can go ahead and hardcode these values in our code.

The main function routine

Our main function routine will consist of creating and interacting with an instance of the SceneReconstruction3D class. This code can be found in the chapter4.py file, which imports all the necessary modules and instantiates the class:

import numpy as np

from scene3D import SceneReconstruction3D


def main():
    #...

Estimating the camera motion from a pair of images

Now that we have loaded two images (self.img1 and self.img2) of the same scene, such as two examples from the fountain dataset, we find ourselves in a similar situation as in the last chapter. We are given two images that supposedly show the same rigid object or static scene, but from different viewpoints. However, this time we want to go a step further; if the only thing that changes between taking the two pictures is the location of the camera, can we infer the relative camera motion by looking at the matching features?

Well, of course we can. Otherwise, this chapter would not make much sense, would it? We will take the location and orientation of the camera in the first image as a given and then find out how much we have to reorient and relocate the camera so that its viewpoint matches that from the second image.

In other words, we need to recover the essential matrix of the camera in the second image. An essential matrix is a 4 x 3 matrix...

Reconstructing the scene

Finally, we can reconstruct the 3D scene by making use of a process called triangulation. We are able to infer the 3D coordinates of a point because of the way epipolar geometry works. By calculating the essential matrix, we get to know more about the geometry of the visual scene than we might think. Because the two cameras depict the same real-world scene, we know that most of the 3D real-world points will be found in both images. Moreover, we know that the mapping from the 2D image points to the corresponding 3D real-world points, will follow the rules of geometry. If we study a sufficiently large number of image points, we can construct, and solve, a (large) system of linear equations to get the ground truth of the real-world coordinates.

Let's return to the Swiss fountain dataset. If we ask two photographers to take a picture of the fountain from different viewpoints at the same time, it is not hard to realize that the first photographer might show up in the...

3D point cloud visualization

The last step is visualizing the triangulated 3D real-world points. An easy way of creating 3D scatterplots is by using matplotlib. However, if you are looking for more professional visualization tools, you may be interested in Mayavi (http://docs.enthought.com/mayavi/mayavi), VisPy (http://vispy.org), or the Point Cloud Library (http://pointclouds.org). Although the latter does not have Python support for point cloud visualization yet, it is an excellent tool for point cloud segmentation, filtering, and sample consensus model fitting. For more information, head over to strawlab's GitHub repository at https://github.com/strawlab/python-pcl.

Before we can plot our 3D point cloud, we obviously have to extract the [R | t] matrix and perform the triangulation as explained earlier:

def plot_point_cloud(self, feat_mode="SURF"):
    self._extract_keypoints(feat_mode)
    self._find_fundamental_matrix()
    self._find_essential_matrix()
    self._find_camera_matrices_rt...

Summary

In this chapter, we explored a way of reconstructing a scene in 3D—by inferring the geometrical features of 2D images taken by the same camera. We wrote a script to calibrate a camera, and you learned about fundamental and essential matrices. We used this knowledge to perform triangulation. We then went on to visualize the real-world geometry of the scene in a 3D point cloud. Using simple 3D scatterplots in matplotlib, we found a way to convince ourselves that our calculations were accurate and practical.

Going forward from here, it will be possible to store the triangulated 3D points in a file that can be parsed by the Point Cloud Library, or to repeat the procedure for different image pairs so that we can generate a denser and more accurate reconstruction. Although we have covered a lot in this chapter, there is a lot more left to do. Typically, when talking about a structure-from-motion pipeline, we include two additional steps that we have not talked about so far: bundle adjustment...

The rest of the chapter is locked

You have been reading a chapter from

OpenCV with Python Blueprints

Published in: Oct 2015Publisher: PacktISBN-13: 9781785282690

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Authors (2)

Michael Beyeler

Michael Beyeler is a postdoctoral fellow in neuroengineering and data science at the University of Washington, where he is working on computational models of bionic vision in order to improve the perceptual experience of blind patients implanted with a retinal prosthesis (bionic eye).His work lies at the intersection of neuroscience, computer engineering, computer vision, and machine learning. He is also an active contributor to several open source software projects, and has professional programming experience in Python, C/C++, CUDA, MATLAB, and Android. Michael received a PhD in computer science from the University of California, Irvine, and an MSc in biomedical engineering and a BSc in electrical engineering from ETH Zurich, Switzerland.
Read more about Michael Beyeler

Michael Beyeler (USD)

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages