Reader small image

You're reading from  3D Deep Learning with Python

Product typeBook
Published inOct 2022
PublisherPackt
ISBN-139781803247823
Edition1st Edition
Right arrow
Authors (3):
Xudong Ma
Xudong Ma
author image
Xudong Ma

Xudong Ma is a Staff Machine Learning engineer with Grabango Inc. at Berkeley California. He was a Senior Machine Learning Engineer at Facebook(Meta) Oculus and worked closely with the 3D PyTorch Team on 3D facial tracking projects. He has many years of experience working on computer vision, machine learning and deep learning. He holds a Ph.D. in Electrical and Computer Engineering.
Read more about Xudong Ma

Vishakh Hegde
Vishakh Hegde
author image
Vishakh Hegde

Vishakh Hegde is a Machine Learning and Computer Vision researcher. He has over 7 years of experience in this field during which he has authored multiple well cited research papers and published patents. He holds a masters from Stanford University specializing in applied mathematics and machine learning, and a BS and MS in Physics from IIT Madras. He previously worked at Schlumberger and Matroid. He is a Senior Applied Scientist at Ambient.ai, where he helped build their weapon detection system which is deployed at several Global Fortune 500 companies. He is now leveraging his expertise and passion to solve business challenges to build a technology startup in Silicon Valley. You can learn more about him on his personal website.
Read more about Vishakh Hegde

Lilit Yolyan
Lilit Yolyan
author image
Lilit Yolyan

Lilit Yolyan is a machine learning researcher working on her Ph.D. at YSU. Her research focuses on building computer vision solutions for smart cities using remote sensing data. She has 5 years of experience in the field of computer vision and has worked on a complex driver safety solution to be deployed by many well-known car manufacturing companies.
Read more about Lilit Yolyan

View More author details
Right arrow

Understanding Differentiable Volumetric Rendering

In this chapter, we are going to discuss a new way of differentiable rendering. We are going to use a voxel 3D data representation, unlike the mesh 3D data representation we used in the last chapter. Voxel 3D data representation has certain advantages compared to mesh models. For example, it is more flexible and highly structured.

To understand volumetric rendering, we need to understand several important concepts, such as ray sampling, volumes, volume sampling, and ray marching. All these concepts have corresponding PyTorch3D implementations. We will discuss each of these concepts using explanations and coding exercises.

After we understand the preceding basic concepts of volumetric rendering, we can then see easily that all the operations mentioned already are already differentiable. Volumetric rendering is naturally differentiable. Thus, by then, we will be ready to use differentiable volumetric rendering for some real applications...

Technical requirements

In order to run the example code snippets in this book, you need to have a computer, ideally with a GPU. However, running the code snippets with only CPUs is not impossible.

The recommended computer configuration includes the following:

  • A GPU for example, the NVIDIA GTX series or RTX series with at least 8 GB of memory
  • Python 3
  • PyTorch library and PyTorch3D libraries

The code snippets with this chapter can be found at https://github.com/PacktPublishing/3D-Deep-Learning-with-Python.

Overview of volumetric rendering

Volumetric rendering is a collection of techniques used to generate a 2D view of discrete 3D data. This 3D discrete data could be a collection of images, voxel representation, or any other discrete representation of data. The main goal of volumetric rendering is to render a 2D projection of 3D data since that is what our eyes can perceive on a flat screen. This method generated such projections without any explicit conversion to a geometric representation (such as meshes). Volumetric rendering is typically used when generating surfaces is difficult or can lead to errors. It can also be used when the content (and not just the geometry and surface) of the volume is important. It is typically used for data visualization. For example, in brain scans, a visualization of the content of the interior of the brain is typically very important.

In this section, we will explore the volumetric rendering of a volume. We will get a high-level overview of volumetric...

Understanding ray sampling

Ray sampling is the process of emitting rays from the camera that goes through the image pixels and sampling points along these rays. The ray sampling scheme depends on the use case. For example, sometimes we might want to randomly sample rays that go through some random subset of image pixels. Typically, we need to use such a sampler during training since we only need a representative sample from the full data. In such cases, we can use MonteCarloRaysampler in Pytorch3D. In other cases, we want to get the pixel values for each pixel on the image and maintain a spatial order. This is useful for rendering and visualization. For such use cases, PyTorch3D provides NDCMultiNomialRaysampler.

In the following, we will demonstrate how to use one of the PyTorch3D ray samplers, NDCGridRaysampler. This is like NDCMultiNomialRaysampler, where pixels are sampled along a grid. The codes can be found in the GitHub repository named understand_ray_sampling.py:

    ...

Using volume sampling

Volume sampling is the process of getting color and occupancy information along the points provided by the ray samples. The volume representation we are working with is discrete. Therefore, the points defined in the ray sampling step might not fall exactly on a point. The nodes of the volume grids and points on rays typically have different spatial locations. We need to use an interpolation scheme to interpolate the densities and colors at points of rays from the densities and colors at volumes. We can do that by using VolumeSampler implemented in PyTorch3D. The following code can be found in the GitHub repository in the understand_volume_sampler.py file:

  1. Import the Python modules that we need:
    import torch
    from pytorch3d.structures import Volumes
    from pytorch3d.renderer.implicit.renderer import VolumeSampler
  2. Set up the devices:
    if torch.cuda.is_available():
        device = torch.device("cuda:0")
        torch...

Exploring the ray marcher

Now that we have the color and density values for all the points sampled with the ray sampler, we need to figure out how to use it to finally render the pixel value on the projected image. In this section, we are going to discuss the process of converting the densities and colors on points of rays to RGB values on images. This process models the physical process of image formation.

In this section, we discuss a very simple model, where the RGB value of each image pixel is a weighted sum of the colors on the points of the corresponding ray. If we consider the densities as probabilities of occupancy or opacity, then the incident light intensity at each point of the ray is a = product of (1-p_i), where p_i are the densities. Given the probability that this point is occupied by a certain object is p_i, the expected light intensity reflected from this point is w_i = a p_i. We just use w_i as the weights for the weighted sum of colors. Usually, we normalize...

Differentiable volumetric rendering

While standard volumetric rendering is used to render 2D projections of 3D data, differentiable volume rendering is used to do the opposite: construct 3D data from 2D images. This is how it works: we represent the shape and texture of the object as a parametric function. This function can be used to generate 2D projections. But, given 2D projections (this is typically multiple views of the 3D scene), we can optimize the parameters of these implicit shape and texture functions so that its projections are the multi-view 2D images. This optimization is possible since the rendering process is completely differentiable, and the implicit functions used are also differentiable.

Reconstructing 3D models from multi-view images

In this section, we are going to show an example of using differentiable volumetric rendering for reconstructing 3D models from multi-view images. Reconstructing 3D models is a frequently sought problem. Usually, the direct ways...

Summary

In this chapter, we started with a high-level description of differentiable volumetric rendering. We then dived deep into several important concepts of differentiable volumetric rendering, including ray sampling, volume sampling, and the ray marcher, but only by explanations and coding examples. We walked through a coding example of using differentiable volumetric rendering for reconstructing 3D models from multi-view images.

Using volumes for 3D deep learning has become an interesting direction in recent years. As many innovative ideas come out following this direction, many breakthroughs are emerging. One of the breakthroughs, called Neural Radiance Fields (NeRF), will be the topic of our next chapter.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
3D Deep Learning with Python
Published in: Oct 2022Publisher: PacktISBN-13: 9781803247823
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (3)

author image
Xudong Ma

Xudong Ma is a Staff Machine Learning engineer with Grabango Inc. at Berkeley California. He was a Senior Machine Learning Engineer at Facebook(Meta) Oculus and worked closely with the 3D PyTorch Team on 3D facial tracking projects. He has many years of experience working on computer vision, machine learning and deep learning. He holds a Ph.D. in Electrical and Computer Engineering.
Read more about Xudong Ma

author image
Vishakh Hegde

Vishakh Hegde is a Machine Learning and Computer Vision researcher. He has over 7 years of experience in this field during which he has authored multiple well cited research papers and published patents. He holds a masters from Stanford University specializing in applied mathematics and machine learning, and a BS and MS in Physics from IIT Madras. He previously worked at Schlumberger and Matroid. He is a Senior Applied Scientist at Ambient.ai, where he helped build their weapon detection system which is deployed at several Global Fortune 500 companies. He is now leveraging his expertise and passion to solve business challenges to build a technology startup in Silicon Valley. You can learn more about him on his personal website.
Read more about Vishakh Hegde

author image
Lilit Yolyan

Lilit Yolyan is a machine learning researcher working on her Ph.D. at YSU. Her research focuses on building computer vision solutions for smart cities using remote sensing data. She has 5 years of experience in the field of computer vision and has worked on a complex driver safety solution to be deployed by many well-known car manufacturing companies.
Read more about Lilit Yolyan