You're reading from 3D Deep Learning with Python

Product typeBook

Published inOct 2022

PublisherPackt

ISBN-139781803247823

Edition1st Edition

Concepts

Computer Vision

Authors (3):

Xudong Ma

Vishakh Hegde

Lilit Yolyan

View More author details

Exploring Neural Radiance Fields (NeRF)

In the previous chapter, you learned about Differentiable Volume Rendering where you reconstructed the 3D volume from several multi-view images. With this technique, you modeled a volume consisting of N x N x N voxels. The space requirement for storing this volume scale would therefore be O(N3). This is undesirable, especially if we want to transmit this information over the network. Other methods can overcome such large disk space requirements, but they are prone to smoothing geometry and texture. Therefore, we cannot use them to model very complex or textured scenes reliably.

In this chapter, we are going to discuss a breakthrough new approach to representing 3D scenes, called Neural Radiance Fields (NeRF). This is one of the first techniques to model a 3D scene that requires less constant disk space and at the same time, captures the fine geometry and texture of complex scenes.

In this chapter, you will learn about the following topics...

Technical requirements

In order to run the example code snippets in this book, you need to have a computer, ideally with a GPU with about 8 GB of GPU memory. Running code snippets only using CPUs is not impossible but will be extremely slow. The recommended computer configuration is as follows:

A GPU device – for example, Nvidia GTX series or RTX series with at least 8 GB of memory
Python 3.7+
The PyTorch and PyTorch3D libraries

The code snippets for this chapter can be found at https://github.com/PacktPublishing/3D-Deep-Learning-with-Python.

Understanding NeRF

View synthesis is a long-standing problem in 3D computer vision. The challenge is to synthesize new views of a 3D scene using a small number of available 2D snapshots of the scene. It is particularly challenging because the view of a complex scene can depend on a lot of factors such as object artifacts, light sources, reflections, opacity, object surface texture, and occlusions. Any good representation should capture this information either implicitly or explicitly. Additionally, many objects have complex structures that are not completely visible from a certain viewpoint. The challenge is to construct complete information about the world given incomplete and noisy information.

As the name suggests, NeRF uses neural networks to model the world. As we will learn later in the chapter, NeRF uses neural networks in a very unconventional manner. It was a concept first developed by a team of researchers from UC Berkeley, Google Research, and UC San Diego. Because of...

Training a NeRF model

In this section, we are going to train a simple NeRF model on images generated from the synthetic cow model. We are only going to instantiate the NeRF model without worrying about how it is implemented. The implementation details are covered in the next section. A single neural network (NeRF model) is trained to represent a single 3D scene. The following codes can be found in train_nerf.py, which can be found in this chapter’s GitHub repository. It is modified from a PyTorch3D tutorial. Let us go through the code to train a NeRF model on the synthetic cow scene:

First, let us import the standard modules:

import torch
import matplotlib.pyplot as plt

Next, let us import the functions and classes used for rendering. These are pytorch3d data structures:

from pytorch3d.renderer import (
FoVPerspectiveCameras,
NDCMultinomialRaysampler,
MonteCarloRaysampler,
EmissionAbsorptionRaymarcher,
ImplicitRenderer,
)
from utils.helper_functions import (generate_rotating_nerf...

Understanding the NeRF model architecture

So far, we have used the NeRF model class without fully knowing what it looks like. In this section, we will first visualize what the neural network looks like and then go through the code in detail and understand how it is implemented.

The neural network takes the harmonic embedding of the spatial location (x, y, z) and the harmonic embedding of (θ, ∅) as its input and outputs the predicted density σ and the predicted color (r, g, b). The following figure illustrates the network architecture that we are going to implement in this section:

Figure 6.5: The simplified model architecture of the NeRF model

Note

The model architecture that we are going to implement is different from the original NeRF model architecture. In this implementation, we are implementing a simplified version of it. This simplified architecture makes it faster and easier to train.

Let us start defining the NeuralRadianceField...

Understanding volume rendering with radiance fields

Volume rendering allows you to create a 2D projection of a 3D image or scene. In this section, we will learn about rendering a 3D scene from different viewpoints. For the purposes of this section, assume that the NeRF model is fully trained and that it accurately maps the input coordinates (x, y, z, dx, dy, dz) to an output (r, g, b, σ). Here are the definitions of these input and output coordinates:

(x, y, z): A point in the 3D scene in the World Coordinates
(dx, dy, dz): This is a unit vector that represents the direction along which we are viewing the point (x, y, z)
(r, g, b): This is the radiance value (or the emitted color) of the point (x, y, z)
σ: The volume density at the point (x, y, z)

In the previous chapter, you came to understand the concepts underlying volumetric rendering. You used the technique of ray sampling to get volume densities and colors...

Summary

In this chapter, we came to understand how a neural network can be used to model and represent a 3D scene. This neural network is called the NeRF model. We then trained a simple NeRF model on a synthetic 3D scene. We then dug deeper into the NeRF model architecture and its implementation in code. We also understood the main components of the model. We then understood the principles behind rendering volumes with the NeRF model. The NeRF model is used to capture a single scene. Once we build this model, we can use it to render that 3D scene from different angles. It is logical to wonder whether there is a way to capture multiple scenes with a single model and whether we can predictably manipulate certain objects and attributes in the scene. This is our topic of exploration in the next chapter where we will explore the GIRAFFE model.

The rest of the chapter is locked

You have been reading a chapter from

3D Deep Learning with Python

Published in: Oct 2022Publisher: PacktISBN-13: 9781803247823

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Authors (3)

Xudong Ma

Xudong Ma is a Staff Machine Learning engineer with Grabango Inc. at Berkeley California. He was a Senior Machine Learning Engineer at Facebook(Meta) Oculus and worked closely with the 3D PyTorch Team on 3D facial tracking projects. He has many years of experience working on computer vision, machine learning and deep learning. He holds a Ph.D. in Electrical and Computer Engineering.
Read more about Xudong Ma

Vishakh Hegde

Vishakh Hegde is a Machine Learning and Computer Vision researcher. He has over 7 years of experience in this field during which he has authored multiple well cited research papers and published patents. He holds a masters from Stanford University specializing in applied mathematics and machine learning, and a BS and MS in Physics from IIT Madras. He previously worked at Schlumberger and Matroid. He is a Senior Applied Scientist at Ambient.ai, where he helped build their weapon detection system which is deployed at several Global Fortune 500 companies. He is now leveraging his expertise and passion to solve business challenges to build a technology startup in Silicon Valley. You can learn more about him on his personal website.
Read more about Vishakh Hegde

Lilit Yolyan

Lilit Yolyan is a machine learning researcher working on her Ph.D. at YSU. Her research focuses on building computer vision solutions for smart cities using remote sensing data. She has 5 years of experience in the field of computer vision and has worked on a complex driver safety solution to be deployed by many well-known car manufacturing companies.
Read more about Lilit Yolyan

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages