You're reading from 3D Deep Learning with Python

Product typeBook

Published inOct 2022

PublisherPackt

ISBN-139781803247823

Edition1st Edition

Concepts

Computer Vision

Authors (3):

Xudong Ma

Vishakh Hegde

Lilit Yolyan

View More author details

Introducing 3D Computer Vision and Geometry

In this chapter, we will learn about some basic concepts of 3D computer vision and geometry that will be especially useful for later chapters in this book. We will start by discussing what rendering, rasterization, and shading are. We will go through different lighting models and shading models, such as point light sources, directional light sources, ambient lighting, diffusion, highlights, and shininess. We will go through a coding example for rendering a mesh model using different lighting models and parameters.

We will then learn how to use PyTorch for solving optimization problems. Particularly, we will go through stochastic gradient descent over heterogeneous mini-batches, which becomes possible by using PyTorch3D. We will also learn about different formats for mini-batches in PyTorch3D, including the list, padded, and packed formats, and learn how to convert between the different formats.

In the last part of the chapter, we will...

Technical requirements

To run the example code snippets in this book, the readers need to have a computer, ideally with a GPU. However, running the code snippets only with CPUs is not impossible.

The recommended computer configuration includes the following:

A modern GPU – for example, the Nvidia GTX series or RTX series with at least 8 GB of memory
Python 3
PyTorch library and PyTorch3D libraries

The code snippets with this chapter can be found at https://github.com/PacktPublishing/3D-Deep-Learning-with-Python.

Exploring the basic concepts of rendering, rasterization, and shading

Rendering is a process that takes 3D data models of the world around our camera as input and output images. It is an approximation to the physical process where images are formed in our camera in the real world. Typically, the 3D data models are meshes. In this case, rendering is usually done using ray tracing:

Figure 2.1: Rendering by ray tracing (rays are generated from camera origins and go through the image pixels for finding relevant mesh faces)

An example of ray tracing processing is shown in Figure 2.1. In the example, the world model contains one 3D sphere, which is represented by a mesh model. To form the image of the 3D sphere, for each image pixel, we generate one ray, starting from the camera origin and going through the image pixel. If one ray intersects with one mesh face, then we know the mesh face can project its color to the image pixel. We also need to trace the depth of...

Coding exercises for 3D rendering

In this section, we will look at a concrete coding exercise using PyTorch3D for rendering a mesh model. We are going to learn how to define a camera model and how to define a light source in PyTorch3D. We will also learn how to change the incoming light components and material properties so that more realistic images can be rendered by controlling the three light components (ambient, diffusion, and glossy):

First, we need to import all the Python modules that we need:

import open3d
import os
import sys
import torch
import matplotlib.pyplot as plt
from pytorch3d.io import load_objs_as_meshes
from pytorch3d.renderer import (
    look_at_view_transform,
    PerspectiveCameras,
    PerspectiveCameras,
    PointLights,
    Materials,
    RasterizationSettings,
    MeshRenderer,
    MeshRasterizer...

Using PyTorch3D heterogeneous batches and PyTorch optimizers

In this section, we are going to learn how to use the PyTorch optimizer on PyTorch3D heterogeneous mini-batches. In deep learning, we are usually given a list of data examples, such as the following ones – .. Here, are the observations and are the prediction values. For example, may be some images and the ground-truth classification results – for example, “cat” or “dog”. A deep neural network is then trained so that the outputs of the neural networks are as close to as possible. Usually, a loss function between the neural network outputs and is defined so that the loss function values decrease as the neural network outputs become closer to .

Thus, training a deep learning network is usually done by minimizing the loss function that is evaluated on all training data examples, and. A straightforward method used in many optimization algorithms is computing the gradients first...

Understanding transformations and rotations

In 3D deep learning and computer vision, we usually need to work with 3D transformations, such as rotations and 3D rigid motions. PyTorch3D provides a high-level encapsulation of these transformations in its pytorch3d.transforms.Transform3d class. One advantage of the Transform3d class is that it is mini-batch based. Thus, as frequently needed in 3D deep learning, it is possible to apply a mini-batch of transformations on a mini-batch of meshes only within several lines of code. Another advantage of Transform3d is that gradient backpropagation can straightforwardly pass through Transform3d.

PyTorch3D also provides many lower-level APIs for computations in the Lie groups SO(3) and SE(3). Here, SO(3) denotes the special orthogonal group in 3D and SE(3) denotes the special Euclidean group in 3D. Informally speaking, SO(3) denotes the set of all the rotation transformations and SE(3) denotes the set of all the rigid transformations in 3D....

Summary

In this chapter, we learned about the basic concepts of rendering, rasterization, and shading, including light source models, the Lambertian shading model, and the Phong lighting model. We learned how to implement rendering, rasterization, and shading using PyTorch3D. We also learned how to change the parameters in the rendering process, such as ambient lighting, shininess, and specular colors, and how these parameters would affect the rendering results.

We then learned how to use the PyTorch optimizer. We went through a coding example, where the PyTorch optimizer was used on a PyTorch3D mini-batch. In the last part of the chapter, we learned how to use the PyTorch3D APIs for converting between the different representations or rotations and transformations.

In the next chapter, we will learn some more advanced techniques for using deformable mesh models for fitting real-world 3D data.

The rest of the chapter is locked

You have been reading a chapter from

3D Deep Learning with Python

Published in: Oct 2022Publisher: PacktISBN-13: 9781803247823

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Authors (3)

Xudong Ma

Xudong Ma is a Staff Machine Learning engineer with Grabango Inc. at Berkeley California. He was a Senior Machine Learning Engineer at Facebook(Meta) Oculus and worked closely with the 3D PyTorch Team on 3D facial tracking projects. He has many years of experience working on computer vision, machine learning and deep learning. He holds a Ph.D. in Electrical and Computer Engineering.
Read more about Xudong Ma

Vishakh Hegde

Vishakh Hegde is a Machine Learning and Computer Vision researcher. He has over 7 years of experience in this field during which he has authored multiple well cited research papers and published patents. He holds a masters from Stanford University specializing in applied mathematics and machine learning, and a BS and MS in Physics from IIT Madras. He previously worked at Schlumberger and Matroid. He is a Senior Applied Scientist at Ambient.ai, where he helped build their weapon detection system which is deployed at several Global Fortune 500 companies. He is now leveraging his expertise and passion to solve business challenges to build a technology startup in Silicon Valley. You can learn more about him on his personal website.
Read more about Vishakh Hegde

Lilit Yolyan

Lilit Yolyan is a machine learning researcher working on her Ph.D. at YSU. Her research focuses on building computer vision solutions for smart cities using remote sensing data. She has 5 years of experience in the field of computer vision and has worked on a complex driver safety solution to be deployed by many well-known car manufacturing companies.
Read more about Lilit Yolyan

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages