Reader small image

You're reading from  3D Deep Learning with Python

Product typeBook
Published inOct 2022
PublisherPackt
ISBN-139781803247823
Edition1st Edition
Right arrow
Authors (3):
Xudong Ma
Xudong Ma
author image
Xudong Ma

Xudong Ma is a Staff Machine Learning engineer with Grabango Inc. at Berkeley California. He was a Senior Machine Learning Engineer at Facebook(Meta) Oculus and worked closely with the 3D PyTorch Team on 3D facial tracking projects. He has many years of experience working on computer vision, machine learning and deep learning. He holds a Ph.D. in Electrical and Computer Engineering.
Read more about Xudong Ma

Vishakh Hegde
Vishakh Hegde
author image
Vishakh Hegde

Vishakh Hegde is a Machine Learning and Computer Vision researcher. He has over 7 years of experience in this field during which he has authored multiple well cited research papers and published patents. He holds a masters from Stanford University specializing in applied mathematics and machine learning, and a BS and MS in Physics from IIT Madras. He previously worked at Schlumberger and Matroid. He is a Senior Applied Scientist at Ambient.ai, where he helped build their weapon detection system which is deployed at several Global Fortune 500 companies. He is now leveraging his expertise and passion to solve business challenges to build a technology startup in Silicon Valley. You can learn more about him on his personal website.
Read more about Vishakh Hegde

Lilit Yolyan
Lilit Yolyan
author image
Lilit Yolyan

Lilit Yolyan is a machine learning researcher working on her Ph.D. at YSU. Her research focuses on building computer vision solutions for smart cities using remote sensing data. She has 5 years of experience in the field of computer vision and has worked on a complex driver safety solution to be deployed by many well-known car manufacturing companies.
Read more about Lilit Yolyan

View More author details
Right arrow

Understanding 3D coordination systems

In this section, we are going to learn about the frequently used coordination systems in PyTorch3D. This section is adapted from PyTorch’s documentation of camera coordinate systems: https://pytorch3d.org/docs/cameras. To understand and use the PyTorch3D rendering system, we usually need to know these coordination systems and how to use them. As discussed in the previous sections, 3D data can be represented by points, faces, and voxels. The location of each point can be represented by a set of x, y, and z coordinates, with respect to a certain coordination system. We usually need to define and use multiple coordination systems, depending on which one is most convenient.

Figure 1.2 – A world coordinate system, where the origin and axis are defined independently of the camera positions

Figure 1.2 – A world coordinate system, where the origin and axis are defined independently of the camera positions

The first coordination system we frequently use is called the world coordination system. This coordinate system is a 3D coordination system chosen with respect to all the 3D objects, such that the locations of the 3D objects can be easy to determine. Usually, the axis of the world coordination system does not agree with the object orientation or camera orientation. Thus, there exist some non-zero rotations and displacements between the origin of the world coordination system and the object and camera orientations. A figure showing the world coordination system is shown here:

Figure 1.3 – The camera view coordinate system, where the origin is at the camera projection center and the three axes are defined according to the imaging plane

Figure 1.3 – The camera view coordinate system, where the origin is at the camera projection center and the three axes are defined according to the imaging plane

Since the axis of the world coordination system usually does not agree with the camera orientation, for many situations, it is more convenient to define and use a camera view coordination system. In PyTorch3D, the camera view coordination system is defined such that the origin is at the projection point of the camera, the x axis points to the left, the y axis points upward, and the z axis points to the front.

Figure 1.4 – The NDC coordinate system, in which the volume is confined to the ranges that the camera can render

Figure 1.4 – The NDC coordinate system, in which the volume is confined to the ranges that the camera can render

The normalized device coordinate (NDC) confines the volume that a camera can render. The x coordinate values in the NDC space range from -1 to +1, as do the y coordinate values. The z coordinate values range from znear to zfar, where znear is the nearest depth and zfar is the farthest depth. Any object out of this znear to zfar range would not be rendered by the camera.

Finally, the screen coordinate system is defined in terms of how the rendered images are shown on our screens. The coordinate system contains the x coordinate as the columns of the pixels, the y coordinate as the rows of the pixels, and the z coordinate corresponding to the depth of the object.

To render the 3D object correctly on our 2D screens, we need to switch between these coordinate systems. Luckily, these conversions can be easily carried out by using the PyTorch3D camera models. We will discuss coordinatation conversion in more detail after we discuss the camera models.

Previous PageNext Page
You have been reading a chapter from
3D Deep Learning with Python
Published in: Oct 2022Publisher: PacktISBN-13: 9781803247823
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (3)

author image
Xudong Ma

Xudong Ma is a Staff Machine Learning engineer with Grabango Inc. at Berkeley California. He was a Senior Machine Learning Engineer at Facebook(Meta) Oculus and worked closely with the 3D PyTorch Team on 3D facial tracking projects. He has many years of experience working on computer vision, machine learning and deep learning. He holds a Ph.D. in Electrical and Computer Engineering.
Read more about Xudong Ma

author image
Vishakh Hegde

Vishakh Hegde is a Machine Learning and Computer Vision researcher. He has over 7 years of experience in this field during which he has authored multiple well cited research papers and published patents. He holds a masters from Stanford University specializing in applied mathematics and machine learning, and a BS and MS in Physics from IIT Madras. He previously worked at Schlumberger and Matroid. He is a Senior Applied Scientist at Ambient.ai, where he helped build their weapon detection system which is deployed at several Global Fortune 500 companies. He is now leveraging his expertise and passion to solve business challenges to build a technology startup in Silicon Valley. You can learn more about him on his personal website.
Read more about Vishakh Hegde

author image
Lilit Yolyan

Lilit Yolyan is a machine learning researcher working on her Ph.D. at YSU. Her research focuses on building computer vision solutions for smart cities using remote sensing data. She has 5 years of experience in the field of computer vision and has worked on a complex driver safety solution to be deployed by many well-known car manufacturing companies.
Read more about Lilit Yolyan