Reader small image

You're reading from  3D Deep Learning with Python

Product typeBook
Published inOct 2022
PublisherPackt
ISBN-139781803247823
Edition1st Edition
Right arrow
Authors (3):
Xudong Ma
Xudong Ma
author image
Xudong Ma

Xudong Ma is a Staff Machine Learning engineer with Grabango Inc. at Berkeley California. He was a Senior Machine Learning Engineer at Facebook(Meta) Oculus and worked closely with the 3D PyTorch Team on 3D facial tracking projects. He has many years of experience working on computer vision, machine learning and deep learning. He holds a Ph.D. in Electrical and Computer Engineering.
Read more about Xudong Ma

Vishakh Hegde
Vishakh Hegde
author image
Vishakh Hegde

Vishakh Hegde is a Machine Learning and Computer Vision researcher. He has over 7 years of experience in this field during which he has authored multiple well cited research papers and published patents. He holds a masters from Stanford University specializing in applied mathematics and machine learning, and a BS and MS in Physics from IIT Madras. He previously worked at Schlumberger and Matroid. He is a Senior Applied Scientist at Ambient.ai, where he helped build their weapon detection system which is deployed at several Global Fortune 500 companies. He is now leveraging his expertise and passion to solve business challenges to build a technology startup in Silicon Valley. You can learn more about him on his personal website.
Read more about Vishakh Hegde

Lilit Yolyan
Lilit Yolyan
author image
Lilit Yolyan

Lilit Yolyan is a machine learning researcher working on her Ph.D. at YSU. Her research focuses on building computer vision solutions for smart cities using remote sensing data. She has 5 years of experience in the field of computer vision and has worked on a complex driver safety solution to be deployed by many well-known car manufacturing companies.
Read more about Lilit Yolyan

View More author details
Right arrow

Coding for camera models and coordination systems

In this section, we are going to leverage everything we have learned to build a concrete camera model and convert between different coordinate systems, using a concrete code snippet example written in Python and PyTorch3D:

  1. First, we are going to use the following mesh defined by a cube.obj file. Basically, the mesh is a cube:
    mtllib ./cube.mtl
    o cube
    # Vertex list
    v -50 -50 20
    v -50 -50 10
    v -50 50 10
    v -50 50 20
    v 50 -50 20
    v 50 -50 10
    v 50 50 10
    v 50 50 20
    # Point/Line/Face list
    usemtl Door
    f 1 2 3
    f 6 5 8
    f 7 3 2
    f 4 8 5
    f 8 4 3
    f 6 2 1
    f 1 3 4
    f 6 8 7
    f 7 2 6
    f 4 5 1
    f 8 3 7
    f 6 1 5
    # End of file

The example code snippet is camera.py, which can be downloaded from the book’s GitHub repository.

  1. Let us import all the modules that we need:
    import open3d
    import torch
    import pytorch3d
    from pytorch3d.io import load_obj
    from scipy.spatial.transform import Rotation as Rotation
    from pytorch3d.renderer.cameras import PerspectiveCameras
  2. We can load and visualize the mesh by using Open3D’s draw_geometrics function:
    #Load meshes and visualize it with Open3D
    mesh_file = "cube.obj"
    print('visualizing the mesh using open3D')
    mesh = open3d.io.read_triangle_mesh(mesh_file)
    open3d.visualization.draw_geometries([mesh],
                     mesh_show_wireframe = True,
                     mesh_show_back_face = True)
  3. We define a camera variable as a PyTorch3D PerspectiveCamera object. The camera here is actually mini-batched. For example, the rotation matrix, R, is a PyTorch tensor with a shape of [8, 3, 3], which actually defines eight cameras, each with one of the eight rotation matrices. This is the same case for all other camera parameters, such as image sizes, focal lengths, and principal points:
    #Define a mini-batch of 8 cameras
    image_size = torch.ones(8, 2)
    image_size[:,0] = image_size[:,0] * 1024
    image_size[:,1] = image_size[:,1] * 512
    image_size = image_size.cuda()
    focal_length = torch.ones(8, 2)
    focal_length[:,0] = focal_length[:,0] * 1200
    focal_length[:,1] = focal_length[:,1] * 300
    focal_length = focal_length.cuda()
    principal_point = torch.ones(8, 2)
    principal_point[:,0] = principal_point[:,0] * 512
    principal_point[:,1] = principal_point[:,1] * 256
    principal_point = principal_point.cuda()
    R = Rotation.from_euler('zyx', [
        [n*5, n, n]  for n in range(-4, 4, 1)], degrees=True).as_matrix()
    R = torch.from_numpy(R).cuda()
    T = [ [n, 0, 0] for n in range(-4, 4, 1)]
    T = torch.FloatTensor(T).cuda()
    camera = PerspectiveCameras(focal_length = focal_length,
                                principal_point = principal_point,
                                in_ndc = False,
                                image_size = image_size,
                                R = R,
                                T = T,
                                device = 'cuda')
  4. Once we have defined the camera variable, we can call the get_world_to_view_transform class member method to obtain a Transform3d object, world_to_view_transform. We can then use the transform_points member method to convert from world coordination to camera view coordination. Similarly, we can also use the get_full_projection_transform member method to obtain a Transform3d object, which is for the conversion from world coordination to screen coordination:
    world_to_view_transform = camera.get_world_to_view_transform()
    world_to_screen_transform = camera.get_full_projection_transform()
    #Load meshes using PyTorch3D
    vertices, faces, aux = load_obj(mesh_file)
    vertices = vertices.cuda()
    world_to_view_vertices = world_to_view_transform.transform_points(vertices)
    world_to_screen_vertices = world_to_screen_transform.transform_points(vertices)
    print('world_to_view_vertices = ', world_to_view_vertices)
    print('world_to_screen_vertices = ', world_to_screen_vertices

The code example shows the basic ways that PyTorch3D cameras can be used and how easy it is to switch between different coordinate systems using PyTorch3D.

Previous PageNext Page
You have been reading a chapter from
3D Deep Learning with Python
Published in: Oct 2022Publisher: PacktISBN-13: 9781803247823
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (3)

author image
Xudong Ma

Xudong Ma is a Staff Machine Learning engineer with Grabango Inc. at Berkeley California. He was a Senior Machine Learning Engineer at Facebook(Meta) Oculus and worked closely with the 3D PyTorch Team on 3D facial tracking projects. He has many years of experience working on computer vision, machine learning and deep learning. He holds a Ph.D. in Electrical and Computer Engineering.
Read more about Xudong Ma

author image
Vishakh Hegde

Vishakh Hegde is a Machine Learning and Computer Vision researcher. He has over 7 years of experience in this field during which he has authored multiple well cited research papers and published patents. He holds a masters from Stanford University specializing in applied mathematics and machine learning, and a BS and MS in Physics from IIT Madras. He previously worked at Schlumberger and Matroid. He is a Senior Applied Scientist at Ambient.ai, where he helped build their weapon detection system which is deployed at several Global Fortune 500 companies. He is now leveraging his expertise and passion to solve business challenges to build a technology startup in Silicon Valley. You can learn more about him on his personal website.
Read more about Vishakh Hegde

author image
Lilit Yolyan

Lilit Yolyan is a machine learning researcher working on her Ph.D. at YSU. Her research focuses on building computer vision solutions for smart cities using remote sensing data. She has 5 years of experience in the field of computer vision and has worked on a complex driver safety solution to be deployed by many well-known car manufacturing companies.
Read more about Lilit Yolyan