Reader small image

You're reading from  Hands-On Mathematics for Deep Learning

Product typeBook
Published inJun 2020
Reading LevelIntermediate
PublisherPackt
ISBN-139781838647292
Edition1st Edition
Languages
Right arrow
Author (1)
Jay Dawani
Jay Dawani
author image
Jay Dawani

Jay Dawani is a former professional swimmer turned mathematician and computer scientist. He is also a Forbes 30 Under 30 Fellow. At present, he is the Director of Artificial Intelligence at Geometric Energy Corporation (NATO CAGE) and the CEO of Lemurian Labs - a startup he founded that is developing the next generation of autonomy, intelligent process automation, and driver intelligence. Previously he has also been the technology and R&D advisor to Spacebit Capital. He has spent the last three years researching at the frontiers of AI with a focus on reinforcement learning, open-ended learning, deep learning, quantum machine learning, human-machine interaction, multi-agent and complex systems, and artificial general intelligence.
Read more about Jay Dawani

Right arrow

Geometric Deep Learning

Throughout this book, we have learned about various types of neural networks that are used in deep learning, such as convolutional neural networks and recurrent neural networks, and they have achieved some tremendous results in a variety of tasks, such as computer vision, image reconstruction, synthetic data generation, speech recognition, language translation, and so on. All of the models we have looked at so far have been trained on Euclidean data, that is, data that can be represented in grid (matrix) format—images, text, audio, and so on.

However, many of the tasks that we would like to apply deep learning to use non-Euclidean data (more on this shortly) the kind that the neural networks we have come across so far are unable to process and deal with. This includes dealing with sensor networks, mesh surfaces, point clouds, objects (the...

Comparing Euclidean and non-Euclidean data

Before we learn about geometric deep learning techniques, it is important for us to understand the differences between Euclidean and non-Euclidean data, and why we need a separate approach to deal with it.

Deep learning architectures such as FNNs, CNNs, and RNNs have proven successful for a variety of tasks, such as speech recognition, machine translation, image reconstruction, object recognition and segmentation, and motion tracking, in the last 8 years. This is because of their ability to exploit and use the local statistical properties that exist within data. These properties include stationarity, locality, and compositionality. In the case of CNNs, the data they take as input can be represented in a grid form (such as images, which can be represented by matrices and tensors).

The stationarity, in this case (images), comes from the...

Graph neural networks

Graph neural networks are the quintessential neural network for geometric deep learning, and, as the name suggests, they work particularly well on graph-based data such as meshes.

Now, let's assume we have a graph, G, that has a binary adjacency matrix, A. Then, we have another matrix, X, that contains all the node features. These features could be text, images, or categorical, node degrees, clustering coefficients, indicator vectors, and so on. The goal here is to generate node embeddings using local neighborhoods.

As we know, nodes on graphs have neighboring nodes, and, in this case, each node tries to aggregate the information from its neighbors using a neural network. We can think of the network neighborhood as a computation graph. Since each node has edges with different nodes, each node has a unique computation graph.

If we think back to convolutional...

Spectral graph CNNs

Spectral graph CNNs, as the name suggests, use a spectral convolution, which we defined as follows:

Here, . We can rewrite this in matrix form as follows:

This is not shift-invariant since G does not have a circulant structure.

Now, in the spectral domain, we define a convolutional layer as follows:

Here, , , and is an n×n diagonal matrix of spectral filter coefficients (which are basis-dependent, meaning that they don't generalize over different graphs and are limited to a single domain), and ξ is the nonlinearity that's applied to the vertex-wise function values.

What this means is that if we learn a convolutional filter with the basis Φ on one domain, it will not be transferable or applicable to another task that has the basis Ψ. This isn't to say we can't create bases that can be used for different domains...

Mixture model networks

Now that we've seen a few examples of how GNNs work, let's go a step further and see how we can apply neural networks to meshes.

First, we use a patch that is defined at each point in a local system of d-dimensional pseudo-coordinates, , around x. This is referred to as a geodesic polar. On each of these coordinates, we apply a set of parametric kernels, , that produces local weights.

The kernels here differ in that they are Gaussian and not fixed, and are produced using the following equation:

These parameters ( and ) are trainable and learned.

A spatial convolution with a filter, g, can be defined as follows:

Here, is a feature at vertex i.

Previously, we mentioned geodesic polar coordinates, but what are they? Let's define them and find out. We can write them as follows:

Here, is the geodesic distance between i and j and is the...

Facial recognition in 3D

Let's go ahead and see how this translates to a real-world problem such as 3D facial recognition, which is used in phones, security, and so on. In 2D images, this would be largely dependent on the pose and illumination, and we don't have access to depth information. Because of this limitation, we use 3D faces instead so that we don't have to worry about lighting conditions, head orientation, and various facial expressions. For this task, the data we will be using is meshes.

In this case, our meshes make up an undirected, connected graph, G = (V, E, A), where |V| = n is the vertices, E is a set of edges, and contains the d-dimensional pseudo-coordinates, , where . The node feature matrix is denoted as , where each of the nodes contains d-dimensional features. We then define the lth channel of the feature map as fl, of which the ith node...

Summary

In this chapter, we learned about some important mathematical topics, such as the difference between Euclidean and non-Euclidean data and manifolds. We then went on to learn about a few fascinating and emerging topics in the field of deep learning that have widespread applications in a plethora of domains in which traditional deep learning algorithms have proved to be ineffective. This new class of neural networks, known as graph neural networks, greatly expand on the usefulness of deep learning by extending it to work on non-Euclidean data. Toward the end of this chapter, we saw an example use case for graph neural networks—facial recognition in 3D.

This brings us to the end of this book. Congratulations on successfully completing the lessons that were provided!

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Hands-On Mathematics for Deep Learning
Published in: Jun 2020Publisher: PacktISBN-13: 9781838647292
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Jay Dawani

Jay Dawani is a former professional swimmer turned mathematician and computer scientist. He is also a Forbes 30 Under 30 Fellow. At present, he is the Director of Artificial Intelligence at Geometric Energy Corporation (NATO CAGE) and the CEO of Lemurian Labs - a startup he founded that is developing the next generation of autonomy, intelligent process automation, and driver intelligence. Previously he has also been the technology and R&D advisor to Spacebit Capital. He has spent the last three years researching at the frontiers of AI with a focus on reinforcement learning, open-ended learning, deep learning, quantum machine learning, human-machine interaction, multi-agent and complex systems, and artificial general intelligence.
Read more about Jay Dawani