Reader small image

You're reading from  Hands-On Graph Neural Networks Using Python

Product typeBook
Published inApr 2023
PublisherPackt
ISBN-139781804617526
Edition1st Edition
Right arrow
Author (1)
Maxime Labonne
Maxime Labonne
author image
Maxime Labonne

Maxime Labonne is currently a senior applied researcher at Airbus. He received a M.Sc. degree in computer science from INSA CVL, and a Ph.D. in machine learning and cyber security from the Polytechnic Institute of Paris. During his career, he worked on computer networks and the problem of representation learning, which led him to explore graph neural networks. He applied this knowledge to various industrial projects, including intrusion detection, satellite communications, quantum networks, and AI-powered aircrafts. He is now an active graph neural network evangelist through Twitter and his personal blog.
Read more about Maxime Labonne

Right arrow

Graph Attention Networks

Graph Attention Networks (GATs) are a theoretical improvement over GCNs. Instead of static normalization coefficients, they propose weighting factors calculated by a process called self-attention. The same process is at the core of one of the most successful deep learning architectures: the transformer, popularized by BERT and GPT-3. Introduced by Veličković et al. in 2017, GATs have become one of the most popular GNN architectures thanks to excellent out-of-the-box performance.

In this chapter, we will learn how the graph attention layer works in four steps. This is actually the perfect example for understanding how self-attention works in general. This theoretical background will allow us to implement a graph attention layer from scratch in NumPy. We will build the matrices by ourselves to understand how their values are calculated at each step.

In the last section, we’ll use a GAT on two node classification datasets: Cora, and a...

Technical requirements

All the code examples from this chapter can be found on GitHub at https://github.com/PacktPublishing/Hands-On-Graph-Neural-Networks-Using-Python/tree/main/Chapter07.

The installation steps required to run the code on your local machine can be found in the Preface section of this book.

Introducing the graph attention layer

The main idea behind GATs is that some nodes are more important than others. In fact, this was already the case with the graph convolutional layer: nodes with few neighbors were more important than others, thanks to the normalization coefficient . This approach is limiting because it only takes into account node degrees. On the other hand, the goal of the graph attention layer is to produce weighting factors that also consider the importance of node features.

Let’s call our weighting factors attention scores and note, , the attention score between the nodes and . We can define the graph attention operator as follows:

An important characteristic of GATs is that the attention scores are calculated implicitly by comparing inputs to each other (hence the name self-attention). In this section, we will see how to calculate these attention scores in four steps and also how to make an improvement to the graph attention...

Implementing the graph attention layer in NumPy

As previously stated, neural networks work in terms of matrix multiplications. Therefore, we need to translate our individual embeddings into operations for the entire graph. In this section, we will implement the original graph attention layer from scratch to properly understand the inner workings of self-attention. Naturally, this process can be repeated several times to create multi-head attention.

The first step consists of translating the original graph attention operator in terms of matrices. This is how we defined it in the last section:

By taking inspiration from the graph linear layer, we can write the following:

Where is a matrix that stores every .

In this example, we will use the following graph from the previous chapter:

Figure 7.3 – Simple graph where nodes have different numbers of neighbors

Figure 7.3 – Simple graph where nodes have different numbers of neighbors

The graph must provide two important...

Implementing a GAT in PyTorch Geometric

We now have a complete picture of how the graph attention layer works. These layers can be stacked to create our new architecture of choice: the GAT. In this section, we will follow the guidelines from the original GAT paper to implement our own model using PyG. We will use it to perform node classification on the Cora and CiteSeer datasets. Finally, we will comment on these results and compare them.

Let’s start with the Cora dataset:

  1. We import Cora from the Planetoid class using PyG:
    from torch_geometric.datasets import Planetoid
    dataset = Planetoid(root=".", name="Cora")
    data = dataset[0]
    Data(x=[2708, 1433], edge_index=[2, 10556], y=[2708], train_mask=[2708], val_mask=[2708], test_mask=[2708])
  2. We import the necessary libraries to create our own GAT class, using the GATv2 layer:
    import torch
    import torch.nn.functional as F
    from torch_geometric.nn import GATv2Conv
    from torch.nn import Linear, Dropout
  3. ...

Summary

In this chapter, we introduced a new essential architecture: the GAT. We saw its inner workings with four main steps, from linear transformation to multi-head attention. We saw how it works in practice by implementing a graph attention layer in NumPy. Finally, we applied a GAT model (with GATv2) to the Cora and CiteSeer datasets, where it provided excellent accuracy scores. We showed that these scores were dependent on the number of neighbors, which is a first step toward error analysis.

In Chapter 8, Scaling Graph Neural Networks with GraphSAGE, we will introduce a new architecture dedicated to managing large graphs. To test this claim, we will implement it on a new dataset several times bigger than what we’ve seen so far. We will talk about transductive and inductive learning, which is an important distinction for GNN practitioners.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Hands-On Graph Neural Networks Using Python
Published in: Apr 2023Publisher: PacktISBN-13: 9781804617526
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Maxime Labonne

Maxime Labonne is currently a senior applied researcher at Airbus. He received a M.Sc. degree in computer science from INSA CVL, and a Ph.D. in machine learning and cyber security from the Polytechnic Institute of Paris. During his career, he worked on computer networks and the problem of representation learning, which led him to explore graph neural networks. He applied this knowledge to various industrial projects, including intrusion detection, satellite communications, quantum networks, and AI-powered aircrafts. He is now an active graph neural network evangelist through Twitter and his personal blog.
Read more about Maxime Labonne