Reader small image

You're reading from  Hands-On Graph Neural Networks Using Python

Product typeBook
Published inApr 2023
PublisherPackt
ISBN-139781804617526
Edition1st Edition
Right arrow
Author (1)
Maxime Labonne
Maxime Labonne
author image
Maxime Labonne

Maxime Labonne is currently a senior applied researcher at Airbus. He received a M.Sc. degree in computer science from INSA CVL, and a Ph.D. in machine learning and cyber security from the Polytechnic Institute of Paris. During his career, he worked on computer networks and the problem of representation learning, which led him to explore graph neural networks. He applied this knowledge to various industrial projects, including intrusion detection, satellite communications, quantum networks, and AI-powered aircrafts. He is now an active graph neural network evangelist through Twitter and his personal blog.
Read more about Maxime Labonne

Right arrow

Including Node Features with Vanilla Neural Networks

So far, the only type of information we’ve considered is the graph topology. However, graph datasets tend to be richer than a mere set of connections: nodes and edges can also have features to represent scores, colors, words, and so on. Including this additional information in our input data is essential to produce the best embeddings possible. In fact, this is something natural in machine learning: node and edge features have the same structure as a tabular (non-graph) dataset. This means that traditional techniques can be applied to this data, such as neural networks.

In this chapter, we will introduce two new graph datasets: Cora and Facebook Page-Page. We will see how Vanilla Neural Networks perform on node features only by considering them as tabular datasets. We will then experiment to include topological information in our neural networks. This will give us our first GNN architecture: a simple model that considers...

Technical requirements

All the code examples from this chapter can be found on GitHub at https://github.com/PacktPublishing/Hands-On-Graph-Neural-Networks-Using-Python/tree/main/Chapter05.

Installation steps required to run the code on your local machine can be found in the Preface of this book.

Introducing graph datasets

The graph datasets we’re going to use in this chapter are richer than Zachary’s Karate Club: they have more nodes, more edges, and include node features. In this section, we will introduce them to give us a good understanding of these graphs and how to process them with PyTorch Geometric. Here are the two datasets we will use:

  • The Cora dataset
  • The Facebook Page-Page dataset

Let’s start with the smaller one: the popular Cora dataset.

The Cora dataset

Introduced by Sen et al. in 2008 [1], Cora (no license) is the most popular dataset for node classification in the scientific literature. It represents a network of 2,708 publications, where each connection is a reference. Each publication is described as a binary vector of 1,433 unique words, where 0 and 1 indicate the absence or presence of the corresponding word, respectively. This representation is also called a binary bag of words in natural language processing...

Classifying nodes with vanilla neural networks

Compared to Zachary’s Karate Club, these two datasets include a new type of information: node features. They provide additional information about the nodes in a graph, such as a user’s age, gender, or interests in a social network. In a vanilla neural network (also called multilayer perceptron), these embeddings are directly used in the model to perform downstream tasks such as node classification.

In this section, we will consider node features as a regular tabular dataset. We will train a simple neural network on this dataset to classify our nodes. Note that this architecture does not take into account the topology of the network. We will try to fix this issue in the next section and compare our results.

The tabular dataset of node features can be easily accessed through the data object we created. First, I would like to convert this object into a regular pandas DataFrame by merging data.x (containing the node features...

Classifying nodes with vanilla graph neural networks

Instead of directly introducing well-known GNN architectures, let’s try to build our own model to understand the thought process behind GNNs. First, we need to go back to the definition of a simple linear layer.

A basic neural network layer corresponds to a linear transformation , where is the input vector of node and is the weight matrix. In PyTorch, this equation can be implemented with the torch.mm() function, or with the nn.Linear class that adds other parameters such as biases.

With our graph datasets, the input vectors are node features. It means that nodes are completely separate from each other. This is not enough to capture a good understanding of the graph: like a pixel in an image, the context of a node is essential to understand it. If you look at a group of pixels instead of a single one, you can recognize edges, patterns, and so on. Likewise, to understand a node, you need to look at its neighborhood...

Summary

In this chapter, we learned about the missing link between vanilla neural networks and GNNs. We built our own GNN architecture using our intuition and a bit of linear algebra. We explored two popular graph datasets from the scientific literature to compare our two architectures. Finally, we implemented them in PyTorch and evaluated their performance. The result is clear: even our intuitive version of a GNN completely outperforms the MLP on both datasets.

In Chapter 6, Normalizing Embeddings with Graph Convolutional Networks, we refine our vanilla GNN architecture to correctly normalize its inputs. This graph convolutional network model is an incredibly efficient baseline we’ll keep using in the rest of the book. We will compare its results on our two previous datasets and introduce a new interesting task: node regression.

Further reading

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Hands-On Graph Neural Networks Using Python
Published in: Apr 2023Publisher: PacktISBN-13: 9781804617526
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Maxime Labonne

Maxime Labonne is currently a senior applied researcher at Airbus. He received a M.Sc. degree in computer science from INSA CVL, and a Ph.D. in machine learning and cyber security from the Polytechnic Institute of Paris. During his career, he worked on computer networks and the problem of representation learning, which led him to explore graph neural networks. He applied this knowledge to various industrial projects, including intrusion detection, satellite communications, quantum networks, and AI-powered aircrafts. He is now an active graph neural network evangelist through Twitter and his personal blog.
Read more about Maxime Labonne