Reader small image

You're reading from  Hands-On Graph Neural Networks Using Python

Product typeBook
Published inApr 2023
PublisherPackt
ISBN-139781804617526
Edition1st Edition
Right arrow
Author (1)
Maxime Labonne
Maxime Labonne
author image
Maxime Labonne

Maxime Labonne is currently a senior applied researcher at Airbus. He received a M.Sc. degree in computer science from INSA CVL, and a Ph.D. in machine learning and cyber security from the Polytechnic Institute of Paris. During his career, he worked on computer networks and the problem of representation learning, which led him to explore graph neural networks. He applied this knowledge to various industrial projects, including intrusion detection, satellite communications, quantum networks, and AI-powered aircrafts. He is now an active graph neural network evangelist through Twitter and his personal blog.
Read more about Maxime Labonne

Right arrow

Learning from Heterogeneous Graphs

In the previous chapter, we tried to generate realistic molecules that contain different types of nodes (atoms) and edges (bonds). We also observe this kind of behavior in other applications, such as recommender systems (users and items), social networks (followers and followees), or cybersecurity (routers and servers). We call these kinds of graphs heterogeneous, as opposed to homogeneous graphs, which only involve one type of node and one type of edge.

In this chapter, we will recap everything we know about homogeneous GNNs. We will introduce the message passing neural network framework to generalize the architectures we have seen so far. This summary will allow us to understand how to expand our framework to heterogeneous networks. We will start by creating our own heterogeneous dataset. Then, we will transform homogeneous architectures into heterogeneous ones.

In the last section, we will take a different approach and discuss an architecture...

Technical requirements

All the code examples from this chapter can be found on GitHub at https://github.com/PacktPublishing/Hands-On-Graph-Neural-Networks-Using-Python/tree/main/Chapter12.

The installation steps required to run the code on your local machine can be found in the Preface of this book.

The message passing neural network framework

Before exploring heterogeneous graphs, let’s recap what we have learned about homogeneous GNNs. In the previous chapters, we saw different functions for aggregating and combining features from different nodes. As seen in Chapter 5, the simplest GNN layer consists of summing the linear combination of features from neighboring nodes (including the target node itself) with a weight matrix. The output of the previous sum then replaces the previous target node embedding.

The node-level operator can be written as follows:

is the set of neighboring nodes of the node (including itself), is the embedding of the node, and is a weight matrix:

GCN and GAT layers added fixed and dynamic weights to node features but kept the same idea. Even GraphSAGE’s LSTM operator or GIN’s max aggregator did not change the main concept of a GNN layer. If we look at all these variants, we can generalize GNN layers...

Introducing heterogeneous graphs

Heterogeneous graphs are a powerful tool to represent general relationships between different entities. Having different types of nodes and edges creates graph structures that are more complex but also more difficult to learn. In particular, one of the main problems with heterogeneous networks is that features from different types of nodes or edges do not necessarily have the same meaning or dimensionality. Therefore, merging different features would destroy a lot of information. This is not the case with homogeneous graphs, where each dimension has the exact same meaning for every node or edge.

Heterogeneous graphs are a more general kind of network that can represent different types of nodes and edges. Formally, it is defined as a graph, , comprising , a set of nodes, and , a set of edges. In the heterogeneous setting, it is associated with a node-type mapping function, (where denotes the set of node types), and a link-type mapping function,...

Transforming homogeneous GNNs to heterogeneous GNNs

To better understand the problem, let’s take a real dataset as an example. The DBLP computer science bibliography offers a dataset, [2-3], that contains four types of nodes – papers (14,328), terms (7,723), authors (4,057), and conferences (20). This dataset’s goal is to correctly classify the authors into four categories – database, data mining, artificial intelligence, and information retrieval. The authors’ node features are a bag-of-words (“0” or “1”) of 334 keywords they might have used in their publications. The following figure summarizes the relations between the different node types.

Figure 12.3 – Relationships between node types in the DBLP dataset

Figure 12.3 – Relationships between node types in the DBLP dataset

These node types do not have the same dimensionalities and semantic relationships. In heterogeneous graphs, relations between nodes are essential, which is why we want to consider...

Implementing a hierarchical self-attention network

In this section, we will implement a GNN model designed to handle heterogeneous graphs – the hierarchical self-attention network (HAN). This architecture was introduced by Liu et al. in 2021 [5]. HAN uses self-attention at two different levels:

  • Node-level attention to understand the importance of neighboring nodes in a given meta-path (such as a GAT in a homogeneous setting).
  • Semantic-level attention to learn the importance of each meta-path. This is the main feature of HAN, allowing us to select the best meta-paths for a given task automatically – for example, the meta-path game-user-game might be more relevant than game-dev-game in some tasks, such as predicting the number of players.

In the following section, we will detail the three main components – node-level attention, semantic-level attention, and the prediction module. This architecture is illustrated in Figure 12.5.

Figure 12.5 – HAN’s architecture with its three main modules ...

Summary

In this chapter, we introduced the MPNN framework to generalize GNN layers using three steps – message, aggregate, and update. In the rest of the chapter, we expanded this framework to consider heterogeneous networks, composed of different types of nodes and edges. This particular kind of graph allows us to represent various relations between entities, which are more insightful than a single type of connection.

Moreover, we saw how to transform homogeneous GNNs into heterogeneous ones thanks to PyTorch Geometric. We described the different layers in our heterogeneous GAT, which take node pairs as inputs to model their relations. Finally, we implemented a heterogeneous-specific architecture with HAN and compared the results of three techniques on the DBLP dataset. It proved the importance of exploiting the heterogeneous information that is represented in this kind of network.

In Chapter 13, Temporal Graph Neural Networks, we will see how to consider time in GNNs...

Further reading

  • [1] J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and G. E. Dahl. Neural Message Passing for Quantum Chemistry. arXiv, 2017. DOI: 10.48550/ARXIV.1704.01212. Available: https://arxiv.org/abs/1704.01212.
  • [2] Jie Tang, Jing Zhang, Limin Yao, Juanzi Li, Li Zhang, and Zhong Su. ArnetMiner: Extraction and Mining of Academic Social Networks. In Proceedings of the Fourteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD’2008). pp.990–998. Available: https://dl.acm.org/doi/abs/10.1145/1401890.1402008.
  • [3] X. Fu, J. Zhang, Z. Meng, and I. King. MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding. Apr. 2020. DOI: 10.1145/3366423.3380297. Available: https://arxiv.org/abs/2002.01680.
  • [4] M. Schlichtkrull, T. N. Kipf, P. Bloem, R. van den Berg, I. Titov, and M. Welling. Modeling Relational Data with Graph Convolutional Networks. arXiv, 2017. DOI: 10.48550/ARXIV.1703.06103. Available...
lock icon
The rest of the chapter is locked
You have been reading a chapter from
Hands-On Graph Neural Networks Using Python
Published in: Apr 2023Publisher: PacktISBN-13: 9781804617526
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Maxime Labonne

Maxime Labonne is currently a senior applied researcher at Airbus. He received a M.Sc. degree in computer science from INSA CVL, and a Ph.D. in machine learning and cyber security from the Polytechnic Institute of Paris. During his career, he worked on computer networks and the problem of representation learning, which led him to explore graph neural networks. He applied this knowledge to various industrial projects, including intrusion detection, satellite communications, quantum networks, and AI-powered aircrafts. He is now an active graph neural network evangelist through Twitter and his personal blog.
Read more about Maxime Labonne