You're reading from Hands-On Graph Neural Networks Using Python

Product type Book

Published in Apr 2023

Publisher Packt

ISBN-13 9781804617526

Pages 354 pages

Edition 1st Edition

Languages

Concepts

Neural Networks

Author (1):

Maxime Labonne

Table of Contents (25) Chapters

Preface

Part 1: Introduction to Graph Learning

Chapter 1: Getting Started with Graph Learning

Chapter 2: Graph Theory for Graph Neural Networks

Chapter 3: Creating Node Representations with DeepWalk

Part 2: Fundamentals

Chapter 4: Improving Embeddings with Biased Random Walks in Node2Vec

Chapter 5: Including Node Features with Vanilla Neural Networks

Chapter 6: Introducing Graph Convolutional Networks

Chapter 7: Graph Attention Networks

Part 3: Advanced Techniques

Chapter 8: Scaling Up Graph Neural Networks with GraphSAGE

Chapter 9: Defining Expressiveness for Graph Classification

Chapter 10: Predicting Links with Graph Neural Networks

Chapter 11: Generating Graphs Using Graph Neural Networks

Chapter 12: Learning from Heterogeneous Graphs

Chapter 13: Temporal Graph Neural Networks

Chapter 14: Explaining Graph Neural Networks

Part 4: Applications

Chapter 15: Forecasting Traffic Using A3T-GCN

Chapter 16: Detecting Anomalies Using Heterogeneous GNNs

Chapter 17: Building a Recommender System Using LightGCN

Chapter 18: Unlocking the Potential of Graph Neural Networks for Real-World Applications

Index

Why subscribe?

Other Books You May Enjoy

Preface

In just ten years, Graph Neural Networks (GNNs) have become an essential and popular deep learning architecture. They have already had a significant impact various industries, such as in drug discovery, where GNNs predicted a new antibiotic, named halicin, and have improved estimated time of arrival calculations on Google Maps. Tech companies and universities are exploring the potential of GNNs in various applications, including recommender systems, fake news detection, and chip design. GNNs have enormous potential and many yet-to-be-discovered applications, making them a critical tool for solving global problems.

In this book, we aim to provide a comprehensive and practical overview of the world of GNNs. We will begin by exploring the fundamental concepts of graph theory and graph learning and then delve into the most widely used and well-established GNN architectures. As we progress, we will also cover the latest advances in GNNs and introduce specialized architectures that are designed to tackle specific tasks, such as graph generation, link prediction, and more.

In addition to these specialized chapters, we will provide hands-on experience through three practical projects. These projects will cover critical real-world applications of GNNs, including traffic forecasting, anomaly detection, and recommender systems. Through these projects, you will gain a deeper understanding of how GNNs work and also develop the skills to implement them in practical scenarios.

Finally, this book provides a hands-on learning experience with readable code for every chapter’s techniques and relevant applications, which are readily accessible on GitHub and Google Colab.

By the end of this book, you will have a comprehensive understanding of the field of graph learning and GNNs and will be well-equipped to design and implement these models for a wide range of applications.

Who this book is for

This book is intended for individuals interested in learning about GNNs and how they can be applied to various real-world problems. This book is ideal for data scientists, machine learning engineers, and artificial intelligence (AI) professionals who want to gain practical experience in designing and implementing GNNs. This book is written for individuals with prior knowledge of deep learning and machine learning. However, it provides a comprehensive introduction to the fundamental concepts of graph theory and graph learning for those new to the field. It will also be useful for researchers and students in computer science, mathematics, and engineering who want to expand their knowledge in this rapidly growing area of research.

What this book covers

Chapter 1, Getting Started with Graph Learning, provides a comprehensive introduction to GNNs, including their importance in modern data analysis and machine learning. The chapter starts by exploring the relevance of graphs as a representation of data and their widespread use in various domains. It then delves into the importance of graph learning, including different applications and techniques. Finally, the chapter focuses on the GNN architecture and highlights its unique features and performance compared to other methods.

Chapter 2, Graph Theory for Graph Neural Networks, covers the basics of graph theory and introduces various types of graphs, including their properties and applications. This chapter also covers fundamental graph concepts, such as the adjacency matrix, graph measures, such as centrality, and graph algorithms, Breadth-First Search (BFS) and Depth-First Search (DFS).

Chapter 3, Creating Node Representations with DeepWalk, focuses on DeepWalk, a pioneer in applying machine learning to graph data. The main objective of the DeepWalk architecture is to generate node representations that other models can utilize for downstream tasks such as node classification. The chapter covers two key components of DeepWalk – Word2Vec and random walks – with a particular emphasis on the Word2Vec skip-gram model.

Chapter 4, Improving Embeddings with Biased Random Walks in Node2Vec, focuses on the Node2Vec architecture, which is based on the DeepWalk architecture covered in the previous chapter. The chapter covers the modifications made to the random walk generation in Node2Vec and how to select the best parameters for a specific graph. The implementation of Node2Vec is compared to DeepWalk on Zachary’s Karate Club to highlight the differences between the two architectures. The chapter concludes with a practical application of Node2Vec, building a movie recommendation system.

Chapter 5, Including Node Features with Vanilla Neural Networks, explores the integration of additional information, such as node and edge features, into the graph embeddings to produce more accurate results. The chapter starts with a comparison of vanilla neural networks’ performance on node features only, treated as tabular datasets. Then, we will experiment with adding topological information to the neural networks, leading to the creation of a simple vanilla GNN architecture.

Chapter 6, Introducing Graph Convolutional Networks, focuses on the Graph Convolutional Network (GCN) architecture and its importance as a blueprint for GNNs. It covers the limitations of previous vanilla GNN layers and explains the motivation behind GCNs. The chapter details how the GCN layer works, its performance improvements over the vanilla GNN layer, and its implementation on the Cora and Facebook Page-Page datasets using PyTorch Geometric. The chapter also touches upon the task of node regression and the benefits of transforming tabular data into a graph.

Chapter 7, Graph Attention Networks, focuses on Graph Attention Networks (GATs), which are an improvement over GCNs. The chapter explains how GATs work by using the concept of self-attention and provides a step-by-step understanding of the graph attention layer. The chapter also implements a graph attention layer from scratch using NumPy. The final section of the chapter discusses the use of a GAT on two node classification datasets, Cora and CiteSeer, and compares the accuracy with that of a GCN.

Chapter 8, Scaling up Graph Neural Networks with GraphSAGE, focuses on the GraphSAGE architecture and its ability to handle large graphs effectively. The chapter covers the two main ideas behind GraphSAGE, including its neighbor sampling technique and aggregation operators. You will learn about the variants proposed by tech companies such as Uber Eats and Pinterest, as well as the benefits of GraphSAGE’s inductive approach. The chapter concludes by implementing GraphSAGE for node classification and multi-label classification tasks.

Chapter 9, Defining Expressiveness for Graph Classification, explores the concept of expressiveness in GNNs and how it can be used to design better models. It introduces the Weisfeiler-Leman (WL) test, which provides the framework for understanding expressiveness in GNNs. The chapter uses the WL test to compare different GNN layers and determine the most expressive one. Based on this result, a more powerful GNN is designed and implemented using PyTorch Geometric. The chapter concludes with a comparison of different methods for graph classification on the PROTEINS dataset.

Chapter 10, Predicting Links with Graph Neural Networks, focuses on link prediction in graphs. It covers traditional techniques, such as matrix factorization and GNN-based methods. The chapter explains the concept of link prediction and its importance in social networks and recommender systems. You will learn about the limitations of traditional techniques and the benefits of using GNN-based methods. We will explore three GNN-based techniques from two different families, including node embeddings and subgraph representation. Finally, you will implement various link prediction techniques in PyTorch Geometric and choose the best method for a given problem.

Chapter 11, Generating Graphs Using Graph Neural Networks, explores the field of graph generation, which involves finding methods to create new graphs. The chapter first introduces you to traditional techniques such as Erdős–Rényi and small-world models. Then you will focus on three families of solutions for GNN-based graph generation: VAE-based, autoregressive, and GAN-based models. The chapter concludes with an implementation of a GAN-based framework with Reinforcement Learning (RL) to generate new chemical compounds using the DeepChem library with TensorFlow.

Chapter 12, Learning from Heterogeneous Graphs, focuses on heterogeneous GNNs. Heterogeneous graphs contain different types of nodes and edges, in contrast to homogeneous graphs, which only involve one type of node and one type of edge. The chapter begins by reviewing the Message Passing Neural Network (MPNN) framework for homogeneous GNNs, then expands the framework to heterogeneous networks. Finally, we introduce a technique for creating a heterogeneous dataset, transforming homogeneous architectures into heterogeneous ones, and discussing an architecture specifically designed for processing heterogeneous networks.

Chapter 13, Temporal Graph Neural Networks, focuses on Temporal GNNs, or Spatio-Temporal GNNs, which are a type of GNN that can handle graphs with changing edges and features over time. The chapter first explains the concept of dynamic graphs and the applications of temporal GNNs, focusing on time series forecasting. The chapter then moves on to the application of temporal GNNs to web traffic forecasting to improve results using temporal information. Finally, the chapter describes another temporal GNN architecture specifically designed for dynamic graphs and applies it to the task of epidemic forecasting.

Chapter 14, Explaining Graph Neural Networks, covers various techniques to better understand the predictions and behavior of a GNN model. The chapter highlights two popular explanation methods: GNNExplainer and integrated gradients. Then, you will see the application of these techniques on a graph classification task using the MUTAG dataset and a node classification task using the Twitch social network.

Chapter 15, Forecasting Traffic Using A3T-GCN, focuses on the application of Temporal Graph Neural Networks in the field of traffic forecasting. It highlights the importance of accurate traffic forecasts in smart cities and the challenges of traffic forecasting due to complex spatial and temporal dependencies. The chapter covers the steps involved in processing a new dataset to create a temporal graph and the implementation of a new type of temporal GNN to predict future traffic speed. Finally, the results are compared to a baseline solution to verify the relevance of the architecture.

Chapter 16, Detecting Anomalies Using Heterogeneous GNNs, focuses on the application of GNNs in anomaly detection. GNNs, with their ability to capture complex relationships, make them well-suited for detecting anomalies and can handle large amounts of data efficiently. In this chapter, you will learn how to implement a GNN for intrusion detection in computer networks using the CIDDS-001 dataset. The chapter covers processing the dataset, building relevant features, implementing a heterogenous GNN, and evaluating the results to determine its effectiveness in detecting anomalies in network traffic.

Chapter 17, Recommending Books Using LightGCN, focuses on the application of GNNs in recommender systems. The goal of recommender systems is to provide personalized recommendations to users based on their interests and past interactions. GNNs are well-suited for this task as they can effectively incorporate complex relationships between users and items. In this chapter, the LightGCN architecture is introduced as a GNN specifically designed for recommender systems. Using the Book-Crossing dataset, the chapter demonstrates how to build a book recommender system with collaborative filtering using the LightGCN architecture.

Chapter 18, Unlocking the Potential of Graph Neural Networks for Real-Word Applications, summarizes what we have learned throughout the book, and looks ahead to the future of GNNs.

To get the most out of this book

You should have a basic understanding of graph theory and machine learning concepts, such as supervised and unsupervised learning, training, and the evaluation of models to maximize your learning experience. Familiarity with deep learning frameworks, such as PyTorch, will also be useful, although not essential, as the book will provide a comprehensive introduction to the mathematical concepts and their implementation.

Software covered in the book	Operating system requirements
Python 3.8.15	Windows, macOS, or Linux
PyTorch 1.13.1	Windows, macOS, or Linux
PyTorch Geometric 2.2.0	Windows, macOS, or Linux

To install Python 3.8.15, you can download the latest version from the official Python website: https://www.python.org/downloads/. We strongly recommend using a virtual environment, such as venv or conda.

Optionally, if you want to use a Graphics Processing Unit (GPU) from NVIDIA to accelerate training and inference, you will need to install CUDA and cuDNN:

CUDA is a parallel computing platform and API developed by NVIDIA for general computing on GPUs. To install CUDA, you can follow the instructions on the NVIDIA website: https://developer.nvidia.com/cuda-downloads.

cuDNN is a library developed by NVIDIA, which provides highly optimized GPU implementations of primitives for deep learning algorithms. To install cuDNN, you need to create an account on the NVIDIA website and download the library from the cuDNN download page: https://developer.nvidia.com/cudnn.

You can check out the list of CUDA-enabled GPU products on the NVIDIA website: https://developer.nvidia.com/cuda-gpus.

To install PyTorch 1.13.1, you can follow the instructions on the official PyTorch website: https://pytorch.org/. You can choose the installation method that is most appropriate for your system (including CUDA and cuDNN).

To install PyTorch Geometric 2.2.0, you can follow the instructions in the GitHub repository: https://pytorch-geometric.readthedocs.io/en/2.2.0/notes/installation.html. You will need to have PyTorch installed on your system first.

Chapter 11 requires TensorFlow 2.4. To install it, you can follow the instructions on the official TensorFlow website: https://www.tensorflow.org/install. You can choose the installation method that is most appropriate for your system and the version of TensorFlow you want to use.

Chapter 14 requires an older version of PyTorch Geometric (version 2.0.4). It is recommended to create a specific virtual environment for this chapter.

Chapter 15, Chapter 16, and Chapter 17 require a high GPU memory usage. You can lower it by decreasing the size of the training set in the code.

Other Python libraries are required in some or most chapters. You can install them using pip install <name==version>, or using another installer depending on your configuration (such as conda). Here is the complete list of required packages with the corresponding versions:

pandas==1.5.2
gensim==4.3.0
networkx==2.8.8
matplotlib==3.6.3
node2vec==0.4.6
seaborn==0.12.2
scikit-learn==1.2.0
deepchem==2.7.1
torch-geometric-temporal==0.54.0
captum==0.6.0

The complete list of requirements is available on GitHub at https://github.com/PacktPublishing/Hands-On-Graph-Neural-Networks-Using-Python. Alternatively, you can directly import notebooks in Google Colab at https://colab.research.google.com.

If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book’s GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.

Download the example code files

You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Hands-On-Graph-Neural-Networks-Using-Python. If there’s an update to the code, it will be updated in the GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

We also provide a PDF file that has color images of the screenshots and diagrams used in this book. You can download it here: https://packt.link/gaFU6.

Conventions used

There are a number of text conventions used throughout this book.

Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: “We initialize two lists (visited and queue) and add the starting node.”

A block of code is set as follows:

DG = nx.DiGraph()
DG.add_edges_from([('A', 'B'), ('A', 'C'), ('B', 'D'), ('B', 'E'), ('C', 'F'), ('C', 'G')])

Tips or important notes

Appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, email us at customercare@packtpub.com and mention the book title in the subject of your message.

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata and fill in the form.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at copyright@packt.com with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Share your thoughts

Once you’ve read Hands-On Graph Neural Networks Using Python, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.

Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.