Reader small image

You're reading from  Hands-On Graph Neural Networks Using Python

Product typeBook
Published inApr 2023
PublisherPackt
ISBN-139781804617526
Edition1st Edition
Right arrow
Author (1)
Maxime Labonne
Maxime Labonne
author image
Maxime Labonne

Maxime Labonne is currently a senior applied researcher at Airbus. He received a M.Sc. degree in computer science from INSA CVL, and a Ph.D. in machine learning and cyber security from the Polytechnic Institute of Paris. During his career, he worked on computer networks and the problem of representation learning, which led him to explore graph neural networks. He applied this knowledge to various industrial projects, including intrusion detection, satellite communications, quantum networks, and AI-powered aircrafts. He is now an active graph neural network evangelist through Twitter and his personal blog.
Read more about Maxime Labonne

Right arrow

Improving Embeddings with Biased Random Walks in Node2Vec

Node2Vec is an architecture largely based on DeepWalk. In the previous chapter, we saw the two main components of this architecture: random walks and Word2Vec. How can we improve the quality of our embeddings? Interestingly enough, not with more machine learning. Instead, Node2Vec brings critical modifications to the way random walks themselves are generated.

In this chapter, we will talk about these modifications and how to find the best parameters for a given graph. We will implement the Node2Vec architecture and compare it to using DeepWalk on Zachary’s Karate Club. This will give you a good understanding of the differences between the two architectures. Finally, we will use this technology to build a real application: a movie recommender system (RecSys) powered by Node2Vec.

By the end of this chapter, you will know how to implement Node2Vec on any graph dataset and how to select good parameters. You will understand...

Technical requirements

All the code examples from this chapter can be found on GitHub at https://github.com/PacktPublishing/Hands-On-Graph-Neural-Networks-Using-Python/tree/main/Chapter04.

Installation steps required to run the code on your local machine can be found in the Preface of this book.

Introducing Node2Vec

Node2Vec was introduced in 2016 by Grover and Leskovec from Stanford University [1]. It keeps the same two main components from DeepWalk: random walks and Word2Vec. The difference is that instead of obtaining sequences of nodes with a uniform distribution, the random walks are carefully biased in Node2Vec. We will see why these biased random walks perform better and how to implement them in the two following sections:

  • Defining a neighborhood
  • Introducing biases in random walks

Let’s start by questioning our intuitive concept of neighborhoods.

Defining a neighborhood

How do you define the neighborhood of a node? The key concept introduced in Node2Vec is the flexible notion of a neighborhood. Intuitively, we think of it as something close to the initial node, but what does “close” mean in the context of a graph? Let’s take the following graph as an example:

Figure 4.1 – Example of a random graph

Figure 4.1 – Example...

Implementing Node2Vec

Now that we have the functions to generate biased random walks, the implementation of Node2Vec is very similar to implementing DeepWalk. It is so similar that we can reuse the same code and create sequences with and to implement DeepWalk as a special case of Node2Vec. Let’s reuse Zachary’s Karate Club for this task:

As in the previous chapter, our goal is to correctly classify each member of the club as part of one of the two groups (“Mr. Hi” and “Officer”). We will use the node embeddings provided by Node2Vec as input to a machine learning classifier (Random Forest in this case).

Let’s see how to implement it step by step:

  1. First, we want to install the gensim library to use Word2Vec. This time, we will use version 3.8.0 for compatibility reasons:
    !pip install -qI gensim==3.8.0
  2. We import the required libraries:
    from gensim.models.word2vec import Word2Vec
    from sklearn.ensemble import RandomForestClassifier...

Building a movie RecSys

One of the most popular applications of GNNs is RecSys. If you think about the foundation of Word2Vec (and, thus, DeepWalk and Node2Vec), the goal is to produce vectors with the ability to measure their similarity. Encode movies instead of words, and you can suddenly ask for movies that are the most similar to a given input title. It sounds a lot like a RecSys, right?

But how to encode movies? We want to create (biased) random walks of movies, but this requires a graph dataset where similar movies are connected to each other. This is not easy to find.

Another approach is to look at user ratings. There are different techniques to build a graph based on ratings: bipartite graphs, edges based on pointwise mutual information, and so on. In this section, we’ll implement a simple and intuitive approach: movies that are liked by the same users are connected. We’ll then use this graph to learn movie embeddings using Node2Vec:

  1. First, let&...

Summary

In this chapter, we learned about Node2Vec, a second architecture based on the popular Word2Vec. We implemented functions to generate biased random walks and explained the connection between their parameters and two network properties: homophily and structural equivalence. We showed their usefulness by comparing Node2Vec’s results to DeepWalk’s for Zachary’s Karate Club. Finally, we built our first RecSys using a custom graph dataset and another implementation of Node2Vec. It gave us correct recommendations that we will improve even more in later chapters.

In Chapter 5, Including Node Features with Vanilla Neural Networks, we will talk about one overlooked issue concerning DeepWalk and Node2Vec: the lack of proper node features. We will try to address this problem using traditional neural networks, which cannot understand the network topology. This dilemma is important to understand before we finally introduce the answer: graph neural networks.

Further reading

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Hands-On Graph Neural Networks Using Python
Published in: Apr 2023Publisher: PacktISBN-13: 9781804617526
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Maxime Labonne

Maxime Labonne is currently a senior applied researcher at Airbus. He received a M.Sc. degree in computer science from INSA CVL, and a Ph.D. in machine learning and cyber security from the Polytechnic Institute of Paris. During his career, he worked on computer networks and the problem of representation learning, which led him to explore graph neural networks. He applied this knowledge to various industrial projects, including intrusion detection, satellite communications, quantum networks, and AI-powered aircrafts. He is now an active graph neural network evangelist through Twitter and his personal blog.
Read more about Maxime Labonne