Reader small image

You're reading from  Graph Machine Learning

Product typeBook
Published inJun 2021
PublisherPackt
ISBN-139781800204492
Edition1st Edition
Right arrow
Authors (3):
Claudio Stamile
Claudio Stamile
author image
Claudio Stamile

Claudio Stamile received an M.Sc. degree in computer science from the University of Calabria (Cosenza, Italy) in September 2013 and, in September 2017, he received his joint Ph.D. from KU Leuven (Leuven, Belgium) and Université Claude Bernard Lyon 1 (Lyon, France). During his career, he has developed a solid background in artificial intelligence, graph theory, and machine learning, with a focus on the biomedical field. He is currently a senior data scientist in CGnal, a consulting firm fully committed to helping its top-tier clients implement data-driven strategies and build AI-powered solutions to promote efficiency and support new business models.
Read more about Claudio Stamile

Aldo Marzullo
Aldo Marzullo
author image
Aldo Marzullo

Aldo Marzullo received an M.Sc. degree in computer science from the University of Calabria (Cosenza, Italy) in September 2016. During his studies, he developed a solid background in several areas, including algorithm design, graph theory, and machine learning. In January 2020, he received his joint Ph.D. from the University of Calabria and Université Claude Bernard Lyon 1 (Lyon, France), with a thesis entitled Deep Learning and Graph Theory for Brain Connectivity Analysis in Multiple Sclerosis. He is currently a postdoctoral researcher at the University of Calabria and collaborates with several international institutions.
Read more about Aldo Marzullo

Enrico Deusebio
Enrico Deusebio
author image
Enrico Deusebio

Enrico Deusebio is currently the chief operating officer at CGnal, a consulting firm that helps its top-tier clients implement data-driven strategies and build AI-powered solutions. He has been working with data and large-scale simulations using high-performance facilities and large-scale computing centers for over 10 years, both in an academic and industrial context. He has collaborated and worked with top-tier universities, such as the University of Cambridge, the University of Turin, and the Royal Institute of Technology (KTH) in Stockholm, where he obtained a Ph.D. in 2014. He also holds B.Sc. and M.Sc. degrees in aerospace engineering from Politecnico di Torino.
Read more about Enrico Deusebio

View More author details
Right arrow

Chapter 2: Graph Machine Learning

Machine learning is a subset of artificial intelligence that aims to provide systems with the ability to learn and improve from data. It has achieved impressive results in many different applications, especially where it is difficult or unfeasible to explicitly define rules to solve a specific task. For instance, we can train algorithms to recognize spam emails, translate sentences into other languages, recognize objects in an image, and so on.

In recent years, there has been an increasing interest in applying machine learning to graph-structured data. Here, the primary objective is to automatically learn suitable representations to make predictions, discover new patterns, and understand complex dynamics in a better manner with respect to "traditional" machine learning approaches.

This chapter will first review some of the basic machine learning concepts. Then, an introduction to graph machine learning will be provided, with a particular...

Technical requirements

We will be using Jupyter notebooks with Python 3.8 for all of our exercises. The following is a list of the Python libraries that need to be installed for this chapter using pip. For example, run pip install networkx==2.5 on the command line, and so on:

Jupyter==1.0.0
networkx==2.5
matplotlib==3.2.2
node2vec==0.3.3
karateclub==1.0.19
scipy==1.6.2

All the code files relevant to this chapter are available at https://github.com/PacktPublishing/Graph-Machine-Learning/tree/main/Chapter02.

Understanding machine learning on graphs

Of the branches of artificial intelligence, machine learning is one that has attracted the most attention in recent years. It refers to a class of computer algorithms that automatically learn and improve their skills through experience without being explicitly programmed. Such an approach takes inspiration from nature. Imagine an athlete who faces a novel movement for the first time: they start slowly, carefully imitating the gesture of a coach, trying, making mistakes, and trying again. Eventually, they will improve, becoming more and more confident.

Now, how does this concept translate to machines? It is essentially an optimization problem. The goal is to find a mathematical model that is able to achieve the best possible performance on a particular task. Performance can be measured using a specific performance metric (also known as a loss function or cost function). In a common learning task, the algorithm is provided with data, possibly...

The generalized graph embedding problem

In classical machine learning applications, a common way to process the input data is to build from a set of features, in a process called feature engineering, which is capable of giving a compact and meaningful representation of each instance present in the dataset.

The dataset obtained from the feature engineering step will be then used as input for the machine learning algorithm. If this process usually works well for a large range of problems, it may not be the optimal solution when we are dealing with graphs. Indeed, due to their well-defined structure, finding a suitable representation capable of incorporating all the useful information might not be an easy task.

The first, and most straightforward, way of creating features capable of representing structural information from graphs is the extraction of certain statistics. For instance, a graph could be represented by its degree distribution, efficiency, and all the metrics we described...

The taxonomy of graph embedding machine learning algorithms

A wide variety of methods to generate a compact space for graph representation have been developed. In recent years, a trend has been observed of researchers and machine learning practitioners converging toward a unified notation to provide a common definition to describe such algorithms. In this section, we will be introduced to a simplified version of the taxonomy defined in the paper Machine Learning on Graphs: A Model and Comprehensive Taxonomy (https://arxiv.org/abs/2005.03675).

In this formal representation, every graph, node, or edge embedding method can be described by two fundamental components, named the encoder and the decoder. The encoder (ENC) maps the input into the embedding space, while the decoder (DEC) decodes structural information about the graph from the learned embedding (Figure 2.7).

The framework described in the paper follows an intuitive idea: if we are able to encode a graph such that the...

Summary 

In this chapter, we refreshed some basic machine learning concepts and discovered how they can be applied to graphs. We defined basic graph machine learning terminology with a particular focus on graph representation learning. A taxonomy of the main graph machine learning algorithms was presented in order to clarify what differentiates the various ranges of solutions developed over the years. Finally, practical examples were provided to begin understanding how the theory can be applied to practical problems.

In the next chapter, we will revise the main graph-based machine learning algorithms. We will analyze their behavior and see how they can be used in practice.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Graph Machine Learning
Published in: Jun 2021Publisher: PacktISBN-13: 9781800204492
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Authors (3)

author image
Claudio Stamile

Claudio Stamile received an M.Sc. degree in computer science from the University of Calabria (Cosenza, Italy) in September 2013 and, in September 2017, he received his joint Ph.D. from KU Leuven (Leuven, Belgium) and Université Claude Bernard Lyon 1 (Lyon, France). During his career, he has developed a solid background in artificial intelligence, graph theory, and machine learning, with a focus on the biomedical field. He is currently a senior data scientist in CGnal, a consulting firm fully committed to helping its top-tier clients implement data-driven strategies and build AI-powered solutions to promote efficiency and support new business models.
Read more about Claudio Stamile

author image
Aldo Marzullo

Aldo Marzullo received an M.Sc. degree in computer science from the University of Calabria (Cosenza, Italy) in September 2016. During his studies, he developed a solid background in several areas, including algorithm design, graph theory, and machine learning. In January 2020, he received his joint Ph.D. from the University of Calabria and Université Claude Bernard Lyon 1 (Lyon, France), with a thesis entitled Deep Learning and Graph Theory for Brain Connectivity Analysis in Multiple Sclerosis. He is currently a postdoctoral researcher at the University of Calabria and collaborates with several international institutions.
Read more about Aldo Marzullo

author image
Enrico Deusebio

Enrico Deusebio is currently the chief operating officer at CGnal, a consulting firm that helps its top-tier clients implement data-driven strategies and build AI-powered solutions. He has been working with data and large-scale simulations using high-performance facilities and large-scale computing centers for over 10 years, both in an academic and industrial context. He has collaborated and worked with top-tier universities, such as the University of Cambridge, the University of Turin, and the Royal Institute of Technology (KTH) in Stockholm, where he obtained a Ph.D. in 2014. He also holds B.Sc. and M.Sc. degrees in aerospace engineering from Politecnico di Torino.
Read more about Enrico Deusebio