Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Graph Data Science with Neo4j

You're reading from  Graph Data Science with Neo4j

Product type Book
Published in Jan 2023
Publisher Packt
ISBN-13 9781804612743
Pages 288 pages
Edition 1st Edition
Languages
Author (1):
Estelle Scifo Estelle Scifo
Profile icon Estelle Scifo

Table of Contents (16) Chapters

Preface 1. Part 1 – Creating Graph Data in Neo4j
2. Chapter 1: Introducing and Installing Neo4j 3. Chapter 2: Importing Data into Neo4j to Build a Knowledge Graph 4. Part 2 – Exploring and Characterizing Graph Data with Neo4j
5. Chapter 3: Characterizing a Graph Dataset 6. Chapter 4: Using Graph Algorithms to Characterize a Graph Dataset 7. Chapter 5: Visualizing Graph Data 8. Part 3 – Making Predictions on a Graph
9. Chapter 6: Building a Machine Learning Model with Graph Features 10. Chapter 7: Automatically Extracting Features with Graph Embeddings for Machine Learning 11. Chapter 8: Building a GDS Pipeline for Node Classification Model Training 12. Chapter 9: Predicting Future Edges 13. Chapter 10: Writing Your Custom Graph Algorithms with the Pregel API in Java 14. Index 15. Other Books You May Enjoy

Visualizing Graph Data

Graphs are special objects. Unlike images, there is no simple way to visualize them. The preceding chapters have demonstrated how we can extract information from a graph dataset: node importance using centrality metrics (for example, degree) or node clusters with community detection algorithms (for example, the Louvain algorithm). We have also already used some tools to visualize the content of our graph: neodash to draw charts from data stored in Neo4j, and Neo4j Browser, which is able to draw a graph with nodes and relationships in a dynamic way. Neo4j Browser is very convenient to see the result of a Cypher query, but it is not intended for data analysis visualization. Typically, it does not let us configure node color based on a node property.

In this chapter, we will focus on graph data visualization. We will first learn why it is challenging and what the graph visualization techniques are. We will first create static but customizable images of a graph...

Technical requirements

In order to be able to reproduce the examples given in this chapter, you’ll need the following tools:

  • Neo4j 5.x installed on your computer (see the installation instructions in Chapter 1, Introducing and Installing Neo4j) with the following plugins installed:
    • Awesome Procedures on Cypher (APOC)
    • GDS plugin (version >= 2.2)
  • A Python environment with Jupyter to run notebooks
  • An internet connection to download the plugins and the datasets
  • Any code listed in the book will be available in the associated GitHub repository (https://github.com/PacktPublishing/Graph-Data-Science-with-Neo4j) in the corresponding chapter folder

The complexity of graph data visualization

In order for us to understand why graph visualization is so challenging, we are first going to investigate the easiest networks that can be visualized: physical networks.

Physical networks

By physical networks, I mean networks whose nodes (and sometimes edges) have fixed spatial positions (coordinates). That includes the following:

  • Road networks: Street intersections (nodes) have spatial coordinates (latitude and longitude). Edges (the roads themselves) also have a shape or geometry (linestring) that can be stored using a geospatial data format (shapefile or GeoJSON, for instance) and drawn on a map.
  • Public transport networks: Nodes are bus/train stops with defined positions; edges are the bus paths between these stops.
  • Electric network: We can imagine this as containing nodes with different types (power station, transformer, consumer, etc.). Each of them also has a precise location, and the distance between them can...

Visualizing a small graph with networkx and matplotlib

When the graph is small enough, such as the ones represented in the previous screenshots (Figure 5.2 and 5.3), it can be convenient to visualize them using the matplotlib plotting library. In this section, we are going to reproduce the visualizations displayed previously.

When dealing with graphs in Python, fortunately, we do not have to create our own data structure and implement our algorithms. As with many other tasks, we can just pip install a package developed by the fantastic open source community around Python. For graphs, the most used package is called networkx. Let’s go ahead and go through our next Jupyter notebook.

Visualizing a graph with known coordinates

In this section, we are going to draw a graph representing a part of the road network around the Colosseum in Rome. This data was extracted using the osmnx package, but we are not going to detail its extraction process here, even if osmnx makes it...

Discovering the Neo4j Bloom graph application

Neo4j Bloom is distributed by Neo4j and is a professional-looking application to deal with data specifically stored in a Neo4j database. In this section, we are going to discover its features and use its ability to configure node display color based on a property to visualize graph communities.

What is Bloom?

What can Bloom do for us? It can do all of the following:

  • Graph querying:
    • By label
    • By path
    • With a parametrized Cypher query
  • Graph visualization:
    • Filtered nodes, customized sizes, color, displayed properties, and so on
  • Graph editing:
    • Adding nodes and relationships
    • Adding/editing properties
  • Graph exploration:
    • Shortest path
    • GDS integration

In this chapter, I’ll leave apart the graph editing functionalities to focus on graph querying and graph data visualization features, taking a small tour of the GDS integration.

Bloom installation

If you are using Neo4j Desktop, chances are that it is already installed...

Visualizing large graphs with Gephi

Gephi is a powerful open source graph visualization software, able to deal with very large graphs. In the following sections, we are going to install Gephi and the required plugins, set up our Neo4j database, and draw a graph using this software.

Installing Gephi and its required plugin

In order to install Gephi and the plugins we need to connect it with Neo4j, follow these steps:

  1. Download Gephi from https://gephi.org/.
  2. To start it, follow your OS-specific instructions. For Linux, the following commands should work (make the necessary changes depending on the version of Gephi you downloaded):
    cd Downloads/
    tar xzvf gephi-0.9.7-linux-x64.tar.gz
    cd gephi-0.9.7/bin
    ./gephi
  3. Install the streaming plugin from the Gephi UI.
  4. Open the plugins wizard from the Tools | Plugins menu.
  5. Go to the Available Plugins tab.
  6. Search for the Graph Streaming plugin in the list and select it.
  7. Click Install.

The following screenshot...

Summary

In this chapter, we’ve explored a few helpful tools for graph data visualization. First, networkx has helped us visualize relatively small graphs in a Jupyter notebook. We have explored the challenge of graph data visualization and learned about graph layout. In the second part, we have used another great tool—part of the Neo4j ecosystem—called Neo4j Bloom. It has many features allowing to deal with graph data stored in Neo4j without writing any Cypher query. We have focused on how to customize the appearance of the graph, choosing the node’s color and size.

Finally, we have discovered a very powerful tool we have to know about when dealing with GDS: Gephi. Here, again, we have focused on node appearance configuration.

In all cases, you are highly encouraged to dig deeper into these tools by yourself, using your own data and/or exploring the features we can’t talk about in this book (unless we double its length, but then nobody would...

Further reading

To dig deeper into the concepts covered in this chapter, you can start with the following resources:

Exercises

To practice what you have learned in this chapter, you can use the following ideas to explore your data:

  1. Try to build the street network of your own location. For this, you will need to find the central location coordinates (which you can do using Google Maps, for instance) and update the Geospatial_Network_Creation notebook.
  2. In Neo4j Bloom, use the Filter toolbox to visualize only nodes in the bigger Louvain community (use Cypher to find out the ID of the biggest community).
  3. Still in Bloom, configure the node color to be a function of its degree (the value stored in the degree property).
lock icon The rest of the chapter is locked
You have been reading a chapter from
Graph Data Science with Neo4j
Published in: Jan 2023 Publisher: Packt ISBN-13: 9781804612743
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime}