Packt+ | Advance your knowledge in tech

You're reading from Learning Neo4j 3.x - Second Edition

Product type Book

Published in Oct 2017

Publisher Packt

ISBN-13 9781786466143

Pages 316 pages

Edition 2nd Edition

Languages

Java

Concepts

Databases

Author (1):

Jerome Baton

Table of Contents (24) Chapters

Title Page

Credits

About the Authors

Acknowledgement

About the Reviewers

www.PacktPub.com

Customer Feedback

Preface

Graph Theory and Databases

Getting Started with Neo4j

Modeling Data for Neo4j

Getting Started with Cypher

Awesome Procedures on Cypher - APOC

Extending Cypher

Query Performance Tuning

Importing Data into Neo4j

Going Spatial

Security

Visualizations for Neo4j

Data Refactoring with Neo4j

Clustering

Use Case Example - Recommendations

Use Case Example - Impact Analysis and Simulation

Tips and Tricks

Chapter 3. Modeling Data for Neo4j

In this chapter, we will get started with some graph database modeling in Neo4j. As this type of modeling can be quite different from what we are typically used to with our relational database backgrounds, we will start by explaining the fundamental constructs first and then move on to explore some recommended approaches.

We will cover the following topics in this chapter:

The four fundamental data constructs
A graph model--a simple, high-fidelity model of reality
Modeling pitfalls and best practices

The four fundamental data constructs

As you may already know, the graph theory gives us many different graphs to work with. Graphs come in many different shapes and sizes, and therefore, Neo4j needed to choose a very specific type of data structure that is flexible enough to support the versatility required by real-world datasets. This is why the underlying data model of Neo4j, the labeled property graph, is one of the most generic and versatile of all graph models.

This graph data model gives us four different fundamental building blocks to structure and store our data. Let's go through them:

The labeled property graph model

Nodes: These are typically used to store entity information. In the preceding example, these are individual books, readers, and authors that are present in the library data model.
Relationships: These are used to connect nodes to one another explicitly and therefore provide a means of structuring your entities. They are the equivalent of an explicitly stored and precalculated...

How to start modeling for graph databases

In this section, we will spend some time going through what a graph database model is. Specifically, we would like to clarify a common misunderstanding that originates from our habitual relational database system knowledge.

What we know – ER diagrams and relational schemas

In a relational system, we have been taught to start our modeling with an Entity-Relationship diagram. Using these techniques, we can start from a problem/domain description (what we call a user story in today's agile development methodologies) and extract the meaningful entities and relationships. We will come back to this later, but essentially, we usually find that from such a domain description, we can perform the following:

Extract the entities by looking at the nouns of the description
Extract the properties by looking at the adjectives of the description
Extract the relationship by looking at the operating verbs in the description

These are, of course, generic guidelines that...

A graph model – a simple, high-fidelity model of reality

Let's take a quick look at how we can avoid the complexity mentioned previously in the graph world. In the following figure, you will find the graph model and relational model side by side:

The relational model versus the graph model

On the right-hand side of the image, you will see the three tables in the relational model:

A customers table with a number of customer records
An Accounts table with a number of accounts of these customers
A typical join table that links customers to accounts

What is important here is the implication of this construction--every single time we want to find the accounts of a customer, we need to perform the following:

Look up the customer by their key in the customer table.
Join the customer using this key to their accounts.
Look up the customer's accounts in the accounts table using the account keys that we found in the previous step.

Compare this with the left-hand side of the figure and you will see that the model...

Graph modeling – best practices and pitfalls

In this chapter, we will give you an overview of the generic recommendations and best practices for graph database modeling, and we will also provide you with some insight into common pitfalls for you to avoid. It goes without saying that all of these recommendations are generic recommendations and that there may be exceptions to these rules in your specific domains--just as previously, in the case of your relational database design models.

Graph modeling best practices

In the upcoming sections, we will be discussing a number of practices that have been successfully applied in a number of Neo4j projects.

Designing for query-ability

As with any database management system, but perhaps even more so for a graph database management system such as Neo4j, your queries will drive your model. What we mean with this is that, exactly like it was with any type of database that you may have used in the past or would still be using today, you will need to make...

Test questions

Q1. The four fundamental data constructs of Neo4j are:

Table, record, field, and constraint
Node, relationship, property, and schema
Node, relationship, property, and label
Document, relationship, property, and collection

Q2. Normalization is expensive in a graph database model.

True
False

Q3. If you have a few entities in your dataset that have lots of relationships to other entities, then you can't use a graph database because of the dense node problem.

True--you will have to use a relational system.
True--but there is no alternative, so you will have to live with it.
False--you can still use a graph database but it will be painfully slow for all queries.
False--you can very effectively use a graph database, but you should take precautions, such as applying a fan-out pattern to your data.

Summary

In this chapter, we discussed a number of topics that will help you get started when modeling your domain for a graph database management system. We talked about the fundamental building blocks of the model, compared and contrasted this with the way we do things in a relational database management system, and then discussed some often recurring patterns, both good and bad, for the modeling work.

You should also know that modeling is best done as a team in front of a white board, and you should bring non-technical persons as well.

With the model behind us, we can now start tackling more technical matters such as Neo4j's fantastic query language: Cypher.