Reader small image

You're reading from  Learning Neo4j

Product typeBook
Published inAug 2014
Reading LevelBeginner
Publisher
ISBN-139781849517164
Edition1st Edition
Languages
Tools
Concepts
Right arrow
Author (1)
Rik Van Bruggen
Rik Van Bruggen
author image
Rik Van Bruggen

Rik Van Bruggen is the VP of Sales for Neo Technology for Benelux, UK, and the Nordic region. He has been working for startup companies for most of his career, including eCom Interactive Expertise, SilverStream Software, Imprivata, and Courion. While he has an interest in technology, his real passion is business and how to make technology work for a business. He lives in Antwerp, Belgium, with his wife and three lovely kids, and enjoys technology, orienteering, jogging, and Belgian beer.
Read more about Rik Van Bruggen

Right arrow

Chapter 4. Modeling Data for Neo4j

In this chapter, we will get started with some graph database modeling in Neo4j. As this type of modeling can be quite different from what we are typically used to with our relational database backgrounds, we will start by explaining the fundamental constructs first and then explore some recommended approaches.

We will cover the following topics in this chapter:

  • Modeling principles and how-to's

  • Modeling pitfalls and best practices

The four fundamental data constructs


As you may already know by now, graph theory gives us many different graphs to work with. Graphs come in many different shapes and sizes, and therefore, Neo4j needed to choose a very specific type of data structure that is flexible enough to support the versatility required by real-world datasets. This is why the underlying data model of Neo4j, the labeled property graph, is one of the most generic and versatile of all graph models.

This graph data model gives us four different fundamental building blocks to structure and store our data. Let's go through them:

The labeled property graph model

  • Nodes: These are typically used to store entity information. In the preceding example, these are the individual books, readers, and authors that are present in the library data model.

  • Relationships: These are used to connect nodes to one another explicitly and therefore provide a means of structuring your entities. They are the equivalent of an explicitly stored, and...

How to start modeling for graph databases


In this section, we will spend some time going through what a graph database model is. Specifically, we would like to clarify a common misunderstanding that originates from our habitual relational database system knowledge.

What we know – ER diagrams and relational schemas

In a relational system, we have been taught to start out modeling with an Entity-Relationship diagram. Using these techniques, we can start from a problem/domain description (what we call a user story in today's agile development methodologies) and extract the meaningful entities and relationships. We will come back to this later, but essentially, we usually find that from such a domain description, we can:

  • Extract the entities by looking at the nouns of the description

  • Extract the properties by looking at the adjectives of the description

  • Extract the relationship by looking at the operating verbs in the description

These are, of course, generic guidelines that will need to be tried...

A graph model – a simple, high-fidelity model of reality


Let's take a quick look at how we can avoid the complexity mentioned previously in the graph world. In the following figure, you will find the graph model and the relational model side by side:

The relational model versus the graph model

On the right-hand side of the image, you will see the three tables in the relational model:

  • A customers table with a number of customer records

  • An Accounts table with a number of accounts of these customers

  • A typical join table that links customers to accounts

What is important here is the implication of this construction: every single time we want to find the accounts of a customer, we need to perform the following:

  1. Look up the customer by their key in the customer table.

  2. Join the customer using this key to their accounts.

  3. Look up the customer's accounts in the accounts table using the account keys that we found in the previous step.

Contrast this with the left-hand side of the figure, and you will see that...

Graph modeling – best practices and pitfalls


In this chapter, we will give an overview of the generic recommendations and best practices for graph database modeling, and we will also provide you with some insight into common pitfalls for you to avoid. It goes without saying that all of these recommendations are generic recommendations and that there may be exceptions to these rules in your specific domains—just like this could be previously, in the case of your relational database design models.

Graph modeling best practices

In the upcoming sections, I will be sharing and discussing a number of practices that have been successfully applied in a number of Neo4j projects.

Design for query-ability

Like with any database management system, but perhaps even more so for a graph database management system such as Neo4j, your queries will drive your model. What we mean with this is that, exactly like it was with any type of database that you may have used in the past or would still be using today, you...

Test questions


Q1. The four fundamental data constructs of Neo4j are:

  1. Table, record, field, and constraint

  2. Node, relationship, property, and schema

  3. Node, relationship, property, and label

  4. Document, relationship, property, and collection

Q2. Normalization is expensive in a graph database model.

  1. True

  2. False

Q3. If you have a few entities in your dataset that have lots of relationships to other entities, then you can't use a graph database because of the dense node problem.

  1. True—you will have to use a relational system

  2. True—but there is no alternative, so you will have to live with it

  3. False—you can still use a graph database but it will be painfully slow for all queries

  4. False—you can very effectively use a graph database, but you should take precautions, for example, applying a fan-out pattern to your data

Summary


In this chapter, we discussed a number of topics that will help you get started when modeling your domain for a graph database management system. We talked about the fundamental building blocks of the model, compared and contrasted this with the way we do things in a relational database management system, and then discussed some often recurring patterns, both good and bad, for doing the modeling work.

With the model behind us, we can now start tackling specific business problems using Neo4j. In the next chapter, we will start discussing the different data import strategies that will fill the Neo4j database with domain-specific datasets.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Learning Neo4j
Published in: Aug 2014Publisher: ISBN-13: 9781849517164
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Rik Van Bruggen

Rik Van Bruggen is the VP of Sales for Neo Technology for Benelux, UK, and the Nordic region. He has been working for startup companies for most of his career, including eCom Interactive Expertise, SilverStream Software, Imprivata, and Courion. While he has an interest in technology, his real passion is business and how to make technology work for a business. He lives in Antwerp, Belgium, with his wife and three lovely kids, and enjoys technology, orienteering, jogging, and Belgian beer.
Read more about Rik Van Bruggen