Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Learning Neo4j
Learning Neo4j

Learning Neo4j: Run blazingly fast queries on complex graph datasets with the power of the Neo4j graph database

By Rik Van Bruggen
$54.99
Book Aug 2014 222 pages 1st Edition
eBook
$32.99 $22.99
Print
$54.99
Subscription
$15.99 Monthly
eBook
$32.99 $22.99
Print
$54.99
Subscription
$15.99 Monthly

What do you get with Print?

Product feature icon Instant access to your digital eBook copy whilst your Print order is Shipped
Product feature icon Black & white paperback book shipped to your address
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Buy Now

Product Details


Publication date : Aug 25, 2014
Length 222 pages
Edition : 1st Edition
Language : English
ISBN-13 : 9781849517164
Category :
Languages :
Concepts :
Table of content icon View table of contents Preview book icon Preview Book

Learning Neo4j

Chapter 1. Graphs and Graph Theory – an Introduction

People have different ways of learning new topics. We know that background information can contribute greatly to a better understanding of new topics. That is why, in this chapter of our Learning Neo4j book, we will start with quite a bit of background information, not to recount the tales of history, but to give you the necessary context that can lead to a better understanding of topics.

In order to do so, we will address the following topics:

  • Graphs: What they are and where they came from. This section will aim to set the record straight on what exactly our subject will contain and what it won't.

  • Graph theory: What it is and what it is used for. This section will give you quite a few examples of graph theory applications, and it will also start hinting at applications for graph databases such as Neo4j later on.

So, let's dig right in.

Introduction to and history of graphs


Many people might have used the word graph at some point in their professional or personal lives. However, chances are that they did not use it in the way that we will be using it in this book. Most people—obviously not you, my dear reader, otherwise you probably would not have picked up this book—actually think about something very different when talking about a graph. They think about pie charts and bar charts. They think about graphics, not graphs.

In this book, we will be working with a completely different type of subject—the graphs that you might know from your math classes. I, for once, distinctly remember being taught the basics of discrete Mathematics in one of my university classes, and I also remember finding it terribly complex and difficult to work with. Little did I know that my later professional career will use these techniques in a software context, let alone that I would be writing a book on this topic.

So, what are graphs? To explain this, I think it is useful to put a little historic context around the concept. Graphs are actually quite old as a concept. They were invented, or at least first described, in an academic paper by the well-known Swiss mathematician Leonhard Euler. He was trying to solve an age-old problem that we now know as the 7 bridges of Königsberg. The problem at hand was pretty simple to understand.

Königsberg has a beautiful medieval city in the Prussian empire, situated on the river Pregel. It is located between Poland and Lithuania in today's Russia. If you try to look it up on any modern-day map, you will most likely not find it as it is currently known as Kaliningrad. The Pregel not only cut Königsberg into a left- and right-bank side of the city, but it also created an island in the middle of the river, which was known as the Kneiphof. The result of this peculiar situation was a city that was cut into four parts. We will refer to them as A, B, C and D, which were connected by seven bridges (labeled a, b, c, d, e, f, and g in the following diagram).This gives us the following situation:

  • The seven bridges are connected to the four different parts of the city

  • The essence of the problem that people were trying to solve was to take a tour of the city, visiting every one of its parts and crossing every single one of its bridges, without having to walk a single bridge or street twice

In the following diagram, you can see how Euler illustrated this problem in his original 1736 paper:

Illustration of the mentioned problem as mentioned by Euler in his paper in 1736

Essentially, it was a pathfinding problem, like there are many others (for example, the knight's ride problem or the travelling salesman problem). It does not seem like a very difficult assignment at all now does it? However, at the time, people really struggled with it and were trying to figure it out for the longest time. It was not until Euler got involved and took a very different, mathematical approach to the problem that it got solved once and for all.

Euler did the following two things that I find really interesting:

  1. First and foremost, he decided not to take the traditional brute force method to solve the problem (that is, in this case, drawing a number of different route options on the map and trying to figure out—essentially by trial and error—if there was such a route through the city), but decided to do something different. He took a step back and took a different look at the problem by creating what I call an abstract version of the problem at hand, which is essentially a model of the problem domain that he was trying to work with. In his mind at least, Euler must have realized that the citizens of Königsberg were focusing their attention on the wrong part of the problem—the streets. Euler quickly came to the conclusion that the streets of Königsberg really did not matter to find a solution to the problem. The only things that mattered for his pathfinding operation were the following:

    • The parts of the city

    • The bridges connecting the parts of the city

    Now, all of a sudden, we seem to have a very different problem at hand, which can be accurately represented in what is often regarded as "the world's first graph":

  2. Secondly, Euler solved the puzzle at hand by applying a mathematical algorithm on the model that he created. Euler's logic was simple: if I want to take a walk in the town of Königsberg, then:

    • I will have to start somewhere in any one of the four parts of the city

    • I will have to leave that part of the city; in other words, cross one of the bridges to go to another part of the city

    • I will then have to cross another five bridges, leaving and entering different parts of the city

    • Finally, I will end the walk through Königsberg in another part of the city

Therefore, Euler argues, the case must be that the first and last parts of the city have an odd number of bridges that connect them to other parts of the city (because you leave from the first part and you arrive at the last part of the city), but the other two parts of the city must have an even number of bridges connecting them to the first and last parts of the city because you will arrive and leave from these parts of the city.

This "number of bridges connecting the parts of the city" has a very special meaning in the model that Euler created, the graph representation of the model. We call this the degree of the nodes in the graph. In order for there to be a path through Königsberg that only crossed every bridge once, Euler proved that all he had to do was to apply a very simple algorithm that will establish the degree (in other words, count the number of bridges) of every part of the city. This is shown in the following diagram:

This is how Euler solved the famous "Seven bridges of Königsberg" problem. By proving that there was no part of the city that had an even number of bridges, he also proved that the required walk in the city cannot be done. Adding one more bridge would immediately make it possible, but with the current state of the city, and its bridges at the time, there was no way one could take such an Eulerian Walk of the city. By doing so, Euler created the world's first graph. The concepts and techniques of his research, however, are universally applicable; in order to do such a walk on any graph, the graph must have zero or two vertices with an odd degree, and all intermediate vertices must have an even degree.

To summarize, a graph is nothing more than an abstract, mathematical representation of two or more entities, which are somehow connected or related to each other. Graphs model pairwise relations between objects. They are, therefore, always made up of the following components:

  • The nodes of the graph, usually representing the objects mentioned previously: In math, we usually refer to these structures as vertices; but for this book, and in the context of graph databases such as Neo4j, we will always refer to vertices as nodes.

  • The links between the nodes of the graph: In math, we refer to these structures as edges, but again, for the purpose of this book, we will refer to these links as relationships.

  • The structure of how nodes and relationships are connected to each other makes a graph: Many important qualities, such as the number of edges connected to a node, what we referred to as degree, can be assessed. Many other such indicators also exist.

Now that we have graphs and understand a bit more about their nature and history, it's time to look at the discipline that was created on top of these concepts, often referred to as the graph theory.

Definition and usage of graph theory


When Euler invented the first graph, he was trying to solve a very specific problem of the citizens of Königsberg, with a very specific representation/model and a very specific algorithm. It turns out that there are quite a few problems that can be:

  • Described using the graph metaphor of objects and pairwise relations between these objects

  • Solved by applying a mathematical algorithm to this structure

The mechanism is the same, and the scientific discipline that studies these modeling and solution patterns, using graphs, is often referred to as the graph theory, and it is considered to be a part of discrete Mathematics.

There are lots of different types of graphs that have been analyzed in this discipline, as you can see from the following diagram.

Graph theory, the study of graph models and algorithms, has turned out to be a fascinating field of study, which has been used in many different disciplines to solve some of the most interesting questions facing mankind. Interestingly enough, it has seldom really been applied with rigor in the different fields of science that can benefit from it; maybe scientists today don't have the multidisciplinary approach required (providing expertise from graph theory and their specific field of study) to do so.

So, let's talk about some of these fields of study a bit, without wanting to give you an exhaustive list of all applicable fields. Still, I do believe that some of these examples will be of interest for our future discussions in this book and work up an appetite for what types of applications we will use a graph-based database such as Neo4j for.

Social studies

For the longest time, people have understood that the way humans interact with one another is actually very easy to describe in a network. People interact with people every day. People influence one another every day. People exchange ideas every day. As they do, these interactions cause ripple effects through the social environment that they inhabit. Modeling these interactions as a graph has been of primary importance to better understand global demographics, political movements, and—last but not least—commercial adoption of certain products by certain groups. With the advent of online social networks, this graph-based approach to social understanding has taken a whole new direction. Companies such as Google, Facebook, Twitter, LinkedIn (see the following diagram featuring a visualization of my LinkedIn network), and many others have undertaken very specific efforts to include graph-based systems in the way they target their customers and users, and in doing so, they have changed many of our daily lives quite fundamentally.

Biological studies

We sometimes say it in marketing taglines: "Graphs Are Everywhere". When we do so, we are actually describing reality in a very real and fascinating way. Also, in this field, researchers have known for quite some time that biological components (proteins, molecules, genes, and so on) and their interactions can accurately be modeled and described by means of a graph structure, and doing so yields many practical advantages. In metabolic pathways (see the following diagram for the human metabolic system), for example, graphs can help us to understand how the different parts of the human body interact with each other. In metaproteomics, researchers analyze how different kinds of proteins interact with one another and are used in order to better steer chemical and biological production processes.

A diagram representing the human metabolic system

Computer science

Some of the earliest computers were built with graphs in mind. Graph Compute Engines solved scheduling problems for railroads as early as the late 19th century, and the usage of graphs in computer science has only accelerated since then. In today's applications, the use cases vary from chip design, network management, recommendation systems, and UML modeling to algorithm generation and dependency analysis. The following is an example of such a UML diagram:

An example of a UML diagram

The latter is probably one of the more interesting use cases. Using pathfinding algorithms, software and hardware engineers have been analyzing the effects of changes in the design of their artifacts on the rest of the system. If a change is made to one part of the code, for example, a particular object is renamed; the dependency analysis algorithms can easily walk the graph of the system to find out what other classes will be affected by the former change.

Flow problems

Another really interesting field of graph theory applications is flow problems, also known as maximum flow problems. In essence, this field is part of a larger field of optimization problems, which is trying to establish the best possible path across a flow network. Flow networks are a type of graph in which the nodes/vertices of the graph are connected by relationships/edges that specify the capacity of that particular relationship. Examples can be found in fields such as telecom networks, gas networks, airline networks, package delivery networks, and many others, where graph-based models are then used in combination with complex algorithms. The following diagram is an example of such a network, as you can find it on http://enipedia.tudelft.nl/.

An example of a flow network

These algorithms are then used to identify the calculated optimal path, find bottlenecks, plan maintenance activities, conduct long-term capacity planning, and many other operations.

Route problems

The original problem that Euler set out to solve in 18th century Königsberg was in fact a route planning / pathfinding problem. Today, many graph applications leverage the extraordinary capability of graphs and graph algorithms to calculate—as opposed to finding with trial and error—the optimal route between two nodes on a network. In the following diagram, you will find a simple route planning example as a graph:

A simple route planning example between cities to choose roads versus highways

A very simple example will be from the domain of logistics. When trying to plan for the best way to get a package from one city to another, one will need the following:

  1. A list of all routes available between the cities

  2. The most optimal of the available routes, which depends on various parameters in the network, such as capacity, distance, cost, CO2 exhaust, speed, and so on

This type of operation is a very nice use case for graph algorithms. There are a couple of very well-known algorithms that we can briefly highlight:

  • The Dijkstra algorithm: This is one of the best-known algorithms to calculate the shortest weighted path between two points in a graph, using the properties of the edges as weights or costs of that link.

  • The A* (A-star) algorithm: This is a variation of Dijkstra's original ideas, but it uses heuristics to predict more efficiently the shortest path explorations. As A* explores potential graph paths, it holds a sorted priority queue of alternate path segments along the way, since it calculates the "past path" cost and the "future path" cost of the different options that are possible during the route exploration.

Depending on the required result, the specific dataset, and the speed requirements, different algorithms will yield different returns.

Web search

No book chapter treating graphs and graph theory—even at the highest level—will be complete without mentioning one of the most powerful and widely-used graph algorithms on the planet, PageRank. PageRank is the original graph algorithm, invented by Google founder Larry Page in 1996 at Stanford University, to provide better web search results. For those of us old enough to remember the early days of web searching (using tools such as Lycos, AltaVista, and so on), it provided a true revolution in the way the Web was made accessible to end users. The following diagram represents the PageRank graph:

The older tools did keyword matching on web pages, but Google revolutionized this by no longer focusing on keywords alone, but by doing link analysis on the hyperlinks between different web pages. PageRank, and many of the other algorithms that Google uses today, assumes that more important web pages, which should appear higher in your search results, will have more incoming links from other pages, and therefore, it is able to score these pages by analyzing the graph of links to the web page. History has shown us the importance of PageRank. Not only has Google, Inc. built quite an empire on top of this graph algorithm, but its principles have also been applied to other fields such as cancer research and chemical reactions.

Test questions


Q1. Graph theory is a very recent field in modern Mathematics, invented in the late 20th century by Leonard Euler:

  1. True

  2. False

Q2. Name one field that graphs are NOT used for in today's science/application fields:

  1. Route problems

  2. Social studies

  3. Accounting systems

  4. Biological studies

Q3. Graphs are a very niche phenomenon that can only be applied to a very limited set of applications/research fields:

  1. True

  2. False

Summary


In the first chapter of this book, we wanted to give you a first look at some of the concepts that underpin the subject of this book, the graph database Neo4j. We introduced the history of graphs, explained some of the principles that are being explored in the fascinating mathematical field of graph theory, and provided some examples of other academic and functional domains that have been benefiting from this rich, century-long history. The conclusion of this is plain and simple: Graphs Are Everywhere. Much of our world is in reality dependent on and related to many other things—it is densely connected, as we call it in graph terms. This of course has implications on how we work with the reality in our computer systems, how we store the data that describes reality in a database management system, and how we interact with the system in different kinds of applications.

In the next chapter, we will start applying this context to the specific part of computer science that deals with graph structures in the field of database management systems.

Left arrow icon Right arrow icon

Key benefits

What you will learn

Background and specifications of graph databases Install Neo4j on a variety of different platforms, locally and in the cloud Model data for a graph database such as Neo4j Import data into Neo4j Learn about sample use cases for Neo4j Discover the advantages of graph databases versus other database models Find out where you can find additional information on Neo4j

What do you get with Print?

Product feature icon Instant access to your digital eBook copy whilst your Print order is Shipped
Product feature icon Black & white paperback book shipped to your address
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Buy Now

Product Details


Publication date : Aug 25, 2014
Length 222 pages
Edition : 1st Edition
Language : English
ISBN-13 : 9781849517164
Category :
Languages :
Concepts :

Table of Contents

18 Chapters
Learning Neo4j Chevron down icon Chevron up icon
Credits Chevron down icon Chevron up icon
About the Author Chevron down icon Chevron up icon
About the Reviewers Chevron down icon Chevron up icon
www.PacktPub.com Chevron down icon Chevron up icon
Preface Chevron down icon Chevron up icon
Graphs and Graph Theory – an Introduction Chevron down icon Chevron up icon
Graph Databases – Overview Chevron down icon Chevron up icon
Getting Started with Neo4j Chevron down icon Chevron up icon
Modeling Data for Neo4j Chevron down icon Chevron up icon
Importing Data into Neo4j Chevron down icon Chevron up icon
Use Case Example – Recommendations Chevron down icon Chevron up icon
Use Case Example – Impact Analysis and Simulation Chevron down icon Chevron up icon
Visualizations for Neo4j Chevron down icon Chevron up icon
Other Tools Related to Neo4j Chevron down icon Chevron up icon
Where to Find More Information Related to Neo4j Chevron down icon Chevron up icon
Getting Started with Cypher Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon

Customer reviews

Filter icon Filter
Top Reviews
Rating distribution
Empty star icon Empty star icon Empty star icon Empty star icon Empty star icon 0
(0 Ratings)
5 star 0%
4 star 0%
3 star 0%
2 star 0%
1 star 0%

Filter reviews by


No reviews found
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is the delivery time and cost of print book? Chevron down icon Chevron up icon

Shipping Details

USA:

'

Economy: Delivery to most addresses in the US within 10-15 business days

Premium: Trackable Delivery to most addresses in the US within 3-8 business days

UK:

Economy: Delivery to most addresses in the U.K. within 7-9 business days.
Shipments are not trackable

Premium: Trackable delivery to most addresses in the U.K. within 3-4 business days!
Add one extra business day for deliveries to Northern Ireland and Scottish Highlands and islands

EU:

Premium: Trackable delivery to most EU destinations within 4-9 business days.

Australia:

Economy: Can deliver to P. O. Boxes and private residences.
Trackable service with delivery to addresses in Australia only.
Delivery time ranges from 7-9 business days for VIC and 8-10 business days for Interstate metro
Delivery time is up to 15 business days for remote areas of WA, NT & QLD.

Premium: Delivery to addresses in Australia only
Trackable delivery to most P. O. Boxes and private residences in Australia within 4-5 days based on the distance to a destination following dispatch.

India:

Premium: Delivery to most Indian addresses within 5-6 business days

Rest of the World:

Premium: Countries in the American continent: Trackable delivery to most countries within 4-7 business days

Asia:

Premium: Delivery to most Asian addresses within 5-9 business days

Disclaimer:
All orders received before 5 PM U.K time would start printing from the next business day. So the estimated delivery times start from the next day as well. Orders received after 5 PM U.K time (in our internal systems) on a business day or anytime on the weekend will begin printing the second to next business day. For example, an order placed at 11 AM today will begin printing tomorrow, whereas an order placed at 9 PM tonight will begin printing the day after tomorrow.


Unfortunately, due to several restrictions, we are unable to ship to the following countries:

  1. Afghanistan
  2. American Samoa
  3. Belarus
  4. Brunei Darussalam
  5. Central African Republic
  6. The Democratic Republic of Congo
  7. Eritrea
  8. Guinea-bissau
  9. Iran
  10. Lebanon
  11. Libiya Arab Jamahriya
  12. Somalia
  13. Sudan
  14. Russian Federation
  15. Syrian Arab Republic
  16. Ukraine
  17. Venezuela
What is custom duty/charge? Chevron down icon Chevron up icon

Customs duty are charges levied on goods when they cross international borders. It is a tax that is imposed on imported goods. These duties are charged by special authorities and bodies created by local governments and are meant to protect local industries, economies, and businesses.

Do I have to pay customs charges for the print book order? Chevron down icon Chevron up icon

The orders shipped to the countries that are listed under EU27 will not bear custom charges. They are paid by Packt as part of the order.

List of EU27 countries: www.gov.uk/eu-eea:

A custom duty or localized taxes may be applicable on the shipment and would be charged by the recipient country outside of the EU27 which should be paid by the customer and these duties are not included in the shipping charges been charged on the order.

How do I know my custom duty charges? Chevron down icon Chevron up icon

The amount of duty payable varies greatly depending on the imported goods, the country of origin and several other factors like the total invoice amount or dimensions like weight, and other such criteria applicable in your country.

For example:

  • If you live in Mexico, and the declared value of your ordered items is over $ 50, for you to receive a package, you will have to pay additional import tax of 19% which will be $ 9.50 to the courier service.
  • Whereas if you live in Turkey, and the declared value of your ordered items is over € 22, for you to receive a package, you will have to pay additional import tax of 18% which will be € 3.96 to the courier service.
How can I cancel my order? Chevron down icon Chevron up icon

Cancellation Policy for Published Printed Books:

You can cancel any order within 1 hour of placing the order. Simply contact customercare@packt.com with your order details or payment transaction id. If your order has already started the shipment process, we will do our best to stop it. However, if it is already on the way to you then when you receive it, you can contact us at customercare@packt.com using the returns and refund process.

Please understand that Packt Publishing cannot provide refunds or cancel any order except for the cases described in our Return Policy (i.e. Packt Publishing agrees to replace your printed book because it arrives damaged or material defect in book), Packt Publishing will not accept returns.

What is your returns and refunds policy? Chevron down icon Chevron up icon

Return Policy:

We want you to be happy with your purchase from Packtpub.com. We will not hassle you with returning print books to us. If the print book you receive from us is incorrect, damaged, doesn't work or is unacceptably late, please contact Customer Relations Team on customercare@packt.com with the order number and issue details as explained below:

  1. If you ordered (eBook, Video or Print Book) incorrectly or accidentally, please contact Customer Relations Team on customercare@packt.com within one hour of placing the order and we will replace/refund you the item cost.
  2. Sadly, if your eBook or Video file is faulty or a fault occurs during the eBook or Video being made available to you, i.e. during download then you should contact Customer Relations Team within 14 days of purchase on customercare@packt.com who will be able to resolve this issue for you.
  3. You will have a choice of replacement or refund of the problem items.(damaged, defective or incorrect)
  4. Once Customer Care Team confirms that you will be refunded, you should receive the refund within 10 to 12 working days.
  5. If you are only requesting a refund of one book from a multiple order, then we will refund you the appropriate single item.
  6. Where the items were shipped under a free shipping offer, there will be no shipping costs to refund.

On the off chance your printed book arrives damaged, with book material defect, contact our Customer Relation Team on customercare@packt.com within 14 days of receipt of the book with appropriate evidence of damage and we will work with you to secure a replacement copy, if necessary. Please note that each printed book you order from us is individually made by Packt's professional book-printing partner which is on a print-on-demand basis.

What tax is charged? Chevron down icon Chevron up icon

Currently, no tax is charged on the purchase of any print book (subject to change based on the laws and regulations). A localized VAT fee is charged only to our European and UK customers on eBooks, Video and subscriptions that they buy. GST is charged to Indian customers for eBooks and video purchases.

What payment methods can I use? Chevron down icon Chevron up icon

You can pay with the following card types:

  1. Visa Debit
  2. Visa Credit
  3. MasterCard
  4. PayPal
What is the delivery time and cost of print books? Chevron down icon Chevron up icon

Shipping Details

USA:

'

Economy: Delivery to most addresses in the US within 10-15 business days

Premium: Trackable Delivery to most addresses in the US within 3-8 business days

UK:

Economy: Delivery to most addresses in the U.K. within 7-9 business days.
Shipments are not trackable

Premium: Trackable delivery to most addresses in the U.K. within 3-4 business days!
Add one extra business day for deliveries to Northern Ireland and Scottish Highlands and islands

EU:

Premium: Trackable delivery to most EU destinations within 4-9 business days.

Australia:

Economy: Can deliver to P. O. Boxes and private residences.
Trackable service with delivery to addresses in Australia only.
Delivery time ranges from 7-9 business days for VIC and 8-10 business days for Interstate metro
Delivery time is up to 15 business days for remote areas of WA, NT & QLD.

Premium: Delivery to addresses in Australia only
Trackable delivery to most P. O. Boxes and private residences in Australia within 4-5 days based on the distance to a destination following dispatch.

India:

Premium: Delivery to most Indian addresses within 5-6 business days

Rest of the World:

Premium: Countries in the American continent: Trackable delivery to most countries within 4-7 business days

Asia:

Premium: Delivery to most Asian addresses within 5-9 business days

Disclaimer:
All orders received before 5 PM U.K time would start printing from the next business day. So the estimated delivery times start from the next day as well. Orders received after 5 PM U.K time (in our internal systems) on a business day or anytime on the weekend will begin printing the second to next business day. For example, an order placed at 11 AM today will begin printing tomorrow, whereas an order placed at 9 PM tonight will begin printing the day after tomorrow.


Unfortunately, due to several restrictions, we are unable to ship to the following countries:

  1. Afghanistan
  2. American Samoa
  3. Belarus
  4. Brunei Darussalam
  5. Central African Republic
  6. The Democratic Republic of Congo
  7. Eritrea
  8. Guinea-bissau
  9. Iran
  10. Lebanon
  11. Libiya Arab Jamahriya
  12. Somalia
  13. Sudan
  14. Russian Federation
  15. Syrian Arab Republic
  16. Ukraine
  17. Venezuela