Working with a Neo4j Embedded Database

Exclusive offer: get 50% off this eBook here
Learning Cypher

Learning Cypher — Save 50%

Write powerful and efficient queries for Neo4j with Cypher, its official query language with this book and ebook

€12.99    €6.50
by Onofrio Panzarino | May 2014 | Open Source

In this article, we'll see how to create a Neo4j database. Onofrio Panzarino, the author of this book, Learning Cypher, is a programmer with 15 years experience with various languages (mostly with Java), platforms, and technologies. He has gained a lot of experience with graph databases, particularly with Neo4j. This article throws light on setting up a new Neo4j database.

(For more resources related to this topic, see here.)

Neo4j is a graph database, which means that it does not use tables and rows to represent data logically; instead, it uses nodes and relationships. Both nodes and relationships can have a number of properties. While relationships must have one direction and one type, nodes can have a number of labels. For example, the following diagram shows three nodes and their relationships, where every node has a label (language or graph database), while relationships have a type (QUERY_LANGUAGE_OF and WRITTEN_IN).

The properties used in the graph shown in the following diagram are: name, type, and from. Note that every relation must have exactly one type and one direction, whereas labels for nodes are optional and can be multiple.

Neo4j running modes

Neo4j can be used in two modes:

  • An embedded database in a Java application;

  • A standalone server via REST

In any case, this choice does not affect the way you query and work with the database. It's only an architectural choice driven by the nature of the application (whether a standalone server or a client-server), performance, monitoring, and safety of data.

An embedded database

An embedded Neo4j database is the best choice for performance. It runs in the same process of the client application that hosts it and stores data in the given path. Thus, an embedded database must be created programmatically. We choose an embedded database for the following reasons:

  • When we use Java as the programming language for our project

  • When our application is standalone

Preparing the development environment

The fastest way to prepare the IDE for Neo4j is using Maven. Maven is a dependency management and automated building tool. In the following procedure, we will use NetBeans 7.4, but it works in a very similar way with the other IDEs (for Eclipse, you would need the m2eclipse plugin). The procedure is described as follows:

  1. Create a new Maven project as shown in the following screenshot:

  2. In the next page of the wizard, name the project, set a valid project location, and then click on Finish.

  3. After NetBeans has created the project, expand Project Files in the project tree and open the pom.xml file. In the <dependencies> tag, insert the following XML code:

    <dependencies> <dependency> <groupId>org.neo4j</groupId> <artifactId>neo4j</artifactId> <version>2.0.1</version> </dependency> </dependencies> <repositories> <repository> <id>neo4j</id> <url>http://m2.neo4j.org/content/repositories/releases/</url> <releases> <enabled>true</enabled> </releases> </repository> </repositories>

This code instructs Maven the dependency we are using on our project, that is, Neo4j. The version we have used here is 2.0.1. Of course, you can specify the latest available version.

Once saved, the Maven file resolves the dependency, downloads the JAR files needed, and updates the Java build path. Now, the project is ready to use Neo4j and Cypher.

Creating an embedded database

Creating an embedded database is straightforward. First of all, to create a database, we need a GraphDatabaseFactory class, which can be done with the following code:

GraphDatabaseFactory graphDbFactory = new GraphDatabaseFactory();

Then, we can invoke the newEmbeddedDatabase method with the following code:

GraphDatabaseService graphDb = graphDbFactory .newEmbeddedDatabase("data/dbName");

Now, with the GraphDatabaseService class, we can fully interact with the database, create nodes, create relationships, set properties and indexes.

Invoking Cypher from Java

To execute Cypher queries on a Neo4j database, you need an instance of ExecutionEngine; this class is responsible for parsing and running Cypher queries, returning results in a ExecutionResult instance:

import org.neo4j.cypher.javacompat.ExecutionEngine; import org.neo4j.cypher.javacompat.ExecutionResult; // ... ExecutionEngine engine = new ExecutionEngine(graphDb); ExecutionResult result = engine.execute("MATCH (e:Employee) RETURN e");

Note that we use the org.neo4j.cypher.javacompat package and not the org.neo4j.cypher package even though they are almost the same. The reason is that Cypher is written in Scala, and Cypher authors provide us with the former package for better Java compatibility.

Now with the results, we can do one of the following options:

  • Dumping to a string value

  • Converting to a single column iterator

  • Iterating over the full row

Dumping to a string is useful for testing purposes:

String dumped = result.dumpToString();

If we print the dumped string to the standard output stream, we will get the following result:

Here, we have a single column (e) that contains the nodes. Each node is dumped with all its properties. The numbers between the square brackets are the node IDs, which are the long and unique values assigned by Neo4j on the creation of the node.

When the result is single column, or we need only one column of our result, we can get an iterator over one column with the following code:

import org.neo4j.graphdb.ResourceIterator; // ... ResourceIterator<Node> nodes = result.columnAs("e");

Then, we can iterate that column in the usual way, as shown in the following code:

while(nodes.hasNext()) { Node node = nodes.next(); // do something with node }

However, Neo4j provides a syntax-sugar utility to shorten the code that is to be iterated:

import org.neo4j.helpers.collection.IteratorUtil; // ... for (Node node : IteratorUtil.asIterable(nodes)) { // do something with node }

If we need to iterate over a multiple-column result, we would write this code in the following way:

ResourceIterator<Map<String, Object>> rows = result.iterator(); for(Map<String,Object> row : IteratorUtil.asIterable(rows)) { Node n = (Node) row.get("e"); try(Transaction t = n.getGraphDatabase().beginTx()) { // do something with node } }

The iterator function returns an iterator of maps, where keys are the names of the columns. Note that when we have to work with nodes, even if they are returned by a Cypher query, we have to work in transaction. In fact, Neo4j requires that every time we work with the database, either reading or writing to the database, we must be in a transaction. The only exception is when we launch a Cypher query. If we launch the query within an existing transaction, Cypher will work as any other operation. No change will be persisted on the database until we commit the transaction, but if we run the query outside any transaction, Cypher will open a transaction for us and will commit changes at the end of the query.

Summary

We have now completed the setting up of a Neo4j database. We also learned about Cypher pattern matching.

Resources for Article:


Further resources on this subject:


Learning Cypher Write powerful and efficient queries for Neo4j with Cypher, its official query language with this book and ebook
Published: May 2014
eBook Price: €12.99
Book Price: €20.99
See more
Select your format and quantity:

About the Author :


Onofrio Panzarino

Onofrio Panzarino is a programmer with 15 years experience working with various languages (mostly with Java), platforms, and technologies. Before obtaining his Master of Science degree in Electronics Engineering, he worked as a digital signal processor programmer. Around the same time, he started working as a C++ developer for embedded systems and PCs. Currently, he is working with Android, ASP.NET or C#, and JavaScript for Wolters Kluwer Italia. During these years, he gained a lot of experience with graph databases, particularly with Neo4j.

Onofrio resides in Ancona, Italy. His Twitter handle is (@onof80). He is a speaker in the local Java user group and also a technical writer, mostly for Scala and NoSQL. In his spare time, he loves playing the piano with his family and programming with functional languages.

Books From Packt


Network Graph Analysis and Visualization with Gephi
Network Graph Analysis and Visualization with Gephi

Storm Blueprints: Patterns for Distributed Real-time Computation
Storm Blueprints: Patterns for Distributed Real-time Computation

Storm Real-time Processing Cookbook
Storm Real-time Processing Cookbook

Master Data Services Essentials with SQL Server 2012
Master Data Services Essentials with SQL Server 2012

OpenSceneGraph 3 Cookbook
OpenSceneGraph 3 Cookbook

Microsoft SQL Server 2012 with Hadoop
Microsoft SQL Server 2012 with Hadoop

OpenSceneGraph 3.0: Beginner's Guide
OpenSceneGraph 3.0: Beginner's Guide

R Graphs Cookbook
R Graphs Cookbook


Code Download and Errata
Packt Anytime, Anywhere
Register Books
Print Upgrades
eBook Downloads
Video Support
Contact Us
Awards Voting Nominations Previous Winners
Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
Resources
Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software