Reader small image

You're reading from  Learning Neo4j 3.x - Second Edition

Product typeBook
Published inOct 2017
Reading LevelIntermediate
PublisherPackt
ISBN-139781786466143
Edition2nd Edition
Languages
Tools
Concepts
Right arrow
Author (1)
Jerome Baton
Jerome Baton
author image
Jerome Baton

Jérôme Baton started hacking computers at the age of skin problems, gaming first then continued his trip by self-learning Basic on Amstrad CPC, peaking on coding a full screen horizontal starfield, and messing the interlace of the video controller so that sprites appeared twice as high in horizontal beat'em up games. Disks were three inches for 178 Kb then. Then, for gaming reasons, he switched to Commodore Amiga and its fantastic AMOS Basic. Later caught by seriousness and studies, he wrote Turbo Pascal, C, COBOL, Visual C++, and Java on PCs and mainframes at university, and even Logo in high school. Then, Java happened and he became a consultant, mostly on backend code of websites in many different businesses. Jérôme authored several articles in French on Neo4j, JBoss Forge, an Arduino workshop for Devoxx4Kids, and reviewed kilos of books on Android. He has a weakness for wordplay, puns, spoonerisms, and Neo4j that relieves him from join(t) pains. Jérôme also has the joy to teach in French universities, currently at I.U.T de Paris, Université Paris V - René Descartes (Neo4j, Android), and Université de Troyes (Neo4j), where he does his best to enterTRain the students. When not programming, Jérôme enjoys photography, doing electronics, everything DIY, understanding how things work, trying to be clever or funny on Twitter, and spends a lot of time trying to understand his kids and life in general.
Read more about Jerome Baton

Right arrow

Chapter 13. Clustering

This chapter is about clustering servers, and particularly about  the causal clustering introduced in the Neo4j 3.2 Enterprise.

A cluster is a group of servers used to apply the proverbs, United we stand, divided we fall and E pluribus unum (Latin for several as one). To illustrate this, I will use several types of servers, two Linux laptops, and several Raspberry Pi.

As clustering is an enterprise version feature, it is not available in the community version. I will use the version 3.2.3 Enterprise in this chapter (available for free for 30 days on the Neo4j website).

This chapter covers the following topics:

  • The need for clustering
  • The concept of clustering
  • Building a cluster
  • Disaster recovery

Why set up a cluster?


The obvious reason to set up a cluster is that sometimes, a single server is not enough to serve the whole world. Scaling is needed. Setting several servers in several regions of the world could solve this, but separate servers imply other tasks such as reference data replication and sharing users. A cluster is the solution to these business needs. There are three main advantages to setting up a cluster:

  • High throughput
  • Data redundancy
  • High availability (HA)

If you are working on a business project and you want 24/7 availability, you need a cluster.

Concepts


In a causal cluster, there can be two categories of servers. Core servers are the most important over read replica servers, as replicas are disposable and can be either added or removed during the scaling of the cluster. Let's go through their roles.

Core servers

Core servers are the most important servers of a cluster. The minimum number of core servers in a cluster is three for a transaction is reported as successful to the user who started it when the update generated is propagated to half the size of the cluster plus 1.

The roles of core servers are as follows:

  • They are the targets of all the updating queries; they uphold data consistency via the consensus commit
  • They are responsible for data safety
  • They do the query routing, along with the bolt protocol drivers
  • They copy data to read replicas (when polled)

Although this is transparent for the users, the admins know that, in a cluster, one server is the leader and is elected by the other core servers.

Read replica servers

Read replica...

Building a cluster


The hardware elements that we will use are as follows:

  • Two laptops as core servers, running Linux, connected to a home WiFi network
  • Two raspberry Pi, 2 and 3, running as core servers

Read replica servers will be running as Docker containers. There will be five replicas running on my main laptop.

In a typical deployment, there are more read replicas than core servers. Of course, you can use any machine you want as long as it can connect to a network and run Neo4j. This can be more PCs, Macs, or any brand of credit card-sized computer running Linux.

Here, in a home environment, all servers are on the same network, plugged on the same router, and mostly in the same room (or not because of wifi and long cables). However, it is still relevant as an example and probably one of the cheapest options to do a cluster with hardware.

In a business environment, you may have to talk to your IT service, declare the machines, list which ports they use, and ask for safe passage of the network...

Disaster recovery


Sometimes, servers fail to serve for so many possible reasons. If a read replica fails, so be it; replace it or add another one, it will get the data as it joins the cluster. If a core server fails, the cluster handles it too. The graph is recreated via the other cores. You can see it yourself:

  1. Stop a core server.
  2. Go through its folders, and remove data/databases and data/cluster-state.
  3. Restart it (the more data, the more time it will take).
  4. Watch the log file to see when it is ready.
  5. Go to its Neo4j browser.
  6. Be amazed.

This is so simple, it reminds me what a speaker said at a graph connect conference: "We do the hard things so you don't have to."

If all your core servers fail, first find the reason, for it is very unlikely on a real state-of-the-art deployment. If you want to roll your graph to the date of a previous backup, the steps are as follows:

  1. Stop all instances (core and replicas).
  2. Perform a backup restore on each instance (with the same backup file).
  3. Restart all instances...

Summary


This chapter taught us why and how to create a Neo4j cluster with different kinds of servers. After the concepts, we saw how to build a cluster. We also saw the magic of the bolt+routing protocol.

You learned that we can grow from a cluster of several credit card-sized computers like Raspberry Pis (https://www.raspberrypi.org/) to a cluster made from several clouds (you'll need a strong credit card for that).

We also saw a disaster recovery procedure to which I'll add this tip:

Note

With any software, disaster recovery must always be team work. Otherwise, it is too easy to put the blame on one person because non-tech people need to act as inactivity makes them feel 'not in power' and blaming is so easy, as they believe failure should not have had happened first. No matter Murphy's law. If you are the one who understood what failed and how to solve the issue, and get the blame anyway instead the credit: change of shop. I saw firsthand that a client lost their best sysadmin as he got bored...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Learning Neo4j 3.x - Second Edition
Published in: Oct 2017Publisher: PacktISBN-13: 9781786466143
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Jerome Baton

Jérôme Baton started hacking computers at the age of skin problems, gaming first then continued his trip by self-learning Basic on Amstrad CPC, peaking on coding a full screen horizontal starfield, and messing the interlace of the video controller so that sprites appeared twice as high in horizontal beat'em up games. Disks were three inches for 178 Kb then. Then, for gaming reasons, he switched to Commodore Amiga and its fantastic AMOS Basic. Later caught by seriousness and studies, he wrote Turbo Pascal, C, COBOL, Visual C++, and Java on PCs and mainframes at university, and even Logo in high school. Then, Java happened and he became a consultant, mostly on backend code of websites in many different businesses. Jérôme authored several articles in French on Neo4j, JBoss Forge, an Arduino workshop for Devoxx4Kids, and reviewed kilos of books on Android. He has a weakness for wordplay, puns, spoonerisms, and Neo4j that relieves him from join(t) pains. Jérôme also has the joy to teach in French universities, currently at I.U.T de Paris, Université Paris V - René Descartes (Neo4j, Android), and Université de Troyes (Neo4j), where he does his best to enterTRain the students. When not programming, Jérôme enjoys photography, doing electronics, everything DIY, understanding how things work, trying to be clever or funny on Twitter, and spends a lot of time trying to understand his kids and life in general.
Read more about Jerome Baton