Packt+ | Advance your knowledge in tech

You're reading from Learning Neo4j 3.x - Second Edition

Product typeBook

Published inOct 2017

Reading LevelIntermediate

PublisherPackt

ISBN-139781786466143

Edition2nd Edition

Languages

Java

Tools

Neo4j

Concepts

Databases

Author (1)

Jerome Baton

Chapter 13. Clustering

This chapter is about clustering servers, and particularly about the causal clustering introduced in the Neo4j 3.2 Enterprise.

A cluster is a group of servers used to apply the proverbs, United we stand, divided we fall and E pluribus unum (Latin for several as one). To illustrate this, I will use several types of servers, two Linux laptops, and several Raspberry Pi.

As clustering is an enterprise version feature, it is not available in the community version. I will use the version 3.2.3 Enterprise in this chapter (available for free for 30 days on the Neo4j website).

This chapter covers the following topics:

The need for clustering
The concept of clustering
Building a cluster
Disaster recovery

Why set up a cluster?

The obvious reason to set up a cluster is that sometimes, a single server is not enough to serve the whole world. Scaling is needed. Setting several servers in several regions of the world could solve this, but separate servers imply other tasks such as reference data replication and sharing users. A cluster is the solution to these business needs. There are three main advantages to setting up a cluster:

High throughput
Data redundancy
High availability (HA)

If you are working on a business project and you want 24/7 availability, you need a cluster.

Concepts

In a causal cluster, there can be two categories of servers. Core servers are the most important over read replica servers, as replicas are disposable and can be either added or removed during the scaling of the cluster. Let's go through their roles.

Core servers

Core servers are the most important servers of a cluster. The minimum number of core servers in a cluster is three for a transaction is reported as successful to the user who started it when the update generated is propagated to half the size of the cluster plus 1.

The roles of core servers are as follows:

They are the targets of all the updating queries; they uphold data consistency via the consensus commit
They are responsible for data safety
They do the query routing, along with the bolt protocol drivers
They copy data to read replicas (when polled)

Although this is transparent for the users, the admins know that, in a cluster, one server is the leader and is elected by the other core servers.

Read replica servers

Read replica...

Building a cluster

The hardware elements that we will use are as follows:

Two laptops as core servers, running Linux, connected to a home WiFi network
Two raspberry Pi, 2 and 3, running as core servers

Read replica servers will be running as Docker containers. There will be five replicas running on my main laptop.

In a typical deployment, there are more read replicas than core servers. Of course, you can use any machine you want as long as it can connect to a network and run Neo4j. This can be more PCs, Macs, or any brand of credit card-sized computer running Linux.

Here, in a home environment, all servers are on the same network, plugged on the same router, and mostly in the same room (or not because of wifi and long cables). However, it is still relevant as an example and probably one of the cheapest options to do a cluster with hardware.

In a business environment, you may have to talk to your IT service, declare the machines, list which ports they use, and ask for safe passage of the network...

Disaster recovery

Sometimes, servers fail to serve for so many possible reasons. If a read replica fails, so be it; replace it or add another one, it will get the data as it joins the cluster. If a core server fails, the cluster handles it too. The graph is recreated via the other cores. You can see it yourself:

Stop a core server.
Go through its folders, and remove data/databases and data/cluster-state.
Restart it (the more data, the more time it will take).
Watch the log file to see when it is ready.
Go to its Neo4j browser.
Be amazed.

This is so simple, it reminds me what a speaker said at a graph connect conference: "We do the hard things so you don't have to."

If all your core servers fail, first find the reason, for it is very unlikely on a real state-of-the-art deployment. If you want to roll your graph to the date of a previous backup, the steps are as follows:

Stop all instances (core and replicas).
Perform a backup restore on each instance (with the same backup file).
Restart all instances...

Summary

This chapter taught us why and how to create a Neo4j cluster with different kinds of servers. After the concepts, we saw how to build a cluster. We also saw the magic of the bolt+routing protocol.

You learned that we can grow from a cluster of several credit card-sized computers like Raspberry Pis (https://www.raspberrypi.org/) to a cluster made from several clouds (you'll need a strong credit card for that).

We also saw a disaster recovery procedure to which I'll add this tip:

Note

With any software, disaster recovery must always be team work. Otherwise, it is too easy to put the blame on one person because non-tech people need to act as inactivity makes them feel 'not in power' and blaming is so easy, as they believe failure should not have had happened first. No matter Murphy's law. If you are the one who understood what failed and how to solve the issue, and get the blame anyway instead the credit: change of shop. I saw firsthand that a client lost their best sysadmin as he got bored...

The rest of the chapter is locked

You have been reading a chapter from

Learning Neo4j 3.x - Second Edition

Published in: Oct 2017Publisher: PacktISBN-13: 9781786466143

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Jerome Baton

Jérôme Baton started hacking computers at the age of skin problems, gaming first then continued his trip by self-learning Basic on Amstrad CPC, peaking on coding a full screen horizontal starfield, and messing the interlace of the video controller so that sprites appeared twice as high in horizontal beat'em up games. Disks were three inches for 178 Kb then. Then, for gaming reasons, he switched to Commodore Amiga and its fantastic AMOS Basic. Later caught by seriousness and studies, he wrote Turbo Pascal, C, COBOL, Visual C++, and Java on PCs and mainframes at university, and even Logo in high school. Then, Java happened and he became a consultant, mostly on backend code of websites in many different businesses. Jérôme authored several articles in French on Neo4j, JBoss Forge, an Arduino workshop for Devoxx4Kids, and reviewed kilos of books on Android. He has a weakness for wordplay, puns, spoonerisms, and Neo4j that relieves him from join(t) pains. Jérôme also has the joy to teach in French universities, currently at I.U.T de Paris, Université Paris V - René Descartes (Neo4j, Android), and Université de Troyes (Neo4j), where he does his best to enterTRain the students. When not programming, Jérôme enjoys photography, doing electronics, everything DIY, understanding how things work, trying to be clever or funny on Twitter, and spends a lot of time trying to understand his kids and life in general.
Read more about Jerome Baton

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages