Network partitioning
We have seen that Hazelcast is capable of handling individual node outages, reacting to restore resilience where possible. However, it's not just node failures that we need to be able to handle. It could also easily be issues in the underlying network fabric that can lead to a situation known as the split-brain syndrome. As this happens away from our application, more at the infrastructure layer, there is very little that we can do to prevent it from happening. However, we should understand how the problem can affect the application and how the issue is handled when the underlying outage is resolved.
The primary issue for the application is where two (or more) sides of a network outage are able to operate perfectly in isolation. In theory, assuming there were backup copies of our data held on both sides of the split, we will continue to operate normally as two independent clusters. However, what happens when the sides become visible again to each other, especially in...