Reader small image

You're reading from  Getting Started with Hazelcast

Product typeBook
Published inAug 2013
Reading LevelBeginner
Publisher
ISBN-139781782167303
Edition1st Edition
Languages
Right arrow
Author (1)
Matthew Johns
Matthew Johns
author image
Matthew Johns

contacted on 6 may '16 ________ Matthew Johns is an agile software engineer and hands-on technical/solution architect; specialising in designing and delivering highly scaled and available distributed systems, with broad experience across the whole stack. He is the solution architect and lead engineer at Sky.
Read more about Matthew Johns

Right arrow

Chapter 6. Spreading the Load

In addition to the distributed data storage, Hazelcast also provides us with an ability to share out computational power, in the form of a distributed executor. In this chapter, we shall:

  • Learn about the distributed executor service

  • Using futures for response retrieval

  • Single node and multi-node tasks

  • Forcing the location of execution

  • Aligning data with compute

All power to the compute


So far, we have been focusing on data storage for a lot of cases that would take up most of the story for scaling up our application. However, there are other types of applications that require a lot of computational and data processing power. To help cater for this use case, Hazelcast provides a distributed executor service. For us relatively experienced Java developers, we are hopefully already familiar with the introduction of ExecutorService with Java v1.5. Extending this concept further, the distributed execution capabilities allow us to run the Runnable and Callable tasks on the cluster. However, as we are distributing the task, we must ensure that it is also Serializable.

We can think of Hazelcast as providing the scheduling and task management capabilities on top of a number of executors, holding a number of worker threads each.

Like the data storage capabilities offered, should we need to add further capacity to the cluster, we can start more nodes. This will...

Running once, running everywhere


So far we've seen how we can gain access to a distributed executor service and submit our own tasks to it for execution; however, we might need a little more control as to where a task runs. Should we want to pin a particular task to a specific node, we can use the wrapper class DistributedTask to provide some signaling logic to the task manager so that it can detect and control which node the task is delegated to. You can find the details of the members in the cluster from the Cluster class, which is accessible from the HazelcastInstance class.

Config conf = new Config();
HazelcastInstance hz = Hazelcast.newHazelcastInstance(conf);

Member thisMember = hz.getCluster().getLocalMember();
Set<Member> clusterMembers = hz.getCluster().getMembers();
ExecutorService exec = hz.getExecutorService("exec");

Callable<String> timeTask = new TimeInstanceAwareCallable();

Member member = <target member>;
FutureTask<String> specificTask =
  new DistributedTask...

Placing tasks next to the data


Our capability to run a task in a specific target location becomes much more useful when it comes to data affinity. This means that if we are going to be interacting with the distributed data held within the cluster, it would be optimal to co-locate the task execution close to where the required data is actually held. This will reduce the latency of a task and the networking cost of having to retrieve the dependency data from other nodes across the cluster before processing can actually occur. By making our task PartitionAware, we can return a key with which our task is going to interact. From this it is established which partition the key belongs to, and hence the member node that holds that data. Then the task will be automatically submitted to execute on that specific node to minimize the network latency for the task to obtain or manipulate the data.

We might also need to interact with multiple related entries, which might belong to different partitions,...

Summary


As we can see, this is a technology that deals with many aspects of distribution, be it data persistence or even computation. By leveraging these capabilities into our architecture, we are providing ourselves with a very simple scaling mechanism— just add more nodes. We are scaling up multiple aspects of our application simultaneously; in this way, we should hopefully not introduce any scaling imbalances that might have been present if we had just scaled one aspect independently.

In the next chapter, we will examine the different architectural setups that Hazelcast can operate in, the situations that suit the various options, and how to use them.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Getting Started with Hazelcast
Published in: Aug 2013Publisher: ISBN-13: 9781782167303
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Matthew Johns

contacted on 6 may '16 ________ Matthew Johns is an agile software engineer and hands-on technical/solution architect; specialising in designing and delivering highly scaled and available distributed systems, with broad experience across the whole stack. He is the solution architect and lead engineer at Sky.
Read more about Matthew Johns