Introduction to JBoss Clustering

Exclusive offer: get 80% off this eBook here
JBoss AS 5 Performance Tuning

JBoss AS 5 Performance Tuning — Save 80%

Build faster, more efficient enterprise Java applications

₨831.00    ₨166.20
by Francesco Marchioni | December 2010 | JBoss Java Open Source

Clustering allows us to run applications on several parallel instances (also known as cluster nodes). The load is distributed across different servers, and even if any of the servers fails, the application is still accessible via other cluster nodes. Clustering is crucial for scalable Enterprise applications, as you can improve performance by simply adding more nodes to the cluster.

In this article by Francesco Marchioni, author of the book JBoss AS 5 Performance Tuning, we will cover the basic building blocks of JBoss Clustering with the following schedule:

  • A short introduction to JBoss Clustering platform
  • In the next section we will cover the low level details of the JGroups library, which is used for all clustering-related communications between nodes

JBoss AS 5 Performance Tuning

JBoss AS 5 Performance Tuning

Build faster, more efficient enterprise Java applications

  • Follow the practical examples to make your applications as efficient as possible
  • Written to version 5.1 and includes advice on upgrading to version 6.0
  • Accurately configure the persistence layer and clustering service
  • Learn how to tune all components and hardware
        Read more about this book      

(For more resources on JBoss, see here.)

Clustering plays an important role in Enterprise applications as it lets you split the load of your application across several nodes, granting robustness to your applications. As we discussed earlier, for optimal results it's better to limit the size of your JVM to a maximum of 2-2.5GB, otherwise the dynamics of the garbage collector will decrease your application's performance.

Combining relatively smaller Java heaps with a solid clustering configuration can lead to a better, scalable configuration plus significant hardware savings.

The only drawback to scaling out your applications is an increased complexity in the programming model, which needs to be correctly understood by aspiring architects.

JBoss AS comes out of the box with clustering support. There is no all-in-one library that deals with clustering but rather a set of libraries, which cover different kinds of aspects. The following picture shows how these libraries are arranged:

The backbone of JBoss Clustering is the JGroups library, which provides the communication between members of the cluster. Built upon JGroups we meet two building blocks, the JBoss Cache framework and the HAPartition service.

JBoss Cache handles the consistency of your application across the cluster by means of a replicated and transactional cache.

On the other hand, HAPartition is an abstraction built on top of a JGroups Channel that provides support for making and receiving RPC invocations from one or more cluster members. For example HA-JNDI (High Availability JNDI) or HA Singleton (High Availability Singleton) both use HAPartition to share a single Channel and multiplex RPC invocations over it, eliminating the configuration complexity and runtime overhead of having each service create its own Channel. (If you need more information about the HAPartition service you can consult the JBoss AS documentation http://community.jboss.org/wiki/jBossAS5ClusteringGuide.).

In the next section we will learn more about the JGroups library and how to configure it to reach the best performance for clustering communication.

Configuring JGroups transport

Clustering requires communication between nodes to synchronize the state of running applications or to notify changes in the cluster definition.

JGroups (http://jgroups.org/manual/html/index.html) is a reliable group communication toolkit written entirely in Java. It is based on IP multicast, but extends by providing reliability and group membership.

Member processes of a group can be located on the same host, within the same Local Area Network (LAN), or across a Wide Area Network (WAN). A member can be in turn part of multiple groups.

The following picture illustrates a detailed view of JGroups architecture:

A JGroups process consists basically of three parts, namely the Channel, Building blocks, and the Protocol stack. The Channel is a simple socket-like interface used by application programmers to build reliable group communication applications. Building blocks are an abstraction interface layered on top of Channels, which can be used instead of Channels whenever a higher-level interface is required. Finally we have the Protocol stack, which implements the properties specified for a given channel.

In theory, you could configure every service to bind to a different Channel. However this would require a complex thread infrastructure with too many thread context switches. For this reason, JBoss AS is configured by default to use a single Channel to multiplex all the traffic across the cluster.

The Protocol stack contains a number of layers in a bi-directional list. All messages sent and received over the channel have to pass through all protocols. Every layer may modify, reorder, pass or drop a message, or add a header to a message. A fragmentation layer might break up a message into several smaller messages, adding a header with an ID to each fragment, and re-assemble the fragments on the receiver's side.

The composition of the Protocol stack (that is, its layers) is determined by the creator of the channel: an XML file defines the layers to be used (and the parameters for each layer).

Knowledge about the Protocol stack is not necessary when just using Channels in an application. However, when an application wishes to ignore the default properties for a Protocol stack, and configure their own stack, then knowledge about what the individual layers are supposed to do is needed.

In JBoss AS, the configuration of the Protocol stack is located in the file, <server>\ deploy\cluster\jgroups-channelfactory.sar\META-INF\jgroupschannelfactory- stacks.xml.

The file is quite large to fit here, however, in a nutshell, it contains the following basic elements:

The first part of the file includes the UDP transport configuration. UDP is the default protocol for JGroups and uses multicast (or, if not available, multiple unicast messages) to send and receive messages.

A multicast UDP socket can send and receive datagrams from multiple clients. The interesting and useful feature of multicast is that a client can contact multiple servers with a single packet, without knowing the specific IP address of any of the hosts.

Next to the UDP transport configuration, three protocol stacks are defined:

  • udp: The default IP multicast based stack, with flow control
  • udp-async: The protocol stack optimized for high-volume asynchronous RPCs
  • udp-sync: The stack optimized for low-volume synchronous RPCs

Thereafter, the TCP transport configuration is defined . TCP stacks are typically used when IP multicasting cannot be used in a network (for example, because it is disabled) or because you want to create a network over a WAN (that's conceivably possible but sharing data across remote geographical sites is a scary option from the performance point of view).

You can opt for two TCP protocol stacks:

  • tcp: Addresses the default TCP Protocol stack which is best suited to high-volume asynchronous calls.
  • tcp-async: Addresses the TCP Protocol stack which can be used for low-volume synchronous calls.

If you need to switch to TCP stack, you can simply include the following in your command line args that you pass to JBoss:
-Djboss.default.jgroups.stack=tcp Since you are not using multicast in your TCP communication, this requires configuring the addresses/ports of all the possible nodes in the cluster. You can do this by using the property -Djgroups.tcpping. initial_hosts. For example:
-Djgroups.tcpping.initial_hosts=host1[7600],host2[7600]

Ultimately, the configuration file contains two stacks which can be used for optimising JBoss Messaging Control Channel (jbm-control) and Data Channel (jbm-data).

JBoss AS 5 Performance Tuning Build faster, more efficient enterprise Java applications
Published: December 2010
eBook Price: ₨831.00
Book Price: ₨1,386.00
See more
Select your format and quantity:
        Read more about this book      

(For more resources on JBoss, see here.)

How to optimize the UDP transport configuration

The default UDP transport configuration ships with a list of attributes, which can be tweaked once you know what they are for. A complete reference to the UDP transport configuration can be found on the JBoss clustering guide (http://docs. jboss.org/jbossclustering/cluster_guide/5.1/html/jgroups.chapt. html). Here's the core section of the UDP transport configuration:

The biggest performance hit can be achieved by properly tuning the attributes concerning buffer size (ucast_recv_buf_size, ucast_send_buf_size, mcast_recv_buf_size, and mcast_send_buf_size ).

<UDP
singleton_name="shared-udp"
mcast_port="${jboss.jgroups.udp.mcast_port:45688}"
mcast_addr="${jboss.partition.udpGroup:228.11.11.11}"
tos="8"
ucast_recv_buf_size="20000000"
ucast_send_buf_size="640000"
mcast_recv_buf_size="25000000"
mcast_send_buf_size="640000"
loopback="true"
discard_incompatible_packets="true"
enable_bundling="false"
max_bundle_size="64000"
max_bundle_timeout="30"
. . . .
/>

As a matter of fact, in order to guarantee optimal performance and adequate reliability of UDP multicast, it is essential to size network buffers correctly. Using inappropriate network buffers the chances are that you will experience a high frequency of UDP packets being dropped in the network layers, which therefore need to be retransmitted.

The default values for JGroups' UDP transmission are 20MB and 64KB for unicast transmission and respectively 25MB and 64KB for multicast transmission. While these values sound appropriate for most cases, they can be insufficient for applications sending lots of cluster messages. Think about an application sending a thousand 1KB messages: with the default receive size, we will not be able to buffer all packets, thus increasing the chance of packet loss and costly retransmission.

Monitoring the intra-clustering traffic can be done through the jboss.jgroups domain Mbeans. For example, in order to monitor the amount of bytes sent and received with the UDP transmission protocol, just open your jmx-console and point at the jboss.jgroups domain. Then select your cluster partition. (Default the partition if you are running with default cluster settings). In the following snapshot (we are including only the relevant properties) we can see the amount of Messages sent/received along with their size (in bytes).

Besides increasing the JGroups' buffer size, another important aspect to consider is that most operating systems allow a maximum UDP buffer size, which is generally lower than JGroups' defaults. For completeness, we include here a list of default maximum UDP buffer size:

Operating System

Default Max UDP Buffer (in bytes)

Linux

131071

Windows

No known limit

Solaris

262144

FreeBSD, Darwin

262144

AIX

1048576

So, as a rule of thumb, you should always configure your operating system to take advantage of the JGroups' transport configuration. The following table shows the command required to increase the maximum buffer to 25 megabytes. You will need root privileges in order to modify these kernel parameters:

Operating System

Command

Linux

sysctl -w net.core.rmem_max=26214400

Solaris

ndd -set /dev/udp udp_max_buf 26214400

FreeBSD, Darwin

sysctl -w kern.ipc.maxsockbuf=26214400

AIX

no -o sb_max=8388608 (AIX will only allow 1MB, 4MB, or 8MB)

Another option that is worth trying is enable_bundling, which specifies whether to enable message bundling. If true, the transport protocol would queue outgoing messages until max_bundle_size bytes have accumulated, or max_bundle_time milliseconds have elapsed, whichever occurs first.

The advantage of using this approach is that the transport protocol would send bundled queued messages in one single larger message. Message bundling can have significant performance benefits for channels using asynchronous high volume messages (for example, JBoss Cache components configured for REPL_ASYNC. JBoss Cache will be covered in the next section named Tuning JBoss Cache).

On the other hand, for applications based on a synchronous exchange of RCPs, the introduction of message bundling would introduce a considerable latency so it is not recommended in this case. (That's the case with JBoss Cache components configured as REPL_SYNC).

How to optimize the JGroups' Protocol stack

The Protocol stack contains a list of layers protocols, which need to be crossed by the message. A layer does not necessarily correspond to a transport protocol: for example a layer might take care to fragment the message or to assemble it. What's important to understand is that when a message is sent, it travels down in the stack, while when it's received it walks just the way back.

For example, in the next picture, the FLUSH protocol would be executed first, then the STATE, the GMS, and so on. Vice versa, when the message is received, it would meet the PING protocol first, them MERGE2, up to FLUSH.

Following here, is the list of protocols triggered by the default UDP's Protocol stack.

<stack name="udp"
description="Default: IP multicast based stack, with flow
control.">
<config>
<PING timeout="2000" num_initial_members="3"/>
<MERGE2 max_interval="100000" min_interval="20000"/>
<FD_SOCK/>
<FD timeout="6000" max_tries="5" shun="true"/>
<VERIFY_SUSPECT timeout="1500"/>
<pbcast.NAKACK use_mcast_xmit="false" gc_lag="0"
retransmit_timeout="300,600,1200,2400,4800"
discard_delivered_msgs="true"/>
<UNICAST timeout="300,600,1200,2400,3600"/>
<pbcast.STABLE stability_delay="1000"
desired_avg_gossip="50000"
max_bytes="400000"/>
<pbcast.GMS print_local_addr="true" join_timeout="3000"
shun="true"
view_bundling="true"
view_ack_collection_timeout="5000"/>
<FC max_credits="2000000" min_threshold="0.10"
ignore_synchronous_response="true"/>
<FRAG2 frag_size="60000"/>
<pbcast.STATE_TRANSFER/>
<pbcast.FLUSH timeout="0"/>
</config>
</stack>

The following table will shed some light on the above cryptic configuration:

Category

Usage

Protocols

Transport

Responsible for sending and receiving messages across the network

IDP, TCP, and TUNNEL

Discovery

Used to discover active nodes in the cluster and determine which is the coordinator

PING, MPING, TCPPING, and

TCPGOSSIP

Failure Detection

Used to poll cluster nodes to detect node failures

FD, FD_SIMPLE, FD_PING, FD_

ICMP, FD_SOCK, and VERIFY_

SUSPECT

Reliable Delivery

Ensures that messages are actually delivered and delivered in the right order (FIFO) to the destination node

CAUSAL, NAKACK, pbcast.

NAKACK, SMACK, UNICAST, and

PBCAST

Group Membership

Used to notify the cluster when a node joins, leaves or crashes

pbcast.GMS, MERGE, MERGE2,

and VIEW_SYNC

Flow Control

Used to adapt the datasending

rate to the datareceipt rate among nodes

FC

Fragmentation

Fragments messages larger than a certain size. Unfragments at the receiver's side

FRAG2

State transfer

Synchronizes the application state (serialized as a byte array) from an existing node with a newly joining node

pbcast.STATE_TRANSFER and

pbcast.STREAMING_STATE_

TRANSFER

Distributed garbage

collection

Periodically deletes those that have been seen by all nodes from the memory in each node

pbcast.STABLE

While all the above protocols play a role in message exchanging, it's not necessary that you know the inner details of all of them for tuning your applications. So we will focus just on a few interesting ones.

The FC protocol, for example can be used to adapt the rate of messages sent with the rate of messages received. This has the advantage of creating an homogeneous rate of exchange, where no sender member overwhelms receiver nodes, thus preventing potential problems like filling up buffers causing packet loss. Here's an example of FC configuration:

<FC max_credits="2000000"
min_threshold="0.10"
ignore_synchronous_response="true"/>

The message rate adaptation is done with a simple credit system in which each time a sender sends a message a credit is subtracted (equal to the amount of bytes sent). Conversely, when a receiver collects a message, a credit is added.

  • max_credits specifies the maximum number of credits (in bytes) and should obviously be smaller than the JVM heap size
  • min_threshold specifies the value of min_credits as a percentage of the max_credits element
  • ignore_synchronous_response specifies whether threads that have carried messages up to the application should be allowed to carry outgoing messages back down through FC without blocking for credits

The following image depicts a simple scenario where HostA is sending messages (and thus its max_credits is reduced) to HostB and HostC, which increase their max_credits accordingly.

The FC protocol, while providing a control over the flow of messages, can be a bad choice for applications that are issuing synchronous group RPC calls. In this kind of applications, if you have fast senders issuing messages, but some slow receivers across the cluster, the overall rate of calls will be slowed down. For this reason, remove FD from your protocol list if you are sending synchronous messages or just switch to the udpsync protocol stack.

Besides JGroups, some network interface cards (NICs) and switches perform ethernet flow control (IEEE 802.3x), which causes overhead to senders when packet loss occurs. In order to avoid a redundant flow control, you are advised to remove ethernet flow control. For managed switches, you can usually achieve this via a web or Telnet/SSH interface. For unmanaged switches, unfortunately the only chance is to hope that ethernet flow control is disabled, or to replace the switch.
If you are using NICs, you can disable ethernet flow control by means of a simple shell command, for example, on Linux with the ethtool:
/sbin/ethtool -A eth0 autoneg off tx on rx on
If you want simply to verify if ethernet flow control is off:
/sbin/ethtool -a eth0

One more thing you must be aware of is that, by using JGroups, cluster nodes must store all messages received for potential retransmission in case of a failure. However, if we store all messages forever, we will run out of memory. The distributed garbage collection service in JGroups periodically removes messages that have been seen by all nodes from the memory in each node. The distributed garbage collection service is configured in the pbcast.STABLE sub-element like so:

<pbcast.STABLE stability_delay="1000"
desired_avg_gossip="5000"
max_bytes="400000"/>

The configurable attributes are as follows:

  • desired_avg_gossip: Specifies the interval (in milliseconds) between garbage collection runs. Setting this parameter to 0 disables this service.
  • max_bytes: Specifies the maximum number of bytes to receive before triggering a garbage collection run. Setting this parameter to 0 disables this service.

You are advised to set a max_bytes value if you have a high-traffic cluster.

Summary

In this article we took a look at JBoss clustering and Configuring JGroups transport.


Further resources on this subject:


JBoss AS 5 Performance Tuning Build faster, more efficient enterprise Java applications
Published: December 2010
eBook Price: ₨831.00
Book Price: ₨1,386.00
See more
Select your format and quantity:

About the Author :


Francesco Marchioni

Francesco Marchioni is a Sun Certified Enterprise Architect employed for an Italian company based in Rome. He started learning Java in 1997 and since then has followed the path to the newest application program interfaces released by Sun. He joined the JBoss community in 2000, when the application server was running release 2.x.

He has spent many years as a software consultant, where he has envisioned many successful software migrations from vendor platforms to open source products such as JBoss AS, fulfilling the tight budget requirements of current times.

Over the past 5 years, he has been authoring technical articles for OReilly Media and is running an IT portal focused on JBoss products (http://www.mastertheboss.com).

He has authored the following titles:

He has also co-authored the book Infinispan Data Grid Platform, Packt Publishing (August 2012), with Manik Surtani, which covers all the aspects related to the configuration and development of applications using the Infinispan Data Grid Platform (http://www.packtpub.com/infinispan-data-grid-platform/book).

Books From Packt


JBoss AS 5 Development
JBoss AS 5 Development

JBoss RichFaces 3.3
JBoss RichFaces 3.3

Google App Engine Java and GWT Application Development
Google App Engine Java and GWT Application Development

Apache MyFaces 1.2 Web Application Development
Apache MyFaces 1.2 Web Application Development

Django JavaScript Integration: AJAX and jQuery
Django JavaScript Integration: AJAX and jQuery

OSGi and Apache Felix 3.0 Beginner's Guide
OSGi and Apache Felix 3.0 Beginner's Guide

Learning Ext JS 3.2
Learning Ext JS 3.2

Spring Security 3
Spring Security 3


Code Download and Errata
Packt Anytime, Anywhere
Register Books
Print Upgrades
eBook Downloads
Video Support
Contact Us
Awards Voting Nominations Previous Winners
Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
Resources
Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software