




















































Over 150 recipes to design and optimize large scale Apache Cassandra deployments
Cassandra replicates data to multiple nodes; because of this, a read operation can be served by multiple nodes. If a read at QUORUM or higher is submitted, a Read Repair is executed, and the read operation will involve more than a single server. In a simple flat network which nodes have chosen for digest reads, are not of much consequence. However, in multiple datacenter or multiple switch environments, having a read cross a switch or a slower WAN link between datacenters can add milliseconds of latency. This recipe shows how to debug the read path to see if reads are being routed as expected.
log4j.rootLogger=DEBUG,stdout,R
DEBUG 06:07:35,060 insert writing local
RowMutation(keyspace='ks1', key='65', modifications=[cf1])
DEBUG 06:07:35,062 applying mutation of row 65
[default@ks1] set cf1[‘e'][‘mycolumn']='value';
Value inserted.
[default@ks1] get cf1[‘e'][‘mycolumn'];
Debugging messages should be displayed in the log.
DEBUG 06:08:35,917 weakread reading SliceByNamesReadComman
d(table='ks1', key=65, columnParent='QueryPath(columnFami
lyName='cf1', superColumnName='null', columnName='null')',
columns=[6d79636f6c756d6e,]) locally
...
DEBUG 06:08:35,919 weakreadlocal reading SliceByNamesReadCo
mmand(table='ks1', key=65, columnParent='QueryPath(columnFa
milyName='cf1', superColumnName='null', columnName='null')',
columns=[6d79636f6c756d6e,])
Changing the logging property level to DEBUG causes Cassandra to print information as it is handling reads internally. This is helpful when troubleshooting a snitch or when using the consistency levels such as LOCAL_QUORUM or EACH_QUORUM, which route requests based on network topologies.
While it is possible to simulate network failures by shutting down Cassandra instances, another failure you may wish to simulate is a failure that partitions your network. A failure in which multiple systems are UP but cannot communicate with each other is commonly referred to as a split brain scenario. This state could happen if the uplink between switches fails or the connectivity between two datacenters is lost.
When editing any firewall, it is important to have a backup copy. Testing on a remote machine is risky as an incorrect configuration could render your system unreachable.
:RH-Firewall-1-INPUT - [0:0]
-A INPUT -j RH-Firewall-1-INPUT
-A FORWARD -j RH-Firewall-1-INPUT
-A RH-Firewall-1-INPUT -i lo -j ACCEPT
#/etc/init.d/iptables restart
-A RH-Firewall-1-INPUT -m state --state NEW -s 10.0.1.1 -d
10.0.1.2 -j ACCEPT
IPTables is a complete firewall that is a standard part of current Linux kernel. It has extensible rules that can permit or deny traffic based on many attributes, including, but not limited to, source IP, destination IP, source port, and destination port. This recipe uses the traffic blocking features to simulate network failures, which can be used to test how Cassandra will operate with network failures.
A snitch is Cassandra's way of mapping a node to a physical location in the network. It helps determine the location of a node relative to another node in order to ensure efficient request routing. The RackInferringSnitch can only be used if your network IP allocation is divided along octets in your IP address.
The following network diagram demonstrates a network layout that would be ideal for RackInferringSnitch.
endpoint_snitch: org.apache.cassandra.locator.RackInferringSnitch
The RackInferringSnitch requires no extra configuration as long as your network adheres to a specific network subnetting scheme. In this scheme, the first octet, Y.X.X.X, is the private network number 10. The second octet, X.Y.X.X, represents the datacenter. The third octet, X.X.Y.X, represents the rack. The final octet represents the host, X.X.X.Y. Cassandra uses this information to determine which hosts are ‘closest'. It is assumed that ‘closer' nodes will have more bandwidth and less latency between them. Cassandra uses this information to send Digest Reads to the closest nodes and route requests efficiently.
While it is ideal if the network conforms to what is required for RackInferringSnitch, it is not always practical or possible. It is also rigid in that if a single machine does not adhere to the convention, the snitch will fail to work properly.