MySQL Cluster Management : Part 1

Exclusive offer: get 50% off this eBook here
High Availability MySQL Cookbook

High Availability MySQL Cookbook — Save 50%

Over 60 simple but incredibly effective recipes focusing on different methods of achieving high availability for MySQL database

€20.99    €10.50
by Alex Davies | May 2010 | MySQL

In this article series by Alex Davies, author of High Availability MySQL Cookbook, we will cover:

  • Configuring multiple management nodes
  • Obtaining usage information
  • Adding storage nodes online
  • Replication between MySQL Clusters
  • Replication between MySQL Clusters with a backup channel
  • User-defined partitioning
  • Disk-based tables
  • Calculating DataMemory and IndexMemory

(Read more interesting articles on MySQL High Availability here.)

Configuring multiple management nodes

Every MySQL Cluster must have a management node to start and also to carry out critical tasks such as allowing other nodes to restart, running online backups, and monitoring the status of the cluster. However, it is strongly recommended for a production cluster to ensure that a management node is always available, and this requires more than one node. In this recipe, we will discuss the minor complications that more than one management node will bring before showing the configuration of a new cluster with two management nodes. Finally, the modification of an existing cluster to add a second management node will be shown.

Getting ready

In a single management node cluster, everything is simple. Nodes connect to the management node, get a node ID, and join the cluster. When the management node starts, it reads the config.ini file, starts and prepares to give the cluster information contained within the config.ini file out to the cluster nodes as and when they join.

This process can become slightly more complicated when there are multiple management nodes, and it is important that each management node takes a different ID. Therefore, the first additional complication is that it is an extremely good idea to specify node IDs and ensure that the HostName parameter is set for each management node in the config.ini file.

It is technically possible to start two management nodes with different cluster configuration files in a cluster with multiple management nodes. It is not difficult to see that this can cause all sorts of bizarre behavior including a likely cluster shutdown in the case of the primary management node failing. Ensure that every time the config.ini file is changed, the change is correctly replicated to all management nodes. You should also ensure that all management nodes are always using the same version of the config.ini file.

It is possible to hold the config.ini file on a shared location such as a NFS share, although to avoid introducing complexity and a single point of failure, the best practice would be to store the configuration file in a configuration management system such as Puppet (http://www.puppetlabs.com/) or Cfengine (http://www.cfengine.org/).

How to do it...

The following process should be followed to configure a cluster for multiple management nodes. In this recipe, High Availability with MySQL Cluster. Initially, this recipe will cover the procedure to be followed in order to configure a new cluster with two management nodes. Thereafter, the procedure for adding a second management node to an already running single management node cluster will be covered.

The first step is to define two management nodes in the global configuration file config.ini on both management nodes.

In this example, we are using IP addresses 10.0.0.5 and 10.0.0.6 for the two management nodes that require the following two entries of [ndb_mgmd] in the config.ini file:

[ndb_mgmd]
Id=1
HostName=10.0.0.5
DataDir=/var/lib/mysql-cluster
[ndb_mgmd]
Id=2
HostName=10.0.0.6
DataDir=/var/lib/mysql-cluster

Update the [mysqld] section of each SQL node's /etc/my.cnf to point to both management nodes:

[mysqld]
ndb-connectstring=10.0.0.5,10.0.0.6

Update the [mysqld] section of each SQL node's /etc/my.cnf to point to both management nodes:

[mysqld]
ndb-connectstring=10.0.0.5,10.0.0.6

Now, prepare to start both the management nodes. Install the management node on both nodes, if it does not already exist.

Before proceeding, ensure that you have copied the updated config.ini file to both management nodes.

Start the first management node by changing to the correct directory and running the management node binary (ndb_mgmd) with the following flags:

  • --initial: Deletes the local cache of the config.ini file and updates it (you must do this every time the config.ini file is changed).
  • --ndb-nodeid=X: Tells the node to connect as this nodeid, as we specified in the config.ini file. This is technically unnecessary if there is no ambiguity as to which nodeid this particular node may connect to (in this case, both nodes have a HostName defined). However, defining it reduces the possibility of confusion.
  • --config-file=config.ini: This is used to specify the configuration file. In theory, passing a value of the config.ini file in the local directory is unnecessary because it is the default value. But in certain situations, it seems that passing this in any case avoids issues, and again this reduces the possibility of confusion.
[root@node6 mysql-cluster]# cd /usr/local/mysql-cluster
[root@node6 mysql-cluster]# ndb_mgmd --config-file=config.ini --initial --ndb-nodeid=2
2009-08-15 20:49:21 [MgmSrvr] INFO -- NDB Cluster Management Server. mysql-5.1.34 ndb-7.0.6
2009-08-15 20:49:21 [MgmSrvr] INFO -- Reading cluster configuration from 'config.ini'

Repeat this command on the other node using the correct node ID:

[root@node5 mysql-cluster]# cd /usr/local/mysql-cluster
[root@node5 mysql-cluster]# ndb_mgmd --config-file=config.ini --initial --ndb-nodeid=1

Now, start each storage node in turn. Use the storage management client's show command to show that both management nodes are connected and that all storage nodes have been reconnected:

ndb_mgm> show
Connected to Management Server at: 10.0.0.5:1186
Cluster Configuration
---------------------
[ndbd(NDB)] 4 node(s)
id=3 @10.0.0.1 (mysql-5.1.34 ndb-7.0.6, Nodegroup: 0, Master)
id=4 @10.0.0.2 (mysql-5.1.34 ndb-7.0.6, Nodegroup: 0)
id=5 @10.0.0.3 (mysql-5.1.34 ndb-7.0.6, Nodegroup: 1)
id=6 @10.0.0.4 (mysql-5.1.34 ndb-7.0.6, Nodegroup: 1)
[ndb_mgmd(MGM)] 2 node(s)
id=1 @10.0.0.5 (mysql-5.1.34 ndb-7.0.6)
id=2 @10.0.0.6 (mysql-5.1.34 ndb-7.0.6)
[mysqld(API)] 4 node(s)
id=11 @10.0.0.1 (mysql-5.1.34 ndb-7.0.6)
id=12 @10.0.0.2 (mysql-5.1.34 ndb-7.0.6)
id=13 @10.0.0.3 (mysql-5.1.34 ndb-7.0.6)
id=14 @10.0.0.4 (mysql-5.1.34 ndb-7.0.6)

Finally, restart all SQL nodes (mysqld processes). On RedHat-based systems, this can be achieved using the service command:

[root@node1 ~]# service mysqld restart

Congratulations! Your cluster is now configured with multiple management nodes. Test that failover works by killing a management node, in turn, the remaining management nodes should continue to work.

There's more...

It is sometimes necessary to add a management node to an existing cluster if for example, due to a lack of hardware or time, an initial cluster only has a single management node.

Adding a management node is simple. Firstly, install the management client on the new node . Secondly, modify the config.ini file, as shown earlier in this recipe for adding the new management node, and copy this new config.ini file to both management nodes. Finally, stop the existing management node and start the new one using the following commands:

For the existing management node, type:

[root@node6 mysql-cluster]# killall ndb_mgmd [root@node6 mysql-cluster]# ndb_mgmd --config-file=config.ini --

initial --ndb-nodeid=2
2009-08-15 21:29:53 [MgmSrvr] INFO -- NDB Cluster Management Server. mysql-5.1.34 ndb-7.0.6
2009-08-15 21:29:53 [MgmSrvr] INFO -- Reading cluster configuration from 'config.ini'

Then type the following command for the new management node:

[root@node5 mysql-cluster]# ndb_mgmd --config-file=config.ini --initial --ndb-nodeid=1
2009-08-15 21:29:53 [MgmSrvr] INFO -- NDB Cluster Management Server. mysql-5.1.34 ndb-7.0.6
2009-08-15 21:29:53 [MgmSrvr] INFO -- Reading cluster configuration from 'config.ini'torage node one at a time. Ensure that you only stop one node per nodegroup at a time and wait for it to fully restart before taking another node in the nodegroup, when offline, in order to avoid any downtime.

See also

Look at the section for the online addition of storage nodes (discussed later in this article) for further details on restarting storage nodes one at a time.

Obtaining usage information

This recipe explains how to monitor the usage of a MySQL Cluster, looking at the memory, CPU, IO, and network utilization on storage nodes.

Getting ready

MySQL Cluster is extremely memory-intensive. When a MySQL Cluster starts, the storage nodes will start using the entire DataMemory and IndexMemory allocated to them. In a production cluster with a large amount of RAM, it is likely that this will include a large proportion of the physical memory on the server.

How to do it...

An essential part of managing a MySQL Cluster is looking into what is happening inside each storage node. In this section, we will cover the vital commands used to monitor a cluster.

To monitor the memory (RAM) usage of the nodes within the cluster, execute the &ltnodeid> REPORT MemoryUsage command within the management client as follows:

 ndb_mgm> 3 REPORT MemoryUsage
Node 3: Data usage is 0%(21 32K pages of total 98304)
Node 3: Index usage is 0%(13 8K pages of total 131104)

This command can be executed for all storage nodes rather than just one by using ALL nodeid:

ndb_mgm> ALL REPORT MemoryUsage
Node 3: Data usage is 0%(21 32K pages of total 98304)
Node 3: Index usage is 0%(13 8K pages of total 131104)
Node 4: Data usage is 0%(21 32K pages of total 98304)
Node 4: Index usage is 0%(13 8K pages of total 131104)
Node 5: Data usage is 0%(21 32K pages of total 98304)
Node 5: Index usage is 0%(13 8K pages of total 131104)
Node 6: Data usage is 0%(21 32K pages of total 98304)
Node 6: Index usage is 0%(13 8K pages of total 131104)

This information shows that these nodes are actually using 0% of their DataMemory and IndexMemory.

Memory allocation is important and unfortunately a little more complicated than a percentage used on each node. There is more detail about this in the How it works... section of this recipe, but the vital points to remember are:

  • It is a good idea never to go over 80 percent of memory usage (particularly not for DataMemory)
  • In the case of a cluster with a very high memory usage, it is possible that a cluster will not restart correctly

MySQL Cluster storage nodes make extensive use of disk storage unless specifically configured not to, regardless of whether a cluster is using disk-based tables. It is important to ensure the following:

  • There is sufficient storage available
  • There is sufficient IO bandwidth for the storage node and the latency is not too high

To confirm the disk usage on Linux, use the command df –h as follows:

[root@node1 mysql-cluster]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/system-root
7.6G 2.0G 5.3G 28% /
/dev/xvda1 99M 21M 74M 22% /boot
tmpfs 2.0G 0 2.0G 0% /dev/shm
/dev/mapper/system-ndb_data
2.0G 83M 1.8G 5% /var/lib/mysql-cluster
/dev/mapper/system-ndb_backups
2.0G 68M 1.9G 4% /var/lib/mysql-cluster/BACKUPS

In this example, the cluster data directory and backup directory are on different logical volumes. This provides the following benefits:

  • It is easy to see their usage (5% for data and 4% for backups)
  • Each volume is isolated from other partitions or logical volumes—it means that they are protected from, let's say, a logfile growing in the logs directory

To confirm the rate at which the kernel is writing to and reading from the disk, use the vmstat command:

[root@node1 ~]# vmstat 1
procs -----------memory---------- ---swap-- -----io---- --system-- --
---cpu------
r b swpd free buff cache si so bi bo in cs us
sy id wa st
0 0 0 2978804 324784 353856 0 0 1 121 39 15 0
0 100 0 0
3 0 0 2978804 324784 353856 0 0 0 0 497 620 0
0 99 0 1
0 0 0 2978804 324784 353856 0 0 0 172 529 665 0
0 100 0 0

The bi and bo columns represent the blocks read from a disk and blocks written to a disk, respectively. The first line can be ignored (it's the average since boot), and the number passed to the command, in this case, the refresh rate in seconds. By using a tool such as bonnie (refer to the See also section at the end of this recipe) to establish the potential of each block device, you can then check to see the maximum proportion of each block device is currently being used.

At times of high stress, like during a hot backup, if the disk utilization is too high it is potentially possible that the storage node will start spending a lot of time in the iowait state—this will reduce performance and should be avoided. One way to avoid this is by using a separate block device (that is, disk or raid controller) for the backups mount point.

High Availability MySQL Cookbook Over 60 simple but incredibly effective recipes focusing on different methods of achieving high availability for MySQL database
Published: April 2010
eBook Price: €20.99
Book Price: €34.99
See more
Select your format and quantity:

How it works...

Data within the MySQL Cluster is stored in two parts. In broader terms, the fixed part of a row (fields with a fixed width, such as INT, CHAR, and so on) is stored separately from variable fields (for example, VARCHAR).

As data is stored in 32 KB pages, it is possible for variable-length data to become quite fragmented in cases where a cluster only has free space in existing pages that are available because data has been deleted.

Fragmentation is clearly bad. To reduce it, run the SQL command optimize table as follows:

mysql> optimize table City;
+------------+----------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+------------+----------+----------+----------+
| world.City | optimize | status | OK |
+------------+----------+----------+----------+
1 row in set (0.02 sec)

To know more about fragmentation, check out the GPL tool chkfrag at http://www.severalnines.com/chkfrag/index.php.

There's more...

It is also essential to monitor network utilization because latency will dramatically increase as utilization gets close to 100 percent of either an individual network card or a network device like a switch. If network latency increases by a very small amount, then its effect on performance will be significant. This article will not discuss the many techniques for monitoring the overall network health. However, we will see a tool called iptraf that is very useful inside clusters for working out which node is interacting with which node and what proportion of network resources it is using.

A command such as iptraf –i eth0 will show the network utilization broken down by connection, which can be extremely useful when trying to identify connections on a node that are causing problems. The screenshot for the iptraf command is as follows:

The previous screenshot shows the connections on the second interface (dedicated to cluster traffic) for the first node in a four-storage node cluster. The connection that each node makes with the others (10.0.0.2, 10.0.0.3, and 10.0.0.4 are other storage nodes) is obvious as well as the not entirely obvious ports selected for each connection. There is also a connection to the management node. The Bytes column gives a clear indication of which connections are most utilized.

See also

Bonnie—disk reporting and benchmarking tool at: http://www.garloff.de/kurt/linux/bonnie/

Adding storage nodes online

The ability to add a new node without any downtime is a relatively new feature of MySQL Cluster which dramatically improves long-term uptime in cases where the regular addition of nodes is required, for example, where data volume or query load is continually increasing.

Getting ready

In this recipe, we will show an example of how to add two nodes to an existing two-node cluster (while maintaining NoOfReplicas=2 or two copies of each fragment of data).

The start point for this recipe is a cluster with two storage nodes and one management node running successfully with some data imported . Ensure that the world database has been imported as an NDB table.

How to do it...

Firstly, ensure that your cluster is fully running (that is, all management and storage nodes are running). The command to do this is as follows:

[root@node5 mysql-cluster]# ndb_mgm
ndb_mgm> show
Cluster Configuration
---------------------
[ndbd(NDB)] 2 node(s)
id=2 @10.0.0.1 (mysql-5.1.34 ndb-7.0.6, Nodegroup: 0, Master)
id=3 @10.0.0.2 (mysql-5.1.34 ndb-7.0.6, Nodegroup: 0)
[ndb_mgmd(MGM)] 1 node(s)
id=1 @10.0.0.5 (mysql-5.1.34 ndb-7.0.6)
[mysqld(API)] 4 node(s)
id=4 @10.0.0.1 (mysql-5.1.34 ndb-7.0.6)
id=5 @10.0.0.2 (mysql-5.1.34 ndb-7.0.6)
id=6 (not connected, accepting connect from any host)
id=7 (not connected, accepting connect from any host)

Edit the global cluster configuration file on the management node (/usr/local/mysql-cluster/config.ini) with your favorite text editor to add the new nodes:

[ndb_mgmd]
Id=1
HostName=10.0.0.5
DataDir=/var/lib/mysql-cluster
[ndbd default]
DataDir=/var/lib/mysql-cluster
MaxNoOfConcurrentOperations = 150000
MaxNoOfAttributes = 10000
MaxNoOfOrderedIndexes=512
DataMemory=3G
IndexMemory=1G
NoOfReplicas=2
[ndbd]
HostName=10.0.0.1
[ndbd]
HostName=10.0.0.2
[ndbd]
HostName=10.0.0.3
[ndbd]
HostName=10.0.0.4
[mysqld]
HostName=10.0.0.1
[mysqld]
HostName=10.0.0.2

Now, perform a rolling cluster management node restart by copying the new config.ini file to all management nodes and executing the following commands on all management nodes as follows:

[root@node5 mysql-cluster]# killall ndb_mgmd
[root@node5 mysql-cluster]# ndb_mgmd --initial --config-file=/usr/local/mysql-cluster/config.ini

At this point, you should see the storage node status as follows:

[root@node5 mysql-cluster]# ndb_mgm
-- NDB Cluster -- Management Client --
ndb_mgm> show
Connected to Management Server at: 10.0.0.5:1186
Cluster Configuration
---------------------
[ndbd(NDB)] 4 node(s)
id=2 @10.0.0.1 (mysql-5.1.34 ndb-7.0.6, Nodegroup: 0, Master)
id=3 @10.0.0.2 (mysql-5.1.34 ndb-7.0.6, Nodegroup: 0)
id=4 (not connected, accepting connect from 10.0.0.3)
id=5 (not connected, accepting connect from 10.0.0.4)

Now, restart the active current nodes—in this case, the nodes with id 2 and 3 (10.0.0.1 and 10.0.0.2). This can be done with the management client command &ltnodeid> RESTART or by killing the ndbd process and restarting (there is no need for --initial):

ndb_mgm> 3 restart;
Node 3: Node shutdown initiated
Node 3: Node shutdown completed, restarting, no start.
Node 3 is being restarted
Node 3: Start initiated (version 7.0.6)
Node 3: Data usage decreased to 0%(0 32K pages of total 98304)
Node 3: Started (version 7.0.6)
ndb_mgm> 2 restart;
Node 2: Node shutdown initiated
Node 2: Node shutdown completed, restarting, no start.
Node 2 is being restarted
Node 2: Start initiated (version 7.0.6)
Node 2: Data usage decreased to 0%(0 32K pages of total 98304)
Node 2: Started (version 7.0.6)

At this point, the new nodes have still not joined the cluster. Now, run ndbd –initial on both these nodes (10.0.0.3 and 10.0.0.4) as follows:

[root@node1 ~]# ndbd
2009-08-18 20:39:32 [ndbd] INFO -- Configuration fetched from '10.0.0.5:1186', generation: 1

If you check the status of the show command in the management client, shortly after starting the new storage nodes, you will notice that the newly-started storage nodes move to a started state very rapidly (when compared to other nodes in the cluster). However, but they are shown as belonging to "no nodegroup" as shown in the following output:

ndb_mgm> show
Cluster Configuration
---------------------
[ndbd(NDB)] 4 node(s)
id=2 @10.0.0.1 (mysql-5.1.34 ndb-7.0.6, Nodegroup: 0)
id=3 @10.0.0.2 (mysql-5.1.34 ndb-7.0.6, Nodegroup: 0, Master)
id=4 @10.0.0.3 (mysql-5.1.34 ndb-7.0.6, no nodegroup)
id=5 @10.0.0.4 (mysql-5.1.34 ndb-7.0.6, no nodegroup)

Now, we need to create a new nodegroup for these nodes. We have set NoOfReplicas=2 in the config.ini file, so each nodegroup must contain two nodes. We use the CREATE NODEGROUP &ltnodeID>,&ltnodeID> command to add a nodegroup.

If we had NoOfReplicas=4, we would pass four comma-separated nodeIDs to this command.

ndb_mgm> CREATE NODEGROUP 4,5
Nodegroup 1 created<

Nodegroup 1 now exists. To see the information, use the show command as follows:

ndb_mgm> show
Cluster Configuration
---------------------
[ndbd(NDB)] 4 node(s)
id=2 @10.0.0.1 (mysql-5.1.34 ndb-7.0.6, Nodegroup: 0)
id=3 @10.0.0.2 (mysql-5.1.34 ndb-7.0.6, Nodegroup: 0, Master)
id=4 @10.0.0.3 (mysql-5.1.34 ndb-7.0.6, Nodegroup: 1)
id=5 @10.0.0.4 (mysql-5.1.34 ndb-7.0.6, Nodegroup: 1)

Congratulations! You have now added two new nodes to your cluster, which will be used by the cluster for new fragments of data. Look at the There's more… section of this recipe to see how you can get these nodes used right away and the How it works… section for a brief explanation of what is going on behind the scenes.

How it works...

After you have added the new nodes, it is possible to take a look at how a table is being stored within the cluster. If you used the world sample database imported in , then you will have a City table inside the world database. Running the ndb_desc binary as follows on a storage or management node shows you where the data is stored.

The first parameter, after –d, is the database name and the second is the table name. If a [mysql_cluster] section is not defined in /etc/my.cnf, the management node IP address may be passed with -c.

[root@node1 ~]# ndb_desc -d world City -p
-- City --
Version: 1
Fragment type: 9
K Value: 6
Min load factor: 78
Max load factor: 80
Temporary table: no
Number of attributes: 5
Number of primary keys: 1
Length of frm data: 324
Row Checksum: 1
Row GCI: 1
SingleUserMode: 0
ForceVarPart: 1
FragmentCount: 2
TableStatus: Retrieved
-- Attributes --
ID Int PRIMARY KEY DISTRIBUTION KEY AT=FIXED ST=MEMORY AUTO_INCR
Name Char(35;latin1_swedish_ci) NOT NULL AT=FIXED ST=MEMORY
CountryCode Char(3;latin1_swedish_ci) NOT NULL AT=FIXED ST=MEMORY
District Char(20;latin1_swedish_ci) NOT NULL AT=FIXED ST=MEMORY
Population Int NOT NULL AT=FIXED ST=MEMORY
-- Indexes --
PRIMARY KEY(ID) - UniqueHashIndex
PRIMARY(ID) - OrderedIndex
-- Per partition info --
Partition Row count Commit count Frag fixed memory Frag varsized memory
0 2084 2084 196608 0
1 1995 1995 196608 0
NDBT_ProgramExit: 0 - OK

There are two partitions—one is active on one of the initial nodes, and the other is active on the second of the initial nodes. The new node is not being used at all.

If you import exactly the same table with the new cluster into a new database (four nodes), then you will notice that there are four partitions, and they are as follows:

-- Per partition info --
Partition Row count Commit count Frag fixed memory Frag varsized memory
0 1058 1058 98304 0
2 1026 1026 98304 0
1 1018 1018 98304 0
3 977 977 98304 0

Therefore, when we add a new nodegroup, it is important to reorganize the data in he existing nodes to ensure that it is spread out across the whole cluster and this does not happen automatically. New data, however, is automatically spread out across the whole cluster.

The process to reorganize data in the cluster to use all storage nodes is outlined in the next section.

High Availability MySQL Cookbook Over 60 simple but incredibly effective recipes focusing on different methods of achieving high availability for MySQL database
Published: April 2010
eBook Price: €20.99
Book Price: €34.99
See more
Select your format and quantity:

There's more...

To reorganize the data within a cluster to use all new storage nodes, run the ALTER TABLE x REORGANIZE PARTITION query in a SQL node, substituting x for a table name. This command must be run once per table in the cluster.

In NDB 7.0, the redistribution does not include unique indexes (only ordered indexes are redistributed) or BLOB table data. This is a limitation that is likely to be removed in later releases. If you have a large amount of these two forms of data, then it is likely to that you will notice unequal loadings on your new nodes even after this process. Newly inserted data will, however, be distributed across all nodes correctly.

This query can be executed on any storage node and should not affect the execution of other queries—although it will, of course, increase the load on the storage nodes involved:

[root@node1 ~]# mysql
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 5
Server version: 5.1.34-ndb-7.0.6-cluster-gpl MySQL Cluster Server (GPL)
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> use world;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> ALTER ONLINE TABLE City REORGANIZE PARTITION;
Query OK, 0 rows affected (9.04 sec)
Records: 0 Duplicates: 0 Warnings: 0

After this, run an OPTIMIZE TABLE query to reduce fragmentation significantly, as follows:

mysql> OPTIMIZE TABLE City;
+------------+----------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+------------+----------+----------+----------+
| world.City | optimize | status | OK |
+------------+----------+----------+----------+
1 row in set (0.03 sec)

Now, use the ndb_desc command as follows—it shows four partitions and our data spread across all the new storage nodes:

[root@node1 ~]# ndb_desc -d world City -p
-- Per partition info --
Partition Row count Commit count Frag fixed memory Frag varsized memory
0 1058 4136 196608 0
3 977 977 98304 0
1 1018 3949 196608 0
2 1026 1026 98304 0

Replicating between MySQL Clusters

Replication is commonly used for single MySQL servers. In this recipe, we will explain how to use this technique with MySQL Cluster—replicating from one MySQL Cluster to another and replicating from a MySQL Cluster to a standalone server.

Getting ready

Replication is often used to provide a Disaster Recovery site, some distance away from a primary location, which is asynchronous (in contrast with the synchronous nature of the information flows within a MySQL Cluster). The asynchronous nature of replication means that the main cluster does not experience any performance degradation at the expense of a potential loss of a small amount of data in the event of the master cluster failing.

Replication involving a MySQL Cluster introduces the concept of replication channels. A replication channel is made up of two replication nodes. One of these nodes is in the source machine or cluster, and the other in the destination machine or cluster. It is good practice to have more than one replication channel for redundancy.

The following diagram illustrates the replication channel:

Note that this diagram shows two replication channels. Currently, with Cluster Replication, only one channel can be active at any one time. It is good practice to have another channel set up almost ready to go, so that in the event one of the nodes involved in the primary channel fails, it is very quick to bring up a new channel.

In general, all replication nodes should be of the same, or very similar, MySQL version.

How to do it...

Firstly, prepare the two parts of the replication channel. In this example, we will replicate from one cluster to another. The source end of the channel is referred to as the master and the destination as the slave.

All mysqld processes (SQL nodes or standalone MySQL servers) involved as a replication agent (either as master or slave) must be configured to have a unique server-ID. Additionally, the master must also have some additional configuration in the [mysqld] section of /etc/my.cnf. Start by adding this to the master SQL node's /etc/my.cnf file as follows:

# Enable cluster replication
log-bin
binlog-format=ROW
server-id=3

Add the server-id parameter only to all MySQL servers that are acting as slave nodes, and restart all SQL nodes that have had my.cnf modified:

[root@node4 ~]# service mysql restart
Shutting down MySQL... [ OK ]
Starting MySQL. [ OK ]

On the master node, add an account for the slave node as follows:

[root@node1 ~]# mysql
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.1.34-ndb-7.0.6-cluster-gpl-log MySQL Cluster Server (GPL)
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> GRANT REPLICATION SLAVE ON *.* TO 'slave'@'10.0.0.3' IDENTIFIED BY 'password';
Query OK, 0 rows affected (0.00 sec)
mysql> FLUSH PRIVILEGES;
Query OK, 0 rows affected (0.00 sec)

On the master, run the command SHOW MASTER STATUS to establish the current logfile name and position as follows:

mysql> SHOW MASTER STATUS;
+------------------+----------+--------------+------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+------------------+----------+--------------+------------------+
| node1-bin.000001 | 318 | | |
+------------------+----------+--------------+------------------+
1 row in set (0.00 sec)

On the slave, issue a CHANGE MASTER TO command as follows to tell the slave where the master is, what user and password to use to log in, what logfile it is currently at, and what logfile position to start from:

mysql> CHANGE MASTER TO MASTER_HOST='10.0.0.1', MASTER_USER='slave', MASTER_PASSWORD='password', 

MASTER_LOG_FILE='node1-bin.000001', MASTER_LOG_POS=318
Query OK, 0 rows affected (0.00 sec)
mysql> start slave;
Query OK, 0 rows affected (0.00 sec)

Now, check the status, as follows, to ensure that the node has connected correctly:

mysql> SHOW SLAVE STATUS\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 10.0.0.1
Master_User: slave
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: node1-bin.000001
Read_Master_Log_Pos: 318
Relay_Log_File: node3-relay-bin.000002
Relay_Log_Pos: 253
Relay_Master_Log_File: node1-bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 318
Relay_Log_Space: 409
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Master_Bind:
1 row in set (0.00 sec)
ERROR:
No query specified

Replication is now working. To be sure, connect to the master node and run some SQL queries. In this example, we will create a database as follows:

[master node] mysql> CREATE DATABASE test1;
Query OK, 1 row affected (0.26 sec)

Ensure that this database is created on the slave node:

[slave node] mysql> SHOW DATABASES;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| test1 |
+--------------------+
8 rows in set (0.00 sec)

Now, from another node in the same cluster as the master, create another database as follows:

[node in master cluster] mysql> CREATE DATABASE test2;
Query OK, 1 row affected (0.26 sec)

And ensure that it appears on the slave node:

[slave_node] mysql> SHOW DATABASES;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| test1 |
| test2 |
+--------------------+
8 rows in set (0.00 sec)

If your slave node is also a member of a (different) cluster, then check that this new database has appeared on all nodes in that cluster too:

[node in slave cluster] mysql> SHOW DATABASES;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| test1 |
| test2 |
+--------------------+
8 rows in setp----------+
1 row in set (0.00 sec)

You can see that the new database has been correctly replicated to the slave cluster. It is good practice to test this replication channel with some real data, perhaps by importing the world dataset into the new database on the master cluster.

How it works...

MySQL Cluster replication is implemented by a dedicated thread—the NDB binlog injector thread that runs on each SQL node and produces a standard binary log (binlog), which a slave can connect to normally. This binlog injector thread ensures that all changes within the cluster that the SQL node is a member of are inserted into the binary log and not just the queries that were executed on that specific SQL node. This thread additionally ensures that these transactions are inserted in the correct serialization order. Therefore, the vast majority of the process is identical to the standard MySQL Replication.

If you only had a single SQL node in a cluster, then there would be no need for this thread, and standard MySQL Replication would work perfectly. Unfortunately, there is very little use in a MySQL Cluster with one SQL node.

There's more...

Cluster replication is an extremely powerful tool. In the following section, we will cover a couple of the most useful and more advanced techniques of cluster replication.

In this article series, we have learned:

  • Configuring multiple management nodes
  • Obtaining usage information
  • Adding storage nodes online
  • Replication between MySQL Clusters

We will discuss the following the next article- MySQL Cluster Management : Part 2

  • Replication between MySQL Clusters with a backup channel
  • User-defined partitioning
  • Disk-based tables
  • Calculating DataMemory and IndexMemory

If you have read this article you may be interested to view :


About the Author :


Alex Davies

Alex Davies has a wealth of experience working with MySQL, becoming a founding member of the MySQL Guilds before the age of 18. Author of the first book on MySQL Clustering, published in 2006, Alex has significant experience working on MySQL servers, often working on scalability and high availability problems.

Books From Packt

MySQL for Python: Database Access Made Easy
MySQL for Python: Database Access Made Easy

MySQL Admin Cookbook
MySQL Admin Cookbook

Mastering phpMyAdmin 3.1 for Effective MySQL Management
Mastering phpMyAdmin 3.1 for Effective MySQL Management

MySQL 5.1 Plugins Development
MySQL 5.1 Plugins Development

Magento 1.3: PHP Developer's Guide
Magento 1.3: PHP Developer's Guide

jQuery 1.4 Reference Guide
jQuery 1.4 Reference Guide

Plone 3 Products Development Cookbook
Plone 3 Products Development Cookbook

Wordpress and Flash 10x Cookbook
Wordpress and Flash 10x Cookbook

Code Download and Errata
Packt Anytime, Anywhere
Register Books
Print Upgrades
eBook Downloads
Video Support
Contact Us
Awards Voting Nominations Previous Winners
Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
Resources
Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software