In this article by Chitij Chauhan, Dinesh Kumar, the authors of the book PostgreSQL High Performance Cookbook, we will talk about various high availability and replication solutions including some popular third-party replication tool like Slony.
(For more resources related to this topic, see here.)
Setting up hot streaming replication
Here in this recipe we are going to set up a master/slave streaming replication.
For this exercise you would need two Linux machines each with the latest version of PostgreSQL 9.6 installed. We will be using the following IP addresses for master and slave servers.
- Master IP address: 192.168.0.4
- Slave IP Address: 192.168.0.5
How to do it…
Given are the sequence of steps for setting up master/slave streaming replication:
- Setup password less authentication between master and slave for postgres user.
- First we are going to create a user ID on the master which will be used by slave server to connect to the PostgreSQL database on the master server:
psql -c "CREATE USER repuser REPLICATION LOGIN ENCRYPTED PASSWORD 'charlie';"
- Next would be to allow the replication user that was created in the previous step to allow access to the master PostgreSQL server. This is done by making the necessary changes as mentioned in the pg_hba.conf file:
Vi pg_hba.conf host replication repuser 192.168.0.5/32 md5
- In the next step we are going to configure parameters in the postgresql.conf file. These parameters are required to be set in order to get the streaming replication working:
Vi /var/lib/pgsql/9.6/data/postgresql.conf listen_addresses = '*' wal_level = hot_standby max_wal_senders = 3 wal_keep_segments = 8 archive_mode = on archive_command = 'cp %p /var/lib/pgsql/archive/%f && scp %p firstname.lastname@example.org:/var/lib/pgsql/archive/%f'
- Once the parameter changes have been made in the postgresql.conf file in the previous step ,the next step would be restart the PostgreSQL server on the master server in order to get the changes made in the previous step come into effect:
pg_ctl -D /var/lib/pgsql/9.6/data restart
- Before the slave can replicate the master we would need to give it the initial database to build off. For this purpose we will make a base backup by copying the primary server's data directory to the standby:
psql -U postgres -h 192.168.0.4 -c "SELECT pg_start_backup('label', true)" rsync -a /var/lib/pgsql/9.6/data/ 192.168.0.5:/var/lib/pgsql/9.6/data/ --exclude postmaster.pid psql -U postgres -h 192.168.0.4 -c "SELECT pg_stop_backup()"
- Once the data directory in the previous step is populated ,next step is to configure the following mentioned parameters in the postgresql.conf file on the slave server:
hot_standby = on
- Next would be to copy the recovery.conf.sample in the $PGDATA location on the slave server and then configure the following mentioned parameters:
cp /usr/pgsql-9.6/share/recovery.conf.sample /var/lib/pgsql/9.6/data/recovery.conf standby_mode = on primary_conninfo = 'host=192.168.0.4 port=5432 user=repuser password=charlie' trigger_file = '/tmp/trigger.replication' restore_command = 'cp /var/lib/pgsql/archive/%f "%p"'
- Next would be to start the slave server:
service postgresql-9.6 start
- Now that the preceding mentioned replication steps are set up we will now test for replication. On the master server, login and issue the following mentioned SQL commands:
psql -h 192.168.0.4 -d postgres -U postgres -W postgres=# create database test; postgres=# \c test; test=# create table testtable ( testint int, testchar varchar(40) ); CREATE TABLE test=# insert into testtable values ( 1, 'What A Sight.' ); INSERT 0 1
- On the slave server we will now check if the newly created database and the corresponding table in the previous step are replicated:
psql -h 192.168.0.5 -d test -U postgres -W test=# select * from testtable; testint | testchar --------+---------------- 1 | What A Sight. (1 row)
- The wal_keep_segments parameter ensures that how many WAL files should be retained in the master pg_xlog in case of network delays. However if you do not want assume a value for this, you can create a replication slot which makes sure master does not remove the WAL files in pg_xlog until they have been received by standbys. For more information refer to: https://www.postgresql.org/docs/9.6/static/warm-standby.html#STREAMING-REPLICATION-SLOTS.
How it works…
The following is the explanation given for the steps done in the preceding section:
- In the initial step of the preceding section we create a user called repuser which will be used by the slave server to make a connection to the primary server. In step 2 of the preceding section we make the necessary changes in the pg_hba.conf file to allow the master server to be accessed by the slave server using the user ID repuser that was created in the step 2.
- We then make the necessary parameter changes on the master in step 4 of the preceding section for configuring streaming replication. Given is a description for these parameters:
- Listen_Addresses: This parameter is used to provide the IP address that you want to have PostgreSQL listen too. A value of * indicates that all available IP addresses.
- wal_level: This parameter determines the level of WAL logging done. Specify hot_standby for streaming replication.
- wal_keep_segments: This parameter specifies the number of 16 MB WAL files to retain in the pg_xlog directory. The rule of thumb is that more such files may be required to handle a large checkpoint.
- archive_mode: Setting this parameter enables completed WAL segments to be sent to archive storage.
- archive_command: This parameter is basically a shell command is executed whenever a WAL segment is completed. In our case we are basically copying the file to the local machine and then we are using the secure copy command to send it across to the slave.
- max_wal_senders: This parameter specifies the total number of concurrent connections allowed from the slave servers.
- Once the necessary configuration changes have been made on the master server we then restart the PostgreSQL server on the master in order to get the new configuration changes come into effect. This is done in step 5 of the preceding section.
- In step 6 of the preceding section, we were basically building the slave by copying the primary's data directory to the slave.
- Now with the data directory available on the slave the next step is to configure it. We will now make the necessary parameter replication related parameter changes on the slave in the postgresql.conf directory on the slave server. We set the following mentioned parameter on the slave:
- hot_standby: This parameter determines if we can connect and run queries during times when server is in the archive recovery or standby mode
- In the next step we are configuring the recovery.conf file. This is required to be setup so that the slave can start receiving logs from the master. The following mentioned parameters are configured in the recovery.conf file on the slave:
- standby_mode: This parameter when enabled causes PostgreSQL to work as a standby in a replication configuration.
- primary_conninfo: This parameter specifies the connection information used by the slave to connect to the master. For our scenario the our master server is set as 192.168.0.4 on port 5432 and we are using the user ID repuser with password charlie to make a connection to the master. Remember that the repuser was the user ID which was created in the initial step of the preceding section for this purpose that is, connecting to the master from the slave.
- trigger_file: When slave is configured as a standby it will continue to restore the XLOG records from the master. The trigger_file parameter specifies what is used to trigger a slave to switch over its duties from standby and take over as master or being the primary server.
- At this stage the slave has been now fully configured and we then start the slave server and then replication process begins.
- In step 10 and 11 of the preceding section we are simply testing our replication. We first begin by creating a database test and then log into the test database and create a table by the name test table and then begin inserting some records into the test table. Now our purpose is to see whether these changes are replicated across the slave. To test this we then login into slave on the test database and then query the records from the test table as seen in step 10 of the preceding section. The final result that we see is that the all the records which are changed/inserted on the primary are visible on the slave. This completes our streaming replication setup and configuration.
Replication using Slony
Here in this recipe we are going to setup replication using Slony which is widely used replication engine. It replicates a desired set of tables data from one database to other. This replication approach is based on few event triggers which will be created on the source set of tables which will log the DML and DDL statements into a Slony catalog tables. By using Slony, we can also setup the cascading replication among multiple nodes.
The steps followed in this recipe are carried out on a CentOS Version 6 machine. We would first need to install Slony. The following mentioned are the steps needed to install Slony:
- First go to the mentioned web link and download the given software at http://slony.info/downloads/2.2/source/.
- Once you have downloaded the following mentioned software the next step is to unzip the tarball and then go the newly created directory:
tar xvfj slony1-2.2.3.tar.bz2 cd slony1-2.2.3
- In the next step we are going to configure, compile, and build the software:
/configure --with-pgconfigdir=/usr/pgsql-9.6/bin/ make make install
How to do it…
The following mentioned are the sequence of steps required to replicate data between two tables using Slony replication:
- First start the PostgreSQL server if not already started:
pg_ctl -D $PGDATA start
- In the next step we will be creating two databases test1 and test 2 which will be used as source and target databases:
createdb test1 createdb test2
- In the next step we will create the table t_test on the source database test1 and will insert some records into it:
psql -d test1 test1=# create table t_test (id numeric primary key, name varchar); test1=# insert into t_test values(1,'A'),(2,'B'), (3,'C');
- We will now set up the target database by copying the table definitions from the source database test1:
pg_dump -s -p 5432 -h localhost test1 | psql -h localhost -p 5432 test2
- We will now connect to the target database test2 and verify that there is no data in the tables of the test2 database:
psql -d test2 test2=# select * from t_test;
- We will now setup a slonik script for master/slave that is, source/target setup:
vi init_master.slonik #! /bin/slonik cluster name = mycluster; node 1 admin conninfo = 'dbname=test1 host=localhost port=5432 user=postgres password=postgres'; node 2 admin conninfo = 'dbname=test2 host=localhost port=5432 user=postgres password=postgres'; init cluster ( id=1); create set (id=1, origin=1); set add table(set id=1, origin=1, id=1, fully qualified name = 'public.t_test'); store node (id=2, event node = 1); store path (server=1, client=2, conninfo='dbname=test1 host=localhost port=5432 user=postgres password=postgres'); store path (server=2, client=1, conninfo='dbname=test2 host=localhost port=5432 user=postgres password=postgres'); store listen (origin=1, provider = 1, receiver = 2); store listen (origin=2, provider = 2, receiver = 1);
- We will now create a slonik script for subscription on the slave that is, target:
vi init_slave.slonik #! /bin/slonik cluster name = mycluster; node 1 admin conninfo = 'dbname=test1 host=localhost port=5432 user=postgres password=postgres'; node 2 admin conninfo = 'dbname=test2 host=localhost port=5432 user=postgres password=postgres'; subscribe set ( id = 1, provider = 1, receiver = 2, forward = no);
- We will now run the init_master.slonik script created in step 6 and will run this on the master:
cd /usr/pgsql-9.6/bin slonik init_master.slonik
- We will now run the init_slave.slonik script created in step 7 and will run this on the slave that is, target:
cd /usr/pgsql-9.6/bin slonik init_slave.slonik
- In the next step we will start the master slon daemon:
nohup slon mycluster "dbname=test1 host=localhost port=5432 user=postgres password=postgres" &
- In the next step we will start the slave slon daemon:
nohup slon mycluster "dbname=test2 host=localhost port=5432 user=postgres password=postgres" &
- In the next step we will connect to the master that is, source database test1 and insert some records in the t_test table:
psql -d test1 test1=# insert into t_test values (5,'E');
- We will now test for replication by logging to the slave that is, target database test2 and see if the inserted records into the t_test table in the previous step are visible:
psql -d test2 test2=# select * from t_test; id | name ----+------ 1 | A 2 | B 3 | C 5 | E (4 rows)
How it works…
We will now discuss about the steps followed in the preceding section:
- In step 1, we first start the PostgreSQL server if not already started. In step 2 we create two databases namely test1 and test2 that will serve as our source (master) and target (slave) databases.
- In step 3 of the preceding section we log into the source database test1 and create a table t_test and insert some records into the table.
- In step 4 of the preceding section we set up the target database test2 by copying the table definitions present in the source database and loading them into the target database test2 by using pg_dump utility.
- In step 5 of the preceding section we login into the target database test2 and verify that there are no records present in the table t_test because in step 5 we only extracted the table definitions into test2 database from test1 database.
- In step 6 we setup a slonik script for master/slave replication setup. In the file init_master.slonik we first define the cluster name as mycluster. We then define the nodes in the cluster. Each node will have a number associated to a connection string which contains database connection information. The node entry is defined both for source and target databases. The store_path commands are necessary so that each node knows how to communicate with the other.
- In step 7 we setup a slonik script for subscription of the slave that is, target database test2. Once again the script contains information such as cluster name, node entries which are designed a unique number related to connect string information. It also contains a subscriber set.
- In step 8 of the preceding section we run the init_master.slonik on the master. Similarly in step 9 we run the init_slave.slonik on the slave.
- In step 10 of the preceding section we start the master slon daemon. In step 11 of the preceding section we start the slave slon daemon.
- Subsequent section from step 12 and 13 of the preceding section are used to test for replication. For this purpose in step 12 of the preceding section we first login into the source database test1 and insert some records into the t_test table. To check if the newly inserted records have been replicated to target database test2 we login into the test2 database in step 13 and then result set obtained by the output of the query confirms that the changed/inserted records on the t_test table in the test1 database are successfully replicated across the target database test2. You may refer to the link given for more information regarding Slony replication at http://slony.info/documentation/tutorial.html.
We have seen how to setup streaming replication and then we looked at how to install and replicate using one popular third-party replication tool Slony.
Resources for Article:
- Introducing PostgreSQL 9 [article]
- PostgreSQL Cookbook - High Availability and Replication [article]
- PostgreSQL in Action [article]