PostgreSQL Cookbook - High Availability and Replication

Chitij Chauhan

February 2015

In this article by Chitij Chauhan, author of the book PostgreSQL Cookbook, we will talk about various high availability and replication solutions, including some popular third-party replication tools such as Slony-I and Londiste.

In this article, we will cover the following recipes:

  • Setting up hot streaming replication
  • Replication using Slony-I
  • Replication using Londiste

The important components for any production database is to achieve fault tolerance, 24/7 availability, and redundancy. It is for this purpose that we have different high availability and replication solutions available for PostgreSQL.

From a business perspective, it is important to ensure 24/7 data availability in the event of a disaster situation or a database crash due to disk or hardware failure. In such situations, it becomes critical to ensure that a duplicate copy of the data is available on a different server or a different database, so that seamless failover can be achieved even when the primary server/database is unavailable.

Setting up hot streaming replication

In this recipe, we are going to set up a master-slave streaming replication.

Getting ready

For this exercise, you will need two Linux machines, each with the latest version of PostgreSQL installed. We will be using the following IP addresses for the master and slave servers:

  • Master IP address: 192.168.0.4
  • Slave IP address: 192.168.0.5

Before you start with the master-slave streaming setup, it is important that the SSH connectivity between the master and slave is setup.

How to do it...

Perform the following sequence of steps to set up a master-slave streaming replication:

  1. First, we are going to create a user on the master, which will be used by the slave server to connect to the PostgreSQL database on the master server:
    psql -c "CREATE USER repuser REPLICATION LOGIN ENCRYPTED PASSWORD 'charlie';"
  2. Next, we will allow the replication user that was created in the previous step to allow access to the master PostgreSQL server.

    This is done by making the necessary changes as mentioned in the pg_hba.conf file:

    Vi pg_hba.conf
    host   replication   repuser   192.168.0.5/32   md5
  3. In the next step, we are going to configure parameters in the postgresql.conf file. These parameters need to be set in order to get the streaming replication working:
    Vi /var/lib/pgsql/9.3/data/postgresql.conf 
    listen_addresses = '*'
    wal_level = hot_standby
    max_wal_senders = 3
    wal_keep_segments = 8
    archive_mode = on      
    archive_command = 'cp %p /var/lib/pgsql/archive/%f && scp %p postgres@192.168.0.5:/var/lib/pgsql/archive/%f'
    checkpoint_segments = 8
  4. Once the parameter changes have been made in the postgresql.conf file in the previous step, the next step will be to restart the PostgreSQL server on the master server, in order to let the changes take effect:
    pg_ctl -D /var/lib/pgsql/9.3/data restart
  5. Before the slave can replicate the master, we will need to give it the initial database to build off. For this purpose, we will make a base backup by copying the primary server's data directory to the standby. The rsync command needs to be run as a root user:
    psql -U postgres -h 192.168.0.4 -c "SELECT pg_start_backup('label', true)"
    rsync -a /var/lib/pgsql/9.3/data/ 192.168.0.5:/var/lib/pgsql/9.3/data/ --exclude postmaster.pid
    psql -U postgres -h 192.168.0.4 -c "SELECT pg_stop_backup()"
  6. Once the data directory, mentioned in the previous step, is populated, the next step is to enable the following parameter in the postgresql.conf file on the slave server:
    hot_standby = on
  7. The next step will be to copy the recovery.conf.sample file in the $PGDATA location on the slave server and then configure the following parameters:
    cp /usr/pgsql-9.3/share/recovery.conf.sample /var/lib/pgsql/9.3/data/recovery.conf
    standby_mode = on
    primary_conninfo = 'host=192.168.0.4 port=5432 user=repuser password=charlie'
    trigger_file = '/tmp/trigger.replication′
    restore_command = 'cp /var/lib/pgsql/archive/%f "%p"'
  8. The next step will be to start the slave server:
    service postgresql-9.3 start
  9. Now that the above mentioned replication steps are set up, we will test for replication. On the master server, log in and issue the following SQL commands:
    psql -h 192.168.0.4 -d postgres -U postgres -W 
    postgres=# create database test;
     
    postgres=# \c test;
     
    test=# create table testtable ( testint int, testchar varchar(40) );
     
    CREATE TABLE
    test=# insert into testtable values ( 1, 'What A Sight.' );
    INSERT 0 1
  10. On the slave server, we will now check whether the newly created database and the corresponding table, created in the previous step, are replicated:
    psql -h 192.168.0.5 -d test -U postgres -W 
    test=# select * from testtable;
    testint | testchar
    ---------+---------------------------
    1 | What A Sight.
    (1 row)

How it works...

The following is the explanation for the steps performed in the preceding section.

In the initial step of the preceding section, we create a user called repuser, which will be used by the slave server to make a connection to the primary server. In the second step of the preceding section, we make the necessary changes in the pg_hba.conf file to allow the master server to be accessed by the slave server using the repuser user ID that was created in step 1. We then make the necessary parameter changes on the master in step 3 of the preceding section to configure a streaming replication. The following is a description of these parameters:

  • listen_addresses: This parameter is used to provide the IP address associated with the interface that you want to have PostgreSQL listen to. A value of * indicates all available IP addresses.
  • wal_level: This parameter determines the level of WAL logging done. Specify hot_standby for streaming replication.
  • wal_keep_segments: This parameter specifies the number of 16 MB WAL files to be retained in the pg_xlog directory. The rule of thumb is that more such files might be required to handle a large checkpoint.
  • archive_mode: Setting this parameter enables completed WAL segments to be sent to the archive storage.
  • archive_command: This parameter is basically a shell command that is executed whenever a WAL segment is completed. In our case, we are basically copying the file to the local machine and then using the secure copy command to send it across to the slave.
  • max_wal_senders: This parameter specifies the total number of concurrent connections allowed from the slave servers.
  • checkpoint_segments: This parameter specifies the maximum number of logfile segments between automatic WAL checkpoints. Once the necessary configuration changes have been made on the master server, we then restart the PostgreSQL server on the master in order to let the new configuration changes take effect. This is done in step 4 of the preceding section. In step 5 of the preceding section, we are basically building the slave by copying the primary server's data directory to the slave.

Now, with the data directory available on the slave, the next step is to configure it. We will now make the necessary parameter replication related parameter changes on the slave in the postgresql.conf directory on the slave server. We set the following parameters on the slave:

  • hot_standby: This parameter determines whether you can connect and run queries when the server is in the archive recovery or standby mode. In the next step, we are configuring the recovery.conf file. This is required to be set up so that the slave can start receiving logs from the master. The parameters explained next are configured in the recovery.conf file on the slave.
  • standby_mode: This parameter, when enabled, causes PostgreSQL to work as a standby in a replication configuration.
  • primary_conninfo: This parameter specifies the connection information used by the slave to connect to the master. For our scenario, our master server is set as 192.168.0.4 on port 5432 and we are using the repuser userid with the password charlie to make a connection to the master. Remember that repuser was the userid which was created in the initial step of the preceding section for this purpose, that is, connecting to the master from the slave.
  • trigger_file: When a slave is configured as a standby, it will continue to restore the XLOG records from the master. The trigger_file parameter specifies what is used to trigger a slave, in order to switch over its duties from standby and take over as master or primary server. At this stage, the slave is fully configured now and we can start the slave server; then, the replication process begins. This is shown in step 8 of the preceding section. In steps 9 and 10 of the preceding section, we are simply testing our replication. We first begin by creating a test database, then we log in to the test database and create a table by the name testtable, and then we begin inserting some records into the testtable table. Now, our purpose is to see whether these changes are replicated across the slave. To test this, we log in to the slave on the test database and then query the records from the testtable table, as seen in step 10 of the preceding section. The final result that we see is that all the records that are changed/inserted on the primary server are visible on the slave. This completes our streaming replication's setup and configuration.

Replication using Slony-I

Here, we are going to set up replication using Slony-I. We will be setting up the replication of table data between two databases on the same server.

Getting ready

The steps performed in this recipe are carried out on a CentOS Version 6 machine. It is also important to remove the directives related to hot streaming replication prior to setting up replication using Slony-I.

We will first need to install Slony-I. The following steps need to be performed in order to install Slony-I:

  1. First, go to http://slony.info/downloads/2.2/source/ and download the given software.
  2. Once you have downloaded the Slony-I software, the next step is to unzip the .tar file and then go the newly created directory. Before doing this, please ensure that you have the postgresql-devel package for the corresponding PostgreSQL version installed before you install Slony-I:
    tar xvfj slony1-2.2.3.tar.bz2
     cd slony1-2.2.3
  3. In the next step, we are going to configure, compile, and build the software:
    ./configure --with-pgconfigdir=/usr/pgsql-9.3/bin/
    make
    make install

How to do it...

You need to perform the following sequence of steps, in order to replicate data between two tables using Slony-I replication:

  1. First, start the PostgreSQL server if you have not already started it:
    pg_ctl -D $PGDATA start
  2. In the next step, we will be creating two databases, test1 and test2, which will be used as the source and target databases respectively:
    createdb test1
    createdb test2
  3. In the next step, we will create the t_test table on the source database, test1, and insert some records into it:

    psql -d test1
    test1=# create table t_test (id numeric primary key, name varchar);
     
    test1=# insert into t_test values(1,'A'),(2,'B'), (3,'C');
  4. We will now set up the target database by copying the table definitions from the test1 source database:
    pg_dump -s -p 5432 -h localhost test1 | psql -h localhost -p 5432 test2
  5. We will now connect to the target database, test2, and verify that there is no data in the tables of the test2 database:
    psql -d test2
    test2=# select * from t_test;
  6. We will now set up a slonik script for the master-slave, that is source/target, setup. In this scenario, since we are replicating between two different databases on the same server, the only different connection string option will be the database name:
    cd /usr/pgsql-9.3/bin
    vi init_master.slonik
     
    #!/bin/sh
    cluster name = mycluster;
    node 1 admin conninfo = 'dbname=test1 host=localhost
    port=5432 user=postgres password=postgres';
    node 2 admin conninfo = 'dbname=test2 host=localhost
    port=5432 user=postgres password=postgres';
    init cluster ( id=1);
    create set (id=1, origin=1);
    set add table(set id=1, origin=1, id=1, fully qualified
    name = 'public.t_test');
    store node (id=2, event node = 1);
    store path (server=1, client=2, conninfo='dbname=test1
    host=localhost port=5432 user=postgres password=postgres');
    store path (server=2, client=1, conninfo='dbname=test2
    host=localhost port=5432 user=postgres password=postgres');
    store listen (origin=1, provider = 1, receiver = 2);
     store listen (origin=2, provider = 2, receiver = 1);
  7. We will now create a slonik script for subscription to the slave, that is, target:
    cd /usr/pgsql-9.3/bin
    vi init_slave.slonik
    #!/bin/sh
    cluster name = mycluster;
    node 1 admin conninfo = 'dbname=test1 host=localhost
    port=5432 user=postgres password=postgres';
    node 2 admin conninfo = 'dbname=test2 host=localhost
    port=5432 user=postgres password=postgres';
    subscribe set ( id = 1, provider = 1, receiver = 2, forward
    = no);
  8. We will now run the init_master.slonik script created in step 6 and run this on the master, as follows:
    cd /usr/pgsql-9.3/bin
     
    slonik init_master.slonik
  9. We will now run the init_slave.slonik script created in step 7 and run this on the slave, that is, target:
    cd /usr/pgsql-9.3/bin
     
    slonik init_slave.slonik
  10. In the next step, we will start the master slon daemon:
    nohup slon mycluster "dbname=test1 host=localhost port=5432 
    user=postgres password=postgres" &
  11. In the next step, we will start the slave slon daemon:
    nohup slon mycluster "dbname=test2 host=localhost port=5432 
    user=postgres password=postgres" &
  12. Next, we will connect to the master, that is, the test1 source database, and insert some records in the t_test table:
    psql -d test1
    test1=# insert into t_test values (5,'E');
  13. We will now test for the replication by logging on to the slave, that is, the test2 target database, and see whether the inserted records in the t_test table are visible:
    psql -d test2
     
    test2=# select * from t_test;
    id | name
    ----+------
    1 | A
    2 | B
    3 | C
    5 | E
    (4 rows)

How it works...

We will now discuss the steps performed in the preceding section:

  • In step 1, we first start the PostgreSQL server if it is not already started. In step 2, we create two databases, namely test1 and test2, that will serve as our source (master) and target (slave) databases.
  • In step 3, we log in to the test1 source database, create a t_test table, and insert some records into the table.
  • In step 4, we set up the target database, test2, by copying the table definitions present in the source database and loading them into test2 using the pg_dump utility.
  • In step 5, we log in to the target database, test2, and verify that there are no records present in the t_test table because in step 4, we only extracted the table definitions into the test2 database from the test1 database.
  • In step 6, we set up a slonik script for the master-slave replication setup. In the init_master.slonik file, we first define the cluster name as mycluster. We then define the nodes in the cluster. Each node will have a number associated with a connection string, which contains database connection information. The node entry is defined both for the source and target databases. The store_path commands are necessary, so that each node knows how to communicate with the other.
  • In step 7, we set up a slonik script for the subscription of the slave, that is, the test2 target database. Once again, the script contains information such as the cluster name and the node entries that are designated a unique number related to connection string information. It also contains a subscriber set.
  • In step 8, we run the init_master.slonik file on the master. Similarly, in step 9, we run the init_slave.slonik file on the slave.
  • In step 10, we start the master slon daemon. In step 11, we start the slave slon daemon.
  • The subsequent steps, 12 and 13, are used to test for replication. For this purpose, in step 12 of the preceding section, we first log in to the test1 source database and insert some records into the t_test table. To check whether the newly inserted records have been replicated in the target database, test2, we log in to the test2 database in step 13. The result set obtained from the output of the query confirms that the changed/inserted records on the t_test table in the test1 database are successfully replicated across the target database, test2.

For more information on Slony-I replication, go to http://slony.info/documentation/tutorial.html.

There's more...

If you are using Slony-I for replication between two different servers, in addition to the steps mentioned in the How to do it… section, you will also have to enable authentication information in the pg_hba.conf file existing on both the source and target servers. For example, let's assume that the source server's IP is 192.168.16.44 and the target server's IP is 192.168.16.56 and we are using a user named super to replicate the data.

If this is the situation, then in the source server's pg_hba.conf file, we will have to enter the information, as follows:

host         postgres         super     192.168.16.44/32           md5

Similarly, in the target server's pg_hba.conf file, we will have to enter the authentication information, as follows:

host         postgres         super     192.168.16.56/32           md5

Also, in the shell scripts that were used for Slony-I, wherever the connection information for the host is localhost that entry will need to be replaced by the source and target server's IP addresses.

Replication using Londiste

In this recipe, we are going to show you how to replicate data using Londiste.

Getting ready

For this setup, we are using the same host CentOS Linux machine to replicate data between two databases. This can also be set up using two separate Linux machines running on VMware, VirtualBox, or any other virtualization software. It is assumed that the latest version of PostgreSQL, version 9.3, is installed. We used CentOS Version 6 as the Linux operating system for this exercise.

To set up Londiste replication on the Linux machine, perform the following steps:

  1. Go to http://pgfoundry.org/projects/skytools/ and download the latest version of Skytools 3.2, that is, tarball skytools-3.2.tar.gz.
  2. Extract the tarball file, as follows:
    tar -xvzf skytools-3.2.tar.gz
  3. Go to the new location and build and compile the software:
    cd skytools-3.2
    ./configure --prefix=/var/lib/pgsql/9.3/Sky –with-pgconfig=/usr/pgsql-9.3/bin/pg_config
     
    make
     
    make install
    
  4. Also, set the PYTHONPATH environment variable, as shown here. Alternatively, you can also set it in the .bash_profile script:
    export PYTHONPATH=/opt/PostgreSQL/9.2/Sky/lib64/python2.6/site-packages/

How to do it...

  1. We are going to perform the following sequence of steps to set up replication between two different databases using Londiste. First, create the two databases between which replication has to occur:
    createdb node1
    createdb node2
  2. Populate the node1 database with data using the pgbench utility:
    pgbench -i -s 2 -F 80 node1
  3. Add any primary key and foreign keys to the tables in the node1 database that are needed for replication. Create the following .sql file and add the following lines to it:
    Vi /tmp/prepare_pgbenchdb_for_londiste.sql -- add primary key to history table
    ALTER TABLE pgbench_history ADD COLUMN hid SERIAL PRIMARY KEY;
     
    -- add foreign keys
    ALTER TABLE pgbench_tellers ADD CONSTRAINT pgbench_tellers_branches_fk FOREIGN KEY(bid) REFERENCES pgbench_branches;
    ALTER TABLE pgbench_accounts ADD CONSTRAINT pgbench_accounts_branches_fk FOREIGN KEY(bid) REFERENCES pgbench_branches;
    ALTER TABLE pgbench_history ADD CONSTRAINT pgbench_history_branches_fk FOREIGN KEY(bid) REFERENCES pgbench_branches;
    ALTER TABLE pgbench_history ADD CONSTRAINT pgbench_history_tellers_fk FOREIGN KEY(tid) REFERENCES pgbench_tellers;
    ALTER TABLE pgbench_history ADD CONSTRAINT pgbench_history_accounts_fk FOREIGN KEY(aid) REFERENCES pgbench_accounts;
  4. We will now load the .sql file created in the previous step and load it into the database:
    psql node1 -f /tmp/prepare_pgbenchdb_for_londiste.sql
  5. We will now populate the node2 database with table definitions from the tables in the node1 database:
    pg_dump -s -t 'pgbench*' node1 > /tmp/tables.sql
    psql -f /tmp/tables.sql node2
  6. Now starts the process of replication. We will first create the londiste.ini configuration file with the following parameters in order to set up the root node for the source database, node1:
    Vi londiste.ini
     
    [londiste3]
    job_name = first_table
    db = dbname=node1
    queue_name = replication_queue
    logfile = /home/postgres/log/londiste.log
    pidfile = /home/postgres/pid/londiste.pid
  7. In the next step, we are going to use the londiste.ini configuration file created in the previous step to set up the root node for the node1 database, as shown here:
    [postgres@localhost bin]$ ./londiste3 londiste3.ini create-root node1 dbname=node1
     
    2014-12-09 18:54:34,723 2335 WARNING No host= in public connect string, bad idea
    2014-12-09 18:54:35,210 2335 INFO plpgsql is installed
    2014-12-09 18:54:35,217 2335 INFO pgq is installed
    2014-12-09 18:54:35,225 2335 INFO pgq.get_batch_cursor is installed
    2014-12-09 18:54:35,227 2335 INFO pgq_ext is installed
    2014-12-09 18:54:35,228 2335 INFO pgq_node is installed
    2014-12-09 18:54:35,230 2335 INFO londiste is installed
    2014-12-09 18:54:35,232 2335 INFO londiste.global_add_table is installed
    2014-12-09 18:54:35,281 2335 INFO Initializing node
    2014-12-09 18:54:35,285 2335 INFO Location registered
    2014-12-09 18:54:35,447 2335 INFO Node "node1" initialized for queue "replication_queue" with type "root"
    2014-12-09 18:54:35,465 2335 INFO Don
  8. We will now run the worker daemon for the root node:
    [postgres@localhost bin]$ ./londiste3 londiste3.ini worker
    2014-12-09 18:55:17,008 2342 INFO Consumer uptodate = 1
  9. In the next step, we will create a slave.ini configuration file in order to make a leaf node for the node2 target database:
    Vi slave.ini
    [londiste3]
    job_name = first_table_slave
    db = dbname=node2
    queue_name = replication_queue
    logfile = /home/postgres/log/londiste_slave.log
    pidfile = /home/postgres/pid/londiste_slave.pid
  10. We will now initialize the node in the target database:
    ./londiste3 slave.ini create-leaf node2 dbname=node2 –provider=dbname=node1
    2014-12-09 18:57:22,769 2408 WARNING No host= in public connect string, bad idea
    2014-12-09 18:57:22,778 2408 INFO plpgsql is installed
    2014-12-09 18:57:22,778 2408 INFO Installing pgq
    2014-12-09 18:57:22,778 2408 INFO   Reading from /var/lib/pgsql/9.3/Sky/share/skytools3/pgq.sql
    2014-12-09 18:57:23,211 2408 INFO pgq.get_batch_cursor is installed
    2014-12-09 18:57:23,212 2408 INFO Installing pgq_ext
    2014-12-09 18:57:23,213 2408 INFO   Reading from /var/lib/pgsql/9.3/Sky/share/skytools3/pgq_ext.sql
    2014-12-09 18:57:23,454 2408 INFO Installing pgq_node
    2014-12-09 18:57:23,455 2408 INFO   Reading from /var/lib/pgsql/9.3/Sky/share/skytools3/pgq_node.sql
    2014-12-09 18:57:23,729 2408 INFO Installing londiste
    2014-12-09 18:57:23,730 2408 INFO   Reading from /var/lib/pgsql/9.3/Sky/share/skytools3/londiste.sql
    2014-12-09 18:57:24,391 2408 INFO londiste.global_add_table is installed
    2014-12-09 18:57:24,575 2408 INFO Initializing node
    2014-12-09 18:57:24,705 2408 INFO Location registered
    2014-12-09 18:57:24,715 2408 INFO Location registered
    2014-12-09 18:57:24,744 2408 INFO Subscriber registered: node2
    2014-12-09 18:57:24,748 2408 INFO Location registered
    2014-12-09 18:57:24,750 2408 INFO Location registered
    2014-12-09 18:57:24,757 2408 INFO Node "node2" initialized for queue "replication_queue" with type "leaf"
    2014-12-09 18:57:24,761 2408 INFO Done
  11. We will now launch the worker daemon for the target database, that is, node2:
    [postgres@localhost bin]$ ./londiste3 slave.ini worker
    2014-12-09 18:58:53,411 2423 INFO Consumer uptodate = 1
  12. We will now create the configuration file, that is pgqd.ini, for the ticker daemon:
    vi pgqd.ini
     
    [pgqd]
    logfile = /home/postgres/log/pgqd.log
    pidfile = /home/postgres/pid/pgqd.pid
  13. Using the configuration file created in the previous step, we will launch the ticker daemon:
    [postgres@localhost bin]$ ./pgqd pgqd.ini
    2014-12-09 19:05:56.843 2542 LOG Starting pgqd 3.2
    2014-12-09 19:05:56.844 2542 LOG auto-detecting dbs ...
    2014-12-09 19:05:57.257 2542 LOG node1: pgq version ok: 3.2
    2014-12-09 19:05:58.130 2542 LOG node2: pgq version ok: 3.2
  14. We will now add all the tables to the replication on the root node:
    [postgres@localhost bin]$ ./londiste3 londiste3.ini add-table --all
    2014-12-09 19:07:26,064 2614 INFO Table added: public.pgbench_accounts
    2014-12-09 19:07:26,161 2614 INFO Table added: public.pgbench_branches
    2014-12-09 19:07:26,238 2614 INFO Table added: public.pgbench_history
    2014-12-09 19:07:26,287 2614 INFO Table added: public.pgbench_tellers
  15. Similarly, add all the tables to the replication on the leaf node:
    [postgres@localhost bin]$ ./londiste3 slave.ini add-table –all
  16. We will now generate some traffic on the node1 source database:
    pgbench -T 10 -c 5 node1
  17. We will now use the compare utility available with the londiste3 command to check the tables in both the nodes; that is, both the source database (node1) and destination database (node2) have the same amount of data:
    [postgres@localhost bin]$ ./londiste3 slave.ini compare
     
    2014-12-09 19:26:16,421 2982 INFO Checking if node1 can be used for copy
    2014-12-09 19:26:16,424 2982 INFO Node node1 seems good source, using it
    2014-12-09 19:26:16,425 2982 INFO public.pgbench_accounts: Using node node1 as provider
    2014-12-09 19:26:16,441 2982 INFO Provider: node1 (root)
    2014-12-09 19:26:16,446 2982 INFO Locking public.pgbench_accounts
    2014-12-09 19:26:16,447 2982 INFO Syncing public.pgbench_accounts
    2014-12-09 19:26:18,975 2982 INFO Counting public.pgbench_accounts
    2014-12-09 19:26:19,401 2982 INFO srcdb: 200000 rows, checksum=167607238449
    2014-12-09 19:26:19,706 2982 INFO dstdb: 200000 rows, checksum=167607238449
    2014-12-09 19:26:19,715 2982 INFO Checking if node1 can be used for copy
    2014-12-09 19:26:19,716 2982 INFO Node node1 seems good source, using it
    2014-12-09 19:26:19,716 2982 INFO public.pgbench_branches: Using node node1 as provider
    2014-12-09 19:26:19,730 2982 INFO Provider: node1 (root)
    2014-12-09 19:26:19,734 2982 INFO Locking public.pgbench_branches
    2014-12-09 19:26:19,734 2982 INFO Syncing public.pgbench_branches
    2014-12-09 19:26:22,772 2982 INFO Counting public.pgbench_branches
    2014-12-09 19:26:22,804 2982 INFO srcdb: 2 rows, checksum=-3078609798
    2014-12-09 19:26:22,812 2982 INFO dstdb: 2 rows, checksum=-3078609798
    2014-12-09 19:26:22,866 2982 INFO Checking if node1 can be used for copy
    2014-12-09 19:26:22,877 2982 INFO Node node1 seems good source, using it
    2014-12-09 19:26:22,878 2982 INFO public.pgbench_history: Using node node1 as provider
    2014-12-09 19:26:22,919 2982 INFO Provider: node1 (root)
    2014-12-09 19:26:22,931 2982 INFO Locking public.pgbench_history
    2014-12-09 19:26:22,932 2982 INFO Syncing public.pgbench_history
    2014-12-09 19:26:25,963 2982 INFO Counting public.pgbench_history
    2014-12-09 19:26:26,008 2982 INFO srcdb: 715 rows, checksum=9467587272
    2014-12-09 19:26:26,020 2982 INFO dstdb: 715 rows, checksum=9467587272
    2014-12-09 19:26:26,056 2982 INFO Checking if node1 can be used for copy
    2014-12-09 19:26:26,063 2982 INFO Node node1 seems good source, using it
    2014-12-09 19:26:26,064 2982 INFO public.pgbench_tellers: Using node node1 as provider
    2014-12-09 19:26:26,100 2982 INFO Provider: node1 (root)
    2014-12-09 19:26:26,108 2982 INFO Locking public.pgbench_tellers
    2014-12-09 19:26:26,109 2982 INFO Syncing public.pgbench_tellers
    2014-12-09 19:26:29,144 2982 INFO Counting public.pgbench_tellers
    2014-12-09 19:26:29,176 2982 INFO srcdb: 20 rows, checksum=4814381032
    2014-12-09 19:26:29,182 2982 INFO dstdb: 20 rows, checksum=4814381032
    

How it works...

The following is an explanation of the steps performed in the preceding section:

  • Initially, in step 1, we create two databases, that is node1 and node2, that are used as the source and target databases, respectively, from a replication perspective.
  • In step 2, we populate the node1 database using the pgbench utility.
  • In step 3 of the preceding section, we add and define the respective primary key and foreign key relationships on different tables and put these DDL commands in a .sql file.
  • In step 4, we execute these DDL commands stated in step 3 on the node1 database; thus, in this way, we force the primary key and foreign key definitions on the tables in the pgbench schema in the node1 database.
  • In step 5, we extract the table definitions from the tables in the pgbench schema in the node1 database and load these definitions in the node2 database. We will now discuss steps 6 to 8 of the preceding section.
  • In step 6, we create the configuration file, which is then used in step 7 to create the root node for the node1 source database.
  • In step 8, we will launch the worker daemon for the root node. Regarding the entries mentioned in the configuration file in step 6, we first define a job that must have a name, so that distinguished processes can be easily identified. Then, we define a connect string with information to connect to the source database, that is node1, and then we define the name of the replication queue involved. Finally, we define the location of the log and pid files.
  • We will now discuss steps 9 to 11 of the preceding section. In step 9, we define the configuration file, which is then used in step 10 to create the leaf node for the target database, that is node2.
  • In step 11, we launch the worker daemon for the leaf node. The entries in the configuration file in step 9 contain the job_name connect string in order to connect to the target database, that is node2, the name of the replication queue involved, and the location of log and pid involved. The key part in step 11 is played by the slave, that is the target database—to find the master or provider, that is source database node1.
  • We will now talk about steps 12 and 13 of the preceding section. In step 12, we define the ticker configuration, with the help of which we launch the ticker process mentioned in step 13. Once the ticker daemon has started successfully, we have all the components and processes setup and needed for replication; however, we have not yet defined what the system needs to replicate.
  • In step 14 and 15, we define the tables to the replication that is set on both the source and target databases, that is node1 and node2, respectively.
  • Finally, we will talk about steps 16 and 17 of the preceding section. Here, at this stage, we are testing the replication that was set up between the node1 source database and the node2 target database.
  • In step 16, we generate some traffic on the node1 source database by running pgbench with five parallel database connections and generating traffic for 10 seconds.
  • In step 17, we check whether the tables on both the source and target databases have the same data. For this purpose, we use the compare command on the provider and subscriber nodes and then count and checksum the rows on both sides. A partial output from the preceding section tells you that the data has been successfully replicated between all the tables that are part of the replication set up between the node1 source database and the node2 destination database, as the count and checksum of rows for all the tables on the source and target destination databases are matching:
    2014-12-09 19:26:18,975 2982 INFO Counting public.pgbench_accounts
    2014-12-09 19:26:19,401 2982 INFO srcdb: 200000 rows, checksum=167607238449
    2014-12-09 19:26:19,706 2982 INFO dstdb: 200000 rows, checksum=167607238449
     
    2014-12-09 19:26:22,772 2982 INFO Counting public.pgbench_branches
    2014-12-09 19:26:22,804 2982 INFO srcdb: 2 rows, checksum=-3078609798
    2014-12-09 19:26:22,812 2982 INFO dstdb: 2 rows, checksum=-3078609798
     
    2014-12-09 19:26:25,963 2982 INFO Counting public.pgbench_history
    2014-12-09 19:26:26,008 2982 INFO srcdb: 715 rows, checksum=9467587272
    2014-12-09 19:26:26,020 2982 INFO dstdb: 715 rows, checksum=9467587272
     
    2014-12-09 19:26:29,144 2982 INFO Counting public.pgbench_tellers
    2014-12-09 19:26:29,176 2982 INFO srcdb: 20 rows, checksum=4814381032
    2014-12-09 19:26:29,182 2982 INFO dstdb: 20 rows, checksum=4814381032

Summary

This article demonstrates the high availability and replication concepts in PostgreSQL. After reading this chapter, you will be able to implement high availability and replication options using different techniques including streaming replication, Slony-I replication and replication using Longdiste.

Resources for Article:


Further resources on this subject:


You've been reading an excerpt of:

PostgreSQL Cookbook

Explore Title