In this article by the author, Hans-Jürgen Schönig, of the book, PostgreSQL Replication, Second Edition, we learn how to set up synchronous replication.
In asynchronous replication, data is submitted and received by the slave (or slaves) after the transaction has been committed on the master. During the time between the master's commit and the point when the slave actually has fully received the data, it can still be lost.
Here, you will learn about the following topics:
- Making sure that no single transaction can be lost
- Configuring PostgreSQL for synchronous replication
- Understanding and using application_name
- The performance impact of synchronous replication
- Optimizing replication for speed
Synchronous replication can be the cornerstone of your replication setup, providing a system that ensures zero data loss.
(For more resources related to this topic, see here.)
Synchronous replication setup
Synchronous replication has been made to protect your data at all costs. The core idea of synchronous replication is that a transaction must be on at least two servers before the master returns success to the client. Making sure that data is on at least two nodes is a key requirement to ensure no data loss in the event of a crash.
Setting up synchronous replication works just like setting up asynchronous replication. Just a handful of parameters discussed here have to be changed to enjoy the blessings of synchronous replication. However, if you are about to create a setup based on synchronous replication, we recommend getting started with an asynchronous setup and gradually extending your configuration and turning it into synchronous replication. This will allow you to debug things more easily and avoid problems down the road.
Understanding the downside to synchronous replication
The most important thing you have to know about synchronous replication is that it is simply expensive. Synchronous replication and its downsides are two of the core reasons for which we have decided to include all this background information in this book. It is essential to understand the physical limitations of synchronous replication, otherwise you could end up in deep trouble.
When setting up synchronous replication, try to keep the following things in mind:
- Minimize the latency
- Make sure you have redundant connections
- Synchronous replication is more expensive than asynchronous replication
- Always cross-check twice whether there is a real need for synchronous replication
In many cases, it is perfectly fine to lose a couple of rows in the event of a crash. Synchronous replication can safely be skipped in this case. However, if there is zero tolerance, synchronous replication is a tool that should be used.
Understanding the application_name parameter
In order to understand a synchronous setup, a config variable called application_name is essential, and it plays an important role in a synchronous setup. In a typical application, people use the application_name parameter for debugging purposes, as it allows users to assign a name to their database connection. It can help track bugs, identify what an application is doing, and so on:
test=# SHOW application_name; application_name ------------------ psql (1 row) test=# SET application_name TO 'whatever'; SET test=# SHOW application_name; application_name ------------------ whatever (1 row)
As you can see, it is possible to set the application_name parameter freely. The setting is valid for the session we are in, and will be gone as soon as we disconnect. The question now is: What does application_name have to do with synchronous replication?
Well, the story goes like this: if this application_name value happens to be part of synchronous_standby_names, the slave will be a synchronous one. In addition to that, to be a synchronous standby, it has to be:
- streaming data in real-time (that is, not fetching old WAL records)
Once a standby becomes synced, it remains in that position until disconnection.
In the case of cascaded replication (which means that a slave is again connected to a slave), the cascaded slave is not treated synchronously anymore. Only the first server is considered to be synchronous.
With all of this information in mind, we can move forward and configure our first synchronous replication.
Making synchronous replication work
To show you how synchronous replication works, this article will include a full, working example outlining all the relevant configuration parameters.
A couple of changes have to be made to the master. The following settings will be needed in postgresql.conf on the master:
wal_level = hot_standby max_wal_senders = 5 # or any number synchronous_standby_names = 'book_sample' hot_standby = on # on the slave to make it readable
Then we have to adapt pg_hba.conf. After that, the server can be restarted and the master is ready for action.
We recommend that you set wal_keep_segments as well to keep more transaction logs. We also recommend setting wal_keep_segments to keep more transaction logs on the master database. This makes the entire setup way more robust.
It is also possible to utilize replication slots.
In the next step, we can perform a base backup just as we have done before. We have to call pg_basebackup on the slave. Ideally, we already include the transaction log when doing the base backup. The --xlog-method=stream parameter allows us to fire things up quickly and without any greater risks.
The --xlog-method=stream and wal_keep_segments parameters are a good combo, and in our opinion, should be used in most cases to ensure that a setup works flawlessly and safely.
We have already recommended setting hot_standby on the master. The config file will be replicated anyway, so you save yourself one trip to postgresql.conf to change this setting. Of course, this is not fine art but an easy and pragmatic approach.
Once the base backup has been performed, we can move ahead and write a simple recovery.conf file suitable for synchronous replication, as follows:
iMac:slavehs$ cat recovery.conf primary_conninfo = 'host=localhost application_name=book_sample port=5432' standby_mode = on
The config file looks just like before. The only difference is that we have added application_name to the scenery. Note that the application_name parameter must be identical to the synchronous_standby_names setting on the master.
Once we have finished writing recovery.conf, we can fire up the slave.
In our example, the slave is on the same server as the master. In this case, you have to ensure that those two instances will use different TCP ports, otherwise the instance that starts second will not be able to fire up. The port can easily be changed in postgresql.conf.
After these steps, the database instance can be started. The slave will check out its connection information and connect to the master. Once it has replayed all the relevant transaction logs, it will be in synchronous state. The master and the slave will hold exactly the same data from then on.
Checking the replication
Now that we have started the database instance, we can connect to the system and see whether things are working properly.
To check for replication, we can connect to the master and take a look at pg_stat_replication. For this check, we can connect to any database inside our (master) instance, as follows:
postgres=# \x Expanded display is on. postgres=# SELECT * FROM pg_stat_replication; -[ RECORD 1 ]----+------------------------------ pid | 62871 usesysid | 10 usename | hs application_name | book_sample client_addr | ::1 client_hostname | client_port | 59235 backend_start | 2013-03-29 14:53:52.352741+01 state | streaming sent_location | 0/30001E8 write_location | 0/30001E8 flush_location | 0/30001E8 replay_location | 0/30001E8 sync_priority | 1 sync_state | sync
This system view will show exactly one line per slave attached to your master system.
The \x command will make the output more readable for you. If you don't use \x to transpose the output, the lines will be so long that it will be pretty hard for you to comprehend the content of this table. In expanded display mode, each column will be in one line instead.
You can see that the application_name parameter has been taken from the connect string passed to the master by the slave (which is book_sample in our example). As the application_name parameter matches the master's synchronous_standby_names setting, we have convinced the system to replicate synchronously. No transaction can be lost anymore because every transaction will end up on two servers instantly. The sync_state setting will tell you precisely how data is moving from the master to the slave.
You can also use a list of application names, or simply a * sign in synchronous_standby_names to indicate that the first slave has to be synchronous.
Understanding performance issues
At various points in this book, we have already pointed out that synchronous replication is an expensive thing to do. Remember that we have to wait for a remote server and not just the local system. The network between those two nodes is definitely not something that is going to speed things up. Writing to more than one node is always more expensive than writing to only one node. Therefore, we definitely have to keep an eye on speed, otherwise we might face some pretty nasty surprises.
Consider what you have learned about the CAP theory earlier in this book. Synchronous replication is exactly where it should be, with the serious impact that the physical limitations will have on performance.
The main question you really have to ask yourself is: do I really want to replicate all transactions synchronously? In many cases, you don't. To prove our point, let's imagine a typical scenario: a bank wants to store accounting-related data as well as some logging data. We definitely don't want to lose a couple of million dollars just because a database node goes down. This kind of data might be worth the effort of replicating synchronously. The logging data is quite different, however. It might be far too expensive to cope with the overhead of synchronous replication. So, we want to replicate this data in an asynchronous way to ensure maximum throughput.
How can we configure a system to handle important as well as not-so-important transactions nicely? The answer lies in a variable you have already seen earlier in the book—the synchronous_commit variable.
Setting synchronous_commit to on
In the default PostgreSQL configuration, synchronous_commit has been set to on. In this case, commits will wait until a reply from the current synchronous standby indicates that it has received the commit record of the transaction and has flushed it to the disk. In other words, both servers must report that the data has been written safely. Unless both servers crash at the same time, your data will survive potential problems (crashing of both servers should be pretty unlikely).
Setting synchronous_commit to remote_write
Flushing to both disks can be highly expensive. In many cases, it is enough to know that the remote server has accepted the XLOG and passed it on to the operating system without flushing things to the disk on the slave. As we can be pretty certain that we don't lose two servers at the very same time, this is a reasonable compromise between performance and consistency with respect to data protection.
Setting synchronous_commit to off
The idea is to delay WAL writing to reduce disk flushes. This can be used if performance is more important than durability. In the case of replication, it means that we are not replicating in a fully synchronous way.
Keep in mind that this can have a serious impact on your application. Imagine a transaction committing on the master and you wanting to query that data instantly on one of the slaves. There would still be a tiny window during which you can actually get outdated data.
Setting synchronous_commit to local
The local value will flush locally but not wait for the replica to respond. In other words, it will turn your transaction into an asynchronous one.
Setting synchronous_commit to local can also cause a small time delay window, during which the slave can actually return slightly outdated data. This phenomenon has to be kept in mind when you decide to offload reads to the slave.
In short, if you want to replicate synchronously, you have to ensure that synchronous_commit is set to either on or remote_write.
Changing durability settings on the fly
Changing the way data is replicated on the fly is easy and highly important to many applications, as it allows the user to control durability on the fly. Not all data has been created equal, and therefore, more important data should be written in a safer way than data that is not as important (such as log files). We have already set up a full synchronous replication infrastructure by adjusting synchronous_standby_names (master) along with the application_name (slave) parameter. The good thing about PostgreSQL is that you can change your durability requirements on the fly:
test=# BEGIN; BEGIN test=# CREATE TABLE t_test (id int4); CREATE TABLE test=# SET synchronous_commit TO local; SET test=# \x Expanded display is on. test=# SELECT * FROM pg_stat_replication; -[ RECORD 1 ]----+------------------------------ pid | 62871 usesysid | 10 usename | hs application_name | book_sample client_addr | ::1 client_hostname | client_port | 59235 backend_start | 2013-03-29 14:53:52.352741+01 state | streaming sent_location | 0/3026258 write_location | 0/3026258 flush_location | 0/3026258 replay_location | 0/3026258 sync_priority | 1 sync_state | sync test=# COMMIT; COMMIT
In this example, we changed the durability requirements on the fly. This will make sure that this very specific transaction will not wait for the slave to flush to the disk. Note, as you can see, sync_state has not changed. Don't be fooled by what you see here; you can completely rely on the behavior outlined in this section. PostgreSQL is perfectly able to handle each transaction separately. This is a unique feature of this wonderful open source database; it puts you in control and lets you decide which kind of durability requirements you want.
Understanding the practical implications and performance
We have already talked about practical implications as well as performance implications. But what good is a theoretical example? Let's do a simple benchmark and see how replication behaves. We are performing this kind of testing to show you that various levels of durability are not just a minor topic; they are the key to performance.
Let's assume a simple test: in the following scenario, we have connected two equally powerful machines (3 GHz, 8 GB RAM) over a 1 Gbit network. The two machines are next to each other. To demonstrate the impact of synchronous replication, we have left shared_buffers and all other memory parameters as default, and only changed fsync to off to make sure that the effect of disk wait is reduced to practically zero.
The test is simple: we use a one-column table with only one integer field and 10,000 single transactions consisting of just one INSERT statement:
INSERT INTO t_test VALUES (1);
We can try this with full, synchronous replication (synchronous_commit = on):
real 0m6.043s user 0m0.131s sys 0m0.169s
As you can see, the test has taken around 6 seconds to complete. This test can be repeated with synchronous_commit = local now (which effectively means asynchronous replication):
real 0m0.909s user 0m0.101s sys 0m0.142s
In this simple test, you can see that the speed has gone up by us much as six times. Of course, this is a brute-force example, which does not fully reflect reality (this was not the goal anyway). What is important to understand, however, is that synchronous versus asynchronous replication is not a matter of a couple of percentage points or so. This should stress our point even more: replicate synchronously only if it is really needed, and if you really have to use synchronous replication, make sure that you limit the number of synchronous transactions to an absolute minimum.
Also, please make sure that your network is up to the job. Replicating data synchronously over network connections with high latency will kill your system performance like nothing else. Keep in mind that throwing expensive hardware at the problem will not solve the problem. Doubling the clock speed of your servers will do practically nothing for you because the real limitation will always come from network latency.
The performance penalty with just one connection is definitely a lot larger than that with many connections. Remember that things can be done in parallel, and network latency does not make us more I/O or CPU bound, so we can reduce the impact of slow transactions by firing up more concurrent work.
When synchronous replication is used, how can you still make sure that performance does not suffer too much? Basically, there are a couple of important suggestions that have proven to be helpful:
- Use longer transactions: Remember that the system must ensure on commit that the data is available on two servers. We don't care what happens in the middle of a transaction, because anybody outside our transaction cannot see the data anyway. A longer transaction will dramatically reduce network communication.
- Run stuff concurrently: If you have more than one transaction going on at the same time, it will be beneficial to performance. The reason for this is that the remote server will return the position inside the XLOG that is considered to be processed safely (flushed or accepted). This method ensures that many transactions can be confirmed at the same time.
Redundancy and stopping replication
When talking about synchronous replication, there is one phenomenon that must not be left out. Imagine we have a two-node cluster replicating synchronously. What happens if the slave dies? The answer is that the master cannot distinguish between a slow and a dead slave easily, so it will start waiting for the slave to come back.
At first glance, this looks like nonsense, but if you think about it more deeply, you will figure out that synchronous replication is actually the only correct thing to do. If somebody decides to go for synchronous replication, the data in the system must be worth something, so it must not be at risk. It is better to refuse data and cry out to the end user than to risk data and silently ignore the requirements of high durability.
If you decide to use synchronous replication, you must consider using at least three nodes in your cluster. Otherwise, it will be very risky, and you cannot afford to lose a single node without facing significant downtime or risking data loss.
Here, we outlined the basic concept of synchronous replication, and showed how data can be replicated synchronously. We also showed how durability requirements can be changed on the fly by modifying PostgreSQL runtime parameters. PostgreSQL gives users the choice of how a transaction should be replicated, and which level of durability is necessary for a certain transaction.
Resources for Article:
- Introducing PostgreSQL 9 [article]
- PostgreSQL – New Features [article]
- Installing PostgreSQL [article]