PostgreSQL's Transaction Model

September 2008

On Databases

Databases come in many forms. The simplest definition of a database is any system of storing, organizing, and retrieving data. With this definition, things like memory, hard drives, file systems, files on those file systems (stored in plain text, tab-delimited, XML, JSON, or even BDB formats), and even applications like MySQL, PostgreSQL, and Oracle are considered databases.

Databases allow users to:

  • Store Data
  • Organize Data
  • Retrieve Data

It is important to keep a broad perspective on what data and databases really are so that you can always choose the best solution for your particular problem.

The SQL databases (MySQL, PostgreSQL, Oracle, and others) are remarkable because of the flexibility and performance they provide. In my work, I look to them first when developing an application, with an eye towards getting the data model right before optimization. Once the application is solid, and once I fully understand what parts of the data system are too slow or fast enough, then I can start building my own database on top of the file system or other existing technologies that will give me the kind of performance I need.

Among the SQL databases, which one is best? There are many criteria I use to evaluate SQL databases, and the one I pay attention to most is how they comply (if at all) with the ACID model.

And given the technical merits of the various SQL databases, I consistently choose PostgreSQL above all other SQL databases when given a choice. Allow me to explain why.

The ACID Model

ACID is an acronym, standing for the four words Atomicity, Consistency, Isolation, and Durability. These are fancy words for some very basic and essential concepts.

Atomicity means that you either do all of the changes you want, or none of them, without leaving the database in some weird in-between state. When you take into account catastrophes like power failures or corruption, atomicity isn't as simple as it first seems.

Consistency means that any state of the database will be internally consistent with the rules that constrain the data. That is, if you have a table with a primary key, then that table will not contain any violations of the primary key constraints after any transaction.

Isolation means that you can be modifying many different parts of the database at the same time without affecting each other. (As a higher feature, there is Serialization, which requires that transactions occur one after the other, or at least the results of transactions.)

Durability means that once a transaction completes, it is never lost, ever.

  • Atomicity: All or nothing
  • Consistency: Rules kept
  • Isolation: No partials seen
  • Durability: Doesn't disappear

ACID compliance isn't rocket science, but it isn't trivial either. These requirements form a minimum standard absolutely necessary to provide a database for a reasonable application.

That is, if you can't guarantee these things, then the users of your application are going to be frustrated since they assume, naturally, that the ACID model is followed. And if the users of the application get frustrated, then the developers of the application will get frustrated as they try to comply with the user's expectations.

A lot of frustration can be avoided if the database simply complies with the principles of the ACID model. If the database gets it right, then the rest of the application will have no problem getting it right as well. Our users will be happy since their expectations of ACID compliance will be met.

Remember: Users expect ACID!

What Violating the ACID Model Looks Like

To consider the importance of the ACID model, let's examine, briefly, what happens when the model is violated.

When Atomicity isn't adhered to, users will see their data partially committed. For instance, they might find their online profile only partially modified, or their bank transfer partially transferred. This is, of course, devastating to the unwary user.

When Consistency is violated, the rules that the data should follow aren't adhered to. Perhaps the number of friends shown doesn't match the friends they actually have in a social networking application. Or perhaps they see their bank balance doesn't match what the numbers add up to. Or worse, perhaps your order system is counting orders that don't even exist and not counting orders that do.

When Isolation isn't guaranteed, they will either have to use a system where only one person can change something at a time, locking out all others, or they will see inconsistencies throughout the world of data, inconsistencies resulting from transactions that are in progress elsewhere. This will make the data unreliable just like violating Atomicity or Consistency. A bank user, for instance, will believe their transfer of funds was successful when in reality their money was simultaneously being withdrawn by another transaction.

When Durability is lost, then users will never know if their transaction really went through, and won't mysteriously disappear down the road with all the trouble that entails.

I am sure we have all had experiences dealing with data systems that didn't follow the ACID model. I remember the days when you had to save your files frequently, and even then you still weren't ensured that all of your data would be properly saved. I also recall applications that would make partial changes, or incomplete changes, and expose these inconsistent states to the user.

In today's world, writing applications with faults like the above is simply inexcusable. There are too many tools out there that are readily available that make writing ACID compliant systems easy. One of those tools, probably the most popular of all, is the SQL database.

Satisfying ACID with Transactions

The principle way that databases comply with ACID requirements is through the concept of transactions.

Ideally, each transaction would occur in an instant, updating the database according to the state of the database at that moment. In reality, this isn't possible. It takes time to accumulate the data and apply the changes.

Typical transaction SQL commands:

  • BEGIN: Start a new transaction
  • COMMIT: Commit the transaction
  • ROLLBACK: Roll back the transaction in progress

Since multiple sessions can each be creating and applying a transaction simultaneously, special precautions have to be taken to ensure that the data that each transaction “sees” is consistent, and that the effects of each transaction appear all together or not at all. Special care is also taken to ensure that when a transaction is committed, the database will be put in a state where catastrophic events will not leave the transaction partially committed.

Contrary to popular belief, there are a variety of ways that databases support transactions. It is well worth the time to read and understand PostgreSQL's two levels of transaction isolation and the four possible isolation levels in Section 12.2 of the PostgreSQL documentation.

Note that some of the inferior levels of transaction isolation violate some extreme cases of ACID compliance for the sake of performance. These edge cases can be properly handled with appropriate use of row-locking techniques. Row-locking is an issue beyond this article.

Keep in mind that the levels of transaction isolation are only what appear to users of the database. Inside the database, there is a remarkable variety of methods on actually implementing transactions.

Consider that while you are in a transaction, making changes to the database, every other transaction has to see one version of the database while you see another. In effect, you have to have copies of some of the data lying around somewhere. Queries to that data have to know which version of the data to retrieve the copy, the original, or the modified version (and which modified version?) Changes to the data have to go somewhere the original, a copy, or some modified version (again, which?) Answering these questions leads to the various implementations of transactions in ACID compliant databases.

For the purposes of this article, I will examine only two: Oracle's and PostgreSQL's implementations. If you are only familiar with Oracle, then hopefully you will learn something new and fascinating as you investigate PostgreSQL's method.

Rollback Segments in Oracle

A simple implementation of an advanced transaction system is to simply store the modified data of each transaction in the data tables themselves. Meanwhile, information required to rollback any particular transaction is stored in the rollback segment. Since rollbacks are rare, it is a good guess to allow this data to accumulate and then to delete it once the transaction has fully committed.

Transactions that are concurrent need to see a consistent view of data. This means that instead of live rows in the table which have been updated by a more recent transaction, they may have to instead retrieve rows from the rollback segment.

Of course, this kind of scheme lead to dreaded errors with the rollback segment, such as ORA-1555 “Snapshot too old”. The net result of such errors is that long-running or large transactions have to be run when the database is effectively out of service, or not at all.

PostgreSQL's Transaction Model

Multi-Version Concurrency Control (MVCC) in PostgreSQL

But there is another way, the way that PostgreSQL handles it.

PostgreSQL's MVCC keeps all of the versions of the data together in the same partition in the same table. By identifying which rows were added by which transactions, which rows were deleted by which transactions, and which transactions have actually committed, it becomes a straightforward check to see which rows are visible for which transactions.

The specific details of this MVCC are incredibly simple. Rows of a table are stored in PostgreSQL as a tuple. Two fields of each tuple are xmin and xmax. Xmin is the transaction ID of the transaction that created the tuple. Xmax is the transaction ID of the transaction that deleted it (if any).

Along with the tuples in each table, a record of each transaction and its current state (in progress, committed, aborted) is kept in a universal transaction log.

When data in a table is selected, only those rows that are created and not destroyed are seen. That is, each row's xmin is observed. If the xmin is a transaction that is in progress or aborted (but not the transaction doing the observing), then the row is invisible. If the xmin is a transaction that has committed or the current transaction, however, then the xmax is observed. If the xmax is a transaction that is in progress or aborted and not the current transaction, or if there is no xmax at all, then the row is seen. Otherwise, the row is considered as already deleted.

Insertions are straightforward. The transaction that inserts the tuple simply creates it with the xmax blank and the xmin set to its transaction ID. Whether the transaction is committed or aborted is irrelevant only the transaction's state needs to be updated.

Deletions are also straightforward. The tuple's xmax is set to the current transaction. Like insertions, whether the transaction is committed or aborted is irrelevant.

Updates are no more than a concurrent insert and delete.

PostgreSQL's MVCC method isn't intuitive, but it is simple and powerful. Instead of filling a rollback segment with long-running transactions, the table space itself is consumed.

PostgreSQL's Transaction Model


PostgreSQL's MVCC is remarkably simply and remarkably fast, except for one fatal flaw. Over time, data accumulates in the table space. Old rows that were deleted or updated don't go away. There is no automatic elimination process once a transaction commits. Not only will this data fill disks, but the database will slow down as the majority of tuples are ignored. Seemingly empty tables can be filled with millions and millions of no longer relevant tuples. Indexes can help in these scenarios, but table scans will take much longer than expected.

That's why you have to vacuum PostgreSQL tables. The vacuum process simple removes the tuples that are no longer needed, freeing up valuable space on the hard drive and increasing the performance of the database by limiting the number of tuples that are checked for each query.

How do you identify tuples that are expired? Xmax will be a transaction that has long ago committed, committed before any running transaction.

How often should a table be vacuumed? If none of the rows are ever deleted or updated, then vacuuming is never necessary. However, if a table is frequently updated or rows deleted, then vacuuming should be done regularly, depending on how frequent those changes occur.

Nowadays PostgreSQL ships with an auto-vacuum daemon that can automate vacuuming of tables to appropriate intervals depending on how each table is used.

Rollback Segments or MVCC?

Too many people think that Oracle's solution to ACID compliance is the best solution out there. I beg to differ, of course. Oracle's approach is one of many, and depending on your data needs, may not be the best.

I have seen PostgreSQL perform much better in areas that Oracle isn't very suitable. In general, PostgreSQL does well in cases where Oracle would fill its rollback partition. However, in other use cases Oracle is much better optimized for the task. I can't tell you precisely which situations are better or worse. It's something you'll have to discover for yourself.


ACID compliance is a standard we as developers need to strive for in all of our applications. We can no longer excuse ourselves from it claiming that it is too hard to get right. There are many choices out there that can give us ACID compliance, or something very close to it, and two of them are Oracle and PostgreSQL. But Oracle and PostgreSQL approach ACID compliance in different ways, giving them different performance characteristics. It's up to us to figure out which way will best suit our needs.

You've been reading an excerpt of:

Creating your MySQL Database: Practical Design Tips and Techniques

Explore Title