Reader small image

You're reading from  Mastering PostgreSQL 15 - Fifth Edition

Product typeBook
Published inJan 2023
PublisherPackt
ISBN-139781803248349
Edition5th Edition
Right arrow
Author (1)
Hans-Jürgen Schönig
Hans-Jürgen Schönig
author image
Hans-Jürgen Schönig

Hans-Jürgen Schönig has 20 years' experience with PostgreSQL. He is the CEO of a PostgreSQL consulting and support company called CYBERTEC PostgreSQL International GmbH. It has successfully served countless customers around the globe. Before founding CYBERTEC PostgreSQL International GmbH in 2000, he worked as a database developer at a private research company that focused on the Austrian labor market, where he primarily worked on data mining and forecast models. He has also written several books about PostgreSQL.
Read more about Hans-Jürgen Schönig

Right arrow

Handling Backup and Recovery

In Chapter 8, Managing PostgreSQL Security, we took a look at all that we need to know about securing PostgreSQL in the most simplistic and most beneficial way possible. The topics that we will cover in this chapter are backup and recovery. Taking backups should be a regular task, and every administrator should keep an eye on this vital exercise. Fortunately, PostgreSQL provides easy means to create backups.

Therefore, in this chapter, we will cover the following topics:

  • Performing simple dumps
  • Handling various formats
  • Replaying backups
  • Handling global data

By the end of this chapter, you will be able to set up proper backup mechanisms.

Performing simple dumps

Backups and data exports are important because without backups, you are effectively putting your database at risk in the case of crashes and storage-related issues. If you are running through a PostgreSQL setup, there are basically two main methods of performing backups:

  • Logical dumps (extracting a SQL script that represents your data)
  • Transaction log shipping

The idea behind transaction log shipping is to archive binary changes made to the database. Most people claim that transaction log shipping is the only real way to create backups. However, in my opinion, this is not necessarily true.

Many people rely on pg_dump to simply extract a textual representation of the data. Interestingly, pg_dump is also the oldest method of creating a backup and has been around since the very early days of the PostgreSQL project (transaction log shipping was added much later). Every PostgreSQL administrator becomes familiar with pg_dump sooner or later,...

Handling various formats

So far, we have seen that pg_dump can be used to create text files. The problem here is that a text file can only be replayed completely, so if we have saved an entire database, we can only replay the entire thing. In most cases, this is not what we want to do. Therefore, PostgreSQL has additional formats that offer more functionality.

At this point, four formats are supported:

-F, --format=c|d|t|p  output file  format (custom, directory, tar, plain  text  (default))

We have already seen plaintext, which is just normal text. On top of that, we can use a custom format. The idea behind a custom format is to get a compressed dump, including a table of contents. Here are two ways to create a custom format dump:

[hs@linuxpc ~]$ pg_dump -Fc test > /tmp/dump.fc
[hs@linuxpc ~]$ pg_dump -Fc test -f /tmp/dump.fc

In addition to the table of contents, the compressed dump has one more advantage: it is a lot smaller...

Replaying backups

Having a backup is pointless unless you have tried to actually replay it. Fortunately, this is easy to do. If you have created a plaintext backup, simply take the SQL file and execute it. The following example shows how that can be done:

psql your_db < your_file.sql

A plaintext backup is simply a text file containing everything. We can always simply replay a text file.

If you have decided on a custom format or directory format, you can use pg_restore to replay the backup. Additionally, pg_restore allows you to do all kinds of fancy things, such as replaying just part of a database. In most cases, however, you will simply replay the entire database. In this example, we will create an empty database and just replay a custom format dump:

[hs@linuxpc backup]$ createdb new_db
[hs@linuxpc backup]$ pg_restore -d new_db -j 4 /tmp/dump.fc

Note that pg_restore will add data to an existing database. If your database is not empty, pg_restore might error out...

Handling global data

In the previous sections, we learned about pg_dump and pg_restore, which are two vital programs when it comes to creating backups. The thing is, pg_dump creates database dumps—it works at the database level. If we want to back up an entire instance, we have to use pg_dumpall or dump all of the databases separately. Before we dig into that, it makes sense to see how pg_dumpall works:

pg_dumpall > /tmp/all.sql

pg_dumpall will connect to one database after the other and send stuff to stdout, where you can process it with Unix. Note that pg_dumpall can be used just like pg_dump. However, it has some downsides. It does not support a custom or directory format, and therefore does not offer multi-core support. This means that we will be stuck with one thread.

However, there is more to pg_dumpall. Keep in mind that users live at the instance level. If you create a normal database dump, you will get all of the permissions, but you won’t get all...

Summary

In this chapter, we learned about creating backups and dumps in general. So far, binary backups have not been covered, but you are already able to extract textual backups from the server so that you can save and replay your data in the most simplistic way possible. Data protection is important, and backups are vital to ensure data security. Remember, without backups you are running the risk of total data loss.

In Chapter 10, Making Sense of Backups and Replication, you will learn about transaction log shipping, streaming replication, and binary backups. You will also learn how to use PostgreSQL’s onboard tools to replicate instances.

Questions

Let us focus on some of the most important questions people often ask:

  • Should everybody create dumps? Actually, no. Dumps are often used as backups and are therefore considered to be an administrative task. Therefore, a proper security system is needed to ensure that not everyone can read everything.
  • Why are dumps so small? A binary copy is a lot larger because it contains the data “as it is” on disk. In a dump, things such as indexes are merely definitions, which are, of course, a lot smaller.
  • Do you have to dump the globals too? Forgetting to dump globals is often a core problem. Make sure that you always have a copy of those globals around.
  • Is it safe to have a .pgpass file? It is safe if you have made 100% sure that the directory containing .pgpass files is only readable by the desired user.

The answers to these questions can also be found in the GitHub repository (https://github.com/PacktPublishing/Mastering-PostgreSQL-15-).

...
lock icon
The rest of the chapter is locked
You have been reading a chapter from
Mastering PostgreSQL 15 - Fifth Edition
Published in: Jan 2023Publisher: PacktISBN-13: 9781803248349
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Hans-Jürgen Schönig

Hans-Jürgen Schönig has 20 years' experience with PostgreSQL. He is the CEO of a PostgreSQL consulting and support company called CYBERTEC PostgreSQL International GmbH. It has successfully served countless customers around the globe. Before founding CYBERTEC PostgreSQL International GmbH in 2000, he worked as a database developer at a private research company that focused on the Austrian labor market, where he primarily worked on data mining and forecast models. He has also written several books about PostgreSQL.
Read more about Hans-Jürgen Schönig