You're reading from Learn PostgreSQL Use, manage, and build secure and scalable databases with PostgreSQL 16

Product type Paperback

Published in Oct 2023

Publisher Packt

ISBN-13 9781837635641

Length 744 pages

Edition 2nd Edition

Languages

SQL

Tools

PostgreSQL

Concepts

Data Analysis

Authors (2):

Ferrari

Enrico Pirozzi

View More author details

Table of Contents (22) Chapters

Preface

1. Introduction to PostgreSQL

2. Getting to Know Your Cluster FREE CHAPTER

3. Managing Users and Connections

4. Basic Statements

5. Advanced Statements

6. Window Functions

7. Server-Side Programming

8. Triggers and Rules

9. Partitioning

10. Users, Roles, and Database Security

11. Transactions, MVCC, WALs, and Checkpoints

12. Extending the Database – the Extension Ecosystem

13. Query Tuning, Indexes, and Performance Optimization

14. Logging and Auditing

15. Backup and Restore

16. Configuration and Monitoring

17. Physical Replication

18. Logical Replication

19. Useful Tools and Extensions

20. Other Books You May Enjoy

21. Index

Exploring the disk layout of PGDATA

In the previous sections, you have seen how to install PostgreSQL and connect to it, but we have not looked at the storage part of a cluster. Since the aim of PostgreSQL, as well as the aim of any relational database, is to permanently store data, the cluster needs some sort of permanent storage. In particular, PostgreSQL exploits the underlying filesystem to store its own data. All of the PostgreSQL-related stuff is contained in a directory known as PGDATA.

The PGDATA directory acts as the disk container that stores all the data of the cluster, including the users’ data and cluster configuration.

The following is an example of the content of PGDATA for a running PostgreSQL 16 cluster:

$  ls -1 /postgres/16/data
base
global
pg_commit_ts
pg_dynshmem
pg_hba.conf
pg_ident.conf
pg_logical
pg_multixact
pg_notify
pg_replslot
pg_serial
pg_snapshots
pg_stat
pg_stat_tmp
pg_subtrans
pg_tblspc
pg_twophase
PG_VERSION
pg_wal
pg_xact
postgresql.auto.conf
postgresql.conf
postmaster.opts
postmaster.pid

The PGDATA directory is structured in several files and subdirectories. The main files are as follows:

postgresql.conf is the main configuration file, used by default when the service is started.
postgresql.auto.conf is the automatically included configuration file used to store dynamically changed settings via SQL instructions.
pg_hba.conf is the HBA file that provides the configuration regarding available database connections.
PG_VERSION is a text file that contains the major version number (useful when inspecting the directory to understand which version of the cluster has managed the PGDATA directory).
postmaster.pid is the PID of the postmaster process, the first launched process in the cluster.

The main directories available in PGDATA are as follows:

base is a directory that contains all the users’ data, including databases, tables, and other objects.
global is a directory containing cluster-wide objects.
pg_wal is the directory containing the WAL files.
pg_stat and pg_stat_tmp are, respectively, the storage of permanent and temporary statistical information about the status and health of the cluster.

Of course, all files and directories in PGDATA are important for the cluster to work properly, but so far, the preceding is the “core” list of objects that are fundamental in PGDATA itself. Other files and directories will be discussed in later chapters.

Objects in the PGDATA directory

PostgreSQL does not name objects on disk, such as tables, in a mnemonic or human-readable way; instead, every file is named after a numeric identifier. You can see this by having a look, for instance, at the base subdirectory:

$  ls -1 /postgres/16/data/base
1
16386
4
5

As you can see from the preceding code, the base directory contains four objects, named 1,4, 5, and 16386. Please note that these numbers could be different on your machine. In particular, each of the preceding is a directory that contains other files, as shown here:

$ ls -1 /postgres/16/data/base/16386 | head
112
113
1247
1247_fsm
1247_vm
1249
1249_fsm
1249_vm
1255
1255_fsm

As you can see, each file is named with a numeric identifier. Internally, PostgreSQL holds a specific catalog that allows the database to match a mnemonic name to a numeric identifier, and vice versa. The integer identifier is named OID (or, Object Identifier); this name is a historical term that today corresponds to the so-called filenode. The two terms will be used interchangeably in this section.

There is a specific utility that allows you to inspect a PGDATA directory and extract mnemonic names: oid2name. For example, if you executed the oid2name utility, you’d get a list of all available databases similar to the following one:

$ oid2name
All databases:
   Oid  Database Name  Tablespace
----------------------------------
 16390        forumdb  pg_default
     5       postgres  pg_default
     4      template0  pg_default
     1      template1  pg_default

As you can see, the Oid numbers in the oid2name output reflect the same directory names listed in the base directory; every subdirectory has a name corresponding to the database.

You can even go further and inspect a single file going into the database directory, specifying the database where you are going to search for an object name with the -d flag:

$ cd /postgres/16/data/base/1
$ oid2name -d template1 -f 3395
From database "template1":
  Filenode                 Table Name
-------------------------------------
      3395  pg_init_privs_o_c_o_index

As you can see from the preceding example, the 3395 file in the /postgres/16/data/base/1 directory corresponds to the table named pg_init_privs_o_c_o_index. Therefore, when PostgreSQL needs to interact with a table like this, it will seek the disk to the /postgres/16/data/base/1/3395 file.

From the preceding example, it should be clear that every SQL table is stored as a file with a numeric name. However, PostgreSQL does not allow a single file to be greater than 1 GB in size, so what happens if a table grows beyond that limit? PostgreSQL “attaches” another file with a numeric extension that indicates the next chunk of 1 GB of data. In other words, if your table is stored in the 123 file, the second gigabyte will be stored in the 123.1 file, and if another gigabyte of storage is needed, another file, 123.2, will be created. Therefore, the filenode refers to the very first file related to a specific table, but more than one file can be stored on disk.

Tablespaces

PostgreSQL pretends to find all its data within the PGDATA directory, but that does not mean that your cluster is “jailed” in this directory. In fact, PostgreSQL allows “escaping” the PGDATA directory by means of tablespaces. A tablespace is a directory that can be outside the PGDATA directory and can also belong to different storage. Tablespaces are mapped into the PGDATA directory by means of symbolic links stored in the pg_tblspc subdirectory. In this way, the PostgreSQL processes do not have to look outside PGDATA, but are still able to access “external” storage. A tablespace can be used to achieve different aims, such as enlarging the storage data or providing different storage performances for specific objects. For instance, you can create a tablespace on a slow disk to contain infrequently accessed objects and tables, keeping fast storage within another tablespace for frequently accessed objects.

You don’t have to make links by yourself: PostgreSQL provides the TABLESPACE feature to manage this and the cluster will create and manage the appropriate links under the pg_tblspc subdirectory.

For instance, the following is a PGDATA directory that has three different tablespaces:

$ ls -l /postgres/16/data/pg_tblspc/
lrwxrwxrwx 1 postgres postgres 22 Jan 19 13:08 16384 -> /data/tablespaces/ts_a
lrwxrwxrwx 1 postgres postgres 22 Jan 19 13:08 16385 -> /data/tablespaces/ts_b
lrwxrwxrwx 1 postgres postgres 22 Jan 19 13:08 16386 -> /data/tablespaces/ts_c

As you can see from the preceding example, there are three tablespaces that are attached to the /data storage. You can inspect them with oid2name and the -s flag:

$ oid2name -s
All tablespaces:
    Oid  Tablespace Name
------------------------
   1663       pg_default
   1664        pg_global
  16384             ts_a
  16385             ts_b
  16386             ts_c

As you can see, the numeric identifiers of the symbolic links are mapped to the mnemonic names of the tablespaces. From the preceding example, you can observe that there are also two particular tablespaces:

pg_default is the default tablespace corresponding to “none,” the default storage to be used for every object when nothing is explicitly specified. In other words, every object stored directly under the PGDATA directory is attached to the pg_default tablespace.
pg_global is the tablespace used for system-wide objects.

By default, both of the preceding tablespaces refer directly to the PGDATA directory, meaning any cluster without a custom tablespace is totally contained within the PGDATA directory.