Home Data PostgreSQL 16 Administration Cookbook

PostgreSQL 16 Administration Cookbook

By Gianni Ciolli , Boriss Mejías , Jimmy Angelakos and 2 more
ai-assist-svg-icon Book + AI Assistant
eBook + AI Assistant $43.99 $29.99
Print $54.99
Subscription $15.99 $10 p/m for three months
ai-assist-svg-icon NEW: AI Assistant (beta) Available with eBook, Print, and Subscription.
ai-assist-svg-icon NEW: AI Assistant (beta) Available with eBook, Print, and Subscription. $10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime! ai-assist-svg-icon NEW: AI Assistant (beta) Available with eBook, Print, and Subscription.
What do you get with a Packt Subscription?
Gain access to our AI Assistant (beta) for an exclusive selection of 500 books, available during your subscription period. Enjoy a personalized, interactive, and narrative experience to engage with the book content on a deeper level.
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
Gain access to our AI Assistant (beta) for an exclusive selection of 500 books, available during your subscription period. Enjoy a personalized, interactive, and narrative experience to engage with the book content on a deeper level.
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Along with your eBook purchase, enjoy AI Assistant (beta) access in our online reader for a personalized, interactive reading experience.
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
ai-assist-svg-icon NEW: AI Assistant (beta) Available with eBook, Print, and Subscription. ai-assist-svg-icon NEW: AI Assistant (beta) Available with eBook, Print, and Subscription. BUY NOW $10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime! ai-assist-svg-icon NEW: AI Assistant (beta) Available with eBook, Print, and Subscription.
eBook + AI Assistant $43.99 $29.99
Print $54.99
Subscription $15.99 $10 p/m for three months
What do you get with a Packt Subscription?
Gain access to our AI Assistant (beta) for an exclusive selection of 500 books, available during your subscription period. Enjoy a personalized, interactive, and narrative experience to engage with the book content on a deeper level.
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
Gain access to our AI Assistant (beta) for an exclusive selection of 500 books, available during your subscription period. Enjoy a personalized, interactive, and narrative experience to engage with the book content on a deeper level.
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Along with your eBook purchase, enjoy AI Assistant (beta) access in our online reader for a personalized, interactive reading experience.
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
  1. Free Chapter
    First Steps
About this book
PostgreSQL has seen a huge increase in its customer base in the past few years and is becoming one of the go-to solutions for anyone who has a database-specific challenge. This PostgreSQL book touches on all the fundamentals of Database Administration in a problem-solution format. It is intended to be the perfect desk reference guide. This new edition focuses on recipes based on the new PostgreSQL 16 release. The additions include handling complex batch loading scenarios with the SQL MERGE statement, security improvements, running Postgres on Kubernetes or with TPA and Ansible, and more. This edition also focuses on certain performance gains, such as query optimization, and the acceleration of specific operations, such as sort. It will help you understand roles, ensuring high availability, concurrency, and replication. It also draws your attention to aspects like validating backups, recovery, monitoring, and scaling aspects. This book will act as a one-stop solution to all your real-world database administration challenges. By the end of this book, you will be able to manage, monitor, and replicate your PostgreSQL 16 database for efficient administration and maintenance with the best practices from experts.
Publication date:
December 2023
Publisher
Packt
Pages
636
ISBN
9781835460580

 

First Steps

PostgreSQL is a feature-rich, general-purpose database-management system. It’s a complex piece of software, but every journey begins with the first step.

We’ll start with your first connection. Many people fall at the first hurdle, so we’ll try not to skip past that too swiftly. We’ll quickly move on to enabling remote users, and from there, we will move on to getting access through GUI administration tools.

We will also introduce the psql query tool, which is the tool used to load our sample database, as well as many other examples in the book.

For additional help, we’ve included a few useful recipes that you may need for reference.

In this chapter, we will cover the following recipes:

  • Introducing PostgreSQL
  • How to get PostgreSQL
  • Connecting to the PostgreSQL server
  • Enabling access for network/remote users
  • Using the pgAdmin GUI tool
  • Using the psql query and scripting tool
  • Changing your password securely
  • Avoiding hardcoding your password
  • Using a connection service file
  • Troubleshooting a failed connection
  • PostgreSQL in the cloud
  • PostgreSQL with Kubernetes
  • PostgreSQL with TPA
 

Introducing PostgreSQL 16

PostgreSQL is an advanced SQL database server, available on a wide range of platforms. One of the clearest benefits of PostgreSQL is that it is open source, meaning that you have a very permissive license to install, use, and distribute it without paying anyone any fees or royalties. On top of that, PostgreSQL is known as a database that stays up for long periods and requires little or no maintenance, in most cases. Overall, PostgreSQL provides a very low total cost of ownership.

PostgreSQL is also known for its huge range of advanced features, developed over the course of more than 30 years of continuous development and enhancement. Originally developed by the Database Research Group at the University of California, Berkeley, PostgreSQL is now developed and maintained by a huge army of developers and contributors. Many of these contributors have full-time jobs related to PostgreSQL, working as designers, developers, database administrators, and trainers. Some, but not many, of these contributors work for companies that specialize in support for PostgreSQL. No single company owns PostgreSQL, nor are you required (or even encouraged) to register your usage.

PostgreSQL has the following main features:

  • Excellent SQL standards compliance, up to SQL:2023
  • Client-server architecture
  • A highly concurrent design, where readers and writers don’t block each other
  • Highly configurable and extensible for many types of applications
  • Excellent scalability and performance, with extensive tuning features
  • Support for many kinds of data models, such as relational, post-relational (arrays and nested relations via record types), document (JSON and XML), and key/value

What makes PostgreSQL different?

The PostgreSQL project focuses on the following objectives:

  • Robust, high-quality software with maintainable, well-commented code
  • Low-maintenance administration for both embedded and enterprise use
  • Standards-compliant SQL, interoperability, and compatibility
  • Performance, security, and high availability

What surprises many people is that PostgreSQL’s feature set is more similar to Oracle or SQL Server than it is to MySQL. The only connection between MySQL and PostgreSQL is that these two projects are open source; apart from that, the features and philosophies are almost totally different.

One of the key features of Oracle, since Oracle 7, has been snapshot isolation, where readers don’t block writers and writers don’t block readers. You may be surprised to learn that PostgreSQL was the first database to be designed with this feature, and it offers a complete implementation. In PostgreSQL, this feature is called Multiversion Concurrency Control (MVCC), and we will discuss this in more detail later in the book.

PostgreSQL is a general-purpose database management system. You define the database that you want to manage with it. PostgreSQL offers you many ways in which to work. You can either use a normalized database model, augmented with features such as arrays and record subtypes, or use a fully dynamic schema with the help of JSONB and an extension named hstore. PostgreSQL also allows you to create your own server-side functions in any of a dozen different languages, including a formal notion of transform to ensure data is properly converted.

PostgreSQL is highly extensible, so you can add your own data types, operators, index types, and functional languages. You can even override different parts of the system, using plugins to alter the execution of commands, or add a new query optimizer.

All of these features offer a huge range of implementation options to software architects. There are many ways out of trouble when building applications and maintaining them over long periods of time. Regrettably, we simply don’t have space in this book for all the cool features for developers; this book is about administration, maintenance, and backup.

In the early days, when PostgreSQL was still a research database, the focus was solely on the cool new features. Over the last 20 years, enormous amounts of code have been rewritten and improved, giving us one of the largest and most stable software servers available for operational use.

Who is using PostgreSQL? Prominent users include Apple, BASF, Genentech, Heroku, IMDB, Skype, McAfee, NTT, the UK Met Office, and the US National Weather Service. Early in 2010, PostgreSQL received well in excess of 1,000,000 downloads per year, according to data submitted to the European Commission, which concluded that “PostgreSQL is considered by many database users to be a credible alternative.” PostgreSQL has gone on from there to be even more popular.

We need to mention one last thing: when PostgreSQL was first developed, it was named Postgres, and therefore, many aspects of the project still refer to the word Postgres – for example, the default database is named postgres, and the software is frequently installed using the postgres user ID. As a result, people shorten the name PostgreSQL to simply Postgres and, in many cases, use the two names interchangeably.

PostgreSQL is pronounced post-grez-q-l. Postgres is pronounced post-grez.

Some people get confused and refer to it as Postgre or Postgre SQL, which are hard to say and likely to confuse people. Two names are enough, so don’t use a third one!

The following sections explain the key areas in more detail.

Robustness

PostgreSQL is robust, high-quality software, supported by testing for both features and concurrency. By default, the database provides strong disk-write guarantees, and developers take the risk of data loss very seriously in everything they do. Options to trade robustness for performance exist, although they are not enabled by default.

All actions on the database are performed within transactions, protected by a transaction log that will perform automatic crash recovery in case of software failure.

Databases may optionally be created with data block checksums to help diagnose hardware faults. Multiple backup mechanisms exist, with full and detailed Point-in-Time Recovery (PITR) if you need a detailed recovery. A variety of diagnostic tools are available as well.

Database replication is supported natively. Synchronous replication can provide greater than 5 nines (99.999%) of availability and data protection, if properly configured and managed, or even higher with appropriate redundancy.

Security

Access to PostgreSQL is controllable via host-based access rules. Authentication is flexible and pluggable, allowing for easy integration with any external security architecture. The latest Salted Challenge Response Authentication Mechanism (SCRAM) provides full 256-bit protection.

Full SSL-encrypted access is supported natively for both user access and replication. A full-featured cryptographic function library is available for database users.

PostgreSQL provides role-based access privileges to access data, by command type. PostgreSQL also provides Row-Level Security (RLS) for privacy, medical, and military-grade security.

Functions can execute with the permissions of the definer, while views may be defined with security barriers to ensure that security is enforced ahead of other processing.

All aspects of PostgreSQL are assessed by an active security team, while known exploits are categorized and reported at http://www.postgresql.org/support/security/.

Ease of use

Clear, full, and accurate documentation exists as a result of a development process where documentation changes are required.

The documentation can easily be found on the PostgreSQL website at https://www.postgresql.org/docs/. In this book, we will refer many times to the URLs of specific sections of that documentation.

Another option is to install a copy of the exact same documentation on your laptop, in the PDF or HTML format, for offline use. You can do it easily on most operating systems by installing the appropriate package, as in this Ubuntu/Debian example:

$ sudo apt-get install postgresql-doc-16

Hundreds of small changes occur with each release, which smooth off any rough edges of usage, supplied directly by knowledgeable users.

PostgreSQL works on small and large systems in the same way and across operating systems.

Client access and drivers exist for every language and environment, so there is no restriction on what type of development environment is chosen now or in the future.

The SQL standard is followed very closely; there is no weird behavior, such as silent truncation of data.

Text data is supported via a single data type that allows the storage of anything from 1 byte to 1 gigabyte. This storage is optimized in multiple ways, so 1 byte is stored efficiently, and much larger values are automatically managed and compressed.

PostgreSQL has a clear policy of minimizing the number of configuration parameters, and with each release, we work out ways to auto-tune the settings.

Extensibility

PostgreSQL is designed to be highly extensible. Database extensions can be easily loaded by using CREATE EXTENSION, which automates version checks, dependencies, and other aspects of configuration.

PostgreSQL supports user-defined data types, operators, indexes, functions, and languages.

Many extensions are available for PostgreSQL, including the PostGIS extension, which provides world-class Geographical Information System (GIS) features.

Performance and concurrency

PostgreSQL 16 can achieve significantly more than 1,000,000 reads per second on a 4-socket server, and it benchmarks at more than 50,000 write transactions per second with full durability, depending upon your hardware. With advanced hardware, even higher levels of performance are possible.

PostgreSQL has an advanced optimizer that considers a variety of join types, utilizing user data statistics to guide its choices. PostgreSQL provides the widest range of index types of any commonly available database server, fully supporting all data types.

PostgreSQL provides MVCC, which enables readers and writers to avoid blocking each other.

Taken together, the performance features of PostgreSQL allow a mixed workload of transactional systems and complex search and analytical tasks. This is important because it means we don’t always need to unload our data from production systems and reload it into analytical data stores just to execute a few ad hoc queries. PostgreSQL’s capabilities make it the database of choice for new systems, as well as the correct long term choice in almost every case.

Scalability

PostgreSQL 16 scales well on a single node, with multiple CPU sockets. PostgreSQL efficiently runs up to hundreds of active sessions and thousands of connected sessions when using a session pool. Further scalability is achieved in each annual release.

PostgreSQL provides multi-node read scalability using the Hot Standby feature. Transparent multi-node write scalability is under active development. The starting point for this is EDB Postgres Distributed (formerly Bi-directional replication, which will be discussed in Chapter 12, Replication and Upgrades), as it allows transparent and efficient synchronization of reference data across multiple servers. Other forms of write scalability have existed for more than a decade, starting from the PL/Proxy language, Greenplum and Citus.

SQL and NoSQL data models

PostgreSQL follows the SQL standard very closely. SQL itself does not force any particular type of model to be used, so PostgreSQL can easily be used for many types of models at the same time, in the same database.

PostgreSQL can be used as a relational database, in which case we can utilize any level of denormalization, from the full Third Normal Form (3NF) to the more normalized star schema models. PostgreSQL extends the relational model to provide arrays, row types, and range types.

A document-centric database is also possible using PostgreSQL’s text, XML, and binary JSON (JSONB) data types, supported by indexes optimized for documents and by full-text search capabilities.

Key/value stores are supported using the hstore extension.

Popularity

When MySQL was taken over by a commercial database vendor some years back, it was agreed in the EU monopoly investigation that followed that PostgreSQL was a viable competitor. That’s certainly been true, with the PostgreSQL user base expanding consistently for more than a decade.

Various polls have indicated that PostgreSQL is the favorite database for building new, enterprise-class applications. The PostgreSQL feature set attracts serious users who have serious applications. Financial services companies may be PostgreSQL’s largest user group, although governments, telecommunication companies, and many other segments are strong users as well. This popularity extends across the world; Japan, Ecuador, Argentina, and Russia have very large user groups, as do the US, Europe, and Australasia.

Amazon Web Services’ chief technology officer, Dr. Werner Vogels, described PostgreSQL as “an amazing database,” going on to say that “PostgreSQL has become the preferred open source relational database for many enterprise developers and start-ups, powering leading geospatial and mobile applications.” More recently, AWS has revealed that PostgreSQL is their fastest-growing service.

Commercial support

Many people have commented that strong commercial support is what enterprises need before they can invest in open source technology. Strong support is available worldwide from a number of companies.

The authors of this book work for EnterpriseDB (EDB), the largest company providing commercial support for open source PostgreSQL, offering 24/7 support in English with bug-fix resolution times.

Many other companies provide strong and knowledgeable support to specific geographic regions, vertical markets, and specialized technology stacks.

PostgreSQL is also available as a hosted or cloud solution from a variety of companies, since it runs very well in cloud environments.

A full list of companies is kept up to date at the following URL: http://www.postgresql.org/support/professional_support/.

Research and development funding

PostgreSQL was originally developed as a research project at the University of California, Berkeley in the late 1980s and early 1990s. Further work was carried out by volunteers until the late 1990s. Then, the first professional developer became involved. Over time, more and more companies and research groups became involved, supporting many professional contributors. Further funding for research and development was provided by the National Science Foundation.

The project also received funding from the EU FP7 Programme in the form of the 4CaaST project for cloud computing and the AXLE project for scalable data analytics. AXLE deserves a special mention because it was a three-year project aimed at enhancing PostgreSQL’s business intelligence capabilities, specifically for very large databases. The project covered security, privacy, integration with data mining, and visualization tools and interfaces for new hardware.

Other funding for PostgreSQL development comes from users who directly sponsor features and companies that sell products and services based around PostgreSQL.

Many features are contributed regularly by larger commercial companies, such as EDB.

 

How to get PostgreSQL

PostgreSQL is 100% open source software and is freely available to use, alter, or redistribute in any way you choose. Its license is an approved open source license, very similar to the Berkeley Software Distribution (BSD) license, although only just different enough that it is now known as The PostgreSQL License (TPL). You can see the license here: https://opensource.org/licenses/PostgreSQL.

How to do it...

PostgreSQL is already being used by many different application packages, so you may find it already installed on your servers. Many Linux distributions include PostgreSQL as part of the basic installation or include it with the installation disk.

One thing to be wary of is that the included version of PostgreSQL may not be the latest release. It will typically be the latest major release that was available when that operating system release was published. There is usually no good reason to stick to that version – there is no increased stability implied there—and later production versions are just as well supported by the various Linux distributions as the earlier versions.

If you don’t have a copy yet or the latest version, you can download the source code or binary packages for a variety of operating systems from http://www.postgresql.org/download/.

Installation details vary significantly from platform to platform, and there aren’t any special tricks or recipes to mention. Just follow the installation guide, and away you go! We’ve consciously avoided describing the installation processes here to make sure we don’t garble or override the information published to assist you.

EDB has provided the main macOS/Windows installer for PostgreSQL for many years, which can be accessed here: https://www.enterprisedb.com/downloads/postgres-postgresql-downloads. This gives you the option of installing both client and server software so that you can try it out on your laptop:

Figure 1.1: The PostgreSQL Setup Wizard

The installer shown in Figure 1.2 also allows you to install just the client software, allowing you to work with remote database servers, such as PostgreSQL in the cloud:

Figure 1.2: Selecting components to install

If you would like to receive email updates of the latest news, you can subscribe to the PostgreSQL announce mailing list, which contains updates from all the vendors that support PostgreSQL. You’ll get a few emails each month about new releases of core PostgreSQL, related software, conferences, and user group information. It’s worth keeping in touch with these developments.

Note

For more information about the PostgreSQL announcement mailing list, visit http://archives.postgresql.org/pgsql-announce/.

How it works...

Many people ask questions such as, How can this be free? Are you sure I don’t have to pay someone? Who gives this stuff away for nothing?

Open source applications such as PostgreSQL work on a community basis, where many contributors perform tasks that make the whole process work. For many of these people, their involvement is professional, rather than merely a hobby, and they can do this because there is generally great value for both the contributors and their employers alike.

You might not believe it. You don’t have to, because it just works!

There’s more…

Remember that PostgreSQL is more than just the core software. There is a huge range of websites that offer add-ons, extensions, and tools for PostgreSQL. You’ll also find an army of bloggers who provide useful tricks and discoveries that will help you in your work.

Besides these, a range of professional companies can offer you help when you need it.

 

Connecting to the PostgreSQL server

How do we access PostgreSQL?

Connecting to the database is the first experience of PostgreSQL for most people, so we want to make it a good one. Let’s do it now and fix any problems we have along the way. Remember that a connection needs to be made secure, so there may be some hoops for us to jump through to ensure that the data we wish to access is secure.

Before we can execute commands against the database, we need to connect to the database server to give us a session.

Sessions are designed to be long-lived, so you connect once, perform many requests, and eventually disconnect. There is a small overhead during the connection. It may become noticeable if you connect and disconnect repeatedly, so you may wish to investigate the use of connection pools. Connection pools allow pre-connected sessions to be quickly served to you when you wish to reconnect. We will discuss them in Chapter 4, Server Control.

Getting ready

First, cache your database. If you don’t know where it is, you’ll probably have difficulty accessing it. There may be more than one database, and you’ll need to know the right one to access and have the authority to connect to it.

You need to specify the following parameters to connect to PostgreSQL:

  • A host or host address
  • A port
  • A database name
  • A user
  • A password (or other means of authentication; but only if requested)

To connect, there must be a PostgreSQL server running on that host and listening to the port with that number. On that server, a database and a user with the specified names must also exist. Furthermore, the host must explicitly allow connections from your client (as explained in the Enabling access for network/remote users recipe), and you must also pass the authentication step using the method the server specifies – for example, specifying a password won’t work if the server has requested a different form of authentication. Note that you might not need to provide a password at all if PostgreSQL can recognize that your user is already authenticated by the OS; this is called peer authentication. After showing an example in this recipe, we will discuss it fully in the next recipe: Enabling access for network/remote users (despite not being a network/remote connection method).

Almost all PostgreSQL interfaces use the libpq interface library. When using libpq, most of the connection parameter handling is identical, so we can discuss that just once.

If you don’t specify the preceding parameters, PostgreSQL looks for values set through environment variables, which are as follows:

  • PGHOST or PGHOSTADDR
  • PGPORT (set this to 5432 if it is not set already)
  • PGDATABASE
  • PGUSER
  • PGPASSWORD (this is definitely not recommended by us, nor by the PostgreSQL documentation, even if it still exists)

If you somehow specify the first four parameters but not the password, PostgreSQL looks for a password file, as discussed in the Avoiding hardcoding your password recipe.

Some PostgreSQL interfaces use the client-server protocol directly, so the ways in which the defaults are handled may differ. The information we need to supply won’t vary significantly, so check the exact syntax for that interface.

Connection details can also be specified using a connection string, as in this example:

psql "user=myuser host=myhost port=5432 dbname=mydb password=mypasswd"

or alternatively using a Uniform Resource Identifier (URI) format, as follows:

psql postgresql://myuser:mypasswd@myhost:5432/mydb

Both examples specify that we will connect the psql client application to the PostgreSQL server at the myhost host, on port 5432, with the database name mydb, user myuser and password mypasswd.

Note

If you do not specify mypasswd in the preceding URI, you may be prompted to enter the password.

How to do it...

In this example, Afroditi is a database administrator who needs to connect to PostgreSQL to perform some maintenance activities. She can SSH to the database server using her own username afroditi, and DBAs are given sudo privileges to become the postgres user, so she can simply launch psql as the postgres user:

afroditi@dbserver1:~$ sudo -u postgres psql
psql (16.0 (Debian 16.0-1.pgdg120+1))
Type "help" for help.
postgres=#

Note that psql was launched as the postgres user, so it used the postgres user for the database connection, and that psql on Linux attempts a Unix socket connection by default. Hence, this matches peer authentication.

How it works…

PostgreSQL is a client-server database. The system it runs on is known as the host. We can access the PostgreSQL server remotely, through the network. However, we must specify host, which is a hostname, or hostaddr, which is an IP address. We can specify a host as localhost if we wish to make a TCP/IP connection to the same system. Rather than using TCP/IP to localhost, it is usually better to use a Unix socket connection, which is attempted if the host begins with a slash (/) and the name is presumed to be a directory name (the default is /tmp).

On any system, there can be more than one database server. Each database server listens to exactly one well-known network port, which cannot be shared between servers on the same system. The default port number for PostgreSQL is 5432, which has been registered with the Internet Assigned Numbers Authority (IANA) and is uniquely assigned to PostgreSQL (you can see it used in the /etc/services file on most *nix servers). The port number can be used to uniquely identify a specific database server, if any exist. IANA (http://www.iana.org) is the organization that coordinates the allocation of available numbers for various internet protocols.

A database server is also sometimes known as a database cluster because the PostgreSQL server allows you to define one or more databases on each server. Each connection request must identify exactly one database, identified by its dbname. When you connect, you will only be able to see the database objects created within that database.

A database user is used to identify the connection. By default, there is no limit on the number of connections for a particular user. In the Enabling access for network/remote users recipe, we will cover how to restrict that. In more recent versions of PostgreSQL, users are referred to as login roles, although many clues remind us of the earlier nomenclature, and that still makes sense in many ways. A login role is a role that has been assigned the CONNECT privilege.

Each connection will typically be authenticated in some way. This is defined at the server level: client authentication will not be optional at connection time if the administrator has configured the server to require it.

Once you’ve connected, each connection can have one active transaction at a time and one fully active statement at any time.

The server will have a defined limit on the number of connections it can serve, so a connection request can be refused if the server is oversubscribed.

There’s more…

If you are already connected to a database server with psql and you want to confirm that you’ve connected to the right place and in the right way, you can execute some, or all, of the following commands. Here is the command that shows the current_database:

SELECT current_database();

The following command shows the current_user ID:

SELECT current_user;

The next command shows the IP address and port of the current connection, unless you are using Unix sockets, in which case both values are NULL:

SELECT inet_server_addr(), inet_server_port();

A user’s password is not accessible using general SQL, for obvious reasons.

You may also need the following:

SELECT version();

This is just one of several ways to check the database software version; please refer to the What version is the server? recipe in Chapter 2, Exploring the Database. You can also use the new psql meta-command, \conninfo. This displays most of the preceding information in a single line:

postgres=# \conninfo
You are connected to database postgres, as user postgres, via socket in /var/run/postgresql, at port 5432.

See also

There are many other snippets of information required to understand connections. Some of them are mentioned in this chapter, and others are discussed in Chapter 6, Security. For further details, refer to the PostgreSQL server documentation, which we provided a link to earlier in this chapter.

 

Enabling access for network/remote users

PostgreSQL comes in a variety of distributions. In many of these, you will note that remote access is initially disabled as a security measure. You can do this quickly, as described here, but you really should read the chapter on security soon.

How to do it…

By default, on Linux PostgreSQL gives access to clients who connect using Unix sockets, provided that the database user is the same as the system’s username, as in the example from the previous recipe.

A socket is effectively a filesystem path that processes running on the same host can use for two-way communication. The PostgreSQL server process can see the OS username under which the client is running, and authenticate the client based on that. This is great, but unfortunately only applies to the special case when the client and the server are running on the same host. For all the remaining cases, we need to show you how to enable all the other connection methods.

Note

In this recipe, we mention configuration files, which can be located as shown in the Finding the configuration settings for your session recipe in Chapter 3, Server Configuration.

The steps are as follows:

  1. Add or edit this line in your postgresql.conf file:
    listen_addresses = '*'
    
  2. Add the following line as the first line of pg_hba.conf to allow access to all databases for all users with an encrypted password:
    # TYPE   DATABASE   USER   CIDR-ADDRESS   METHOD
    host      all       all     0.0.0.0/0    scram-sha-256
    
  3. After changing listen_addresses, we restart the PostgreSQL server, as explained in the Updating the parameter file recipe in Chapter 3, Server Configuration.

    Note

    This recipe assumes that postgresql.conf does not include any other configuration files, which is the case in a default installation. If changing listen_addresses in postgresql.conf does not seem to work, perhaps that setting is overridden by another configuration file. Check out the recipe we just mentioned for more details.

How it works…

The listen_addresses parameter specifies which IP addresses to listen to. This allows you to flexibly enable and disable listening on interfaces of multiple Network Interface Cards (NICs) or virtual networks on the same system. In most cases, we want to accept connections on all NICs, so we use *, meaning all IP addresses. But the user could also specify the IP address of a given interface. For instance, you might decide to be listening only on connections coming through a specific VPN.

The pg_hba.conf file contains a set of host-based authentication rules. Each rule is considered in sequence until one rule matches the incoming connection and is applied for authentication, or the attempt is specifically rejected with a reject method, which is also implemented as a rule.

The rule that we added to the pg_hba.conf file means that a remote connection that specifies any user or database on any IP address will be asked to authenticate using a SCRAM-SHA-256-encrypted password. The following are the parameters required for SCRAM-SHA-256-encrypted passwords:

  • Type: For this, host means a remote connection.
  • Database: For this, all means for all databases. Other names match exactly, except when prefixed with a plus (+) symbol, in which case we mean a group role rather than a single user. You can also specify a comma-separated list of users or use the @ symbol to include a file with a list of users. You can even specify sameuser so that the rule matches when you specify the same name for the user and database.
  • User: For this, all means for all users. Other names match exactly, except when prefixed with a plus (+) symbol, in which case we mean a group role rather than a single user. You can also specify a comma-separated list of users, or use the @ symbol to include a file with a list of users.
  • CIDR-ADDRESS: This consists of two parts: an IP address and a subnet mask. The subnet mask is specified as the number of leading bits of the IP address that make up the mask. Thus, /0 means 0 bits of the IP address so that all IP addresses will be matched. For example, 192.168.0.0/24 would mean matching the first 24 bits, so any IP address of the 192.168.0.x form would match. You can also use samenet or samehost.
  • Method: For this, scram-sha-256 means that PostgreSQL will ask the client to provide a password encrypted with SCRAM-SHA-256. A common choice is peer, which is enabled by default and described in the There’s more… section of this recipe. Another common (and discouraged!) setting is trust, which effectively means no authentication. Other authentication methods include GSSAPI, SSPI, LDAP, RADIUS, and PAM. PostgreSQL connections can also be made using SSL, in which case client SSL certificates provide authentication. See the Using SSL certificates to authenticate the client recipe in Chapter 6, Security, for more details.

Don’t use the password authentication method in pg_hba.conf as this sends the password in plain text (it has been deprecated for years). This is not a real security issue if your connection is encrypted with SSL, but there are normally no downsides with SCRAM-SHA-256 anyway, and you have extra security for non-SSL connections.

There’s more…

We have mentioned peer authentication as a method that allows password-less connections via Unix sockets. It is enabled by default, but only on systems that have Unix sockets, meaning that it does not exist on Windows: this is why the Windows installer asks you to insert the administrator password during installation.

When using a Unix socket connection, the client is another process running on the same host; therefore, Postgres can reliably get the OS username under which the client is running. The logic of peer authentication is to allow a connection attempt if the client’s OS username is identical to the database username being used for the connection. Hence, if there is a database user with exactly the same name as an OS user, then that user can benefit from password-less authentication.

It is a safe technique, which is why it is enabled by default. In the special case of the postgres user, you can connect as a database superuser in a password-less way. This is not a security breach because PostgreSQL actually runs as the postgres OS user, so if you log in to the server with that user, then you are allowed to access all the data files.

When installing PostgreSQL on your Linux laptop, an easy way to enable your own access is to create a database user identical to your OS username, and also a database with the same name. This way, you can use psql with your normal OS user for password-less connections. Of course, to create the user and the database you need first to connect as the predefined postgres superuser, which you can do by running psql as the postgres OS user.

In earlier versions of PostgreSQL, access through the network was enabled by adding the -i command-line switch when you started the server. This is still a valid option, but now it is just equivalent to specifying the following:

listen_addresses = '*'

So, if you’re reading some notes about how to set things up and this is mentioned, be warned that those notes are probably long out of date. They are not necessarily wrong, but it’s worth looking further to see whether anything else has changed.

See also

Look at the installer and/or OS-specific documentation to find the standard location of the files.

 

Using the pgAdmin 4 GUI tool

Graphical administration tools are often requested by system administrators. PostgreSQL has a range of tool options. In this book, we’ll cover pgAdmin 4.

pgAdmin 4 is a client application that sends and receives SQL to and from PostgreSQL, displaying the results for you. The admin client can access many database servers, allowing you to manage a fleet of servers. The tool works in both standalone app mode and within web browsers.

How to do it…

pgAdmin 4 is usually named just pgAdmin. The 4 at the end has a long history but isn’t that important. It is more of an “epoch” than a release level; pgAdmin 4 replaces the earlier pgAdmin 3. Instructions to download and install it can be found at https://www.pgadmin.org/.

When you start pgAdmin, you will be prompted to register a new server.

Give your server a name on the General tab, and then click the Connection tab, as shown in the screenshot, and fill in the five basic connection parameters, as well as the other information. You should uncheck the Save password? option:

Figure 1.3: The server connection properties

If you have many database servers, you can group them together. I suggest keeping any replicated servers together in the same server group. Give each server a sensible name.

Once you’ve added a server, pgAdmin will connect to it and display information about it, using the information that you have added.

The default screen is Dashboard, which presents a few interesting graphs based on the data it polls from the server. That’s not very useful, so click on the Statistics tab.

You will then get access to the main browser screen, with the object tree view on the left and statistics on the right, as shown in the following screenshot:

Figure 1.4: The pgAdmin tree view with the Statistics tab

pgAdmin easily displays much of the data that is available from PostgreSQL. The information is context-sensitive, allowing you to navigate and see everything quickly and easily. Except for the dashboard, the information is not dynamically updated; this will occur only when you navigate the application, where every click will refresh the data, so bear this in mind when using the application.

pgAdmin also provides Grant Wizard. This is useful for DBAs for review and immediate maintenance. In the example shown in the screenshot, the user first selected the Sequences on the navigation tree and then selected Tools Grant Wizard. This will open a pop-up window to select the objects on which privileges will be granted to a selected role:

Figure 1.5: Grant Wizard: selecting sequences to grant privileges to a role

The pgAdmin query tool allows you to have multiple active sessions. The query tool has a good-looking visual Explain feature, which displays the EXPLAIN plan for your query. To do this, go to Tools Query Tool. Write your query in the Query text box, and then click on the E (Explain) button, as shown in the following screenshot. The graphical execution tree is shown below the query:

Figure 1.6: The visual Explain feature

How it works…

pgAdmin provides a wide range of features, many of which are provided by other tools as well. This gives us the opportunity to choose which of those tools we want. For many reasons, it is best to use the right tool for the right job, and that is always a matter of expertise, experience, and personal taste.

pgAdmin submits SQL to the PostgreSQL server and displays the results quickly and easily. As a database browser, it is fantastic. For performing small DBA tasks, it is ideal. As you might’ve guessed from these comments, I don’t recommend GUI tools for every task.

Scripting is an important technique for DBAs. You keep an exact copy of the task executed, so you document all the actions in a way that is automatically repeatable, and you can edit and resubmit if problems occur. It’s also easy to put all the tasks in a script into a single transaction, which isn’t possible using the current GUI tools. For scripting, I strongly recommend the psql utility, which has many additional features that you’ll increasingly appreciate over time.

Although I recommend psql as a scripting tool, many people find it convenient as a query tool. Some people may find this strange and assume that it is a choice for experts only. Two great features of psql as an interactive query tool are the online help for SQL and the tab completion feature, which allows you to build up SQL quickly without having to remember the syntax. This is why the Using the psql query and scripting tool recipe (which we recommend particularly for more information) is named that way.

pgAdmin provides the PSQL Tool in the Tools menu, which allows you to run psql alongside pgAdmin. This is a great innovation and allows you to get the power of a GUI alongside the power of psql.

pgAdmin also provides pgAgent, a job scheduler, which we will discuss in Chapter 7, Database Administration.

A quick warning! When you create an object in pgAdmin, the object will be created with a mixed-case name if you use capitals or spaces anywhere in the object name. If I ask for a table named MyTable, the only way to access that table is by referring to it in double quotes as "MyTable". See the Handling objects with quoted names recipe in Chapter 5, Tables and Data:

Figure 1.7: Table options

See also

You may also be interested in commercial tools of various kinds for PostgreSQL. A full listing is given in the PostgreSQL software catalog at http://www.postgresql.org/download/products/1.

 

Using the psql query and scripting tool

psql is the query tool supplied as a part of the core distribution of PostgreSQL, so it is available in all environments and works similarly in all of them. This makes it an ideal choice for developing portable applications and techniques.

psql provides features for use as both an interactive query tool and as a scripting tool.

Getting ready

From here on, we will assume that the psql command alone is enough to allow you access to the PostgreSQL server. This assumes that all your connection parameters are defaults, or that you have set environment variables appropriately, as previously explained in the Enabling access for remote/network users recipe.

Written in full, the connection parameters will be either of these options:

psql -h myhost -p 5432 -d mydb -U myuser
psql postgresql://myuser@myhost:5432/mydb

The default value for the port (-p) is 5432. By default, mydb and myuser are both identical to the operating system’s username. The default myhost on Windows is localhost, while on Unix, we use the default directory for Unix socket connections. The location of such directories varies across distributions and is set at compile time. However, note that you don’t actually need to know its value because, on local connections, both the server and the client are normally compiled together, so they use the same default.

How to do it…

The command that executes a single SQL command and prints the output is the easiest, as shown here:

$ psql -c "SELECT current_time"
     timetz
-----------------
18:48:32.484+01
(1 row)

The -c command is non-interactive. If we want to execute multiple commands, we can write those commands in a text file and then execute them using the -f option. This command loads a very small and simple set of examples:

$ psql -f examples.sql

The contents of the examples.sql file are as follows:

SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SET check_function_bodies = false;
SET xmloption = content;
SET client_min_messages = warning;
SET row_security = off;
DROP SCHEMA IF EXISTS myschema CASCADE;
CREATE SCHEMA myschema;
SET default_tablespace = '';
SET default_table_access_method = heap;
SET search_path = myschema;
CREATE TABLE mytable (
    id integer PRIMARY KEY,
    col1 text
);
CREATE TABLE mytable2 (
    id integer,
    fid integer REFERENCES mytable(id),
    col2 timestamp with time zone DEFAULT clock_timestamp(),
    PRIMARY KEY (id, fid)
);
COPY mytable (id, col1) FROM stdin;
1	Ananas
2	Banana
3	Cucumber
4	Dasheen
5	Endive
\.
COPY mytable2 (id, fid, col2) FROM stdin;
1001	1	2023-11-15 18:49:14.84806+01
1001	2	2023-11-15 18:49:14.848334+01
1002	5	2023-11-15 18:49:14.848344+01
\.

The above command produces the following output when successful, which is a list of command tags that show the command that was executed, and how many rows were affected:

SET
SET
SET
SET
SET
SET
DROP SCHEMA
CREATE SCHEMA
SET
SET
SET
CREATE TABLE
CREATE TABLE
COPY 5
COPY 3

The examples.sql script is very similar to a dump file produced by PostgreSQL backup tools, so this type of file and the output it produces are very common; in fact, we produced it by creating a dump file and then removing some parts that were not needed by this example.

When a command is executed successfully, PostgreSQL outputs a command tag equal to the name of that command; this is how the preceding output was produced.

The psql tool can also be used with both the -c and -f modes together; each one can be used multiple times. In this case, it will execute all the commands consecutively:

$ psql -c "SELECT current_time" –f examples.sql -c "SELECT current_time"
     timetz
-----------------
18:52:15.287+01
(1 row)
   ...output removed for clarity...
     timetz
-----------------
18:58:23.554+01
(1 row)

The psql tool can also be used in interactive mode, which is the default, so it requires no option:

$ psql
postgres=#

The first interactive command you’ll need is the following:

postgres=# help

You can then enter SQL or other commands. The following is the last interactive command you’ll need:

postgres=# \quit

Unfortunately, you cannot type quit on its own, nor can you type \exit or other options. Sorry – it’s just \quit, or \q for short!

How it works…

In psql, you can enter the following two types of command:

  • psql meta-commands
  • SQL

A meta-command is a command for the psql client, which may (or may not) send SQL to the database server, depending on what it actually does, whereas an SQL command is always sent to the database server. An example of a meta-command is \q, which tells the client to disconnect. All lines that begin with \ (a backslash) as the first non-blank character are presumed to be meta-commands of some kind.

If it isn’t a meta-command, it’s SQL, in which case psql keeps reading SQL until we find a semicolon, so we can spread SQL across many lines and format it any way we find convenient.

The help command is the only exception. We provide this for people who are completely lost, which is a good thought; so let’s start from there ourselves.

There are two types of help commands, which are as follows:

  • \?: This provides help on psql meta-commands.
  • \h: This provides help on specific SQL commands.

Consider the following snippet as an example:

postgres=# \h DELETE
Command: DELETE
Description: delete rows of a table
Syntax:
[ WITH [ RECURSIVE ] with_query [, ...] ]
DELETE FROM [ ONLY ] table [ [ AS ] alias ]
    [ USING usinglist ]
    [ WHERE condition | WHERE CURRENT OF cursor_name ]
    [ RETURNING * | output_expression [ AS output_name ] [,]]

I find this a great way to discover and remember options and syntax. You’ll also appreciate having the ability to scroll back through the previous command history if your terminal allows it.

You’ll get a lot of benefits from tab completion, which will fill in the next part of the syntax when you press the Tab key. This also works for object names, so you can type in just the first few letters and then press Tab; all the options will be displayed. Thus, you can type in just enough letters to make the object name unique and then hit Tab to get the rest of the name.

Like most programming languages, SQL also supports comments. One-line comments begin with two dashes, as follows:

-- This is a single-line comment

Multiline comments are similar to those in C and Java:

/*
Multiline comment
line 2
line 3
*/

You’ll probably agree that psql looks a little daunting at first, with strange backslash commands. I do hope you’ll take a few moments to understand the interface and keep digging for more information. The psql tool is one of the most surprising parts of PostgreSQL, and it is incredibly useful for database administration tasks when used alongside other tools.

There’s more…

psql works across releases and works well with older versions. It may not work at all with newer server versions, so use the latest client level of the server you are accessing.

See also

Check out some other useful features of psql, which are as follows:

  • Informational metacommands, such as \d, \dn, and more
  • Formatting, for output, such as \x
  • Execution timing using the \timing command
  • Input/output and editing commands, such as \copy, \i, and \o
  • Automatic startup files, such as .psqlrc
  • Substitutable parameters (variables), such as \set and \unset
  • Access to the OS command line using \!
  • Crosstab views with \crosstabview
  • Conditional execution, such as \if, \elif, \else, and \endif
About the Authors
  • Gianni Ciolli

    Gianni Ciolli is Vice President and Field CTO at EDB; he was Global Head of Professional Services at 2ndQuadrant until it was acquired by EDB. Gianni has been a PostgreSQL consultant, trainer, and speaker at many PostgreSQL conferences in Europe and abroad over more than 10 years. He has a PhD in Mathematics from the University of Florence. He has worked with free and Open-Source software since the 1990s and is active in the community. He lives between Frankfurt and London and plays the piano in his spare time

    Browse publications by this author
  • Boriss Mejías

    Boriss Mejias is a Senior Solutions Architect at EDB, building on his experience as PostgreSQL consultant and trainer at 2ndQuadrant. He has been working with open source software since the beginning of the century contributing to several projects both with code and community work. He has a PhD in Computer Science from the Université catholique de Louvain, and an Engineering degree from Universidad de Chile.Complementary to his role as Solutions Architect, he gives PostgreSQL training and is a regular speaker at PostgreSQL conferences. He loves spending time with his family and playing air guitar

    Browse publications by this author
  • Jimmy Angelakos

    Jimmy Angelakos is a Systems and Database Architect and recognized PostgreSQL expert. He studied Computer Science at the University of Aberdeen, has worked with Open-Source tools for 25+ years, is an active member of PostgreSQL Europe and occasional contributor to the PostgreSQL project. He also speaks frequently at database and Free & Open-Source Software conferences

    Browse publications by this author
  • Vibhor Kumar

    Vibhor Kumar, Global VP at EDB, is a pioneering data tech leader. He manages a global team of engineers, optimizing clients' Postgres databases for peak performance and scalability. He advises Fortune 500 clients, including many Financial Institutes, in innovating and transforming their data platforms. His past experience spans IBM, BMC Software, and CMC Ltd. He holds a BSc in Computer Science from the University of Lucknow and a Master's from the Army Institute of Management. As a certified expert in numerous technologies, he often shares his insights on DevOps, cloud, and database optimization through blogging and speaking at events

    Browse publications by this author
  • Simon Riggs

    Simon Riggs is a Major Developer of PostgreSQL since 2004. Formerly, Simon was the Founder and CEO of 2ndQuadrant, acquired by EDB in 2020. Simon has contributed widely to PostgreSQL, initiating new projects, contributing ideas, committing many important features as well and working directly with database architects and users on advanced solutions

    Browse publications by this author
Latest Reviews (1 reviews total)
PostgreSQL 16 Administration Cookbook
Unlock this book and the full library FREE for 7 days
Start now