In this chapter, we will cover the installation process of PostgreSQL. You will learn how to install binary packages and see how PostgreSQL can be compiled from source. Compiling from source is especially important if you happen to be using platforms on which no binaries are available (for example, AIX, HPUX, and so on).
We will cover the following topics in this chapter:
Preparing your setup
Installing binary packages
Compiling PostgreSQL from source
Understanding the existing databases
You will see how easy it is to make PostgreSQL work in a couple of minutes.
Before we dig into the actual installation process and see how things can be put to work, it makes sense to talk a little bit about the PostgreSQL version numbers. Understanding the PostgreSQL versioning policy will give you valuable insights, especially with respect to your upgrade policy, downtime management, and so on.
Minor releases: 9.4.2, 9.4.1, 9.4.1
Major releases: 9.4.0, 9.3.0, 9.2.0
N.0.0 releases (super major): 9.0.0, 8.0.0, 7.0.0
The distinction between the preceding three types of releases is pretty important. Why is that? Well, if you happen to upgrade to a new minor release (say, from 9.4.1 to 9.4.3), all you have to do is stop the database and start the new binaries. There is no need to touch the data. In short, the amount of downtime needed is basically close to zero.
You can safely update to a more recent minor release to improve reliability. The risk involved is negligible.
In case of a major version change, you definitely have to plan things a little better because updates are a bit more complicated (
pg_dump / pg_restore or
When I am training people, they ask me on a regular basis which version of PostgreSQL they should use. The answer to this question is simple; if you have the ability to decide freely, it is absolutely safe to use the latest stable release of PostgreSQL, even if it is a "zero" release (which is equal to 9.4.0, 9.3.0, and so on).
After this little introduction to PostgreSQL versioning, we can move forward and see how binary packages can be installed. Nowadays, most people use binary packages that are shipped with their preferred Linux distribution. These packages are tested, easy to use, and readily available.
In this chapter, we will show you how to install PostgreSQL on Debian or Ubuntu and on Red-Hat-based systems.
Let's focus on installing PostgreSQL on Debian or Ubuntu first. The key point here is that it is recommended to add the PostgreSQL repositories to Ubuntu. The reason is that many Linux distributions, including Ubuntu, ship very old and outdated versions of PostgreSQL in their standard setup. If you don't want to miss a couple of years of PostgreSQL development, adding the current repositories will be highly beneficial to you. The process of adding the repositories is as follows:
Create a file called
/etc/apt/sources.list.d/pgdg.list, and add a line for the PostgreSQL repository (the following steps can be done as a root user or by using
/etc/apt/sources.listis a place to put the line:
deb http://apt.postgresql.org/pub/repos/apt/ YOUR_DEBIAN_VERSION_HERE-pgdg main
deb http://apt.postgresql.org/pub/repos/apt/ wheezy-pgdg main
$# wget --quiet -O - \ https://www.postgresql.org/media/keys/ACCC4CF8.asc | \ apt-key add - OK
In our case, we will install PostgreSQL 9.4. Of course, you can also decide to use 9.3 or any other recent version you desire:
apt-get install "postgresql-9.4"
All relevant packages will be downloaded automatically, and the system will instantly fire up PostgreSQL.
Once all these steps have been performed, you are ready for action. You can try to connect to the database:
root@chantal:~# su - postgres $ psql postgres psql (9.4.1) Type "help" for help. postgres=#
The installation process on Red Hat-based distributions works in a pretty similar way. Many distributions use RPM packages. The following URL shows the distributions for which we are currently ready to use RPMs: http://yum.postgresql.org/repopackages.php.
The first thing to do is to install an RPM package containing all the repository information. Once this is done, we can easily fetch PostgreSQL RPMs from the repository and fire things up in almost no time.
yum install http://yum.postgresql.org/9.4/fedora/fedora-20-x86_64/pgdg-fedora94-9.4-1.noarch.rpm
Once the repository has been added, we can install PostgreSQL by using the following commands:
yum install postgresql94-server postgresql94-contrib /usr/pgsql-9.4/bin/postgresql94-setup initdb systemctl enable postgresql-9.4.service systemctl start postgresql-9.4.service
The first command (
yum install) will fetch the packages from the repository and install them on your server. Once this is done, we can prepare a database instance and initialize it.
Finally, we enable the service and start it up. Our database server is now ready for action.
SLAs: You might have to provide an old version, which is not available as package anymore, to fulfill some SLA agreements.
No packages available: On your favorite flavor of Linux, there is most likely a package containing PostgreSQL available always. However, what about AIX, Solaris, HPUX, and others?
Custom patches: Some people write custom patches to enhance PostgreSQL.
Split directories: You might want to split the binary and library directories and make sure that PostgreSQL does not integrate tightly into the existing OS.
Before we get started, we have to download the tarball from http://ftp.postgresql.org/pub/source/. There, you will find one directory per version of PostgreSQL. In our case, we have downloaded PostgreSQL 9.4.1, and we will use it throughout this chapter.
The first thing we have to do is to extract the tar archive:
tar xvfz postgresql-9.4.1.tar.gz
This will create a directory containing the PostgreSQL source code. Once we have entered this directory, we can call
configure, which will then check your system to see if all libraries you need are present. It generates vital parts of the build infrastructure.
Here is how it works:
In our example, we used the most simplistic of all configurations. We want to install the binaries to a directory called
/usr/local/pg941. Note that this is not where the data will end up; it is where the executables will reside. If you don't define
–prefix, the default installation path will be
Of course, there is a lot more. Try running the following command:
If you run the preceding command, you will see that there are some more features that can be turned on (for example,
--with-python) in case you are planning to write stored procedures in Perl or Python.
In some cases, you might find that our operating system lacks libraries needed to compile PostgreSQL properly. Some of the most common candidates are
zlib-dev (of course there are some more). These two libraries are needed to enable the command-line history as well as to give support for compression. We highly recommend providing both libraries to PostgreSQL.
Keep in mind that the two previously defined libraries have slightly different names on different Linux distributions because every Linux distribution uses slightly different naming conventions.
If you are compiling on a more exotic Unix operating system such as Solaris, AIX, and so on, we recommend you to check out the documentation regarding the platform specifications.
make make install
You just have to call
make install (as root) and wait for a few seconds. In this case, we simply use one CPU core to build PostgreSQL. If you want to scale out the build process to many CPU cores, you can use
–j, shown as follows:
make -j 8
-j 8 command will tell
make to do up to
8 things in parallel, if possible. Adding parallelism to the build process will definitely speed up the process. It is not uncommon to build PostgreSQL in 30 seconds or less if there are enough CPU cores on board.
It is highly recommended to install the PostgreSQL
contrib packages as well. Contrib is a set of additional modules that can be used for different purposes such as creating database links from PostgreSQL, to PostgreSQL, or for adding an additional indexing functionality.
If you are installing PostgreSQL from binary packages, you can simply install one more package (for example,
postgresql-9.3-contrib). If you happen to install from source, you have to perform the following steps:
cd contrib make make install
Of course, you can also use the
-j flag again to scale out to more than just one CPU. The
make install command will need root permissions again (for example, via
Creating and configuring a user to run PostgreSQL
Creating a database instance
If you have installed PostgreSQL from binary packages, the system will automatically create a user for PostgreSQL. If you happen to compile it yourself, you have to create the operating system user yourself too.
Depending on the operating system you are using, this works in a slightly different way. On Ubuntu, for instance, you can call
adduser on Red Hat and
useradd on CentOS. I really recommend looking up the procedure to create a user in your operating system manual.
In general, it's best practice to create a user named
postgres; however, a nonroot user will also do. I just recommend sticking to the standard to make life easier on the administration front.
Once the user has been created, it is, in general, a good idea to prepare your infrastructure for PostgreSQL. This implies adjusting your
$PATH environment variable. On most Linux systems, this can be done in your
.bashrc file. Having your favorite PostgreSQL tools in your path will make life simple and a lot easier.
Finally, we can add the
init scripts to the system. In
postgresql-9.4.1/contrib/start-scripts, you will find
init scripts for Linux, Mac OS X, and FreeBSD. These scripts are a good framework to make your
init process work as expected.
Once we compile PostgreSQL and prepare ourselves to launch PostgreSQL, we can create a so-called PostgreSQL database instance. What is a database instance? Well, whenever you start PostgreSQL, you are actually firing up a database instance. So, the instance is really a central thing; it is that which contains all the databases, users, tablespaces, and so on.
In PostgreSQL, a database instance always resides in a
database directory. In our example, we want to create the instance under
mkdir /data chown postgres.postgres /data su - postgres initdb -D /data -E unicode
First, we created the directory and assigned it to the
postgres user. Then, we created the database instance. The important part here is that we explicitly stated (
-E unicode) that we want UTF-8 to be the default character set in our system. If we don't explicitly tell the system what to use, it will check out the locale settings and use the Unix locale as the default for the instance. This might not be the desired configuration for your setup, so it is better to explicitly define the character set.
Also, instead of using
-D here, we can set
$PGDATA to tell PostgreSQL where the desired place for the database instance is going to be. There's also an
initdb --help command that will reveal a handful of additional configuration options.
-A: This defines the default authentication method of local connections. Many people use trust, md5, or peer for this option.
--locale: This defines your desired character set and locale settings.
-k: This setting will require PostgreSQL to create data page checksums. It is highly recommended to use this setting for mission critical data. The overhead of the page checksums is virtually zero, so you will get a lot more protection for your data at virtually no cost.
Once we create our database instance, we can start our database server.
Note that on some Linux distros, it might be necessary to add a version number to the service (for example,
/etc/init.d/postgresql-9.4 start). On non-Linux systems, you have to check out your corresponding
In case you have not installed the
start scripts, you can fire up PostgreSQL manually. Assuming that our database instance resides in
/data, it works like this:
pg_ctl -D /data -l /dev/null start
In the preceding command,
pg_ctl is the tool to control PostgreSQL,
-D tells the system where to find the database instance,
-l /dev/null tells our database server to send the log information to
start will simply make the instance fire up.
Note that we use
-l here for simplicity reasons. In later chapters, you will learn how to set up proper logging using the PostgreSQL onboard infrastructure.
Installing PostgreSQL is as simple as that.
Ideally, you connect to a database called
postgres, which can be found in any database instance. Some systems don't encourage people to log in as a
postgres user. Therefore, you might want to use
sudo here as well to log in to PostgreSQL. If this works for you, you can make PostgreSQL display a list of existing databases, where
\l will do the job:
postgres=# \x Expanded display is on. postgres=# \l List of databases -[ RECORD 1 ]-----+---------------------- Name | postgres Owner | postgres Encoding | UTF8 Collate | en_US.UTF-8 Ctype | en_US.UTF-8 Access privileges | -[ RECORD 2 ]-----+---------------------- Name | template0 Owner | postgres Encoding | UTF8 Collate | en_US.UTF-8 Ctype | en_US.UTF-8 Access privileges | =c/postgres | postgres=CTc/postgres -[ RECORD 3 ]-----+---------------------- Name | template1 Owner | postgres Encoding | UTF8 Collate | en_US.UTF-8 Ctype | en_US.UTF-8 Access privileges | =c/postgres | postgres=CTc/postgres
Congratulations, you've completed your first task using the
psql shell. If you executed
\l as proposed, you might have seen that the table is too wide to be displayed properly. To avoid this, you can use
\x to transpose the output and display each column as a separate line. This little feature can come in handy whenever you have to read a wide table.
If you want to customize the
psql shell for your needs, you might want to consider writing a
.psqlrc file. It can automatically set things such as
\x for you on every login.
In an empty database instance, you will already find the three existing databases
postgres. The rule for you as an end user is simple: always connect to the
postgres database and try to avoid connections to the
template databases (
template0 does not allow connections anyway); these databases are only here to act as a role model in case you create a new database. Make sure that no useless objects are in
template1 because whenever you create an additional database, these useless objects are cloned.
postgres=# \h CREATE DATABASE Command: CREATE DATABASE Description: create a new database Syntax: CREATE DATABASE name [ [ WITH ] [ OWNER [=] user_name ] [ TEMPLATE [=] template ] [ ENCODING [=] encoding ] [ LC_COLLATE [=] lc_collate ] [ LC_CTYPE [=] lc_ctype ] [ TABLESPACE [=] tablespace_name ] [ CONNECTION LIMIT [=] connlimit ] ]
\h command is very convenient to use; it provides you with the syntax of basically every command in the system. In short,
\h makes life really easy.
In our case, we can see which options
CREATE DATABASE provides. First, we can define the name of the newly created database. The
TEMPLATE parameter can be used to physically clone an existing database (if you don't use this one,
template1 will be cloned by default). The
LC_* parameters are needed in case you want to use encodings and locales different from the default one. Finally, we can make use of a tablespace (which will be dealt with later on in this book), and PostgreSQL provides a way to limit the number of concurrent connections to the database.
In our example, we create a simple database:
postgres=# CREATE DATABASE test; CREATE DATABASE
If this succeeds, we are ready to connect to our newly created database. We can even do so without
postgres=# \c test psql (9.4.1, server 9.4.1) You are now connected to database "test" as user "postgres".
In this chapter, you learned how to install PostgreSQL binary packages and compile PostgreSQL from source. You also learned how to create database instances and create simple databases inside your instance.
The next chapter will be dedicated to some of the biggest problems in the database world. You will be guided through indexing and detecting performance bottlenecks.