In this chapter, we will cover the following recipes:
Creating and dropping tablespaces
Moving objects between tablespaces
Initializing a database cluster
Starting the server
Stopping the server
Displaying the server status
Reloading the server configuration files
PostgreSQL is an advanced database server available on a wide range of platforms, ranging from Unix-based operating systems such as Oracle Solaris, IBM AIX, and HP-UX; Windows; and Mac OS X to Red Hat Linux and other Linux-based platforms.
We start with showing how to create databases in PostgreSQL. During the course of this chapter, we will cover schemas, users, groups, and tablespaces, and show how to create these entities. We will also show how to start and stop the PostgreSQL server services.
A database is a systematic and organized collection of data which can be easily accessed, managed, and updated. It provides an efficient way of retrieving stored information. PostgreSQL is a powerful open source database. It is portable because it written in ANSI C. As a result, it is available for different platforms and is reliable. It is also ACID (short for Atomicity, Consistency, Isolation, Durability) compliant, supports transactions, is scalable as it supports multi version concurrency control (MVCC) and table partitioning, is secure as it employs host based access control and supports SSL, and provides high availability and replication by implementing features such as streaming replication and its support for point in time recovery.
Before you start creating databases, you would need to install PostgreSQL on your computer. For Red Hat or CentOS Linux environments, you can download the correct rpm for the PostgreSQL 9.3 version from yum.postgresql.org.
Alternatively, you may download the graphical PostgreSQL installer available from the EnterpriseDB website, at http://www.enterprisedb.com/products-services-training/pgdownload.
For details on how to install PostgreSQL using the graphical PostgreSQL installer from the EnterpriseDB website, you can refer to the following link for further instructions:
Once you have downloaded and installed PostgreSQL, you will need to define the data directory, which is the storage location for all of the data files for the database. You will then need to initialize the data directory. Initialization of the data directory is covered under the recipe titled Initializing a database cluster. After this, you are ready to create the database.
To connect to a database using the
psql utility, you can use the following command:
psql -h localhost -d postgres –p 5432
Here, we are basically connecting to the
postgres database, which is resident on the localhost, that is the same server on which PostgreSQL was installed, and the connection is taking place on port
In the following code, we are creating a user,
hr. Basically, this user is being created because in the next section, it is being used as the owner of the
CREATE USER hr with PASSWORD 'hr';
More details regarding creating users will be covered in the Creating users recipe.
The first method relies on using the
CREATE DATABASESQL statement:
CREATE DATABASE hrdb WITH ENCODING='UTF8' OWNER=hr CONNECTION LIMIT=25;
The second method requires using the
createdb –h localhost –p 5432 –U postgres testdb1
A database is a named collection of objects such as tables, functions, and so on. In order to create a database, the user must be either a superuser or must have the special
createdb command-line executable connects to the
postgres database when triggered, and then issues the
CREATE DATABASE command.
You can view the list of existing databases by querying the
pg_database catalog table, as shown in the following screenshot:
Schemas are among the most important objects within a database. A schema is a named collection of tables. A schema may also contain views, indexes, sequences, data types, operators, and functions. Schemas help organize database objects into logical groups, which helps make these objects more manageable.
You can use the
CREATE SCHEMA statement to create a new schema in PostgreSQL:
CREATE SCHEMA employee;
Alternatively, it is also possible to create a schema for a particular user:
CREATE SCHEMA university AUTHORIZATION bob;
Here, a schema called
university is created and is owned by
A schema is a logical entity that helps organize objects and data in the database.
By default, if you don't create any schemas, any new objects will be created in the public schema.
In order to create a schema, the user must either be a superuser or must have the
CREATE privilege for the current database.
Once a schema is created, it can be used to create new objects such as tables and views within that schema.
You may use the
\dn switch of
psql to list all of the schemas in a database as shown in the following screenshot:
To identify the schema in which you are currently working, you can use the following command:
While searching for objects in the database, you can define the search schemas preferences for where those searches should start. You can use the
search_path parameter for this, as follows:
ALTER DATABASE hrd SET search_path TO hr,hrms, public, pg_catalog;
A user is a login role that is allowed to log in to the PostgreSQL server. The login roles section is where you define accounts for individual users for the PostgreSQL system. Each database user should have an individual account to log in to the PostgreSQL system. Each user has an internal system identifier in PostgreSQL, which is known as a sysid. The user's system ID is used to associate objects in a database with their owner. Users may also have global rights assigned to them when they are created. These rights determine whether a user is allowed to create or drop databases and whether the existing user is a superuser or not.
PostgreSQL provides two methods by which database users are created:
The first method requires using the
CREATE USERSQL statement to create a new user in the database. You can create a new user with the
CREATE USERSQL statement, like this:
CREATE user agovil WITH PASSWORD '[email protected]';
Here, we created the
agoviluser and provided a password for the user to log in with.
The second method requires executing the
createuserscript from the command line.
We may also use the
createdbscript to create a user called
nchabbraon the same host (port
5432), and the
–Soption specifies that the created user will not have the superuser privileges:
$ createuser -h localhost -p 5432 -S nchabbra
CREATE USER SQL statement requires one mandatory parameter which is the name of the new user. Other parameters, which are optional, however, are passwords for the user or group, the system ID, and a set of privileges that may be explicitly allocated.
createuser script can be invoked without arguments. In that case, it will prompt us to provide the username and the set of rights and will attempt to make a local connection to PostgreSQL. It can also be invoked with options and the username to be created on the command line, and you will need to give the user access to a database explicitly if he/she is not the owner of the database.
We can use the
\du switch of
psql to display the list of existing users, inclusive of roles in the PostgreSQL server, as shown in this screenshot:
A group in the PostgreSQL server is similar to the groups that exist in Unix and Linux. A group in PostgreSQL serves to simplify the assignment of rights. It simply requires a name and may be created empty. Once it is created, users who are intended to share common access rights are added into the group together, and are thus associated by their membership within that group. Grants on the database objects are then given to the group instead of each individual group member.
Groups in the PostgreSQL server can be created by using the
CREATE GROUP SQL statement. The following command will create a group. However, no users are currently a part of this group:
hrdb=# CREATE GROUP dept;
In order to assign members/users to the group, we can use the
ALTER GROUP statement as follows:
hrdb=# ALTER GROUP dept ADD USER agovil,nchabbra;
It is also possible to create a group and assign users upon its creation, as shown in the following
CREATE GROUP statement:
hrdb=# CREATE GROUP admins WITH user agovil,nchabbra;
A group is a system-wide database object that can be assigned privileges and have users added to it as members. A group is a role that cannot be used to log in to any database.
It is also possible to grant membership in a group to another group, thereby allowing the member role use of privileges assigned to the group it is a member of.
Database groups are global across a database cluster installation.
Every major RDBMS vendor offers the ability to drop databases just as it allows you to create databases. However, one should exercise caution when dealing with situations like dropping databases. Once a database is dropped, all of the information residing in it is lost forever. It is only for a valid business purpose that we should drop databases. In normal circumstances, a database is only dropped when it gets decommissioned and is no longer required for business operations.
There are two methods to drop a database in the PostgreSQL server instance:
You can use the
DROP DATABASEstatement to drop a database from PostgreSQL, as follows:
hrdb=# DROP DATABASE hrdb;
You can use the
dropdbcommand line-utility, which is a wrapper around the
$ dropdb hrdb;
DROP DATABASE statement permanently deletes catalog entries and the data directory. Only the owner of the database can issue the
DROP DATABASE statement.
Also, it is not possible to drop a database to which you are connected. In order to delete the database, the database owner will have to make a connection to another database of which he is an owner.
In order to drop a database that has active connections to it, you will have to follow these steps:
Identify all of the active sessions on the database. To identify all of the active sessions on the database, you need to query the
pg_stat_activitycatalog table as follows:
SELECT * from pg_stat_activity where datname='testdb1';
Terminate all of the active connections to the database. To terminate all of the active connections, you will need to use the
pg_terminate_backendfunction as follows:
SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE datname = 'testdb1';
Once all of the connections are terminated, you may proceed with dropping the database using the
PostgreSQL stores data files consisting of database objects such as tables and indices on the disk. The tablespace is defined as the location of these objects on the disk. A tablespace is used to map a logical name to a physical location on the disk.
A tablespace is a location on the disk where PostgreSQL stores data files containing database objects, for example indexes, tables, and so on.
Before you create the tablespace, the directory location must be physically created and the directory must be empty:
mkdir –p /var/lib/pgsql/data/dbs
To create a tablespace in PostgreSQL, you need to use the
CREATE TABLESPACE statement.
CREATE TABLESPACE data_tbs OWNER agovil LOCATION '/var/lib/pgsql/data/dbs';
Similarly, a tablespace in PostgreSQL can be dropped using the
DROP TABLESPACE statement, as follows:
DROP TABLESPACE data_tbs;
A tablespace allows you to control the disk layout of PostgreSQL. The owner of the tablespace, by default, would be the user who executed the
CREATE TABLESPACE statement. This statement also gives you the option of assigning the ownership of the tablespace to a new user. This option is the part of the
OWNER clause in the
CREATE TABLESPACE statement.
The name of the tablespace should not begin with a
pg_ prefix because this is reserved for the system tablespaces.
Before deleting a tablespace, ensure that it is empty, which means there should be no database objects inside it. If the user tries to delete the tablespace when it is not empty, the command will fail.
There are two options that will aid in deleting the tablespace when it is not empty:
You may drop the database
You may alter the database to move it to a different tablespace
After any of the preceding actions have been completed, then the corresponding tablespace may be dropped.
A tablespace can contain both permanent and temporary objects. You will need to define and create a secondary tablespace to serve as the target destination of objects that might get moved from the primary tablespace. Moving objects between tablespaces is a mechanism of copying bulk data in which copying happens sequentially, block by block. Moving a table to another tablespace locks it for the duration of the move.
Here, we will first create a new tablespace,
hrms, using the following command:
mkdir –p /var/lib/pgsql/data/hrms
Then we set the default tablespace for the
testdb1 database to
hrms using the following statement:
CREATE TABLESPACE HRMS OWNER agovil LOCATION '/var/lib/pgsql/data/hrms';
We will also create a table, insert some records into it, and create a corresponding index for it. This is being done because the table and its index will be used in the How to do it… section of this recipe:
CREATE TABLE EMPLOYEES(id integer PRIMARY KEY , name varchar(40)); INSERT INTO EMPLOYEES VALUES (1, 'Mike Johansson'); INSERT INTO EMPLOYEES VALUES(2, 'Rajat Arora'); CREATE INDEX emp_idx on employees(name);
Moving a complete database to a different tablespace involves three steps:
You will change the tablespace for the given database so that new objects for the associated database are created in the new tablespace:
ALTER DATABASE testdb1 SET default_tablespace='hrms';
ALTER TABLE employee SET TABLESPACE hrms;
You will also have to move any existing indexes to the new tablespace:
ALTER INDEX emp_idx SET TABLESPACE hrms;
You will have to query the
pg_tables catalog table to find out which tables from the current database need to be moved to a different tablespace.
Similarly for the indexes, you will have to query the
pg_indexes catalog table to find out which indexes need to be moved to a different tablespace.
To create the database cluster, use the
$ initdb -D /var/lib/pgsql/data
Another way of initializing the database cluster is by calling the
initdb command via the
$ pg_ctl -D /var/lib/pgsql/data initdb
A database cluster is a collection of databases that are managed by a single server instance.
initdb command is triggered, the directories in which the database data will reside are created, shared catalog tables are generated, and the
postgres databases are created, out of which the default database is
initdb command initializes the database cluster default locale and the character set encoding.
You can refer to http://www.postgresql.org/docs/9.3/static/creating-cluster.html for more information on initializing a database cluster.
Before anyone can access the database, the database server must be started. Then you will be able to start all of the instances of the
postgres database in the cluster using the different commands with options as mentioned in this recipe.
The term "server" refers to the database and the associated backend processes. The term "service" refers to the operating system wrapper through which the server gets invoked. In normal circumstances, the PostgreSQL server will usually start automatically when the system boots up. However, there will be situations where you may have to start the server manually for different reasons.
There are a couple of methods through which the PostgreSQL server can be started on Unix or Linux platforms:
The first method relies on passing the start argument to the
pg_ctlutility to get the postmaster backend process started, which effectively means starting the PostgreSQL server.
The next method relies on using the service commands, which, if supported by the operating system, can be used as a wrapper to the installed PostgreSQL script.
The last method involves invoking the installed PostgreSQL script directly using its complete path.
On most Unix distributions and Red Hat-based Linux distributions, the
pg_ctl utility can be used as follows:
pg_ctl -D /var/lib/pgsql/data start
If you are using the service command, the service can be started like this:
service postgresql<version> start
For PostgreSQL version 9.3, the service command to start the PostgreSQL server is as follows:
service postgresql-9.3 start
On Windows-based systems, the PostgreSQL service can be started using the following command:
NET START postgresql-9.3
The start argument of the
pg_ctl utility will first start PostgreSQL's postmaster backend process using the path of the data directory.
The database system will then start up successfully, report the last time the database system was shut down, and provide various debugging statements before returning the
postgres user to the shell prompt.
Sometimes in emergency situations, you might have to bring down the PostgreSQL server's services. There are certain situations in which you may need to stop the database services. For instance, during an operating system migration, you might need to stop the running services, take a filesystem backup, and then proceed with OS migration.
There are a couple of ways by which the PostgreSQL server can be stopped.
On Unix distributions and Red Hat-based Linux distributions, we can use the
stop argument of the
pg_ctl utility to stop the postmaster:
pg_ctl -D /var/lib/pgsql/data stop -m fast
service command, the PostgreSQL server can be stopped like this:
service postgresql stop
You may also stop the server by manually invoking the installed PostgreSQL script using its complete path:
On Windows-based systems, you may stop the postmaster service in this manner:
NET STOP postgresql-9.3
By default, the PostgreSQL server will wait for clients to first cancel their connections before shutting down.
However, with the use of a fast shutdown, there is no wait time involved as all of the user transactions will be aborted and all connections will be disconnected.
There may be situations where one needs to stop the PostgreSQL server in an emergency situation, and for this, PostgreSQL provides the immediate shutdown mode.
In case of immediate shutdown, a process will receive a harsher signal and will not be able to respond to the server anymore.
The consequence of this type of shutdown is that PostgreSQL is not able to finish its disk I/O, and therefore has to do a crash recovery the next time it is started.
The immediate shutdown mode can be invoked like this:
pg_ctl -D /var/lib/pgsql/data stop -m immediate
Another way to shut down the server would be to send the signal directly using the
kill command. The PID of the
postgres process can be found using the
ps command or from the
postmaster.pid file in the
data directory. In order to initiate a fast shutdown, you can issue the following command:
$ kill -INT head -1 /usr/local/pgsql/data/postmaster.pid
Many a times, there will be situations where end users complain that the database performance is sluggish and they are not able to log in to the database. In such situations, it is often helpful to take a quick glance through the status of the PostgreSQL backend
postmaster process and confirm whether the PostgreSQL server services are up and running.
There are a couple of ways by which the status of the PostgreSQL server can be checked.
On Unix and on Red Hat-based Linux distributions, the status argument of the
pg_ctl utility can be used to check the status of a running
pg_ctl -D /var/lib/pgsql/data status
On Unix-based and Linux-based platforms supporting the
service command, the status of a
postgresql process can be checked as follows:
service postgresql status
You may also check the server status by manually invoking the installed PostgreSQL script using its complete path:
Changes made to certain PostgreSQL configuration parameters come into effect when the server configuration files, such as
postgresql.conf, are reloaded. Reloading the server configuration files becomes necessary in such cases.
Some of the configuration parameters in PostgreSQL can be changed on the fly. However, changes to other configurations can only be reflected once the server configuration files are reloaded.
On most Unix-based and Linux-based platforms, the command to reload the server configuration file is as follows:
pg_ctl -D /var/lib/pgsql/data reload
It is also possible to reload the configuration file while being connected to a PostgreSQL session. However, this can be done by the superuser only:
postgres=# select pg_reload_conf();
On Red Hat and other Linux-based systems that support the
service command, the
postgresql command to reload the configuration file is as follows:
service postgresql reload
To ensure that changes made to the parameters in the configuration file take effect, a reload of the configuration file is needed. Reloading the configuration files requires sending the
sighup signal to the postmaster process, which in turn will forward it to the other connected backend sessions.
There are some configuration parameters whose changed values can only be reflected by a server reload. These configuration parameters have a value known as
sighup for the attribute context in the
pg_settings catalog table:
SELECT name, setting, unit ,(source = 'default') as is_default FROM pg_settings WHERE context = 'sighup' AND (name like '%delay' or name like '%timeout') AND setting != '0';
Output for the preceding query is as shown below:
Every major RDBMS, including PostgreSQL, allows simultaneous and concurrent database connections in order for users to run transactions. Due to such concurrent processing of databases, it may be during peak transaction hours that database performance becomes slow or that there are some blocking sessions. In order to deal with such situations, we might have to terminate some specific sessions or sessions coming from a particular user so that we can get database performance back to normal.
PostgreSQL provides the
pg_terminate_backend function to kill a specific session. Even though the
pg_terminate_backend function acts on a single connection at a time, we can embed
pg_terminate_backend by wrapping it around the
SELECT query to kill multiple connections, based on the filter criteria specified in the
To terminate all of the connections from a particular database, we can use the
pg_terminate_backend function as follows:
SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE datname = 'testdb1';
To terminate all of the connections for a particular user, we can use
pg_terminate_backend like this:
SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE usename = 'agovil';
pg_terminate_backend function requires the
pid column or process ID as input. The value of
pid can be obtained from the
pg_stat_activity catalog table. Once
pid is passed as input to the
pg_terminate_backend function, all running queries will automatically be canceled and it will terminate a specific connection corresponding to the process ID as found in the
Terminating backends is also useful to free memory from idle
postgres processes that was not released for whatever reason and was hogging system resources.
If the requirement is to cancel running queries and not to terminate existing sessions, then we can use the
pg_cancel_backend function to cancel all active queries on a connection. However, with the
pg_cancel_backend function, we can only kill runaway queries issued in a database or by a specific user. It does not have the ability to terminate connections.
To cancel all of the running queries issued against a database, we can use the
pg_cancel_backend function as follows:
SELECT pg_cancel_backend(pid) FROM pg_stat_activity WHERE datname = 'testdb1';
To cancel all of the running queries issued by a specific user, we can use the
pg_cancel_backend function like this:
SELECT pg_cancel_backend(pid) FROM pg_stat_activity WHERE usename = 'agovil';
In versions before PostgreSQL 9.2, the
procpid column has to be passed as input to the
pg_cancel_backend functions to terminate running sessions and cancel queries. The
pid column replaced the
procpid column from PostgreSQL version 9.2 onwards.
You may refer to https://blog.sleeplessbeastie.eu/2014/07/23/how-to-terminate-postgresql-sessions/ and http://www.devopsderek.com/blog/2012/11/13/list-and-disconnect-postgresql-db-sessions/ for more information regarding terminating backend connections.