You're reading from Learn PostgreSQL

Product typeBook

Published inOct 2020

Reading LevelIntermediate

PublisherPackt

ISBN-139781838985288

Edition1st Edition

Languages

SQL

Tools

PostgreSQL

Concepts

Databases

Authors (2):

Luca Ferrari

Enrico Pirozzi

View More author details

Indexes and Performance Optimization

Performance tuning is one of the most complex tasks in the daily job of a database administrator. SQL is a declarative language, and therefore it does not define how to access the underlying data – that responsibility is left to the database engine. PostgreSQL, therefore, must select, for every statement, the best available access to the data.

A particular component, the planner, is responsible for deciding on the best among all the available paths to the underlying data and another component, the optimizer, is responsible for executing the statement with such a particular access plan.

The aim of this chapter is to teach you how PostgreSQL executes a query, how the planner computes the best execution plan, and how you can help in improving the performance by means of indexes.

You will learn about the following topics in this chapter...

Technical requirements

You need to know the following:

How to execute queries against the database
How to execute data description language (DDL) statements

The code for this chapter can be found in the following GitHub repository: https://github.com/PacktPublishing/Learn-PostgreSQL.

Execution of a statement

SQL is a declarative language: you ask the database to execute something on the data it contains, but you do not specify how the database is supposed to complete the SQL statement. For instance, when you ask to get back some data, you execute a SELECT statement, but you only specify the clauses that specify which subset of data you need, not how the database is supposed to pull the data from its persistent storage. You have to trust the database – in particular, PostgreSQL – to be able to do its job and get you the fastest path to the data, always, under any circumstance of workload. The good news is that PostgreSQL is really good at doing this and is able to understand (and to some extent, interpret) your SQL statements and its current workload to provide you with access to the data in the fastest way.

However, finding the fastest path to the data often requires an equilibrium between searching for the absolute fastest path and the time spent in...

Indexes

An index is a data structure that allows faster access to the underlying table so that specific tuples can be found quickly. Here, "quickly" means faster than scanning the whole underlying table and analyzing every single tuple.

PostgreSQL supports different types of indexes, and not all types are optimal for every scenario and workload. In the following sections, you will discover the main types of indexes that PostgreSQL provides, but in any case, you can extend PostgreSQL with your own indexes or indexes provided by extensions.

An index in PostgreSQL can be built on a single column or multiple columns at once; PostgreSQL supports indexes with up to 32 columns.

An index can cover all the data in the underlying table, or can index specific values only – in that case, the index is known as "partial." For example, you can decide to index only those values of certain columns that you are going to use the most.

An index can also be unique, meaning that...

The EXPLAIN statement

EXPLAIN is the statement that allows you to see how PostgreSQL is going to execute a specific query. You have to pass the statement you want to analyze to EXPLAIN, and the execution plan will be shown.

There are a few important things to know before using EXPLAIN:

It will only show the best plan, which is the one with the lowest cost among all the evaluated plans.
It will not execute the statement you are asking the plan for, therefore the EXPLAIN execution is fast and pretty much constant each time.
It will present you with all the execution nodes that the executor will use to provide you with the dataset.

Let's see an example of EXPLAIN in action to better understand. Imagine we need to understand the execution plan of the SELECT * FROM categories statement. In this case, you need to prefix the statement with the EXPLAIN command, as follows:



forumdb=> EXPLAIN SELECT * FROM categories;
                        QUERY PLAN                         
--------...

An example of query tuning

In the previous section, you have learned how to use EXPLAIN to understand how PostgreSQL is going to execute a query; it is now time to use EXPLAIN in action to tune some slow queries and improve performance.

This section will show you some basic concepts of the day-to-day usage of EXPLAIN as a powerful tool to determine where and how to instrument PostgreSQL in doing faster data access. Of course, query tuning is a very complex subject and often requires repeated trial-based optimization, so the aim of this section is not to provide you with true knowledge about query tuning but rather a basic understanding of how to improve your own database and queries.

Sometimes, tuning a query involves simply rewriting it a way that is more comfortable – or better, more comprehensible –to PostgreSQL, but most often, that means using an appropriate index to speed up access to the underlying data.

Let's start with a simple example: we want to extract all...

ANALYZE and how to update statistics

PostgreSQL exploits a statistical approach to evaluate different execution plans. This means that PostgreSQL does not know how many tuples there are in a table, but has a good approximation that allows the planner to compute the cost of the execution plan.

Statistics are not only related to the quantity (how many tuples) but also to the quality of the underlying data – for example, how many distinct values, which values are more frequent in a column, and so on. Thanks to the combination of all of this data, PostgreSQL is able to make a good decision.

There are times, however, when the quality of the statistical data is not good enough for PostgreSQL to choose the best plan, a problem commonly known as "out-of-date statistics." In fact, statistics are not updated in real time; rather, PostgreSQL keeps track of what is ongoing in every table in every database and summarizes the number of new tuples, updated ones, and deleted ones, as...

Auto-explain

Auto-explain is an extension that helps the database administrator get an idea of slow queries and their execution plan. Essentially, auto-explain triggers when a running query is slower than a specified threshold, and then dumps in the PostgreSQL logs (refer to Chapter 14, Logging and Auditing) the execution plan of the query.

In this way, the database administrator can get an insight into slow queries and their execution plan without having to re-execute these queries. Thanks to this, the database administrator can inspect the execution plans and decide if and where to apply indexes or perform a deeper analysis.

The auto-explain module is configured via a set of auto_explain parameter options that can be inserted in the PostgreSQL configuration (the postgresql.conf file), but you need to remember that in order to activate the module, you need to restart the cluster.

The auto-explain module can do pretty much the same things that a manual EXPLAIN command can do, including...

Summary

PostgreSQL provides very rich features for creating and managing indexes, both single column-based or multi-column-based, as well as multiple types of indexes that can be built.

Thanks to the EXPLAIN command, a database administrator can inspect a slow query and see how the optimizer has thought about what the best access to the underlying data is, and thanks to an understanding of how PostgreSQL works, the administrator can decide which indexes to create in order to tune the performances.

PostgreSQL also provides a rich set of statistics that is used to both extract the quality and the quantity of data within every table, therefore being able to generate an execution plan, and to monitor which indexes are used and when. Auto-explain is another useful module that can be used to silently monitor slow queries and execution plans and see how the cluster is performing without any need to manually execute every suspect statement.

It is important to emphasize that performance tuning...

References

PostgreSQL official documentation about CREATE INDEX: https://www.postgresql.org/docs/12/sql-createindex.html
PostgreSQL official documentation about pg_stats: https://www.postgresql.org/docs/12/view-pg-stats.html
PostgreSQL official documentation about EXPLAIN: https://www.postgresql.org/docs/12/using-explain.html
PostgreSQL official documentation about ANALYZE: https://www.postgresql.org/docs/12/sql-analyze.html
Auto-explain official documentation: https://www.postgresql.org/docs/12/auto-explain.html

The rest of the chapter is locked

You have been reading a chapter from

Learn PostgreSQL

Published in: Oct 2020Publisher: PacktISBN-13: 9781838985288

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Authors (2)

Luca Ferrari

Luca Ferrari has been passionate about computer science since the Commodore 64 era, and today holds a master's degree (with honors) and a Ph.D. from the University of Modena and Reggio Emilia. He has written several research papers, technical articles, and book chapters. In 2011, he was named an Adjunct Professor by Nipissing University. An avid Unix user, he is a strong advocate of open source, and in his free time, he collaborates with a few projects. He met PostgreSQL back in release 7.3; he was a founder and former president of the Italian PostgreSQL Community (ITPUG). He also talks regularly at technical conferences and events and delivers professional training.
Read more about Luca Ferrari

Enrico Pirozzi

Enrico Pirozzi, EnterpriseDB certified on implementation management and tuning, with a master's in computer science, has been a PostgreSQL DBA since 2003. Based in Italy, he has been providing database advice to clients in industries such as manufacturing and web development for 10 years. He has been training others on PostgreSQL since 2008. Dedicated to open source technology since early in his career, he is a cofounder of the PostgreSQL Italian mailing list, PostgreSQL-it, and of the PostgreSQL Italian community site, PSQL
Read more about Enrico Pirozzi

Other recommended products

Related to this chapter

PostgreSQL 11 Server Side Programming Quick Start Guide

PostgreSQL is a rock-solid, scalable, and safe, enterprise-level relational database. With a broad range of features and stability it is ever increasing in popularity. The book shows you how to take advantages of PostgreSQL 11 features for Server-Side-Programming. Server-Side-Programming enables strong data encapsulation and coherence.

BookNov 2018260 pages

PostgreSQL 13 Cookbook

PostgreSQL has become the most advanced open source database on the market. This book adopts a step-by-step approach to meet almost every requirement you can think of while deploying PostgreSQL in production environments. You will not only learn how to design and manage your database but also discover how to administer and secure the database.

BookFeb 2021344 pages

PostgreSQL High Performance Cookbook

This book will guide you to enhance your database’s performance and give you insights into measuring and optimizing a PostgreSQL database to achieve better performance. You will be a able to perform various essential database tasks such as bench marking the database and optimizing the server’s memory usage. This book will also help you to explore various memory optimization techniques offered by PostgreSQL.

BookMar 2017360 pages

Learning PostgreSQL 10

This book will familiarize you with the latest new features released in PostgreSQL 10, and get you up and running with building efficient PostgreSQL database solutions from scratch. You will get a complete overview of SQL, client and server side programming in PostgreSQL, as well as some important administration tasks.

BookDec 2017488 pages

Learning PostgreSQL 11

This book will get you up and running with building efficient relational database solutions right from scratch with the newest features of PostgreSQL 11. You will learn the end-to-end working of relational databases and how to work with database structures. You will also be able to write essential SQL statements, perform data manipulation and do more, with the help of this book.

BookJan 2019556 pages

Mastering PostgreSQL 13

Updated to include the new features introduced in PostgreSQL 13, this book shows you how to build better PostgreSQL applications and administer your PostgreSQL database efficiently. You’ll master the advanced features of PostgreSQL and develop the skills you need to build secure and highly available database solutions.

BookNov 2020476 pages

Mastering PostgreSQL 12

This book includes the newly introduced features in PostgreSQL 12, and shows you how to build better PostgreSQL applications, and administer your database efficiently. You will master the advanced features of PostgreSQL and acquire the necessary skills to build secured and fault-tolerant database solutions.

BookNov 2019470 pages

Mastering PostgreSQL 11

This book includes the newly introduced features in PostgreSQL 11, and shows you how to build better PostgreSQL applications, and administer your PostgreSQL database efficiently. You will master the advanced features of PostgreSQL and acquire the necessary skills to build efficient database solutions.

BookOct 2018446 pages

Mastering PostgreSQL 10

PostgreSQL is an open source database used for handling large datasets (big data) and as a JSON document database. This book highlights the newly introduced features in PostgreSQL 10, and shows you how you can build better PostgreSQL applications, and administer your PostgreSQL database more efficiently.

BookJan 2018428 pages

PostgreSQL 10 High Performance

PostgreSQL is increasingly utilized in all kind of applications, starting from desktop to web and mobile applications. In this book, you will find the best ways to design, monitor and maintain your PostgreSQL solution, with suggestions and tips for high performance, troubleshooting and high availability.

BookApr 2018508 pages

Mastering PostgreSQL 9.6

PostgreSQL is an open source database management tool used for handling large datasets (big data) and as a JSON document database. It also has applications in the software and web domains. This book will enable you to build better PostgreSQL applications and administer databases more efficiently.

BookMay 2017416 pages

PostgreSQL Administration Cookbook

PostgreSQL is a powerful, open source, object-relational database system, fast becoming one of the world's most popular server databases with an enviable reputation for performance and stability and an enormous range of advanced features. This is a practical guide aimed at giving sysadmins and database administrators the necessary toolkit to be able to set up, run, and extend powerful databases with PostgreSQL.

BookApr 2017556 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages