Reader small image

You're reading from  PostgreSQL 14 Administration Cookbook

Product typeBook
Published inMar 2022
PublisherPackt
ISBN-139781803248974
Edition1st Edition
Concepts
Right arrow
Authors (2):
Simon Riggs
Simon Riggs
author image
Simon Riggs

Simon Riggs is the CTO of 2ndQuadrant, having contributed to PostgreSQL as a major developer and committer for 14 years. He has written and designed features for replication, performance, BI, management, and security. Under his guidance, 2ndQuadrant is now a leading developer of open source PostgreSQL, serving hundreds of clients in USA, Europe, and worldwide. Simon is a frequent speaker at many conferences on PostgreSQL Futures. He has worked as a database architect for 30 years.
Read more about Simon Riggs

Gianni Ciolli
Gianni Ciolli
author image
Gianni Ciolli

Gianni Ciolli is the Vice President for Solutions Architecture at EnterpriseDB (EDB). As a PostgreSQL consultant, he has driven many successful enterprise deployments for customers in every part of the globe.Gianni is respected worldwide as a popular speaker and trainer at many PostgreSQL conferences in Europe and abroad over the last 14 years. He has worked with free and open-source software since the 1990s as an active member of the community (Prato Linux User Group, and Italian PostgreSQL Users Group). Gianni has a Ph.D. in Mathematics from the University of Florence. He lives in London with his son. His other interests include music, drama, poetry and athletics.
Read more about Gianni Ciolli

View More author details
Right arrow

How many rows are there in a table?

There is no limit on the number of rows in a table, but the table is limited to available disk space and memory/swap space. If you are storing rows that exceed an aggregated data size of 2 KB, then the maximum number of rows may be limited to 4 billion or fewer.

Counting is one of the easiest SQL statements, so it is also many people's first experience of a PostgreSQL query.

How to do it…

From any interface, the SQL command used to count rows is as follows:

SELECT count(*) FROM table;

This will return a single integer value as the result.

In psql, the command looks like the following:

cookbook=# select count(*) from orders;
 count 
-------
   345
(1 row)

How it works...

PostgreSQL can choose between two techniques available to compute the SQL count(*) function. Both are available in all the currently supported versions:

  • The first is called sequential scan. We access every data block in the table one after the other, reading the number of rows in each block. If the table is on the disk, it will cause a beneficial disk access pattern, and the statement will be fairly fast.
  • The other technique is known as an index-only scan. It requires an index on the table, and it covers a more general case than optimizing SQL queries with count(*), so we will cover it in more detail in Chapter 10Performance and Concurrency.

Some people think that the count SQL statement is a good test of the performance of a DBMS. Some DBMSs have specific tuning features for the count SQL statement, and Postgres optimizes this using index-only scans. The PostgreSQL project has talked about this many times, but few people thought we should try to optimize this. Yes, the count function is frequently used within applications, but without any WHERE clause, it is not that useful. Therefore, the index-only scans feature has been implemented, which applies to more real-world situations, as well as this recipe.

We scan every block of the table because of a major feature of Postgres, named Multiversion Concurrency Control (MVCC). MVCC allows us to run the count SQL statement at the same time that we are inserting, updating, or deleting data from the table. That's a very cool feature, and we went to a lot of trouble in Postgres to provide it for you.

MVCC requires us to record information on each row of a table, stating when that change was made. If the changes were made after the SQL statement began to execute, then we just ignore those changes. This means that we need to carry out visibility checks on each row in the table to allow us to work out the results of the count SQL statement. The optimization provided by index-only scans is the ability to skip such checks on the table blocks that are already known to be visible to all current sessions. Rows in these blocks can be counted directly on the index, which is normally smaller than the table, and is, therefore, faster.

If you think a little deeper about this, you'll see that the result of the count SQL statement is just the value at a moment in time. Depending on what happens to the table, that value could change a little or a lot while the count SQL statement is executing. So, once you've executed this, all you really know is that, at a particular point in the past, there were exactly x rows in the table.

Previous PageNext Page
You have been reading a chapter from
PostgreSQL 14 Administration Cookbook
Published in: Mar 2022Publisher: PacktISBN-13: 9781803248974
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
Simon Riggs

Simon Riggs is the CTO of 2ndQuadrant, having contributed to PostgreSQL as a major developer and committer for 14 years. He has written and designed features for replication, performance, BI, management, and security. Under his guidance, 2ndQuadrant is now a leading developer of open source PostgreSQL, serving hundreds of clients in USA, Europe, and worldwide. Simon is a frequent speaker at many conferences on PostgreSQL Futures. He has worked as a database architect for 30 years.
Read more about Simon Riggs

author image
Gianni Ciolli

Gianni Ciolli is the Vice President for Solutions Architecture at EnterpriseDB (EDB). As a PostgreSQL consultant, he has driven many successful enterprise deployments for customers in every part of the globe.Gianni is respected worldwide as a popular speaker and trainer at many PostgreSQL conferences in Europe and abroad over the last 14 years. He has worked with free and open-source software since the 1990s as an active member of the community (Prato Linux User Group, and Italian PostgreSQL Users Group). Gianni has a Ph.D. in Mathematics from the University of Florence. He lives in London with his son. His other interests include music, drama, poetry and athletics.
Read more about Gianni Ciolli