Reader small image

You're reading from  PostgreSQL 14 Administration Cookbook

Product typeBook
Published inMar 2022
PublisherPackt
ISBN-139781803248974
Edition1st Edition
Concepts
Right arrow
Authors (2):
Simon Riggs
Simon Riggs
author image
Simon Riggs

Simon Riggs is the CTO of 2ndQuadrant, having contributed to PostgreSQL as a major developer and committer for 14 years. He has written and designed features for replication, performance, BI, management, and security. Under his guidance, 2ndQuadrant is now a leading developer of open source PostgreSQL, serving hundreds of clients in USA, Europe, and worldwide. Simon is a frequent speaker at many conferences on PostgreSQL Futures. He has worked as a database architect for 30 years.
Read more about Simon Riggs

Gianni Ciolli
Gianni Ciolli
author image
Gianni Ciolli

Gianni Ciolli is the Vice President for Solutions Architecture at EnterpriseDB (EDB). As a PostgreSQL consultant, he has driven many successful enterprise deployments for customers in every part of the globe.Gianni is respected worldwide as a popular speaker and trainer at many PostgreSQL conferences in Europe and abroad over the last 14 years. He has worked with free and open-source software since the 1990s as an active member of the community (Prato Linux User Group, and Italian PostgreSQL Users Group). Gianni has a Ph.D. in Mathematics from the University of Florence. He lives in London with his son. His other interests include music, drama, poetry and athletics.
Read more about Gianni Ciolli

View More author details
Right arrow

Randomly sampling data

DBAs may be asked to set up a test server and populate it with test data. Often, that server will be old hardware, possibly with smaller disk sizes. So, the subject of data sampling raises its head.

The purpose of sampling is to reduce the size of the dataset and improve the speed of later analysis. Some statisticians are so used to the idea of sampling that they may not even question whether its use is valid or if it might cause further complications.

The SQL standard way to perform sampling is by adding the TABLESAMPLE clause to the SELECT statement. 

How to do it…

In this section, we will take a random sample of a given collection of data (for example, a given table). First, you should realize that there isn't a simple tool to slice off a sample of your database. It would be neat if there were, but there isn't. You'll need to read all of this to understand why:

  1. We first...
lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
PostgreSQL 14 Administration Cookbook
Published in: Mar 2022Publisher: PacktISBN-13: 9781803248974

Authors (2)

author image
Simon Riggs

Simon Riggs is the CTO of 2ndQuadrant, having contributed to PostgreSQL as a major developer and committer for 14 years. He has written and designed features for replication, performance, BI, management, and security. Under his guidance, 2ndQuadrant is now a leading developer of open source PostgreSQL, serving hundreds of clients in USA, Europe, and worldwide. Simon is a frequent speaker at many conferences on PostgreSQL Futures. He has worked as a database architect for 30 years.
Read more about Simon Riggs

author image
Gianni Ciolli

Gianni Ciolli is the Vice President for Solutions Architecture at EnterpriseDB (EDB). As a PostgreSQL consultant, he has driven many successful enterprise deployments for customers in every part of the globe.Gianni is respected worldwide as a popular speaker and trainer at many PostgreSQL conferences in Europe and abroad over the last 14 years. He has worked with free and open-source software since the 1990s as an active member of the community (Prato Linux User Group, and Italian PostgreSQL Users Group). Gianni has a Ph.D. in Mathematics from the University of Florence. He lives in London with his son. His other interests include music, drama, poetry and athletics.
Read more about Gianni Ciolli