PostgreSQL 9.0 High Performance


PostgreSQL 9.0 High Performance
eBook: $29.99
Formats: PDF, PacktLib, ePub and Mobi formats
$25.49
save 15%!
Print + free eBook + free PacktLib access to the book: $79.98    Print cover: $49.99
$44.99
save 44%!
Free Shipping!
UK, US, Europe and selected countries in Asia.
Also available on:
Overview
Table of Contents
Author
Reviews
Support
Sample Chapters
  • Learn the right techniques to obtain optimal PostgreSQL database performance, from initial design to routine maintenance
  • Discover the techniques used to scale successful database installations
  • Avoid the common pitfalls that can slow your system down
  • Filled with advice about what you should be doing; how to build experimental databases to explore performance topics, and then move what you've learned into a production database environment
  • Covers versions 8.1 through 9.0

 

Book Details

Language : English
Paperback : 468 pages [ 235mm x 191mm ]
Release Date : October 2010
ISBN : 184951030X
ISBN 13 : 9781849510301
Author(s) : Gregory Smith
Topics and Technologies : All Books, Big Data and Business Intelligence, Open Source


Table of Contents

Preface
Chapter 1: PostgreSQL Versions
Chapter 2: Database Hardware
Chapter 3: Database Hardware Benchmarking
Chapter 4: Disk Setup
Chapter 5: Memory for Database Caching
Chapter 6: Server Configuration Tuning
Chapter 7: Routine Maintenance
Chapter 8: Database Benchmarking
Chapter 9: Database Indexing
Chapter 10: Query Optimization
Chapter 11: Database Activity and Statistics
Chapter 12: Monitoring and Trending
Chapter 13: Pooling and Caching
Chapter 14: Scaling with Replication
Chapter 15: Partitioning Data
Chapter 16: Avoiding Common Problems
Index
  • Chapter 1: PostgreSQL Versions
    • Performance of historical PostgreSQL releases
      • Choosing a version to deploy
      • Upgrading to a newer major version
        • Upgrades to PostgreSQL 8.3+ from earlier ones
        • Minor version upgrades
    • PostgreSQL or another database?
    • PostgreSQL tools
      • PostgreSQL contrib
        • Finding contrib modules on your system
        • Installing a contrib module from source
        • Using a contrib module
      • pgFoundry
      • Additional PostgreSQL-related software
    • PostgreSQL application scaling lifecycle
    • Performance tuning as a practice
    • Summary
  • Chapter 2: Database Hardware
    • Balancing hardware spending
      • CPUs
      • Memory
      • Disks
        • RAID
        • Drive error handling
        • Hard drive reliability studies
        • Drive firmware and RAID
        • SSDs
      • Disk controllers
        • Hardware and Software RAID
        • Recommended disk controllers
        • Attached storage—SAN and NAS
    • Reliable controller and disk setup
      • Write-back caches
        • Sources of write-back caching
        • Disk controller monitoring
        • Disabling drive write caches
      • Performance impact of write-through caching
    • Summary
  • Chapter 3: Database Hardware Benchmarking
    • CPU and memory benchmarking
      • memtest86+
      • STREAM memory testing
        • STREAM and Intel vs. AMD
      • CPU benchmarking
      • Sources of slow memory and processors
    • Physical disk performance
      • Random access and I/Os Per Second
      • Sequential access and ZCAV
        • Short stroking
      • Commit rate
        • PostgreSQL test_fsync
        • INSERT rate
        • Windows commit rate
    • Disk benchmarking tools
      • hdtune
        • Short stroking tests
        • IOPS
        • Unpredictable performance and Windows
      • dd
      • bonnie++
        • bonnie++ 2.0
        • bonnie++ ZCAV
      • sysbench
        • Seek rate
        • fsync commit rate
      • Complicated disk benchmarks
    • Sample disk results
      • Disk performance expectations
        • Sources of slow disk and array performance
    • Summary
  • Chapter 4: Disk Setup
    • Maximum filesystem sizes
    • Filesystem crash recovery
      • Journaling filesystems
    • Linux filesystems
      • ext2
      • ext3
      • ext4
      • XFS
      • Other Linux filesystems
      • Write barriers
        • Drive support for barriers
        • Filesystem support for barriers
      • General Linux filesystem tuning
        • Read-ahead
        • File access times
        • Read caching and swapping
        • Write cache sizing
        • I/O scheduler elevator
    • Solaris and FreeBSD filesystems
      • Solaris UFS
      • FreeBSD UFS2
      • ZFS
    • Windows filesystems
      • FAT32
      • NTFS
        • Adjusting mounting behaviour
    • Disk layout for PostgreSQL
      • Symbolic links
      • Tablespaces
      • Database directory tree
        • Temporary files
      • Disk arrays, RAID, and disk layout
        • Disk layout guidelines
    • Summary
  • Chapter 5: Memory for Database Caching
    • Memory units in the postgresql.conf
    • Increasing UNIX shared memory parameters for larger buffer sizes
      • Kernel semaphores
      • Estimating shared memory allocation
  • Inspecting the database cache
    • Installing pg_buffercache into a database
    • Database disk layout
    • Creating a new block in a database
    • Writing dirty blocks to disk
  • Crash recovery and the buffer cache
    • Checkpoint processing basics
    • Write-ahead log and recovery processing
    • Checkpoint timing
      • Checkpoint spikes
      • Spread checkpoints
    • Database block lifecycle
      • Dirty block write paths
  • Database buffer cache versus operating system cache
    • Doubly cached data
      • Inspecting the OS cache
    • Checkpoint overhead
    • Starting size guidelines
      • Platform, version, and workload limitations
  • Analyzing buffer cache contents
    • Inspection of the buffer cache queries
      • Top relations in the cache
      • Summary by usage count
      • Buffer contents summary, with percentages
      • Buffer usage count distribution
    • Using buffer cache inspection for sizing feedback
  • Summary
  • Chapter 6: Server Configuration Tuning
    • Interacting with the live configuration
      • Defaults and reset values
      • Allowed change context
      • Reloading the configuration file
        • Commented out settings
    • Server-wide settings
      • Database connections
        • listen_addresses
        • max_connections
      • Shared memory
        • shared_buffers
        • Free space map (FSM) settings
      • Logging
        • log_line_prefix
        • log_statement
        • log_min_duration_statement
      • Vacuuming and statistics
        • autovacuum
        • Enabling autovacuum on older versions
        • maintainance_work_mem
        • default_statistics_target
      • Checkpoints
        • checkpoint_segments
        • checkpoint_timeout
        • checkpoint_completion_target
      • WAL settings
        • wal_buffers
        • wal_sync_method
      • PITR and WAL Replication
    • Per-client settings
      • effective_cache_size
      • synchronous_commit
      • work_mem
      • random_page_cost
      • constraint_exclusion
    • Tunables to avoid
      • fsync
      • full_page_writes
      • commit_delay and commit_siblings
      • max_prepared_transactions
      • Query enable parameters
  • New server tuning
  • Dedicated server guidelines
  • Shared server guidelines
  • pgtune
  • Summary
  • Chapter 7: Routine Maintenance
    • Transaction visibility with multiversion concurrency control
      • Visibility computation internals
      • Updates
      • Row lock conflicts
        • Serialization
      • Deletions
      • Advantages of MVCC
      • Disadvantages of MVCC
      • Transaction ID wraparound
    • Vacuum
      • Vacuum Implementation
        • Regular vacuum
        • Returning free disk space
        • Full vacuum
        • HOT
      • Cost-based vacuuming
      • autovacuum
        • autovacuum logging
        • autovacuum monitoring
        • autovacuum triggering
        • Per-table adjustments
      • Common vacuum and autovacuum problems
        • autovacuum is running even though it was turned off
        • autovacuum is constantly running
        • Out of memory errors
        • Not keeping up on a busy server
        • autovacuum is too disruptive
        • Long running transactions
        • Free Space Map exhaustion
        • Recovering from major problems
    • Autoanalyze
    • Index bloat
      • Measuring index bloat
    • Detailed data and index page monitoring
    • Monitoring query logs
      • Basic PostgreSQL log setup
        • Log collection
        • log_line_prefix
        • Multi-line queries
        • Using syslog for log messages
        • CSV logging
      • Logging difficult queries
        • auto_explain
      • Log file analysis
        • Normalized query fingerprints
        • pg_stat_statements
        • pgFouine
        • PQA
        • EPQA
        • pgsi
        • mk-query-digest
    • Summary
  • Chapter 8: Database Benchmarking
    • pgbench default tests
      • Table definition
      • Scale detection
      • Query script definition
      • Configuring the database server for pgbench
        • Sample server configuration
    • Running pgbench manually
    • Graphing results with pgbench-tools
      • Configuring pgbench-tools
        • Customizing for 8.3
    • Sample pgbench test results
      • SELECT-only test
      • TPC-B-like test
      • Latency analysis
    • Sources for bad results and variation
      • Developer PostgreSQL builds
      • Worker threads and pgbench program limitations
  • pgbench custom tests
    • Insert speed test
  • Transaction Processing Performance Council benchmarks
  • Summary
  • Chapter 9: Database Indexing
    • Indexing example walkthrough
      • Measuring query disk and index block statistics
      • Running the example
      • Sample data setup
      • Simple index lookups
      • Full table scans
      • Index creation
      • Lookup with an inefficient index
      • Combining indexes
      • Switching from indexed to sequential scans
        • Planning for plan changes
      • Clustering against an index
      • Explain with buffer counts
    • Index creation and maintenance
      • Unique indexes
      • Concurrent index creation
      • Clustering an index
        • Fill factor
      • Reindexing
    • Index types
      • B-tree
        • Text operator classes
      • Hash
      • GIN
      • GiST
    • Advanced index use
      • Multicolumn indexes
      • Indexes for sorting
      • Partial indexes
      • Expression-based indexes
      • Indexing for full-text search
    • Summary
  • Chapter 10: Query Optimization
    • Sample data sets
      • Pagila
      • Dell Store 2
    • EXPLAIN basics
      • Timing overhead
      • Hot and cold cache behavior
        • Clearing the cache
    • Query plan node structure
      • Basic cost computation
        • Estimated costs and real world costs
    • Explain analysis tools
      • Visual explain
      • Verbose output
      • Machine readable explain output
      • Plan analysis tools
    • Assembling row sets
      • Tuple id
        • Object id
      • Sequential scan
      • Index scan
      • Bitmap heap and index scans
    • Processing nodes
      • Sort
      • Limit
        • Offsets
      • Aggregate
      • HashAggregate
      • Unique
        • WindowAgg
      • Result
      • Append
      • Group
      • Subquery Scan and Subplan
        • Subquery conversion and IN lists
      • Set operations
      • Materialize
      • CTE Scan
    • Joins
      • Nested loop
        • Nested loop with inner Index Scan
      • Merge Join
        • Nested loop and Merge Join materialization
      • Hash Joins
        • Hash semi and anti joins
        • Join ordering
        • Forcing join order
        • Join removal
        • Genetic query optimizer
    • Statistics
      • Viewing and estimating with statistics
      • Statistics targets
        • Adjusting a column target
        • Distinct values
      • Difficult areas to estimate
    • Other query planning parameters
      • effective_cache_size
      • work_mem
      • constraint_exclusion
      • cursor_tuple_fraction
    • Executing other statement types
    • Improving queries
      • Optimizing for fully cached data sets
      • Testing for query equivalence
      • Disabling optimizer features
      • Working around optimizer bugs
      • Avoiding plan restructuring with OFFSET
      • External trouble spots
    • SQL Limitations
      • Numbering rows in SQL
      • Using Window functions for numbering
      • Using Window functions for cumulatives
    • Summary
  • Chapter 11: Database Activity and Statistics
    • Statistics views
    • Cumulative and live views
    • Table statistics
      • Table I/O
    • Index statistics
      • Index I/O
    • Database wide totals
    • Connections and activity
    • Locks
      • Virtual transactions
      • Decoding lock information
      • Transaction lock waits
      • Table lock waits
      • Logging lock information
        • Deadlocks
    • Disk usage
    • Buffer, background writer, and checkpoint activity
      • Saving pg_stat_bgwriter snapshots
      • Tuning using background writer statistics
    • Summary
  • Chapter 12: Monitoring and Trending
    • UNIX monitoring tools
      • Sample setup
      • vmstat
      • iostat
        • iotop for Linux
        • Examples of good performance
        • Overloaded system samples
      • top
        • Solaris top replacements
        • htop for Linux
      • sysstat and sar
        • Enabling sysstat and its optional features
        • Graphing with kSar
    • Windows monitoring tools
      • Task Manager
        • Sysinternals tools
      • Windows System Monitor
        • Saving Windows System Monitor data
    • Trending software
      • Types of monitoring and trending software
        • Storing historical trend data
      • Nagios
        • Nagios and PostgreSQL
        • Nagios and Windows
      • Cacti
        • Cacti and PostgreSQL
        • Cacti and Windows
      • Munin
      • Other trending packages
        • pgstatspack
        • Zenoss
        • Hyperic HQ
        • Reconnoiter
        • Staplr
        • SNMP tools
    • Summary
  • Chapter 13: Pooling and Caching
    • Connection pooling
      • Pooling connection counts
      • pgpool-II
        • pgpool-II load balancing for replication scaling
      • pgBouncer
        • Application server pooling
    • Database caching
      • memcached
      • pgmemcache
    • Summary
  • Chapter 14: Scaling with Replication
    • Hot Standby
      • Terminology
      • Setting up WAL shipping
      • Streaming Replication
      • Tuning Hot Standby
    • Replication queue managers
      • Slony
      • Londiste
      • Read scaling with replication queue software
    • Special application requirements
      • Bucardo
      • pgpool-II
    • Other interesting replication projects
    • Summary
  • Chapter 15: Partitioning Data
    • Table range partitioning
      • Determining a key field to partition over
      • Sizing the partitions
        • List partitioning
      • Creating the partitions
      • Redirecting INSERT statements to the partitions
        • Dynamic trigger functions
        • Partition rules
      • Empty partition query plans
      • Date change update trigger
      • Live migration of a partitioned table
      • Partitioned queries
      • Creating new partitions
        • Scheduled creation
        • Dynamic creation
      • Partitioning advantages
      • Common partitioning mistakes
    • Horizontal partitioning with PL/Proxy
      • Hash generation
      • Scaling with PL/Proxy
        • Sharding
      • Scaling with GridSQL
    • Summary
  • Chapter 16: Avoiding Common Problems
    • Bulk loading
      • Loading methods
        • External loading programs
      • Tuning for bulk loads
      • Skipping WAL acceleration
      • Recreating indexes and adding constraints
      • Parallel restore
      • Post load cleanup
    • Common performance issues
      • Counting rows
      • Unexplained writes
      • Slow function and prepared statement execution
      • PL/pgSQL benchmarking
      • High foreign key overhead
      • Trigger memory use
      • Heavy statistics collector overhead
        • Targeted statistics resets
      • Materialized views
    • Profiling the database
      • gprof
      • OProfile
      • Visual Studio
      • DTrace
        • DTrace on FreeBSD
        • Linux SystemTap emulation of DTrace
    • Performance related features by version
      • Aggressive PostgreSQL version upgrades
      • 8.1
      • 8.2
      • 8.3
      • 8.4
      • 9.0
        • Replication
        • Queries and EXPLAIN
        • Database development
        • Configuration and monitoring
        • Tools
        • Internals
    • Summary

Gregory Smith

Gregory Smith is the principal consultant in the United States for international PostgreSQL services firm 2ndQuadrant. Based in Baltimore, MD, he's been providing database advice to clients in industries like manufacturing, finance, and web development for twenty years. Dedicated to open-source technology since early in his career, Greg turned to full-time PostgreSQL work by diving in at the source code level. He contributes regular feature patches to the core database and has written a variety of database tools. This book reflects the perspective that he's gained over the last few years as one of the most prolific sources of advice on the popular PostgreSQL community mailing lists.

Code Downloads

Download the code and support files for this book.


Submit Errata

Please let us know if you have found any errors not listed on this list by completing our errata submission form. Our editors will check them and add them to this list. Thank you.


Errata

- 1 submitted: last submission 02 May 2013

Errata type: Typo | Page number: 126

"archive_command | sihup" should be "archive_command | sighup"

 

Errata type: Typo | Page number: 256

"This plan is dramatically easier to read than the earlier behavior, and can be forced by turning off the new optimization."
should be
"This plan is dramatically easier to read than the earlier behavior, and can be enforced by turning off the new optimization."

 

Errata type: Code | Page number: 90

The database configuration parameter name is incorrectly stated as:
'open_datasync=fsync_writethrough'
it should be
'wal_sync_method=fsync_writethrough'."

 

Sample chapters

You can view our sample chapters and prefaces of this title on PacktLib or download sample chapters in PDF format.

Frequently bought together

PostgreSQL 9.0 High Performance +    Windows Server 2012 Automation with PowerShell Cookbook =
50% Off
the second eBook
Price for both: £27.35

Buy both these recommended eBooks together and get 50% off the cheapest eBook.

What you will learn from this book

  • Learn best practices culled from years of work scaling PostgreSQL installations to handle demanding applications
  • Uncover what makes hardware good or bad for high-performance database applications
  • Investigate making informed speed and reliability trade-offs
  • Tweak your operating system for best database performance
  • Benchmark your whole system from hardware to application
  • Learn how the server parameters that impact performance work using real examples
  • Understand how the query optimizer makes its decisions, and what to do when it makes the wrong ones
  • Monitor what happens on your server, both in and outside of the database
  • Find the best add-on tools that extend the core PostgreSQL database
  • Discover how to architect replicated systems using the latest features added to PostgreSQL 9.0

 

In Detail

PostgreSQL database servers have a common set of problems they encounter as their usage gets heavier and requirements more demanding. You could spend years discovering solutions to them all, step by step as you encounter them. Or you can just look in here.

All successful database applications are destined to eventually run into issues scaling up their performance. Peek into the future of your PostgreSQL database's problems today. Know the warning signs to look for, and how to avoid the most common issues before they even happen.

Surprisingly, most PostgreSQL database applications evolve in the same way: Choose the right hardware. Tune the operating system and server memory use. Optimize queries against the database, with the right indexes. Monitor every layer, from hardware to queries, using some tools that are inside PostgreSQL and others that are external.

Using monitoring insight, continuously rework the design and configuration. On reaching the limits of a single server, break things up; connection pooling, caching, partitioning, and replication can all help handle increasing database workloads.

The path to a high performance database system isn't always easy. But it doesn't have to be mysterious with the right guide.

A clear, step-by-step guide to optimizing and scaling up PostgreSQL database servers

Approach

Improving database performance requires an equal mix of understanding theoretical concepts and working through hands-on examples. You'll find both here. Many of the examples given will be immediately useful for monitoring and improving your PostgreSQL deployments, providing insight into hard-to-obtain information about your database.

Who this book is for

This book is aimed at intermediate to advanced database administrators using or planning to use PostgreSQL. Portions will also interest systems administrators looking to build or monitor a PostgreSQL installation, as well as developers interested in advanced database internals that impact application design.

Code Download and Errata
Packt Anytime, Anywhere
Register Books
Print Upgrades
eBook Downloads
Video Support
Contact Us
Awards Voting Nominations Previous Winners
Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
Resources
Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software