You're reading from Learn T-SQL Querying - Second Edition

Product typeBook

Published inFeb 2024

PublisherPackt

ISBN-139781837638994

Edition2nd Edition

Tools

SQL Server

Concepts

Database Administration

Authors (2):

Pedro Lopes

Pam Lahoud

View More author details

Indexing for T-SQL Performance

In the previous chapter, we explored execution plans and the various operators that the SQL Database Engine uses to retrieve the data requested by a query. While the Query Optimizer does most of the heavy lifting when choosing the best way to retrieve the data required to satisfy the query, it can only do so efficiently if the proper indexes are in place.

An index is a structure in the database that speeds up access to data by organizing it in a specific way based on the type of index. The data structure that works best for your application will depend on many factors, including the type of data being stored, the volatility of the data, and the data access patterns that will be used to retrieve the data. The SQL Database Engine offers a few different index types, such as rowstore, columnstore, XML, and others. Rowstore indexes are the most common indexes in the SQL Database Engine and are what most people think of when considering indexes for query...

Technical requirements

The examples used in this chapter are designed for use on SQL Server 2022 and Azure SQL Database, but they should work on any version of SQL Server, 2012 or later. The Developer Edition of SQL Server is free for development environments and can be used to run all the code samples. There is also a free tier of Azure SQL Database you can use for testing at https://aka.ms/freedb.

You will need the AdventureWorks2016_EXT (referred to as AdventureWorks) and AdventureWorksDW2016_EXT (referred to as AdventureWorksDW) sample databases, which can be found on GitHub at https://github.com/Microsoft/sql-server-samples/releases/tag/adventureworks. The code samples for this chapter can also be found on GitHub at https://github.com/PacktPublishing/Learn-T-SQL-Querying-Second-Edition/tree/main/ch4.

Understanding predicate SARGability

A predicate is a filter that can be used to determine the set of conditions to apply to a query to trim the result set. As we have discussed in previous chapters, these are typically applicable to the following clauses:

JOIN clauses, which filter the rows matching the type of join
WHERE clauses, which filter source rows from a table or an index
HAVING clauses, which filter the results

Most queries will make use of predicates, usually through a WHERE clause. When a predicate is serviceable by an index, it is said the predicate is SARGable, which is an acronym for Search ARGument-able. Having SARGable predicates should be a goal for our T-SQL queries because it can reduce the number of rows that need to be processed by a plan earlier in the execution – that is, when the data is being read by the SQL Database Engine. The implementation of this early row count reduction is called predicate pushdown; it is the action of using...

Data access using indexes

Now that we have discussed how the Query Optimizer uses indexes to facilitate predicate pushdown and make queries more efficient, let’s explore how indexes are structured and why they are so important for query performance.

Before we begin discussing the structure of indexes, it’s worth understanding how data is stored and accessed in the SQL Database Engine. Data is stored on 8 KB pages. An object such as a table or an index is essentially a collection of pages, along with metadata that maps out the structure of the object. The SQL Database Engine uses a special metadata page called an Index Allocation Map (IAM) page to locate the pages in an object. IAM pages contain a list of all the pages in a database file that belong to an object. Each object will have at least one IAM page but depending on the size of the object and the file structure of the database, there may be more than one IAM page, forming a chain.

Tables that do not have a...

Indexing strategy using rowstore indexes

Now that we’ve covered the basics of how rowstore indexes are structured and how they are used to access data, let’s move on to where and when they should be used, along with some best practices for efficient index design.

The goal of an indexing strategy is to minimize the amount of I/O required to satisfy the queries being generated against the database. This translates into a few simple guidelines:

Keep indexes as small as possible. The more rows that fit on a page, the fewer page reads that are required to access the data.
Avoid lookups – they add unnecessary I/O and can sometimes lead to suboptimal query plans.
Choose index keys that support query predicates so that indexes can be used for seeks rather than scans.
When creating indexes with multiple key columns, columns used for equality comparisons should be first, followed by columns used for inequality comparisons. The leading column should...

Index maintenance

While index maintenance is more of a database administration topic than a developer topic, it’s worth discussing the importance of index maintenance. As we discussed in the section on index structure, over time, INSERT, UPDATE, and DELETE operations can cause an index to become fragmented. Once the data is in memory, fragmentation doesn’t cause a noticeable performance issue, so the main concern is I/O. The SQL Database Engine has a few I/O optimizations, such as the readahead mechanism that’s used when scanning an index, that rely on the data being stored contiguously. When the data is fragmented, I/O may not be as efficient.

Another side effect of fragmentation is lower page density. A page is the smallest unit of I/O in the SQL Database Engine, so an index that contains a lot of partially empty pages will generate a lot more I/O than necessary. If the pages are full, it will take fewer of them to store the same amount of data. This is a...

Summary

This chapter covered a lot of ground, so let’s review the overall indexing strategy guidance:

Clustered index data access is generally more efficient than heaps and every table in the database should have a clustered index, except for short-lived tables such as staging tables.
Create clustered indexes first based on the data structure. These should generally be primary keys unless there’s a specific reason to cluster a different column or columns (for example, surrogate versus natural keys).
Create non-clustered indexes on all foreign key columns.
Once you begin writing queries, create additional non-clustered indexes to support the application queries, or add additional columns to existing foreign key indexes.
Create covering indexes where practical, balancing overhead with performance.
Do not over-index heavily updated tables; balance the cost of index maintenance with the benefit to queries. Just because the SQL Database Engine allows...

The rest of the chapter is locked

You have been reading a chapter from

Learn T-SQL Querying - Second Edition

Published in: Feb 2024Publisher: PacktISBN-13: 9781837638994

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Authors (2)

Pedro Lopes

Pedro Lopes is a Program Manager in the Database Systems group, based in Redmond, WA, USA. He has over 19 years of industry experience and has been with Microsoft for 9 years. He is currently responsible for program management of Database Engine features for in-market and vNext versions of SQL Server, with a special focus on the Relational Engine. He has extensive experience with query performance troubleshooting and is a regular speaker at numerous conferences such as SQLBits, PASS Summit, SQLIntersection, Microsoft Ignite, and Microsoft Build. He blogs about SQL on the SQL Server Team blog. He has authored several tools in the Tiger toolbox on GitHub: AdaptiveIndexDefrag maintenance solution, BPCheck, and usp_WhatsUp.
Read more about Pedro Lopes

Pam Lahoud

Pam Lahoud is a Program Manager in the Database Systems group, based in Redmond, WA, USA. She has been with Microsoft for 13 years and is currently responsible for program management of Database Engine features for in-market and vNext versions of SQL Server, with a special focus on the Storage Engine area. She is passionate about SQL Server performance and has focused on performance tuning and optimization, particularly from the developer's perspective, throughout her career. She is a SQL Server 2008 Microsoft Certified Master (MCM) with over 20 years of experience working with SQL Server.
Read more about Pam Lahoud

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages