Fixing Bottlenecks for Better Database Access in ASP.Net

Matt Perdeck

October 2010


ASP.NET Site Performance Secrets

ASP.NET Site Performance Secrets

Simple and proven techniques to quickly speed up your ASP.NET website

  • Speed up your ASP.NET website by identifying performance bottlenecks that hold back your site's performance and fixing them
  • Tips and tricks for writing faster code and pinpointing those areas in the code that matter most, thus saving time and energy
  • Drastically reduce page load times
  • Configure and improve compression – the single most important way to improve your site's performance
  • Written in a simple problem-solving manner – with a practical hands-on approach and just the right amount of theory you need to make sense of it all


        Read more about this book      

(For more resources on ASP.Net, see here.)

The reader can benefit from the previous article on Pinpointing bottlenecks for better Database Access in ASP.Net.

Now that you have pinpointed the bottlenecks to prioritize, skip to the appropriate subsection to find out how to fix those bottlenecks.

Missing indexes

Just as using an index in a book to find a particular bit of information is often much faster than reading all pages, SQL Server indexes can make finding a particular row in a table dramatically faster by cutting down the number of read operations.

This section first discusses the two types of indexes supported by SQL Server: clustered and non-clustered. It also goes into included columns, a feature of nonclustered indexes. After that, we'll look at when to use each type of index.

Clustered index

Take the following table (missingindexes.sql in the downloaded code bundle):

CREATE TABLE [dbo].[Book](
[BookId] [int] IDENTITY(1,1) NOT NULL,
[Title] [nvarchar](50) NULL,
[Author] [nvarchar](50) NULL,
[Price] [decimal](4, 2) NULL)

Because this table has no clustered index, it is called a heap table. Its records are unordered, and to get all books with a given title, you have to read all the records. It has a very simple structure:

Let's see how long it takes to locate a record in this table. That way, we can compare against the performance of a table with an index.

To do that in a meaningful way, first insert a million records into the table (code to do this is in missingindexes.sql in the downloaded code bundle).

Tell SQL Server to show I/O and timing details of each query we run:


Also, before each query, flush the SQL Server memory cache:


Now, run the query below with a million records in the Book table:

SELECT Title, Author, Price FROM dbo.Book WHERE BookId = 5000

The results on my machine are: reads: 9564, CPU time: 109 ms, elapsed time: 808 ms.

SQL Server stores all data in 8-KB pages. This shows that it read 9564 pages, that is, the entire table.

Now, add a clustered index:


This puts the index on column BookId, making WHERE and JOIN statements on BookId faster. It sorts the table by BookId and adds a structure called a B-tree to speed up access:

BookId is now used the same way as a page number in a book. Because the pages in a book are sorted by page number, finding a page by page number is very fast.

Now, run the same query again to see the difference:

SELECT Title, Author, Price FROM dbo.Book WHERE BookId = 5000

The results are: reads: 2, CPU time: 0 ms, elapsed time: 32 ms.

The number of reads of 8-KB pages has gone from 9564 to 2, CPU time from 109ms to less than 1 ms, and elapsed time from 808 ms to 32 ms. That's a dramatic improvement.

Non-clustered index

Now let's select by Title instead of BookId:

SELECT Title, Author FROM dbo.Book WHERE Title = 'Don Quixote'

These results are pretty similar to what we got with the heap table, which is no wonder, seeing that there is no index on Title.

The solution obviously is to put an index on Title. However, because a clustered index involves sorting the table records on the index field, there can be only one clustered index. We've already sorted on BookId, and the table can't be sorted on Title at the same time.

The solution is to create a non-clustered index. This is essentially a duplicate of the table records, this time sorted by Title. To save space, SQL Server leaves out the other columns, such as Author and Price. You can have up to 249 non-clustered indexes on a table.

Because we still want to access those other columns in queries though, we need a way to get from the non-clustered index records to the actual table records. The solution is to add the BookId to the non-clustered records. Because BookId has the clustered index, once we have found a BookId via the non-clustered index, we can use the clustered index to get to the actual table record. This second step is called a key lookup.

Why go through the clustered index? Why not put the physical address of the table record in the non-clustered index record? The answer is that when you update a table record, it may get bigger, causing SQL Server to move subsequent records to make space. If non-clustered indexes contained physical addresses, they would all have to be updated when this happens. It's a tradeoff between slightly slower reads and much slower updates. If there is no clustered index or if it is not unique, then non-clustered index records do have the physical address.

To see what a non-clustered index will do for us, first create it as follows:


Now, run the same query again:

SELECT Title, Author FROM dbo.Book WHERE Title = 'Don Quixote'

The results are: reads: 4, CPU time: 0 ms, elapsed time: 46 ms.

The number of reads has gone from 9146 to 4, CPU time from 156 ms to less than 1 ms, and elapsed time from 1653 ms to 46 ms. This means that having a non-clustered index is not quite as good as having a clustered index, but still dramatically better than having no index at all.

Included columns

You can squeeze a bit more performance out of a non-clustered index by cutting out the key lookup—the second step where SQL Server uses the clustered index to find the actual record.

Have another look at the test query—it simply returns Title and Author. Title is already present in the non-clustered index record. If you were to add Author to the non-clustered index record as well, there would be no longer any need for SQL Server to access the table record, enabling it to skip the key lookup. It would look similar to the following:

This can be done by including Author in the non-clustered index:

WITH drop_existing

Now, run the query again:

SELECT Title, Author FROM dbo.Book WHERE Title = 'Don Quixote'

The results are: reads: 2, CPU time: 0 ms, elapsed time: 26 ms.

The number of reads has gone from 4 to 2, and elapsed time from 46 ms to 26 ms; that's almost 50 percent improvement. In absolute terms, the gain isn't all that great, but for a query that is executed very frequently, this may be worthwhile. Don't overdo this—the bigger you make the non-clustered index records, the fewer fit on an 8KB page, forcing SQL Server to read more pages.

Selecting columns to give an index

Because indexes do create overhead, you want to carefully select the columns to give indexes. Before starting the selection process, keep the following in mind:

  • Putting a Primary Key on a column by default gives it a clustered index (unless you override the default). So, you may already have many columns in your database with an index. As you'll see later in the When to use a clustered index section, putting the clustered index on the ID column of a record is almost always a good idea.
  • Putting an index on a table column affects all queries that use that table. Don't focus on just one query.
  • Before introducing an index on your live database, test the index in development to make sure it really does improve performance.

Let's look at when and when not to use an index, and when to use a clustered index.

When to use an index

You can follow this decision process when selecting columns to give an index:

  • Start by looking at the most expensive queries.
  • Look at putting an index on at least one column involved in every JOIN.
  • Consider columns used in ORDER BY and GROUP BY clauses. If there is an index on such a column, than SQL Server doesn't have to sort the column again because the index already keeps the column values in sorted order.
  • Consider columns used in WHERE clauses, especially if the WHERE will select a small number of records. However, keep in mind the following:
    • A WHERE clause that applies a function to the column value can't use an index on that column, because the output of the function is not in the index. Take for example the following:

      SELECT Title, Author FROM dbo.Book
      WHERE LEFT(Title, 3) = 'Don'

      Putting an index on the Title column won't make this query any faster.

    • Likewise, SQL Server can't use an index if you use LIKE in a WHERE clause with a wild card at the start of the search string, as in the following:

      SELECT Title, Author FROM dbo.Book
      WHERE Title LIKE '%Quixote'

      However, if the search string starts with constant text instead of a wild card, an index can be used:

      SELECT Title, Author FROM dbo.Book
      WHERE Title LIKE 'Don%'

  • Consider columns that have a UNIQUE constraint. Having an index on the column makes it easier for SQL Server to check whether a new value would not be unique.
  • The MIN and MAX functions benefit from working on a column with an index. Because the values are sorted, there is no need to go through the entire table to find the minimum or maximum.
  • Think twice before putting an index on a column that takes a lot of space. If you use a non-clustered index, the column values will be duplicated in the index. If you use a clustered index, the column values will be used in all nonclustered indexes. The increased sizes of the index records means fewer fit in each 8-KB page, forcing SQL Server to read more pages. The same applies to including columns in non-clustered indexes.

When not to use an index

Having too many indexes can actually hurt performance. Here are the main reasons not to use an index on a column:

  • The column gets updated often
  • The column has low specificity, meaning it has lots of duplicate values

Let's look at each reason in turn.

Column updated often

When you update a column without an index, SQL Server needs to write one 8KB page to disk, provided there are no page splits.

However, if the column has a non-clustered index, or if it is included in a nonclustered index, SQL Server needs to update the index as well, so it has to write at least one additional page to disk. It also has to update the B-tree structure used in the index, potentially leading to more page writes.

If you update a column with a clustered index, the non-clustered index records that use the old value need to be updated too, because the clustered index key is used in the non-clustered indexes to navigate to the actual table records. Secondly, remember that the table records themselves are sorted based on the clustered index. If the update causes the sort order of a record to change, that may mean more writes. Finally, the clustered index needs to keep its B-tree up-to-date.

This doesn't mean you cannot have indexes on columns that get updated; just be aware that indexes slow down updates. Test the effect of any indexes you add.

If an index is critical but rarely used, for example only for overnight report generation, consider dropping the index and recreating it when it is needed.

Low specificity

Even if there is an index on a column, the query optimizer won't always use it. Remember, each time SQL Server accesses a record via an index, it has to go through the index structure. In the case of a non-clustered index, it may have to do a key lookup as well. If you're selecting all books with price $20, and lots of books happen to have that price, than it might be quicker to simply read all book records rather than going through an index over and over again. In that case, it is said that the $20 price has low specificity.

You can use a simple query to determine the average selectivity of the values in a column. For example, to find the average selectivity of the Price column in the Book table, use (missingindexes.sql in downloaded code bundle):

COUNT(DISTINCT Price) AS 'Unique prices',
COUNT(*) AS 'Number of rows',
AS nvarchar(10)) + '%' AS 'Selectivity'

If every book has a unique price, selectivity will be 100 percent. However, if half the books cost $20 and the other half $30, then average selectivity will be only 50 percent. If the selectivity is 85 percent or less, an index is likely to incur more overhead than it would save.

Some prices may occur a lot more often than other prices. To see the specificity of each individual price, you would run (missingindexes.sql in downloaded code bundle):

DECLARE @c real
SELECT @c = CAST(COUNT(*) AS real) FROM Book
COUNT(BookId) AS 'Number of rows',
CAST((1 - (100 * COUNT(BookId) / @c))
AS nvarchar(20)) + '%' AS 'Selectivity'

The query optimizer is unlikely to use a non-clustered index for a price whose specificity is below 85 percent. It figures out the specificity of each price by keeping statistics on the values in the table.

When to use a clustered index

You saw that there are two types of indexes, clustered and non-clustered, and that you can have only one clustered index. How do you determine the lucky column that will have the clustered index?

To work this out, let's first look at the characteristics of a clustered index against a non-clustered index:

Characteristic Clustered index compared to a non-clustered index
Reading Faster: Because there is no need for key lookups. No difference if all the required columns are included in the non-clustered index.
Updating Slower: Not only the table record, but also all non-clustered index records potentially need to be updated.
Inserting/Deleting Faster: With a non-clustered index, inserting a new record in the table means inserting a new record in the non-clustered index as well. With a clustered index, the table is effectively part of the index, so there is no need for the second insert. The same goes for deleting a record. On the other hand, when the record is inserted at any place in the table but the very end, the insert may cause a page split where half the content of the 8-KB page is moved to another page. Having a page split in a non-clustered index is less likely, because its records are smaller (they normally don't have all columns that a table record has), so more records fit on a page. When the record is inserted at the end of the table, there won't be a page split.
Column Size Needs to be kept short and fast - Every non-clustered index contains a clustered index value, to do the key lookup. Every access via a non-clustered index has to use that value, so you want it to be fast for the server to process. That makes a column of type int a lot better to put a clustered index on than a column of type nvarchar(50).


If only one column requires an index, this comparison shows that you'll probably want to give it the clustered index rather than a non-clustered index.

If multiple columns need indexes, you'll probably want to put the clustered index on the primary key column:

  • Reading: The primary key tends to be involved in a lot of JOIN clauses, making read performance important.
  • Updating: The primary key should never or rarely get updated, because that would mean changing referring foreign keys as well.
  • Inserting/Deleting: Most often you'll make the primary key an IDENTITY column, so each new record is assigned a unique, ever increasing number. This means that if you put the clustered index on the primary key, new records are always added at the end of the table. When a record is added at the end of a table with a clustered index and there is no space in the current page, the new record goes into a new page but the rest of the data in the current page stays in the page. In other words, there is no expensive page split.
  • Size: Most often, the primary key is of type int, which is short and fast.

Indeed, when you set the primary key on a column in the SSMS table designer, SSMS gives that column the clustered index by default, unless another column already has the clustered index.

        Read more about this book      

(For more resources on ASP.Net, see here.)

Maintaining indexes

Do the following to keep your indexes working efficiently:

  • Defragment indexes: Repeated updates cause indexes and tables to become fragmented, decreasing performance.
  • Keep statistics updated: SQL Server maintains statistics to figure out whether to use an index for a given query. These statistics are normally kept up-to-date automatically, but this can be switched off. If you did, make sure statistics are kept up-to-date.
  • Remove unused indexes: As you saw, indexes speed up read access, but slow down updates.

Expensive queries

It makes sense to try and optimize those queries that are most expensive because they are used heavily, or because each single execution is just plain expensive. You already saw how to identify and create missing indexes. Here are some more ways to optimize your queries and stored procedures.

Cache aggregation queries

Aggregation statements such as COUNT and AVG are expensive, because they need to access lots of records. If you need aggregated data for a web page, consider caching the aggregation results in a table instead of regenerating them for each web page request. Provided you read the aggregates more often than you update the underlying columns, this will reduce your response time and CPU usage. For example, this code stores a COUNT aggregate in a table Aggregates:

DECLARE @n int
SELECT @n = COUNT(*) FROM dbo.Book
UPDATE Aggregates SET BookCount = @n

You could update the aggregations whenever the underlying data changes using a trigger or as part of the stored procedure that makes the update. You could also recalculate the aggregations periodically with a SQL Server Job. See how to create such a job at:

Keeping records short

Reducing the amount of space taken per table record speeds up access. Records are stored in 8-KB pages on disk. The more records fit on a page, the fewer pages SQL Server needs to read to retrieve a given set of records.

Here are ways to keep your records short:

  • Use short data types. If your values fit in a 1 byte TinyInt, don't use a four-byte Int. If you store simple ASCII characters, use varchar(n) which uses one byte per character, instead of nvarchar(n) which uses two. If you store strings of fixed length, use char(n) or nchar(n) instead of varchar(n) or nvarchar(n), saving two bytes of length field.
  • Consider storing large, rarely-used columns off-row. Large object fields such as nvarchar(max), varchar(max), varbinary(max), and XML fields are normally stored in row if smaller than 8000 bytes, and replaced by a 16-bit pointer to an off-row area if larger than 8000 bytes. Storing off-row means that accessing the field takes at least two reads instead of one, but also makes for a much shorter record, which may be desirable if the field is rarely accessed. To force large object fields in a table to be always off-row, use:
    EXEC sp_tableoption 'mytable', 'large value types out of row',
  • Consider vertical partitioning. If some columns in a table are much more frequently accessed than others, put the rarely-accessed columns in a separate table. Access to the frequently-used columns will be faster, at the expense of having to JOIN to the second table when it does get used.
  • Avoid repeating columns. For example, don't do the following:
    AuthorId Author Country Book Title 1 Book Title 2
    1 Charles Dickens United Kingdom Oliver Twist The Pickwick Papers
    2 Herman Melville United States Moby-Dick  
    3 Leo Tolstoy Russia Anna Karenina War and Peace

    This solution not only creates long records, but also makes it hard to update book titles, and makes it impossible to have more than two titles per author. Instead, store the book titles in a separate book table, and include an AuthorId column that refers back to the Book's author.

  • Avoid duplicate values. For example, don't do the following:
    BookId Book Title Author Country
    1 Oliver Twist Charles Dickens United Kingdom
    2 The Pickwick Papers Charles Dickens United Kingdom
    3 Moby-Dick Herman Melville United States
    4 Anna Karenina Leo Tolstoy Russia
    5 War and Peace Leo Tolstoy Russia

    Here the author's name and country are duplicated for each of their books. In addition to resulting in long records, updating author details now requires multiple record updates and an increased risk of inconsistencies. Store authors and books in separate tables, and have the book records refer back to their Author records.

Considering denormalization

Denormalization is essentially the reverse of the last two points in the previous section—avoid repeating columns and avoid duplicate values.

The issue is that while these recommendations improve update speed, consistency, and record sizes, they do lead to data being spread across tables, meaning more JOINs

For example, say you have 100 addresses spread over 50 cities, with the cities stored in a separate table. This will shorten the address records and make updating a city name easier, but also means having to do a JOIN each time you retrieve an address. If a city name is unlikely to change and you always retrieve the city along with the rest of the address, than you may be better off including the city name in the address record itself. This solution implies having repeated content (the city name), but on the other hand, you'll have one less JOIN.

Being careful with triggers

Triggers can be very convenient, and great for data integrity. On the other hand, they tend to be hidden from view of developers, so they may not realize that an additional INSERT, UPDATE, or DELETE carries the overhead of a trigger.

Keep your triggers short. They run inside the transaction that caused them to fire, so locks held by that transaction continue to be held while the trigger runs. Remember that even if you do not explicitly create a transaction using BEGIN TRAN, each individual INSERT, UPDATE, or DELETE creates its own transaction for the duration of the operation.

When deciding what indexes to use, don't forget to look at your triggers as well as your stored procedures and functions.

Using table variables for small temporary result sets

Consider replacing temporary tables in your stored procedures with table variables.

For example, instead of writing the following:

CREATE TABLE #temp (Id INT, Name nvarchar(100))

You would write the following instead:

DECLARE @temp TABLE(Id INT, Name nvarchar(100))

Table variables have these advantages over temporary tables:

  • SQL Server is more likely to store them in memory rather than tempdb. That means less traffic and locking in tempdb.
  • No transaction log overhead.
  • Fewer stored procedure recompilations.

However, there are disadvantages as well:

  • You can't add indexes or constraints to a table variable after it has been created. If you need an index, it needs to be created as part of the DECLARE statement:
    SELECT Title, Author FROM dbo.Book WHERE Title LIKE '%Quixote'
  • They are less efficient than temporary tables when they have more than about 100 rows, because no statistics are created for a table variable. This makes it more difficult for the query optimizer to come up with an optimal execution plan.

Using full-text search instead of LIKE

You may be using LIKE to search for substrings in text columns, as shown:

SELECT LEFT(longtext, 100) AS excerpt FROM Articles WHERE ...

However, unless the wildcard starts with constant text, SQL Server will not be able to use any index on the column, and so will do a full-table scan instead. Not good.

To improve this situation, consider using SQL Server's Full Text Search feature. This automatically creates an index for all words in the text column, leading to much faster searches. To see how to use Full Text Search, visit:

Replacing cursors with set-based code

If you use cursors, consider replacing them with set-based code. Performance improvements of a 1000 times are not uncommon. Set-based code uses internal algorithms that are much better optimized than you could ever hope to achieve with a cursor.

For more information about converting cursors to set based code, visit:

Minimizing traffic from SQL server to web server

Do not use SELECT *. This will return all columns. Instead, only list the specific columns you actually need.

If the website needs only part of a long text value, only send that part, not the entire value. For example:

SELECT Title, Author FROM dbo.Book

Object naming

Do not start stored procedure names with sp_. SQL Server assumes stored procedure names starting with sp_ belong to system-stored procedures, and always looks in the master database first to find them even when you prefix the name with your database name.

Prefix object names with the schema owner. This saves SQL Server time identifying objects, and improves execution plan reusability. For example, use the following:

SELECT Title, Author FROM Book

Instead of the following:

SELECT Title, Author FROM dbo.Book WHERE LEFT(Title, 1)='D'


Always include the command SET NOCOUNT ON at the start of stored procedures and triggers. This prevents SQL Server from sending the number of rows affected after execution of every SQL statement.

Use FILESTREAM for values over 1 MB

Store BLOBs over 1 MB in size in a FILESTREAM column. This stores the objects directly on the NTFS file system instead of in the database data file. To see how to make this work, visit:

Avoid functions on columns in WHERE clauses

Using a function on a column in a WHERE clause prevents SQL Server from using an index on that column.

Take for example the following query:

SELECT Title, Author FROM dbo.Book WHERE Title LIKE 'D%'

SQL Server doesn't know what values the LEFT function returns, so has no choice but to scan the entire table, executing LEFT for each column value.

However, it does know how to interpret LIKE. Rewrite the query to:

SELECT BookId, Title, Author FROM dbo.Book WHERE Author LIKE 'J%'
SELECT BookId, Title, Author FROM dbo.Book WHERE Author LIKE 'M%'

SQL Server can now use an index on Title, because the LIKE string starts with constant text.

Use UNION ALL instead of UNION

The UNION clause combines the results of two SELECT statements, removing duplicates from the final result. This is expensive; it uses a work table and executes a DISTINCT select to provide this functionality.

If you don't mind duplicates, or if you know there will be no duplicates, use UNION ALL instead. This simply concatenates the SELECT results together.

If the optimizer determines there will be no duplicates, it chooses UNION ALL even if you write UNION. For example, the select statements in the following query will never return overlapping records, and so the optimizer will replace the UNION clause with UNION ALL:

DECLARE @n int
SELECT @n = COUNT(*) FROM dbo.Book
IF @n > 0
print 'Records found'

Use EXISTS instead of COUNT to find existence of records

If you need to establish whether there are records in a result set, don't use COUNT:

print 'Records found'

This reads the entire table to find the number of records. Instead, use EXISTS:

UPDATE dbo.Book
SET LastAccess = GETDATE()
WHERE BookId=@BookId

SELECT Title, Author
FROM dbo.Book
WHERE BookId=@BookId

This allows SQL Server to stop reading the moment it finds a record.


Sometimes, you need to SELECT and UPDATE the same record. For example, you may need to update a "LastAccessed" column whenever you retrieve a record. You can do this with a SELECT and an UPDATE:

DECLARE @title nvarchar(50)
DECLARE @author nvarchar(50)

UPDATE dbo.Book
SET LastAccess = GETDATE(),
@title = Title,
@author = Author
WHERE BookId=@BookId

SELECT @title, @author

However, you can combine the SELECT into the UPDATE, as follows:

EXEC sp_configure 'show advanced options', 1
EXEC sp_configure 'blocked process threshold', 30

That saves you some elapsed time, and it reduces the time locks held on the record.


In this article we dug deeply into fixing the bottlenecks which we found. The two types of indexes: clustered and non-clustered, how they work, and when to use them were discussed. We saw the advantages and disadvantages of reducing the size of table records and discussed other ways to speed up queries such as caching aggregation queries.

In the next article, ASP.Net Site Performance: Speeding up Database Access, we will take a look at the other ways for fixing bottlenecks.

Further resources on this subject:

You've been reading an excerpt of:

ASP.NET Site Performance Secrets

Explore Title