Blog

How Many Indexes Is Too Many?

Last Updated October 14, 2024

Let’s start with the Stack Overflow database (any size will work), drop all the indexes on the Users table, and run a delete:

SET STATISTICS IO ON;
GO
BEGIN TRAN
DELETE dbo.Users WHERE DisplayName = N'Brent Ozar';

SET STATISTICS IO ON;

BEGIN TRAN

DELETE dbo.Users WHERE DisplayName = N'Brent Ozar';

I’m using SET STATISTICS IO ON like we talk about in How to Think Like the Engine to illustrate how much data we read, and I’m doing it in a transaction so I can repeatedly roll it back, showing the effects each time. Here’s the actual execution plan:

We read that from right to left. The first thing SQL Server has to do is scan the entire Users table to find the row(s) with DisplayName = ‘Brent Ozar’ because we don’t have an index on DisplayName. After it’s found them, then it’ll delete them from the clustered index.

On the 2018-06 copy of Stack Overflow that I’m using, SQL Server has to read 143,670 8KB pages to find the rows it’s looking for:

We need an index on DisplayName.

We want our delete to run faster, and we wanna quickly find the rows where DisplayName = ‘Brent Ozar’. To do that, let’s roll back our delete and create an index:

ROLLBACK
CREATE INDEX DisplayName ON dbo.Users(DisplayName);

1 2	ROLLBACK CREATE INDEX DisplayName ON dbo.Users(DisplayName);

Then try our delete again:

BEGIN TRAN
DELETE dbo.Users WHERE DisplayName = N'Brent Ozar';

1 2	BEGIN TRAN DELETE dbo.Users WHERE DisplayName = N'Brent Ozar';

Now, the actual execution plan is simpler:

From right to left, SQL Server can open the index on DisplayName, seek into just the row it’s looking for, then delete the row from the clustered index. Be aware that the plan is simplifying things a little: if you hover your mouse over the Clustered Index Delete operator, you’ll see that at the very bottom of the tooltip, SQL Server shows two objects, not one:

At the bottom, it says “Object: PK_Users_Id” (that’s the clustered index) and “DisplayName” (that’s our new nonclustered index.) Still, the work involved is way less now, as evidenced by query runtime and logical reads:

We cut logical reads from 143670 down to just 12 pages! I’m no data scientist, but I think the term we’re looking for here is “good.”

That index helped. We should add more!

When clients come to me for performance problems, it’s usually one of two extremes: either they’ve never discovered the magic of good nonclustered indexes, or they’ve gone wildly overboard with ’em, adding tons of ’em. Let’s throw on a few more indexes, and then run our delete again:

CREATE INDEX Location ON dbo.Users(Location);
CREATE INDEX LastAccessDate ON dbo.Users(LastAccessDate);
CREATE INDEX Reputation ON dbo.Users(Reputation);
CREATE INDEX WebsiteUrl ON dbo.Users(WebsiteUrl);

CREATE INDEX Location ON dbo.Users(Location);

CREATE INDEX LastAccessDate ON dbo.Users(LastAccessDate);

CREATE INDEX Reputation ON dbo.Users(Reputation);

CREATE INDEX WebsiteUrl ON dbo.Users(WebsiteUrl);

Now, our delete’s actual execution plan looks the same, but…

The graphical plan hides a secret, and to see that secret, we’ll need to hover our mouse over the “Clustered Index Delete” operator, which is a vicious lie:

See that Object at the bottom? The so-called “Clustered Index Delete” is actually deleting the row in every nonclustered index too. That also means we’re doing more logical reads to find the rows:

We’ve gone from 12 logical reads up to 24. Now, is 24 8KB reads a big deal? Not at all! You should feel totally comfortable adding a handful of indexes to most tables when your query workloads need those indexes.