Say you’ve got a table – in this case, I’m using the Stack Overflow Posts table, which holds all of the questions and answers:
See the clustered index at the bottom? It’s on the Id field, which is an identity – starts at one and goes up to a bajillion.
So it would be really stupid to do this, right?
CREATE INDEX IX_Id ON dbo.Posts(Id);
Why would someone ever create yet another index on Id when you already have the clustered index? I’ll show you why:
SELECT COUNT(*) FROM dbo.Posts;
SELECT COUNT(DISTINCT Id) FROM dbo.Posts;
SELECT COUNT(*) FROM dbo.Posts WHERE Body IS NOT NULL;
Here are the execution plans – all 3 of which use our Id index:
The performance differences are pretty eye-opening:
- Clustered index scan: 50 seconds, 7,736,170 logical reads
- Nonclustered index scan: 0.5 seconds, 36,767 logical reads
When SQL Server has to scan the table, it says, “Is there a faster object that I could scan instead? Is there another copy of the table I could use that has less data in it?” Indeed, our nonclustered index just on Id is way smaller, as sp_BlitzIndex shows:
The clustered index (row 2) is 70GB because it has all the fields, whereas the nonclustered index on Id is a tiny 285MB in comparison. Turns out all those big string fields take up a lot of space on the clustered index. Who knew?
My favorite query plan of the 3, though, is the count WHERE Body IS NOT NULL. You’d think SQL Server would have to go check the Body field, right? But no – the Body field was specified as non-nullable (scroll back up to check the table definition) so SQL Server can simply ignore it. Ahh, the sweet benefits of the right null-vs-not-null table definitions.
The index on dbo.Posts.Id can even be used to speed up foreign key creation on other tables. Illustrating this is just a little trickier since foreign key creations don’t show execution plans (nor do they show up in the plan cache.) To show it, I restarted my instance, created a foreign key between Comments.PostId and Posts.Id, and then ran sp_BlitzCache. (You could also probably show this with Extended Events, but I’m doing this fast, dear reader.)
I’m not saying that those queries are a good idea – I’m just explaining why someone might have created that index, and what might backfire when you drop it. I’m also not saying you can never drop it, either – I’m just warning you to keep an eye out for what might slow down.
And by the way, if this comes as a surprise, I bet you haven’t seen my free course, How to Think Like the SQL Server Engine. It teaches you the differences between clustered & nonclustered indexes – something everybody thinks they know until I start asking them questions.