Performance Tuning

Does an Index on Just The Clustering Key Ever Make Sense?

By Brent Ozar · April 20, 2018 · 10 comments

Say you’ve got a table – in this case, I’m using the Stack Overflow Posts table, which holds all of the questions and answers:

See the clustered index at the bottom? It’s on the Id field, which is an identity – starts at one and goes up to a bajillion.

So it would be really stupid to do this, right?

CREATE INDEX IX_Id ON dbo.Posts(Id);

1	CREATE INDEX IX_Id ON dbo.Posts(Id);

Why would someone ever create yet another index on Id when you already have the clustered index? I’ll show you why:

SELECT COUNT(*) FROM dbo.Posts;
SELECT COUNT(DISTINCT Id) FROM dbo.Posts;
SELECT COUNT(*) FROM dbo.Posts WHERE Body IS NOT NULL;

SELECT COUNT(*) FROM dbo.Posts;

SELECT COUNT(DISTINCT Id) FROM dbo.Posts;

SELECT COUNT(*) FROM dbo.Posts WHERE Body IS NOT NULL;

Here are the execution plans – all 3 of which use our Id index:

The performance differences are pretty eye-opening:

Clustered index scan: 50 seconds, 7,736,170 logical reads
Nonclustered index scan: 0.5 seconds, 36,767 logical reads

When SQL Server has to scan the table, it says, “Is there a faster object that I could scan instead? Is there another copy of the table I could use that has less data in it?” Indeed, our nonclustered index just on Id is way smaller, as sp_BlitzIndex shows:

The clustered index (row 2) is 70GB because it has all the fields, whereas the nonclustered index on Id is a tiny 285MB in comparison. Turns out all those big string fields take up a lot of space on the clustered index. Who knew?

My favorite query plan of the 3, though, is the count WHERE Body IS NOT NULL. You’d think SQL Server would have to go check the Body field, right? But no – the Body field was specified as non-nullable (scroll back up to check the table definition) so SQL Server can simply ignore it. Ahh, the sweet benefits of the right null-vs-not-null table definitions.

The index on dbo.Posts.Id can even be used to speed up foreign key creation on other tables. Illustrating this is just a little trickier since foreign key creations don’t show execution plans (nor do they show up in the plan cache.) To show it, I restarted my instance, created a foreign key between Comments.PostId and Posts.Id, and then ran sp_BlitzCache. (You could also probably show this with Extended Events, but I’m doing this fast, dear reader.)

Score another one for our nonclustered index

I’m not saying that those queries are a good idea – I’m just explaining why someone might have created that index, and what might backfire when you drop it. I’m also not saying you can never drop it, either – I’m just warning you to keep an eye out for what might slow down.

And by the way, if this comes as a surprise, I bet you haven’t seen my free course, How to Think Like the SQL Server Engine. It teaches you the differences between clustered & nonclustered indexes – something everybody thinks they know until I start asking them questions.

Free, 3× a week

Get my new posts by email

Three posts a week, plus a Monday roundup of the best database news from around the web.

10 comments

alen teplitsky

April 20, 2018 at 8:59 am

I don’t automatically hate clustered PK’s on an identity column, but i’ve seen tables where the index to column ratio is close to 1:1 because of this

Reply
Eryk Seppanen Student since 2017

April 20, 2018 at 11:26 am

That 3rd one had me scratching my head. Sneaky!

Reply
Ron S

April 20, 2018 at 11:31 am

I was reading through it and suddenly I saw….FREE COURSE!!! 🙂
Good article also. I would be doing that free course right now because it doesn’t hurt at all to get back up to speed on some of that essential information, but the company wants us to learn more about AWS. Even though I consider your free course and the information it contains may actually be more valuable!

Reply
1. Minh Student since 2018
  
  April 24, 2018 at 4:51 am
  
  Same like me 😉
  
  Reply
Marc Kent Student since 2017

April 20, 2018 at 11:33 am

Great post! We use these a lot for super fast EXISTS and NOT EXISTS checks.

Reply
Aaron Morelli

April 20, 2018 at 1:06 pm

“Stuck on 2008 R2? Roll your own column store indexes! Put a single-col b-tree on every column!”

Reply
Tom Norman Student since 2019

April 20, 2018 at 2:01 pm

I saw this on a database I was working on the other day. The LOB kills you and it is there because of the nvarchar(max) field, Body. Our (max) field has a picture in it. The clustered index has that picture in it causing the CPU to go crazy. Told them to get that picture out of the database. In my case, the execution plans using the cluster key is using 86 percent of the query time.

Reply
Darrell

April 23, 2018 at 11:28 am

Mind Blown!
Ah Ha!
When you know it but then Brent paints it together for you, again, and makes you slap yourself and go, “YEAH!”
Thank You, Brent.

Reply
1. Brent Ozar
  
  April 23, 2018 at 12:08 pm
  
  Hahaha, you’re welcome. Posted that after answering a couple of DBA.se questions about it where everybody seemed to be surprised, so figured I should probably blog it.
  
  Reply
  1. dw
    
    April 27, 2018 at 5:40 pm
    
    Wait, there’s a site for Swedish DBAs? https://imgflip.com/i/298rcl
    
    Reply

Does an Index on Just The Clustering Key Ever Make Sense?

Get my new posts by email

Keep digging

10 comments

Leave a comment Cancel reply