Can Adding an Index Make SQL Server 2016…Worse?

Last Updated 7 years ago

Using the StackOverflow database, let’s check out Krock’s query. He’s a competitive fella, and he’s looking to find users who signed up for StackOverflow after he did, but who have a higher reputation than he does.

I’m going to simplify the query a little here:

SELECT me.Id, u.Id
FROM dbo.Users me 
JOIN dbo.Users u on 
u.CreationDate &gt; me.CreationDate
and u.Reputation &gt; me.Reputation
WHERE me.Id = 557499;

SELECT me.Id, u.Id

FROM dbo.Users me

JOIN dbo.Users u on

u.CreationDate > me.CreationDate

and u.Reputation > me.Reputation

WHERE me.Id = 557499;

The Users table has a clustered index on the Id field, so the initial seek on “me” (Krock’s row, Id = 557499) is super-fast. However, to find all the users with a higher creation date and reputation – man, that sucks scanning the clustered index for that, and it brings back a lot of rows. Here’s the execution plan:

execution_plan_1_details — Bringing back 5.3mm rows

Hover your mouse over the clustered index scan, and you get:

Estimated number of rows = 5,277,830
Actual number of rows = 5,277,831

Nice job, SQL Server, close enough for government work. Now it suggests a missing index – rightfully so – and let’s add it, because we want this query to go faster.

After adding the index, here’s the execution plan:

The query runs faster, make no mistake – but check out the estimates:

Estimated number of rows = 1
Actual number of rows = 165,367

Those estimates are built by SQL Server’s cardinality estimator (CE), and there have been major changes to it over the last couple of versions. You can control which CE you’re using by changing the database’s compatibility level. This particular StackOverflow database is running in 2016 compat mode – so what happens if we switch it back to 2012 compat mode?

It’s a differently shaped plan – 2012 decided to go parallel with this query. Its estimated subtree cost is 7.37 – higher than my SQL Server’s Cost Threshold for Parallelism.

The motivating factor can be discovered by hovering over that index seek:

SQL Server 2012’s cardinality estimator guessed that 475,005 users would have a newer creation date and higher reputation than Krock.

This query has so many fun lessons to share.

Your database options matter.
What database you run a query in matters.
Neither 2012 nor 2016’s cardinality estimator is “right” in this case, they’re just differently wrong
Indexes create statistics on their columns, but that doesn’t mean SQL Server has perfect information on what’s inside those columns
Besides, SQL Server can’t know Krock’s creation date or reputation until the plan is built and the query’s already running

I find statistics to be one of the coolest topics inside the engine, and that’s why I love Doug’s newest course, Statistics: SQL Server’s Guessing Game. I bet you’re gonna love it too – go check it out.

New Course: Statistics – SQL Server’s Guessing Game

Why Not Just Create Statistics?

9 Comments. Leave new

Brandon M.
July 13, 2016 9:03 am

Hey Bizarro-Brent,

Is it safe to assume the NC index you added was a covering one? What did the IO and time stats look like before the new index, after the new index in 2016 compat mode, and after the new index in 2012 compat mode? Were there no appreciable differences in 2014 compat mode?

Keep fighting the good fight!

Reply
- Brent Ozar
  July 13, 2016 10:42 am
  
  Brandon – the cool part is, you can actually try out all these questions yourself! Go get yourself the Stack database linked to in the post, and you can follow along with these kinds of experiments.
  
  Reply
Thomas Pullen
July 14, 2016 4:14 am

Last week for the first time ever I used OPTION (QUERYTRACEON 9481) in anger, in Production, and it fixed my bad plan. How cool is that? I think I deserve a holiday now. There’s only so much excellence a man can be expected to excrete in one month, after all.

Reply
Zane
July 14, 2016 8:04 am

You mention the database compatibility level however, can’t you still use Trace Flags 9481 to revert to the OLD CE or 2312 to force the new CE?

I genuinely don’t know since I’m at work and don’t have a SQL 2016 machine to test on here.

Reply
- Brent Ozar
  July 14, 2016 8:42 am
  
  Zane – yes, but that requires modifying the query, something I don’t usually have the opportunity to do.
  
  Reply
- Thomas Franz
  July 14, 2016 10:23 am
  
  Don’t forget – to use trace flags you will need SA privileges. And I hope for you, that your application does not connect to the server with and SysAdmin account…
  
  Reply
  - Zane
    July 18, 2016 12:06 pm
    
    This is just being called out as a general add on. I’m still on 2012 for my applications and there fore non of this applies to my application. 🙂 lol
    
    Reply
Ivan
July 18, 2016 2:36 am

Thanks Brent. Wonderful. But so was is worse? I see the point the estimates were off, but you did say the query ran faster? It appears to me the decision to parallelize the query (by adding an index) resulted better execution times (in spite the wrong estimates).

Reply
- Brent Ozar
  July 18, 2016 7:54 am
  
  Ivan – when these tables are joined to other tables, the query gets spectacularly worse due to the 1-row estimate. (In order to illustrate concepts in blog posts, I try to use the simplest query possible. I discovered this when tuning a much larger query at a client.)
  
  Reply