Can Adding an Index Make SQL Server 2016…Worse?

Using the StackOverflow database, let’s check out Krock’s query. He’s a competitive fella, and he’s looking to find users who signed up for StackOverflow after he did, but who have a higher reputation than he does.

I’m going to simplify the query a little here:

The Users table has a clustered index on the Id field, so the initial seek on “me” (Krock’s row, Id = 557499) is super-fast. However, to find all the users with a higher creation date and reputation – man, that sucks scanning the clustered index for that, and it brings back a lot of rows. Here’s the execution plan:

execution_plan_1_details
Bringing back 5.3mm rows

Hover your mouse over the clustered index scan, and you get:

  • Estimated number of rows = 5,277,830
  • Actual number of rows = 5,277,831

Nice job, SQL Server, close enough for government work. Now it suggests a missing index – rightfully so – and let’s add it, because we want this query to go faster.

After adding the index, here’s the execution plan:

New and improved execution plan
New and improved execution plan

The query runs faster, make no mistake – but check out the estimates:

  • Estimated number of rows = 1
  • Actual number of rows = 165,367

Those estimates are built by SQL Server’s cardinality estimator (CE), and there have been major changes to it over the last couple of versions. You can control which CE you’re using by changing the database’s compatibility level. This particular StackOverflow database is running in 2016 compat mode – so what happens if we switch it back to 2012 compat mode?

Running in 2012 compatibility
Running in 2012 compatibility

It’s a differently shaped plan – 2012 decided to go parallel with this query. Its estimated subtree cost is 7.37 – higher than my SQL Server’s Cost Threshold for Parallelism.

The motivating factor can be discovered by hovering over that index seek:

Higher number of estimated rows
Higher number of estimated rows

SQL Server 2012’s cardinality estimator guessed that 475,005 users would have a newer creation date and higher reputation than Krock.

This query has so many fun lessons to share.

  • Your database options matter.
  • What database you run a query in matters.
  • Neither 2012 nor 2016’s cardinality estimator is “right” in this case, they’re just differently wrong
  • Indexes create statistics on their columns, but that doesn’t mean SQL Server has perfect information on what’s inside those columns
  • Besides, SQL Server can’t know Krock’s creation date or reputation until the plan is built and the query’s already running

I find statistics to be one of the coolest topics inside the engine, and that’s why I love Doug’s newest course, Statistics: SQL Server’s Guessing Game. I bet you’re gonna love it too – go check it out.

Previous Post
New Course: Statistics – SQL Server’s Guessing Game
Next Post
Why Not Just Create Statistics?

9 Comments. Leave new

  • Hey Bizarro-Brent,

    Is it safe to assume the NC index you added was a covering one? What did the IO and time stats look like before the new index, after the new index in 2016 compat mode, and after the new index in 2012 compat mode? Were there no appreciable differences in 2014 compat mode?

    Keep fighting the good fight!

    Reply
    • Brandon – the cool part is, you can actually try out all these questions yourself! Go get yourself the Stack database linked to in the post, and you can follow along with these kinds of experiments.

      Reply
  • Last week for the first time ever I used OPTION (QUERYTRACEON 9481) in anger, in Production, and it fixed my bad plan. How cool is that? I think I deserve a holiday now. There’s only so much excellence a man can be expected to excrete in one month, after all.

    Reply
  • You mention the database compatibility level however, can’t you still use Trace Flags 9481 to revert to the OLD CE or 2312 to force the new CE?

    I genuinely don’t know since I’m at work and don’t have a SQL 2016 machine to test on here.

    Reply
    • Zane – yes, but that requires modifying the query, something I don’t usually have the opportunity to do.

      Reply
    • Thomas Franz
      July 14, 2016 10:23 am

      Don’t forget – to use trace flags you will need SA privileges. And I hope for you, that your application does not connect to the server with and SysAdmin account…

      Reply
      • This is just being called out as a general add on. I’m still on 2012 for my applications and there fore non of this applies to my application. 🙂 lol

        Reply
  • Thanks Brent. Wonderful. But so was is worse? I see the point the estimates were off, but you did say the query ran faster? It appears to me the decision to parallelize the query (by adding an index) resulted better execution times (in spite the wrong estimates).

    Reply
    • Ivan – when these tables are joined to other tables, the query gets spectacularly worse due to the 1-row estimate. (In order to illustrate concepts in blog posts, I try to use the simplest query possible. I discovered this when tuning a much larger query at a client.)

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.