Estimated and Actual Plans Can Have Different Shapes.

A reader posted a question for Office Hours:

Hi Brent, What is your take on Hugo Kornelis’s explanation of execution plan naming. As her his explanation, estimated exec plan is simply an execution plan whereas actual execution plan = execution plan+run-time stats. Do you agree that the naming is flawed and confusing? – Yourbiggestfan

I like Hugo a lot – brilliant fella, and he knows way more than I do about execution plans – but he’s wrong on this one. Estimated and actual plans don’t always have the same shape.

I’ll use the training version of the Stack Overflow database, but any size will work here. Start with a few indexes:

Then create a stored procedure – yes, doing this with a temp table is silly, but I need a short, simple example to show the problem:

Then get the estimated execution plan for India:

The estimated plans for procs are a little weird in that they look like they’re a single statement:

But let’s zoom in on the second query in the proc. I’m not concerned about the plan for the insert into the temp table – that one’s pretty straightforward since we have an index on Location. Pay particular attention to the second query’s plan, though:

Read right to left, top to bottom to see that SQL Server estimates it will use a single CPU core (no parallelism) to:

  1. Scan the temp table
  2. Do a series of clustered index seeks on Users, then
  3. Do a series of UserId index seeks on Comments, then
  4. Sort the comments by Score, descending

But when you actually run the query and include the actual plan, the shape is totally different:

The query went parallel, and it chose a completely different query plan. Because there were so many rows in the temp table, SQL Server decided to scan the Score index on the Comments table, from highest-ranking Comments to lowest. It figured it wouldn’t have to read too many Comments before it stumbled across 100 that were written by people in the temp table. There were tons of differences, and here are just a few:

  • The estimated plan was serial, but it actually went parallel
  • The estimated plan used the Comments.UserId index, the actual one used Comments.Score
  • The estimated plan started with the temp table, the actual one started with Comments

The root cause on this one was the fact that when the estimated plan was generated, SQL Server hadn’t created the temp table yet, and it didn’t have any statistics on its contents. At runtime, the new creation of statistics caused SQL Server to go back and recalculate the plan while the proc ran, so the actual plan had a wildly different shape.

SQL Server 2022 is even worse.

I’ll do a classic parameter sniffing demo that I do all the time:

Execute it for Reputation = 2 – no need to get the actual plan, just run it:

And then try to get the estimated plan for Reputation = 1:

SQL Server 2022 is all, “Estimated plan? I could tell you – but then I’d have to kill you.”

That’s why I keep using the terms estimated plan & actual plan. For another example, check out Erik Darling’s recent post about estimated plan parallelism. (I’d already written & scheduled mine when his published, and I had to link to it here because the timing’s great.)

Previous Post
[Video] Office Hours: Long Answers Edition
Next Post
Office Hours Speed Round, Text Edition

15 Comments. Leave new

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.