PSPO: How SQL Server 2022 Tries to Fix Parameter Sniffing

Parameter sniffing is a notorious problem for Microsoft SQL Server because it tries to reuse execution plans, which doesn’t work out well for widely varying parameters. Here’s a primer for the basics about how it happens.

SQL Server 2022 introduces a new feature called Parameter Sensitive Plan optimization. I’m not really sure why Microsoft capitalized the first three letters sometimes and not others, or why they only abbreviate it as “PSP optimization” – as long as we’re abbreviating, why not call it PSPO? Optimization is the word in there with the most syllables – why the hell would we leave that one as the one that isn’t abbreviated?

Because I make things go faster for a living, we’re gonna call it PSPO. You’ll see how I pronounce it later, and why.

We’ll pick up from that primer blog post, but this time around we’ll put the database in SQL Server 2022 compatibility mode, which enables PSPO.

When I run it for @Reputation = 2, hardly any users match:

The actual execution plan has some new stuff in SQL Server 2022:

Up near the top of that screenshot, if you look closely at the query, the query itself has been modified. SQL Server has added an “option” hint. Here’s the full text of the modified query:

Let’s dig into how this works.

What option plan per value means

When SQL Server compiled the query plan for the first time, it noticed that we were doing an equality search on the Users.Reputation column, and noticed that different parameters for that value might produce dramatically different numbers of rows. That explains the “predicate_range” part.

SQL Server realized:

  • Some parameters might produce less than 100 rows
  • Some might produce 100 to 1,000,000 rows
  • Some might produce over 1,000,000 rows

So SQL Server will build a small, medium, and large query plan for this one query, using different plans depending on the parameter that gets passed in. That’s pretty spiffy because the plan isn’t hard-coding specific parameter values – instead, each time the plan gets executed, SQL Server will look up that parameter in the statistics histogram and choose the small, medium, or large plan based on the value it guesses from the histogram.

To see what goes into the plan cache, let’s clear the cache, run the query, and then check the plan cache’s contents with sp_BlitzCache:

sp_BlitzCache shows that the outer stored procedure has been executed one time, and the inner statement has been executed once:

That’s kinda slick because it means SQL Server only put the medium-sized plan into cache. It didn’t build the small or large plan because it didn’t need to yet – those haven’t been executed yet, and they might never be executed.

Let’s execute the large data version, Reputation = 1. That produces a ton of rows because everybody gets 1 point when they first open their account:

The actual plan is a parallel table scan:

And SQL Server rewrote the query text to include a different QueryVariantID:

And the plan cache shows that the outer proc has been executed twice, with both the medium and large query being executed once each:

If you only read this far, and only do a short demo like that, it seems like Microsoft made huge leaps in solving the parameter sniffing issue. PSPO enables us to cache up to 3 execution plans per query – a small, medium, and large plan – and chooses between them at runtime. That’s how it’s going to look from conference stages as Microsoft brags about the effectiveness of this feature.

But let’s dig just a little bit deeper, and there are small, medium, and large problems.

Small problem: we still have sniffing.

Try running it for, say, Reputation = 3 and review the actual query plan:

Reputation = 3 reuses the query plan we built for Reputation = 2 – scroll up to the earlier screenshots if you wanna double-check my work. Note that SQL Server only estimated that it’d find 9,149 rows – that’s because the medium plan sniffed the first value it was called with, Reputation = 2.

Reputation = 3 brings back 21x more rows than Reputation = 2 did, so it reads more logical pages than there are in the table, only goes single-threaded, and the sort spills to TempDB.

If we free the plan cache, and then run it for Reputation = 3 first:

Then the “medium” plan is built with a parallel scan plan with a big memory grant, which works beautifully for Reputation = 3. However, it doesn’t work as well for Reputation = 2:

Which leaves that giant memory grant on the floor and generates a ton of CX% waits because the estimate is now 21x off in the other direction.

SQL Server 2022’s PSPO implementation doesn’t fix parameter sniffing – it amplifies the problem, because now we have more possible execution plans in memory, each of which gets sniffed. In most scenarios, I think this is still going to be a net win, because the small and large plans will likely be way less vulnerable to extreme performance emergencies. The medium plans will still be just as vulnerable.

Medium problem: direct equality searches only.

The PSPO implementation is more of a down payment than a full payment. It only works for equality searches, not range searches. One of the most common parameter sniffing problems is the date range issue: queries with start & end date parameters, like this.

PSPO doesn’t add the option hint to the query here:

Because PSPO is coded for direct equality searches only.

By direct, I mean direct comparisons to a column with known huge variances in cardinality. For an example of an inequality search, let’s take a table that has a lookup table: the Posts and PostTypes table. Stack Overflow stores all kinds of things in the Posts table, and they’re identified by their PostTypeId:

Questions and Answers are by far the most popular PostTypes:

So given that I’ve got an index on PostTypeId, which means we also have statistics on PostTypeId, this query could get different plans for different parameters:

Sadly, it does not, because PSPO doesn’t trigger here either – note that PSPO didn’t add an option hint on the query:

Note the wiiiiildly incorrect estimates on the number of Posts that will match. SQL Server’s using the density vector there, optimizing for the average PostTypeId rather than any specific one. PSPO won’t go so far as to:

  1. Fetch the PostTypeId for ‘PrivilegeWiki’, then
  2. Look up that PostTypeId in the statistics on the Posts table, then
  3. Notice that there’s a large skew by PostTypeId, and build different plans

Nope – that’s too much work for the PSPO implementation, at least in v2022.

Large problem: monitoring software is doomed.

If you look closely at the sp_BlitzCache screenshots, there’s a huge, massive, giant, unbelievably big new problem:

In the “Query Type” column, the two PSPO-generated queries just say “Statement.” When you’re looking at queries and plans – whether you’re looking at the plan cache, or sp_WhoIsActive, or your monitoring tools, or Query Store as Erik Darling demos – SQL Server can no longer tell where the query came from. It’s like every running statement is suddenly dynamic SQL with no parent.

For example, if you look at the query plan for a stored procedure that’s been “optimized” by PSPO, you get:

If you think that’s bad, there’s no way to join to the underlying statements. That’s not just bad, it’s poor. Really, really poor.

Going into SQL Server 2022 compatibility level simply breaks query performance monitoring.

And it looks bad even in shades.
I call ’em like I see ’em.

When the first previews of 2022 dropped, I played around with this feature and thought, “There is absolutely no way they’d ship something this broken.” I just kinda shrugged and moved on. But now, today, it’s heartbreaking to think this is the way the feature’s going to ship.

That’s PSPO.

Pronounced pss-poh, as in piss-poor.

I love the idea of parameter-sensitive plan optimization. It’s a great idea. Done correctly, this would make database apps go faster and reduce the troubleshooting involved when they’re having performance issues.

But this, this is just a PSPO implementation.

Previous Post
What’s New in SQL Server 2022 Release Candidate 0: Undocumented Stuff
Next Post
Drawing Entity Relationship Diagrams with Stable Diffusion

16 Comments. Leave new

  • I can imagine the technical people and marketing folks sitting around the table figuring out what to name this “feature”, coming up with PSPO and marketing saying, “We can’t call it that. Brent Ozar is going to name it ‘piss poh'”. Let’s abbeviate it “PSP Optimization” and hope he doesn’t catch on.” Nothing gets by you!

    In your dreams, how should this type of problem really be solved? Are there any database engines that do this well?

    Reply
  • Geoff Langdon
    August 26, 2022 8:29 am

    Do you know if we will be able to simply turn the feature off, so that at least the monitoring software will allow us to manually tune as we do at the moment, and we will get the proc name instead of ‘Statement’?

    Reply
  • I wonder why they didn’t use a solution as for the adaptive joins to show the multiple plans in the execution plan (okay, maybe because they would have had to calculate all 3 plans in this case).

    What happens if you query the reputation = 3 multiple times (after “initializing” the plan with reputation = 2)? Shouldn’t the “old” 2019 memory grant feedback feature kick in and grands more memory to this plan after some executions?

    Reply
    • Yep, all you have to do is run queries with the same parameters a few times and then they’ll be fast

      And never run them with any other parameters

      I’m sure users will understand and obey that

      What could go wrong

      Reply
  • Stephen Morris
    August 26, 2022 9:08 am

    Trying to be constructive perhaps Microsoft could re-use the numbered routines feature in some way

    ie put this in SSMS & see what happens

    use tempdb

    go

    create procedure p_test;1 as

    select ‘test1’

    return(0)

    go

    create procedure p_test;2 as

    select ‘test2’

    return(0)

    go

    create procedure p_test;3 as

    select ‘test3’

    return(0)

    go

    select * from sys.objects where name like ‘p_test’

    Select * from sys.numbered_procedures

    drop procedure p_test

    go

    select * from sys.objects where name like ‘p_test’

    Select * from sys.numbered_procedures

    Reply
  • I loved your blog post about Parameter Sniffing.
    Thank you for providing detailed, actionable information!

    Reply
  • That’s how I try (and fail) to get my cat’s attention: psss-poh pssss-poh pssss-poh

    Meh, now I’m pretty disappointed with the feature overall.

    Reply

Leave a Reply

Your email address will not be published.

Fill out this field
Fill out this field
Please enter a valid email address.