ColumnStore Indexes: Rowgroup Elimination and Parameter Sniffing In Stored Procedures

Last Updated 4 years ago

Columnstore Indexes, Development, Execution Plans, Indexing, Parameter Sniffing, SQL Server, SQL Server 2017

Yazoo

Over on his blog, fellow Query Plan aficionado Joe Obbish has a Great Post, Brent® about query patterns that qualify for Rowgroup Elimination. This is really important to performance! It allows scans to skip over stuff it doesn’t need, like skipping over the dialog in, uh… movies with really good fight scenes.

Car chases?

Soundtracks?

Soundtracks.

Fad Gadget

With Joe’s permission (we’re a polite people, here) I decided to pick on one of the queries that was eligible for Rowgroup Elimination and stick it in a stored procedure to see what would happen. I’m interested in a couple things.

Is Rowgroup Elimination considered safe with variables?
Will the plan change if different numbers of Rowgroups are skipped?

With those in mind, let’s create a proc to test those out. Head on over to Joe’s post if you want the setup scripts.

DROP PROCEDURE IF EXISTS #wonky_eye;
GO

CREATE PROCEDURE #wonky_eye
(
    @idlow BIGINT,
    @idhigh BIGINT
)
AS
    BEGIN

        SELECT  MAX(ID)
        FROM    dbo.MILLIONAIRE_CCI
        WHERE   ID BETWEEN @idlow AND @idhigh;

    END;
GO

DROP PROCEDURE IF EXISTS #wonky_eye;

CREATE PROCEDURE #wonky_eye

(

@idlow BIGINT,

@idhigh BIGINT

)

BEGIN

SELECT MAX(ID)

FROM dbo.MILLIONAIRE_CCI

WHERE ID BETWEEN @idlow AND @idhigh;

END;

With literal values, the optimizer/storage engine/unicorn toenails behind the scenes are able to figure out which Rowgroups are needed to satisfy our query.

With stored procedures, though, the first execution will cache a particular plan, and the rest of the queries will reuse that plan.

Let’s test our first hypothesis! Will different passed in values result in appropriate Rowgroup Elimination?

EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 1000
EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 10000
EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 100000
EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 1000000
EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 10000000
EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 100000000

EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 1000

EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 10000

EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 100000

EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 1000000

EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 10000000

EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 100000000

All of those queries use the exact same execution plan:

But do they all use it as efficiently?

Well, in short, no.

Things start off okay, and to be fair, Rowgroup Elimination occurs as appropriate.

Table 'MILLIONAIRE_CCI'. Segment reads 1, segment skipped 99.

Table 'MILLIONAIRE_CCI'. Segment reads 1, segment skipped 99.

Table 'MILLIONAIRE_CCI'. Segment reads 1, segment skipped 99.

Table 'MILLIONAIRE_CCI'. Segment reads 1, segment skipped 99.

Table 'MILLIONAIRE_CCI'. Segment reads 10, segment skipped 90.

Table 'MILLIONAIRE_CCI'. Segment reads 1, segment skipped 99.

Table 'MILLIONAIRE_CCI'. Segment reads 10, segment skipped 90.

But once our high ID hits 10000000, things start to slow down.

SQL Server Execution Times:
  CPU time = 0 ms,  elapsed time = 0 ms.

SQL Server Execution Times:
  CPU time = 0 ms,  elapsed time = 1 ms.

SQL Server Execution Times:
  CPU time = 0 ms,  elapsed time = 5 ms.

SQL Server Execution Times:
  CPU time = 47 ms,  elapsed time = 50 ms.

SQL Server Execution Times:
  CPU time = 500 ms,  elapsed time = 492 ms.

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 0 ms.

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 1 ms.

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 5 ms.

SQL Server Execution Times:

CPU time = 47 ms, elapsed time = 50 ms.

SQL Server Execution Times:

CPU time = 500 ms, elapsed time = 492 ms.

And by the time we hit ID 100000000, things have melted into a fondue almost no one would want.

Table 'MILLIONAIRE_CCI'. Segment reads 100, segment skipped 0.

 SQL Server Execution Times:
   CPU time = 5110 ms,  elapsed time = 5095 ms.

Table 'MILLIONAIRE_CCI'. Segment reads 100, segment skipped 0.

SQL Server Execution Times:

CPU time = 5110 ms, elapsed time = 5095 ms.

Five seconds! Five! Who has that kind of time on their hands?

The performance cliff can be further exposed by incrementing IDs between 10000000 and 100000000.

While Rowgroup Elimination occurs just like before, CPU keeps on going up.

EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 20000000
EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 30000000
EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 40000000
EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 50000000
EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 60000000
EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 70000000
EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 80000000
EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 90000000

EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 20000000

EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 30000000

EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 40000000

EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 50000000

EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 60000000

EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 70000000

EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 80000000

EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 90000000

It looks like we found a potential tipping point where a different plan would be more effective.

SQL Server Execution Times:
  CPU time = 1000 ms,  elapsed time = 995 ms.

SQL Server Execution Times:
  CPU time = 1469 ms,  elapsed time = 1463 ms.

SQL Server Execution Times:
  CPU time = 1953 ms,  elapsed time = 1951 ms.

SQL Server Execution Times:
  CPU time = 2437 ms,  elapsed time = 2440 ms.

SQL Server Execution Times:
  CPU time = 2922 ms,  elapsed time = 2926 ms.

SQL Server Execution Times:
  CPU time = 3422 ms,  elapsed time = 3414 ms.

SQL Server Execution Times:
  CPU time = 3906 ms,  elapsed time = 3916 ms.

SQL Server Execution Times:
  CPU time = 4391 ms,  elapsed time = 4392 ms.

SQL Server Execution Times:

CPU time = 1000 ms, elapsed time = 995 ms.

SQL Server Execution Times:

CPU time = 1469 ms, elapsed time = 1463 ms.

SQL Server Execution Times:

CPU time = 1953 ms, elapsed time = 1951 ms.

SQL Server Execution Times:

CPU time = 2437 ms, elapsed time = 2440 ms.

SQL Server Execution Times:

CPU time = 2922 ms, elapsed time = 2926 ms.

SQL Server Execution Times:

CPU time = 3422 ms, elapsed time = 3414 ms.

SQL Server Execution Times:

CPU time = 3906 ms, elapsed time = 3916 ms.

SQL Server Execution Times:

CPU time = 4391 ms, elapsed time = 4392 ms.

State of Confusion

So where are we? Well, we found that Rowgroup Elimination is possible in stored procedures with ColumnStore indexes, but that the cached plan doesn’t change based on feedback from that elimination.

Good news: elimination can occur with variables passed in.
Bad news: that cached plan sticks with you like belly fat at a desk job

Remember our plan? It used a Stream Aggregate to process the MAX. Stream Aggregates are preferred for small, and/or ordered sets.

How do we know this is the wrong plan, here? How can we test it?

If we run the stored proc once the regular way, and once with a RECOMPILE hint, well…

EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 100000000
EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 100000000 WITH RECOMPILE

1 2	EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 100000000 EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 100000000 WITH RECOMPILE

The query plans decide to be cheeky and have similar costs. You’ll notice that the Hash Match plan is 51% of the cost, and the Stream Aggregate plan is 49%. The relative difference is tiny, though. The Stream Aggregate plan costs 27.8431 query bucks, and the Hash Match plan costs 28.8442 query bucks. That’s a difference of 1.0011 query bucks, which is exactly the cost difference between the Stream Aggregate operator, and the Hash Match operator.

But in this case, cost is one of those awful, lying metrics. Let’s look at the different execution times.

SQL Server Execution Times:
  CPU time = 4969 ms,  elapsed time = 4957 ms.

SQL Server Execution Times:
  CPU time = 47 ms,  elapsed time = 49 ms.

SQL Server Execution Times:

CPU time = 4969 ms, elapsed time = 4957 ms.

SQL Server Execution Times:

CPU time = 47 ms, elapsed time = 49 ms.

The plan with the RECOMPILE hint finished in 49ms. That’s, like, 1000x faster. The difference of course is the Hash Match Aggregate operator replacing the Stream Aggregate operator.

So what happens if we run things in reverse order after freeing the cache?

EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 100000000

EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 1000

EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 100000000

EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 1000

It turns out that the Hash Match Aggregate plan is just as good for the ‘small’ query. It has the same metrics that it did when using the Stream Aggregate plan when it ran first. In this case, there are no other differences to account for, like Key Lookups, or Parallelism.

Happy Mondays

We’ve come this far, so let’s answer one last question: At which point will ol’ crazypants choose the Hash Match plan over the Stream Aggregate plan? To answer that, we’ll go back to our original test scheme, and add in recompiles so each execution gets a fresh plan.

EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 1000 WITH RECOMPILE
EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 10000 WITH RECOMPILE
EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 100000 WITH RECOMPILE
EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 1000000 WITH RECOMPILE
EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 10000000 WITH RECOMPILE
EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 100000000 WITH RECOMPILE

EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 1000 WITH RECOMPILE

EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 10000 WITH RECOMPILE

EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 100000 WITH RECOMPILE

EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 1000000 WITH RECOMPILE

EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 10000000 WITH RECOMPILE

EXEC dbo.#wonky_eye @idlow = 1, @idhigh = 100000000 WITH RECOMPILE

The answer is, unfortunately, pretty early on. The second query chooses the Hash Match plan, which means that, well, almost every single execution would have been better off with a different set of values passed in first, and the Hash Match plan chosen. Hmpf.

Human League

What did we learn, after all these words? Rowgroup Elimination is possible in stored procedures, but it doesn’t provide any feedback to plan choice. No matter how many were eliminated or not, the plan remained the same. As eliminations decreased, performance got worse using the ‘small’ plan. This is textbook parameter sniffing, but with a twist.

I also tested this stuff on 2017, and there was no Adaptive magic that I could find.

Thanks for reading!

Should You Use Always Encrypted?

5 Comments. Leave new

Joe Obbish
August 23, 2017 9:04 am

Great Post, Brent®!

Reply
- Erik Darling
  August 23, 2017 10:14 am
  
  Thanks, Brent®!
  
  Reply
  - OZ
    August 24, 2017 5:52 am
    
    I bet that Eric guy is jealous of this well-written Brent® post.
    
    Reply
Anders Uhl Pedersen
March 5, 2018 3:32 pm

Excellent post, but in all fairness 49 ms is only a hundred times faster than 4957

Reply
- Erik Darling
  March 5, 2018 4:00 pm
  
  Cool, so every time you take out your phone I want you to count to 5 before you turn on the screen, unlock it, read a message, respond to a message, or make a phone call. Sound good?
  
  Reply