Query Plans: Trivial Optimization vs Simple Parameterization

Facemaking

You know when you think you know something, and the obviousness of it makes you all the more confident that you know it?

That’s usually the first sign that there’s a giant gotcha waiting for you.

And that’s what happened to me over the weekend.

The setup

I answered a question.

And after I answered it, I got to thinking… Would partitioning help? After all, smart people agree. Partitioning is ColumnStore’s friend.

So I started looking at it a little bit more differenter.

First, I had to set up a partitioning function and scheme. I used dynamic SQL to create the function because no way in Hades am I typing numbers from 1-999.

Next, I loaded in a billion rows.

Yes, a billion.

And you know, it turns out generating a billion rows at once is terribly slow. So I wrote an awful loop to insert 1 million rows 1000 times.

Then, you know, I needed some indexes.

Lemme be straight with you here: don’t create the indexes first. After about the 4th loop, things grind to a halt.

This is why other smart people will tell you the fastest way to load data is into a HEAP.

Lemme be straight with you about something else: I couldn’t figure out how to just create the ColumnStore index on the partitioned table. I had to create a regular clustered index, and then the ColumnStore index over it with DROP_EXISTING.

So there I had it, my beautiful, billion-row table.

Partitioned, ColumnStore-d.

What could go wrong?

Plantastic

Let’s look at one query with a few variations.

The plan for it is alright. It’s fairly straightforward and the query finishes in about 170ms.

We can see from the graphical execution plan that it’s been Simple Parameterized. SQL Server does this to make plan caching more efficient.

GOLD STARS FOR EVERYONE

With parameters in place rather than literals, the plan is accessible to more queries using the same predicate logic on the Id column.

We can also see the plan is Trivial. If you’re following along, hit F4 while the SELECT operator is highlighted in the query plan , and a Properties pane should open up magically on the right side of the screen.

Pallies

Well, I can change that! I know a trick! I learned it from Paul White!

If I add 1 = 1 somewhere in my WHERE clause, I can get around Simple Parameterization and the Trivial plan.

Right?

Heh heh heh, that’ll show you.

Stringy

Oh but, no. The plan is still trivial, and it still runs in about 172ms.

D.R.A.T.

So how do you beat the trivial plan?

You use yet another trick from Paul White.

Dear Paul,

Thanks.

— The rest of us

If you stick a goofy subquery in, the optimizer will regard it with contempt and suspicion, and give it a full pat down.

Here’s the plan for this query.

Still not Simple!

And, amazingly, it gets FULL optimization.

Full of what, exactly?

Why is this better?

Astute observers may have picked up that the plan changed a little bit. Earlier plans that got Trivial optimization used a Stream Aggregate operator, where the Full optimization plan uses a Hash Match Aggregate.

What’s the deal with that?

Aggregatin’

To make matters more interesting, the Full optimization query finishes in 11ms.

That’s down from 172ms for the Trivial plan.

What else was different?

There’s a really important difference in the two plans.

The Trivial plan was run in RowStore execution mode. This is the enemy of ColumnStore indexes.

It’s essentially using them the old fashioned way.

Voices carry

The plan that gets Full optimization runs in Batch execution mode, and this makes everyone very happy.

CAUGHT

This is why sp_BlitzCache will warn you about both Trivial plans, and ColumnStore indexes being executed in RowStore mode.

Because that’s why.

Thanks for reading!

Previous Post
New #SQLPASS Summit Pre-Con: Expert Performance Tuning for SQL Server 2016 & 2017
Next Post
[Video] Office Hours 2017/06/14 (With Transcriptions)

4 Comments. Leave new

  • Emanuele Meazzo
    June 16, 2017 8:46 am

    Damn you trivial optimization for columnstore!
    In which version of SQL Server did you test this? Seems like something that needs to be fixed by M$

    Great post btw!

    Reply
    • Erik Darling
      June 16, 2017 9:30 am

      Hi there! Thanks! This would have been on 2016, though I don’t recall which CU of SP1 my server was up to at the time.

      Reply
      • Emanuele Meazzo
        June 16, 2017 10:31 am

        I just tried on the latest CU (13.0.4435.0) same results.
        This is definitely something to keep in mind if working with columnstore

        Reply
  • “If you stick a goofy subquery in, the optimizer will regard it with contempt and suspicion, and give it a full pat down.”

    This behavior appears to be quite erratic, or at least it is in SQL Server 2012:
    — constants
    SELECT COUNT_BIG(*) — 2.1 TRIVIAL, fast 10ms
    FROM Orders ct
    WHERE ct.OrderDate BETWEEN CAST(‘2017-12-01’ AS DATETIME) AND CAST(‘2017-12-31’ AS DATETIME)

    — cross apply values
    SELECT COUNT_BIG(*) — 2.2 TRIVIAL, fast 10ms
    FROM Orders ct
    CROSS APPLY (
    VALUES (CAST(‘2017-12-01’ AS DATETIME), CAST(‘2017-12-31’ AS DATETIME))
    ) x (StartDate, EndDate)
    WHERE ct.OrderDate BETWEEN x.StartDate AND x.EndDate

    — cross apply select
    SELECT COUNT_BIG(*) — 2.3 TRIVIAL, fast 10ms
    FROM Orders ct
    CROSS APPLY (
    SELECT StartDate = ISNULL(CAST(‘2017-12-01’ AS DATETIME),GETDATE()), EndDate = ISNULL(CAST(‘2017-12-31’ AS DATETIME),GETDATE())
    ) x
    WHERE ct.OrderDate BETWEEN x.StartDate AND x.EndDate

    — “goofy subquery” constants
    SELECT COUNT_BIG(*) — 2.4 FULL, slow 230ms
    FROM Orders ct
    WHERE ct.OrderDate BETWEEN (SELECT CAST(‘2017-12-01’ AS DATETIME)) AND (SELECT CAST(‘2017-12-31’ AS DATETIME))

    If you were to follow this through with a partitioned table, the effects are even more unpredictable: the “full pat down” appears to be more likely to result in a disastrous plan:

    SELECT COUNT_BIG(*) — Q9 — SUM(Amount)
    FROM MySixtyBillionRowTableWith70Partitions ct
    WHERE ct.TranDate BETWEEN (SELECT CAST(CAST(‘2017-12-01’ AS DATETIME) AS DATE)) AND (SELECT CAST(CAST(‘2017-12-31’ AS DATETIME) AS DATE))

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.