Table Valued Parameters: Unexpected Parameter Sniffing

Like Table Variables, Kinda

Jeremiah wrote about them a few years ago. I always get asked about them while poking fun at Table Variables, so I thought I’d provide some detail and a post to point people to.

There are some interesting differences between them, namely around how cardinality is estimated in different situations.

Table Valued Parameters are often used as a replacement for passing in a CSV list of “things” to then parse, usually with some God awful function. This helps by passing in the list in table-format, so you don’t have to do any additional processing. TVPs don’t have the exact same problems that Table Variables do, but there are some things you have to be aware of.

I’m going to go over how they’re different, and how a trace flag can “help”.

I know you’re gonna ask: “Should I just dump the contents out to a temp table?”

And I’m gonna tell you: “That’s a whole ‘nother post.”

What We’re Testing

Since I usually care about performance (I guess. Whatever.), I want to look at them with and without indexes, like so:

Mr. Swart points out in the comments:

The name of the primary key in your example turns out to be PK_#A954D2B_3214EC07D876E774 which gets in the way of plan forcing, a super-useful emergency technique.

Here’s the stored procedure we’ll be calling:

Now, I’m going to square with you, dear reader. I cannot write C#. If it were me in that scene from Swordfish, we’d all be dead.

Or whatever the plot line was there. Hard to tell.

I had to borrow most of my code from this post by The Only Canadian Who Answers My Emails.

Mine looks a little different, but just so it will Work On My Computer®.

And also so I can test data sets of different data sizes. I had to make the array and loop look like this:

By the end of the blog post, you’ll be glad I didn’t introduce much more variation in there.

Results

I have another confession to make: I used Extended Events again. It still burns a little. At least there was no PowerShell though.

First Thing: There’s a fixed estimate of 100 rows for inserts

This is regardless of how many rows actually end up in there. This is the insert from the application to the TVP, so we’re clear.

Jump, jump
Maybe that’s not so great…
Second Thing: They don’t get a fixed estimate like Table Variables

Table Variables (unless you Recompile, or use a Trace Flag), will sport either a 1 or 100 row estimate, depending on which version of the Cardinality Estimator you use. The old version guesses 1 row, the new guesses 100 rows.

Table Valued Parameters get sniffed, just like… parameters! When the first plan in the cache is a 100k row TVP, we keep that number until recompilation occurs.

Nice guess…
… Be a shame if something happened to it
Third Thing: Non-join cardinality estimates behave like local variables (fixed estimates)

And I am not a very good graphic artist.

Hoowee.

Without the PK/CX, we get 10% for equality, 30% for inequality, and 9% for two inequalities.

This is detailed by Itzik Ben Gan here.

Fourth Thing: This didn’t get much better with a PK/CX

The only estimate that improved was the direct equality. The rest stayed the same.

Captain Obvious, et al.
Fifth Thing: The index didn’t change estimates between runs

We still got the “parameter sniffing” behavior, where whatever row count was inserted first dictated cardinality estimates until I recompiled.

Apologies, no pretty pictures here.

Sixth Thing: Trace Flag 2453 definitely did help

With or without the index, turning it on improved our cardinality estimates, but had little *ffect on standardizing capitalization of key words.

I didn’t have to recompile here, because there was enough of a difference between 100 and 100,000 rows being inserted to trigger a recompilation and accurate estimate.

Pictures don’t do this part justice either.

Go Away

Table Valued Parameters offer some nice improvements over Table Variables (I know, I know, there are times they’re good…).

But they’re also something you need to be careful with. Sniffed parameters here can cause plan quality issues just like regular parameters do.

Boohoo.

In this plan, the 100 row estimate lead to a spill. To be fair, even accurate estimates can result in spills. It’s just hard to fix an accurate estimate 😀

  • The initial insert to the TVP is still serialized (just like table variables), so use caution if you’re inserting a lot of rows
  • There are no column level statistics, just like Table Variables, even with indexes (this leads to fixed predicate estimations)
  • Trace Flag 2453 can improve estimates at the cost of recompiles
  • The fixed estimate of 100 rows for each insert may not be ideal if you’re inserting a lot of rows

Thanks for reading!

Previous Post
SQL Server 2017 CU5: Finding Problems With Parallelism
Next Post
So You Wanna Debug SQL Server Part 1

10 Comments. Leave new

  • Willie Bodger
    March 23, 2018 11:47 am

    So, I use TVPs to import data sets into a temp table in SQL Server from various sources (CSV, XLSX, PDF, TXT, HTML etc.). I have found TVPs to be much faster than looping thru a file and sending each row one at a time (the C# developer default method from what I have seen). It seems that you are saying there are some potential traps waiting in TVPs for me, or is yours really a different use case? I typically will import from 1-20 files with anywhere from 1 to ~1500 rows.

    I guess I’m wondering if there is a better way to programmatically insert a data set into SQL server?

    Thanks,
    wb

    Reply
  • For passing data into a stored proc would the following approach be better?

    1. Pass in data via TVP
    2. Immediately dump that TVP into a local temp table
    3. Work with that temp table.

    Reply
    • Dmitriy, I would not recommend that approach. You are wasting a lot of resource by copying all rows from one temporary table-structure (the TVP) to another (the temporary table).

      If you notice that you are getting bad performance due to incorrect cardinality estimations of the TVP (and note that just getting a bad estimation does not mean you will get bad performance – only fix stuff that is actually broken!), then my first attempt would be to just add OPTION (RECOMPILE) at the end of the affected query. This will cause the statement to be recompiled before each execution, at which time the number of rows in the TVP can be sniffed by the compiler. So you’ll always get a plan for the exact table size. (There are still no statistics so if you add a predicate you can introduce new inaccuracies).

      Yes, the extra compilations do introduce a bit of overhead. But far less than dumping everything in a temporary table (especially because that will probably trigger a recompile of ALL statements that use the temp table instead of just the ones you mark for recompile).

      Reply
  • Willie Bodger
    March 23, 2018 12:24 pm

    That is exactly what I am doing. I was wondering if there was some more efficient way. Since the calls are one after the other, I am not using any traceflags or hints, just wondering if I have missed a more efficient way…

    Reply
    • Erik Darling
      March 23, 2018 1:12 pm

      Willie — that sounds like the best way to do it to me. If you want to post more detailed information on dba.stackexchange.com, you may get a wider variety of opinions. I don’t have to manage any processes like that.

      Thanks!

      Reply
  • Catching up after a busy week. I missed this one the first time around. (I will continue to answer your emails)

    This is a great post on TVPs… a solution with a million caveats. They can be a really useful tool if you know how to avoid stepping on the landmines. Thanks for the shout-out and good job on the C#!

    If this was in a pull request, my only advice would be to give the table type a name. Since we usually care about performance (I guess, whatever).

    The name of the primary key in your example turns out to be PK_#A954D2B_3214EC07D876E774 which gets in the way of plan forcing, a super-useful emergency technique.

    Reply
    • Erik Darling
      March 28, 2018 8:30 am

      Ooh, good call on that. I’ll get this spiffy in a jiffy!

      Thanks, Michael!

      Reply
    • I found out the hard way that if I give the PK on a temp table (that I created in a stored proc) a name, it is no longer unique per session.
      When doing concurrent calls to this SP I got “There is already an object named ‘PK_temptable’ in the database.”.
      Maybe this is a different case with TVPs?

      Reply
      • Sven, I ran into the same problem some time ago. It took a while, but I found out that constraint names must be unique within the schema. Instead of creating a primary key in your TVP definition, create a unique index, clustered or non-clustered to fit your need.

        Reply
  • And now to complete my thought… The index name is scoped to the table / TVP, so you won’t run into the name collision again.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.