Are Table Variables as Good as Temporary Tables in SQL 2014?

There’s a couple of new features in SQL Server 2014 that provide options for how you work with temporary objects. Will inline index creation or memory optimized temporary tables forever change the way you code? Let’s take a look!

Inline Index Creation

SQL Server 2014 brings us a TSQL improvement called “inline specification of CLUSTERED and NONCLUSTERED indexes.” This feature works in Standard Edition and applies to many types of table. This improves the functionality of table variables. But does it make table variables as good as temp tables?

First, let’s take a look at the feature, then take it for a test drive.

Creating a table with an inline nonclustered index

Here’s a simple look at the new feature– when I create a table (nothing temporary about it), I can name and define an index on multiple columns. In this case, I’m creating a nonclustered index on the columns j and k:

I also have the option to put all my constraint and index create statements at the end of the table, like this:

What about temp tables?

The same syntax for inline index create listed above works just fine for me on temporary tables in my CTP of SQL Server 2014.

This is a good thing for some people! One of the issues with temporary tables in stored procedures is that creating an index on the temp table can prevent caching of the temporary item. For most people, this is no big deal, but for some frequently run stored procedures, it might make a difference.

Creating the nonclustered indexes on the temp table at creation time (instead of afterward) can improve temp table caching in stored procedures.

Before you rejoice, there’s some fine print. If you change your temp table syntax to use inline index creation and it enables caching, you might run into issues where statistics aren’t updated on your temporary tables when you think they would be. (The short version: Statistics are also cached for temp tables, and Gremlins keep from updating very frequently.) Just test carefully if you’ve got very frequently run stored procedures you’re modifying.

Inline index creation on table variables

The new syntax works on table variables, too! This means that with SQL Server 2014 and higher, you can create non-unique nonclustered indexes on table variables. You can even set the fillfactor option. (Note: I’m not saying you should use a lower fillfactor– I was just surprised that option was available for table variables.)

With the new SQL 2014 syntax, you will also have the option to create a non-unique clustered index on a table variable:

So, are table variables as good as temp tables now?

Well, sorry, not generally.

First, there’s some limitations to the “inline index create” feature. You can’t inline every kind of index you might want — you can’t add included columns or filters to indexes created with an inline create statement. Since you can add indexes with those to a temporary table after it’s created (and you CANNOT do that with a table variable), it has an advantage.

But temporary tables still have another advantage.

Statistics help temporary tables

When looking at the new feature, I wondered if these new indexes declared inline might secretly allow some sort of populated statistic on table variables– which hasn’t worked before.

But, unfortunately no. Even using the inline indexes, table variables do NOT get statistics to help the optimizer psychically predict how many rows will come back based on a specific predicate value. That can cause you to get less efficient execution plans. Let’s look at an example.

First up, the temp table:

For the temporary table, SQL Server uses statistics associated with the nonclustered index to estimate that it will get 25 rows back (which is right on). Based on this it decides to seek to the rows in the nonclustered index, then do a nested loop lookup to fetch the City column from the clustered index. It does 52 logical reads:

Key Lookup

Now let’s run the same code, but with a table variable with an inline index:

Oddly enough, it gets a clustered index scan. It estimates that only one row will be found — that’s because for table variables, statistics associated with the nonclustered index still can’t be populated. So it doesn’t know to estimate the 25 rows, and it just guesses one.

With a one row estimate, I thought the optimizer would surely go for the nested loop lookup. Looking up just one row is easy, right? But instead it decided to scan the clustered index. (Note: The nonclustered index is functional– it will use it if I omit the City column from the query. But it does not seem to want to do a key lookup from it in my testing here. Wacky!)

Clustered Index Scan

Finding: Inline Index creation is the most awesome for temp tables

I like the new syntax for its added flexibility. I do wish it allowed included columns, but for indexes with just key columns it can make for tidier, more compact code.

This feature doesn’t fix all the flaws of table variables– it’s interesting that I didn’t get a nested loop lookup in my testing, which makes me wonder if the optimizer has as many options with the indexes on table variables.

The ability of temporary tables to have column and index related statistics still gives them a great advantage in most situations. Using the inline index creation script on temporary tables in stored procedure to improve caching is a nice little bonus. Even with the gotcha I linked to above about statistics updates on temp tables, I think this feature makes the case for temporary tables even stronger.

What about Memory Optimized Table Variables?

First off, this feature is more expensive. SQL Server 2014 adds in new “memory optimized” tables, AKA “Project Hekaton”. This feature is only available in Enterprise Edition. I won’t come close to covering the whole feature here– I’ll just scratch the surface of one of its uses: the “Memory Optimized Table Variable”.

The first thing of note is that memory optimized tables DO support statistics— so does that mean that a Memory Optimized Table Variable might have them? Let’s take a look!

First, let’s enable memory optimized objects

To test this feature out, I need to make some changes to my test database:

Memory_Optimized = ON !

Now I can start creating my memory optimized table variable. First, I must create a table type with the table variable’s definition. I’m not going to test this in a natively compiled stored procedure just yet, so I’m leaving off the identity (it’s not supported in this scenario). I also use some special collations and don’t allow nullability on some columns to get the whole thing to work.

All right, now that our table type exists, we can put it to use! Let’s populate it and query it and check out what plan we get:

There’s a little surprise in the execution plan:

Index Seek on Memory Optimized Index

Estimated rows is still 1– statistics still aren’t working here. But magically we get an index seek on the nonclustered index instead of a scan on the clustered index. This is remarkable because back in the normal world of non-memory-optimized indexes, the City column would not be in that nonclustered index– we only asked for it to contain StateProvinceID!

Memory Optimized Indexes are Always Covering

In the brave new world of Memory Optimized tables, indexes have a motto: Always Be Covering. We’re not in Kansas any more Toto, and there’s no such thing as a Key Lookup or Bookmark Lookup anymore.

This means a couple of things: for these tables, it really doesn’t matter that you don’t have the INCLUDE syntax on inline index creation. Nonclustered indexes will always have all the columns, so why bother with an INCLUDE clause? (I’m guessing that’s why it’s not possible in the new syntax!) Also, memory optimized indexes may be larger than you’d assume because they cover everything.

Summing Up

For 99% of the use cases I see in the wild these days, temporary tables beat table variables easily because their ability to have statistics improves query optimization dramatically. That remains true with the enhancements in SQL Server 2014.

For the 1% of cases where stored procedures using temporary objects are run at high rates of frequency, there are a few new choices that are attractive. They both have gotchas and require testing, but they could both turn out pretty well:

Option 1: Inline Nonclustered Indexes on Temp Tables: This works in Standard Edition, so I think this will be the most widely adopted. Instead of creating a nonclustered index after the temp table exists, you can improve caching of your temp tables by creating nonclustered indexes with the temp table itself. Just make sure you don’t get into trouble with out of date statistics.

Option 2: Memory Optimized Table Variables: For you lucky people with Enterprise Edition, you can test these out. You won’t want to do exactly what I did above — you’ll probably want to use natively compiled stored procedures. You’ll definitely want to test at scale and make sure you can monitor performance of these new structures at load. That may be a little tricky: you don’t get actual execution plan or cost information for natively compiled procedures. Given the fact that the memory optimized table variables didn’t show any statistics in my testing, in complex implementations you could end up with some strange execution plans. (This post hasn’t covered questions like “How much memory are these structures using?” “How much memory do you have left?” “What happens if you run out?”) In short: there’s much to learn here– don’t rush in without making sure you can support it.

Previous Post
Microsoft Cloud Rebranded as Microsoft Pale Blue
Next Post
What happens to in-flight data compression in an emergency?

29 Comments. Leave new

  • Aaron Morelli
    April 2, 2014 11:41 am

    Kendra,

    Thanks for this! I’m finding that 2014 has all sorts of goodies. 🙂

    Regarding the strange index seek (when cardinality=1) for the table variable… I’ve noticed in the past that the optimizer sometimes doesn’t choose the cheapest plan when one or more tables have an estimate of 1 row. I wonder if this might be defensive programming by Microsoft since cardinality estimates of 1 are sometimes a default or a fall-back estimate. However, I’m not aware of any posts by Mr. White or any other gurus on this sort of behavior.

    The below example produces a hash join, whether table variables, temp tables, or permanent tables are used. (The forced loop join plan is not only cheaper in “cost units”, but wouldn’t use the min mem grant of 1 MB! So this seems like an intentional trade-off to me…just in case the estimate is way off):

    –SQL 2012 SP1 (11.0.3000); hash join cost=0.024341
    create table #mytab1 (i int)
    create table #mytab2 (i int)
    insert into #mytab1 (i) values (5)
    insert into #mytab2 (i) values (5)

    select *
    from #mytab1 t1
    inner join #mytab2 t2
    on t1.i = t2.i
    –option(loop join) — with hint, cost=0.0065707

    Reply
    • Aaron Morelli
      April 2, 2014 11:42 am

      Whoops! By “Strange index seek” I meant when the scan was chosen, rather than the seek. 🙁

      Reply
    • Kendra Little
      April 2, 2014 12:13 pm

      Hey Aaron,

      I think I got a little bit lost in your long reply with the different example. Do you have a question?

      Reply
      • Aaron Morelli
        April 2, 2014 12:23 pm

        No, sorry for the confusion.

        I just found the “scan instead of seek” behavior interesting, and was noting that I had seen other unintuitive query plan behavior when cardinality estimates were = 1, and that it might not necessarily be due to the use of table variables, but rather due to intentional coding by the optimizer team for low-cardinality estimates. I.e. as if they don’t trust an estimate of “1” as much as they might trust an estimate of “57”. But I’m speculating… 🙂

        Reply
        • Kendra Little
          April 2, 2014 12:30 pm

          Ah, got it.

          For what it’s worth, with your example you can get a nested loop lookup if you put a clustered index on the tables involved– doesn’t have to be unique, works for both the table variables and the temp table. Estimates are still 1. (At least on the version of SQL Server I just happened to be looking at.)

          Reply
          • Aaron Morelli
            April 2, 2014 1:41 pm

            Yeah, the behavior does seem to vary quite a bit. Since I can’t define the “why”, I can’t predict very well, so my takeaway has been to be on the lookout when estimates might = 1, no matter what structure (tabvar/table/msTVF-pre-2014, etc) is in use.

            We all need a personal Paul White to sit on our shoulder and explain this stuff to us as we hit it. 😛

  • Bill Lescher
    April 2, 2014 11:43 am

    Great post! Thanks Kendra!

    Reply
  • Hi Kendra,
    Somehow, I completely missed this enhancement. Thanks for exposing it.
    (Now it’s time to go revise my presentation…)

    Reply
  • Thanks! Enjoyed the read.

    Reply
    • Kendra Little
      April 5, 2014 2:02 pm

      Awesome! I’m so glad. It was a fun one to write, but I wasn’t sure if people would like such a long post.

      Reply
  • Great post! I’ve been playing with Memory Optimized tables this week. They’re quite fast and I can see some good usage for the non-Durable Memory Optimized tables with ETL loads. This is probably the feature we would implement if we went to SQL 2014 so that our ETL (especially transforms) could hang out in memory and not have to be written to disk at all. Have you played around with these at all? I would love to hear your thoughts on that.

    What led me to this article was I was trying to create a memory-optimized temp table (not allowed as I learned) but it’s interesting to learn that you can do that with a table variable. Great stuff!

    Reply
    • Kendra Little
      April 28, 2014 7:17 pm

      Do you mean for staging tables during an ETL?

      Honestly, I’d be tempted just to test how they do just using temporary tables in 2014. They’ve added an enhancement where it does less IO. It works in standard edition. It doesn’t require nearly the amount of fancy indexing or database configuration, and not nearly as much can go wrong. It does require having sufficient memory to see the benefit, but hey, that’s no different.

      I have a related post on this going live later this week.

      http://blogs.msdn.com/b/psssql/archive/2014/04/09/sql-server-2014-tempdb-hidden-performance-gem.aspx

      Reply
      • Yes staging tables 🙂 Please, no one use non-Durable Memory optimized tables for your production Fact Tables 🙂

        I see your point. I was thinking from the perspective of testing our ETL loads. We use stage tables because we can review the data if needed, and honestly it was done that way and we don’t have the resourcing to switch to doing the transforms with SSIS (better option!). With the mem-optimized table, it would be much faster but I just thought about the fact that if we wanted to reclaim that space in memory we would probably have to truncate (I doubt LRU affects Non-Durable tables – please confirm if you know). So basically, it gets me back to square one. Putting something more complex together when I could possibly use temp tables, create tempDB on a SSD and get things relatively faster and have it be less complex.

        One more question. Do you know if in-memory tables are supported with the new Buffer pool extension?

        Reply
        • Kendra Little
          April 28, 2014 7:33 pm

          Do you mean have the in-memory table actually be backed by SSD?

          Just create a normal filegroup on on the SSD 🙂

          Reply
          • Duh! I guess I got a lot of sun doing the lawn over the weekend. My brain isn’t working so well!

            I’ll go back to agreeing with your first response, keeping things simple 🙂

            Thanks!

  • Basic rule of thumb from this website:
    http://blogs.davelozinski.com/curiousconsultant/optimizing-sql-strategies
    When using small amounts of data (eg, a few rows), table variables are great! Otherwise, #temp tables are the way to go.

    Reply
    • Johnny – except the problem is, he’s wrong. If you join that table variable with a few rows to other tables, you can get bad execution plans.

      Reply
      • Kendra and/or Brent, below is some advice I recently provided to a developer. Could you point out where my explanation is incomplete or incorrect? I’m a fan or your website and am always looking to improve my knowledge. Thanks in advance.

        ****************************************************************************************************
        Table variables are best used to hold small result sets, because the database engine does not produce column statistics on table variables. It will produce column statistics on temporary tables. Column statistics allow the query engine to estimate how many rows will meet the conditions of the WHERE and JOIN clauses, because they represent the distribution of values in a column. If you populate a table variable and use it in a query, the query engine will always estimate that there is only one row in the table variable. In fact, there could be a billion rows. Because of that “mistaken” estimate, the resulting query plan could be bad for the amount of data involved. For example, it could end up using a nested loop join instead of a hash join. (Nested Loop, Hash and Merge joins are what the query engine uses to connect data sets; not to be confused with the logical concepts of INNER and OUTER joins).

        The workaround is to use the query hint OPTION RECOMPILE in the query that references the table variable. But this defeats the purpose of a table variable, which were introduced precisely because they would not force the code to be recompiled every time it is executed. Whereas creating a temporary table causes a recompile.

        Another thing is that you have more indexing options on temporary tables as opposed to table variables. When you create an index on a temporary table, a corresponding statistic is also created for the combination of columns defined in that index. And that statistic, in addition to the column statistics automatically created when the temporary table is queried, allow the query engine to produce a more efficient plan.

        Reply
        • Hi Bill. This is a little beyond what we can handle in a blog post comment – we actually tackle this in our in-person training class modules, and in my 90-minute video How to Think Like the SQL Server Engine: https://www.brentozar.com/training-videos-online/how-to-think-like-sql-server/

          Reply
          • Is this argument changed with trace flag 2453 in conjunction with SQL Server 2014’s Cumulative Update 3 and 2012’s SP2?

          • Hey Gabe,

            I think the argument gets more complicated, but I still don’t think table variables are as good. The trace flag works in some situations more than others (I believe it’s joins), and it doesn’t solve some problems with table variables such as the fact that modifying their contents requires a serial plan. But I don’t think I even have that little factoid in this article!

            I do think that trace flag might be useful to some people who have legacy code that they can’t modify, though.

  • Costa Zachariou
    March 1, 2016 3:32 am

    Hi Kendra.

    Beautiful explanation, thank you. Do we need to delete temporary tables or will SQL Server take care of this?

    Reply
  • how to get table variable
    can you help me?

    Reply
  • Just had a perfect example of this how table variables give you bad plans with a SP that made it all the way into production before anyone looked at the query plan.

    Reply
  • Hi Kendra,
    I could not see any benefit with inline index on table variables, I have tried inserting 500 records on indexed table and Non indexed table variable, but i could see that data retrieval is not faster , execution plan is showing that it is going for clustered index scan

    Here is my code:
    DECLARE @TBLWITHOUTNCINDEX AS TABLE (ID INT identity (1,1) PRIMARY KEY,LNAME VARCHAR(50) default ‘kiran’,commonid int default (rand()*(100-1)+1) )
    DECLARE @TBLWITHNCINDEX AS TABLE (ID INT identity (1,1) PRIMARY KEY,LNAME VARCHAR(50) default ‘latheesh’,commonid int default(rand()*(100-1)+1) INDEX IX_commonid NONCLUSTERED )
    declare @NumberOfRecordsToInsert int
    SET @NumberOfRecordsToInsert = 500;
    WHILE (@NumberOfRecordsToInsert > 0)
    BEGIN
    INSERT INTO @TBLWITHOUTNCINDEX default VALUES
    INSERT INTO @TBLWITHNCINDEX default VALUES
    SET @NumberOfRecordsToInsert = @NumberOfRecordsToInsert – 1
    END
    Set statistics XML on
    SELECT * FROM @TBLWITHOUTNCINDEX where commonid=20
    SELECT * FROM @TBLWITHNCINDEX where commonid=20

    Reply
  • It’ll be interesting to see if the temp table will cache when creating multiple inline indexes. I’ll need to experiment with this.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu
{"cart_token":"","hash":"","cart_data":""}