SELECT INTO and non-nullable columns

SQL Server
22 Comments

SELECT…INTO  is one of my favorite SQL Server features.

It’s great for creating table skeletons with false WHERE clauses (1=2), moving a real table to a staged/temp location for testing, etc.

In SQL Server 2014

It acquired the ability to go parallel, which is pretty neat, but that’s not what we’re talking about here.

It has some limitations

Chief among them is this:

Indexes, constraints, and triggers defined in the source table are not transferred to the new table, nor can they be specified in the SELECT…INTO statement. If these objects are required, you can create them after executing the SELECT…INTO statement.

Which is… Sort of true. There’s a trick, and I’ll show you a quick example here with another of my favorite things: a Numbers table.

Trying to add the PK constraint here fails, because the column is NULLable

Msg 8111, Level 16, State 1, Line 37
Cannot define PRIMARY KEY constraint on nullable column in table '#NumbersTest'.
Msg 1750, Level 16, State 0, Line 37
Could not create constraint or index. See previous errors.

We can verify this by looking at the table metadata:


name     is_nullable
N           1

So how do we fix this? We could alter the table, but that won’t leave us with the lasting satisfaction of proving BOL wrong on a technicality. We’ll adjust our code a bit, and try again.

This time, with the addition of an ISNULL check on the column, it ‘inherits’ the not NULLable property, and the PK constraint adds successfully. We can verify that by checking the table metadata, if you don’t believe me:

name     is_nullable
N           0

Note that this same behavior does not occur if you replace ISNULL() with COALESCE()

And with that, you can clean up the test tables. Or keep them around. Do some math. Whatever blows your hair back.

Got a favorite use for a numbers table outside of string splitting or doing date math? Let me know in the comments! I may write a follow up.

Brent says: Wow. That is a really slick trick.

Kendra says: I know some people have bias against SELECT INTO because it seems sloppy and you have to go the extra mile to get the right types, but it can be a great tool. Don’t rule it out.

Jeremiah says: I really like this method – it’s a quick way to copy tables without using other cumbersome techniques

Previous Post
The @DBAreactions Guide to In-Memory OLTP (Hekaton)
Next Post
Moving Databases Made Easy – SQL Server on a File Share

22 Comments. Leave new

  • Dave Vickery
    April 14, 2015 8:37 am

    It’s worth noting that you can tackle the opposite situation with NULLIF. A SELECT INTO will occasionally set a column as NOT NULLABLE (presumably using the inherited properties you’ve explained above.) This causes an INSERT failure when you come to adding in some dummy rows, for example.

    Wrapping the column inside a NULLIF in your SELECT INTO statement will force it to be set as NULLABLE. Obviously, you want to use a nice obscure value to make sure you’re not destroying any data. eg:

    SELECT NULLIF([Age], -999) AS [Age]

    Dave.

    Reply
  • The real fix would be for the SQL Server team to give us parallel DML. Even if they imposed restrictions such as TABLOCKX and an empty destination table that would still be better than SELECT INTO most of the time.

    Reply
  • I do the same thing using top 0 * in my select statement if you want to skip the where.

    Reply
  • One of the things I’ve done in the past is add a blank column on my output. So say I’m creating an archive table.

    SELECT * INTO TableArchive FROM Table

    But I want to add a new, blank column. I’ll put a NULL column in my field list. But that creates an INT column. If I want say a datetime column I can wrap a CAST or CONVERT around it to force a data type.

    SELECT *, CAST(NULL as DateTime) AS ArchiveDate INTO TableArchive FROM Table

    Reply
  • raulggonzalez
    April 14, 2015 11:47 am

    Completely agree, I love it also because of
    – Can be minimally logged (bulk/simple recovery model)
    https://technet.microsoft.com/en-us/library/ms191244%28v=sql.105%29.aspx
    – The only place you can use IDENTITY function!
    https://msdn.microsoft.com/en-gb/library/ms189838.aspx

    True that it’s tricky for NULLABLE columns but for create dummy tables as you already pointed is great!

    Reply
    • The IDENTITY function clearly is useful and valid to use. But it strikes me as such a hack in the product that it is appalling. Expressions are supposed to be composable and usable everywhere. This function must appear in a very specific syntactic context for it to make sense. For example you can’t write IDENTITY(myCol) + 1.

      Reply
      • raulggonzalez
        April 14, 2015 12:48 pm

        Well, IDENTITY() function does not apply to a column, it generates a new column with the property IDENTITY. (when combined to INTO)

        And why would you want to use IDENTITY(INT, 1, 1) + 1 when you can say IDENTITY(INT, 2, 1)?

        Can you elaborate, please?

        Reply
        • The point is that IDENTITY does not behave like any other function. You can’t use it in places where you normally can use a function. You can’t use it as subexpressions and you can’t use it in, say, derived tables (because it doesn’t make sense there). It’s bad programming language design.

          Consider the following alternative clean design:

          select …
          into NewTable (ID INT NOT NULL IDENTITY, [other columns here])

          With that syntax there’s no hack and you can specify the target table schema inline (including indexes and whatnot). That’s how it should be.

          Don’t be lead to believe that whatever is in T-SQL right now makes sense because it has always been done that way.

          Reply
  • Eric Langlois
    April 14, 2015 12:25 pm

    I once had to generate a bunch (1000 +) of multicast IP addresses using a starting address, so I used a numbers table and the starting IP address stored as an integer to generate all of the following addresses.

    That way I was able, with one (1) UPDATE statement, to assign a different multicast address to every network device that was kept in a table.

    Reply
  • great article!

    one thing worth mentioning is that in real life you should really never put an explicitly named constraint on a temp table (i.e. PK_Numbers), as any concurrent code which attempts to create a constraint with that same name will fail with: There is already an object named ‘PK_Numbers’ in the database.

    Reply
  • Ville-Pekka Vahteala
    April 15, 2015 12:25 am

    My favorite use for numbers table is exploding geometries into piece’s.

    Reply
  • Good article, but, I thought, Identity(int,1,1) would be better.
    Please try this way.

    IF object_id(‘tempdb.dbo.#NumbersTest’) IS NOT NULL
    DROP TABLE #NumbersTest
    GO

    SELECT TOP 100000 IDENTITY(INT,1,1) AS N
    INTO #NumbersTest
    FROM master.dbo.spt_values a
    CROSS JOIN master.dbo.spt_values b
    CROSS JOIN master.dbo.spt_values c
    CROSS JOIN master.dbo.spt_values d
    GO

    ALTER TABLE #NumbersTest ADD CONSTRAINT [PK_Numbers]
    PRIMARY KEY CLUSTERED (N) WITH (FILLFACTOR = 100)
    GO

    SELECT [columns].[name], [columns].[is_nullable]
    FROM tempdb.sys.columns
    WHERE [object_id] = OBJECT_ID(N’tempdb..#NumbersTest’);
    GO

    Reply
  • Thomas Franz
    April 29, 2015 3:21 am

    I like the SELECT … INTO to, but sadly a disadvantage is that you can’t specifiy in which filegroup the new table will be created because there is no way to specify a ON FG1 clause (except from altering the whole db and setting FG1 as default, executing the SELECT INTO and altering the database back which is not possible for production purposes).

    Reply
  • Alex Friedman
    April 30, 2015 6:20 am

    This is really awesome.

    Reply
  • You might want to add that ADDING nullable columns might not act as expected:

    declare @tb table (id int null);
    select isNull(id, 0) +isNull(id, 0) id
    into #tempExample
    from @tb

    select COLUMN_NAME, data_type, IS_NULLABLE
    from tempdb.INFORMATION_SCHEMA.COLUMNS c
    where table_name like ‘%tempExample%’

    drop table #tempExample

    Reply
  • Thanks for the hint!

    Note that it can be done WITHOUT knowing the type of the column!
    First assuming that the source actually never is null:

    SELECT ISNULL(fie,”) fie INTO foo FROM bar

    AND if you want to omit later data that might actually have null in the source

    SELECT ISNULL(fie,”) fie INTO foo FROM bar where fie is not null

    It never reaches the point where it tries to convert ” to the type of fie!, but mssql knows that the expression for the field is never null

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.