Say you’ve got a two-step process where you’re:
- Inserting rows into a table that has an identity column, then
- Querying that table to figure out what identities you got
There’s a a faster way that doesn’t require hitting the table twice: the OUTPUT clause.
I’ll demonstrate it with the Badges table from the Stack Overflow database, which has an Id column that’s an identity. I’m going to find all of the Users who live in Iceland, give them a badge, and then return the list of badge IDs I just granted:
1 2 3 4 5 6 7 8 9 10 11 |
INSERT INTO dbo.Badges(Name, UserId, Date) SELECT N'Sunny Disposition', Id, GETDATE() FROM dbo.Users WHERE Location = N'Iceland'; SELECT b.Id FROM dbo.Badges b INNER JOIN dbo.Users u ON b.UserId = u.Id WHERE u.Location = N'Iceland' AND b.Name = N'Sunny Disposition' AND b.Date >= DATEADD(SS, -1, GETDATE()); |
This code pattern is kinda painful because:
- We lock rows twice (when we read & insert, and then again when we read back what we just did)
- We have to be careful to only fetch the rows we really just inserted – so we end up putting in all kinds of convoluted logic to work around concurrency problems
Instead, use the OUTPUT clause.
Here’s how to do it without touching the Badges & Users tables twice:
1 2 3 4 5 6 7 8 9 10 11 |
CREATE TABLE #RowsInserted (Id INT, Name NVARCHAR(40), UserId INT, [Date] DATETIME); INSERT INTO dbo.Badges(Name, UserId, Date) OUTPUT INSERTED.Id, INSERTED.Name, INSERTED.UserId, INSERTED.[Date] INTO #RowsInserted SELECT N'Sunny Disposition', Id, GETDATE() FROM dbo.Users WHERE Location = N'Iceland'; SELECT * FROM #RowsInserted; |
The OUTPUT clause is kinda like the virtual INSERTED/DELETED tables: it lets you grab the output of what you’re doing and redirect it to another place. Presto, less locking, less T-SQL to manage, less guessing as to which rows were affected by your operation.
Isn’t that slick? In my own line of work, I sure don’t need to use it often, but when I do, it’s amazing.
Want to learn more tricks like this?
If you like tricks like this, you’ll love my Mastering Query Tuning class. I’ve got two upcoming sessions: December 11-13 (Fri/Sat/Sun, iCal) and January 12-14 (Tues/Weds/Thurs- iCal.)
If you’ve got a Live Class Season Pass, you can drop in on this class or any of my live online classes at any time. Just head to my current training page, check out the schedule at the bottom of the page, and grab the calendar files for the classes you’re interested in. You don’t have to register ahead of time – just drop in anytime I’m streaming.
22 Comments. Leave new
Have to try but triggers on table still prevent using output,right?
Vlad – go ahead and give ‘er a shot. (I can’t address every design in each post, unfortunately.)
Right, memory failed me. The issue is with the output target table, not the source table (y)
The “insert into dbo.A output inserted.id …” will only reference the columns of table A Whatever the trigger does will happen in another scope and won’t interfere.
I discovered while testing this that your screenshot should have “OUTPUT INSERTED.UserId” instead of “OUTPUT INSERTED.Id”. When it didn’t work as-is, my first thought was that my version of SQL Server didn’t support OUTPUT clauses for Insert Into queries, but fixing the typo allowed this to work.
Correction; I think this is a SQL Server version issue. I’m testing on SQL Server 2014, and “OUTPUT INSERTED.Id, INSERTED.Name, INSERTED.UserId, INSERTED.[Date]” doesn’t work, with red squiggles under “Id”. Changing it to “OUTPUT INSERTED.UserId, INSERTED.Name, INSERTED.UserId, INSERTED.[Date]” works, but that’s simply me putting UserId into two separate columns.
I think the problem with ‘Id’ is that it is not a column that is affected by the insert statement. We are inserting into Badges, but Id is a column in Users, but not in Badges.
Stephen – Id is in Badges as well.
There’s my problem; I made my own Badges table instead of opening up the stack overflow DB. My badges table only has columns Name, UserId, and Date
Stinking Badges……
Funny you should present the OUTPUT clause today, as I believe it is the solution to yesterday’s post about returning different (not random) results for each query as rows are added.
That is slick.
Another reason to use this is if you want to have a read-only asynchronous secondary. If you have an application code pattern where you use a read-write connection for all INSERTs and a read-only connection for all SELECTs, you risk the SELECT not finding any rows because the INSERT hasn’t replicated yet.
Mark – dude! I love it! I never thought of that, that’s good.
Yeah, I’m working on an application like this where the architect won’t let us use OUTPUT on INSERTs or the read-only secondary because the dev team would have to change too much code. I wish I had more details, but he said something about it’s because how Entity Framework is set up.
I am a full time dba and am considering spending my own money to get your consultant’s tool kit for the Black Friday price of $195. You mention in the video on what the output shows that there is a key that expires every 90 days. If I purchase this myself am I going to only be able to use it for 90 days and will the key expire over time (say every year I would need to repurchase the tool). As I have said in every email to you…brilliant work for a truly world class dba and teacher
Hi! For questions about the Consultant Toolkit, shoot me an email. Thanks!
Wow! I usually needed to have to the insert in the parent’s table one row at a time to get the identity values and then do the child tables. But with this, I can do all the parents at one time and output the values to get the identity values and then bulk insert the childs. That’s great!
This is great, Brent. Thanks!
Glad y’all liked it!
Great! solved my problem with the Output on the main table who have triggers.
Thanks a lot 🙂
I love this trick! I took it to another level by using table variables to store the inserted ids so that I could use them to insert into related tables.