As if there weren’t enough reasons
In my last blog post I talked about different things that cause plans, or zones of plans, to execute serially. One of the items I covered was computed columns that reference scalar functions. We know that they’ll make queries go serial, but what about other SQL stuff?
Oh no my index is fragmented
If you’re running Expensive Edition, index rebuilds can be both online and parallel. That’s pretty cool, because it keeps all your gadgets and gizmos mostly available during the whole operation, and the parallel bit usually makes things faster.
That is, unless you have a computed column in there that references a scalar function. I decided to write my test function to not perform any data access so it could be persisted. It’s dead simple, and I’m tacking it on to a column in the PostLinks table of the Stack Overflow database.
CREATE FUNCTION dbo.PIDMultiplier (@pid int) RETURNS INT WITH RETURNS NULL ON NULL INPUT, SCHEMABINDING AS BEGIN DECLARE @Out BIGINT; SELECT @Out = @pid * 2 RETURN @Out; END; GO ALTER TABLE [dbo].[PostLinks] ADD [Multiplied] AS dbo.[PIDMultiplier]([PostId]) PERSISTED
For this one, all we have to do is turn on actual execution plans and rebuild the index, then drop the column and rebuild again.
ALTER TABLE [dbo].[PostLinks] REBUILD WITH (ONLINE = ON) ALTER TABLE [dbo].[PostLinks] DROP COLUMN [Multiplied]
Here are my execution plans. The rebuild I ran when the table had my computed column in it stayed serial.
Parallel, sans computed column:
But there’s a bigger fish in the pond
Probably the most important maintenance item you should be doing, aside from backups, is running DBCC CHECKDB. Seriously, if you’re not doing them both, start today. Ola Hallengren has basically done all the work for you. Back when I had a real job, I used his scripts everywhere.
Before we were so rudely interrupted by a soap box, we were talking about parallelism. This part was a little bit more complicated, but don’t worry, you don’t have to follow along. Just look at the pretty pictures. Sleep now. Yes. Sleep.
The first couple times I tried, the DBCC check never went parallel. Since I’m on my laptop, and not a production server, I can set Cost Threshold for Parallelism to 0. You read that right, ZE-RO! Hold onto your drool dish.
With that set, I fire up Ye Olde Oaken sp_BlitzTrace so I can capture everything with Extended Events. You’ll need all three commands, but you’ll probably have to change @SessionId, and you may have to change @TargetPath. Run the first command to start your session up.
EXEC [dbo].[sp_BlitzTrace] @SessionId = 61 , @Action = 'start' , @TargetPath = 'c:\temp\' , @TraceParallelism = 1 , @TraceExecutionPlansAndKillMyPerformance = 1 EXEC [dbo].[sp_BlitzTrace] @Action = 'stop' EXEC [dbo].[sp_BlitzTrace] @Action = 'read'
With that running, toss in your DBCC command. I’m only using DBCC CHECKTABLE here to simplify. Rest assured, if you run DBCC CHECKDB, the CHECKTABLE part is included. The only checks that DBCC CHECKDB doesn’t run are CHECKIDENT and CHECKCONSTRAINT. Everything else is included.
DBCC CHECKTABLE('dbo.PostLinks') WITH NO_INFOMSGS, ALL_ERRORMSGS ALTER TABLE [dbo].[PostLinks] ADD [Multiplied] AS dbo.[PIDMultiplier]([PostId]) PERSISTED
Run DBCC CHECKTABLE, add the computed column back, and then run it again. When those finish, run the sp_BlitzTrace commands to stop and read session data. You should see execution plans for each run, and they should be way different.
So even DBCC checks are serialized. Crazy, right? I’d been hearing about performance hits to varying degrees when running DBCC checks against tables with computed columns for a while, but never knew why. There may be a separate reason for regular computed columns vs. ones that reference scalar functions. When I took the equivalent SQL out of a function, the DBCC check ran parallel.
ALTER TABLE [dbo].[PostLinks] ADD [Multiplied] AS [PostId] * 2 PERSISTED
Of course, those online index rebuilds running single threaded might be a blessing in disguise, if you haven’t patched SQL recently.
I don’t have much of a grand closing paragraph here. These things can seriously mess you up for a lot of reasons. If you’re a vendor, please get away from using scalar functions, and please please don’t use them in computed columns.
Thanks for reading!