Right now, Erik and I are presenting at the 24 Hours of PASS. We’re talking about Last Season’s Performance Tuning Techniques:
Wanna play along with us as we show how your performance skills might be a little out of date? Here’s the demos and the slides:
Fill Factor: Doing the Page Splits
You can do this one in any database, but you’ll want to do it on a server with very low load. If anybody else is doing any deletes/updates/inserts at all, it’s going to skew your numbers.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
CREATE TABLE TheHeart (ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED, Groove VARCHAR(100)); GO /* How many page splits has our server had? */ SELECT * FROM sys.dm_os_performance_counters WHERE counter_name LIKE '%Splits%'; GO /* Add just one row */ INSERT INTO TheHeart (Groove) VALUES ('The chills that you spill up my back keep me filled'); GO /* See any page splits? */ SELECT * FROM sys.dm_os_performance_counters WHERE counter_name LIKE '%Splits%'; GO DROP TABLE TheHeart; GO |
Questions to think about:
- What does a page split really mean?
- Is there such a thing as a good or a bad page split?
- How do you know which ones you’re having?
- Would setting fill factor have prevented that page split?
Missing Indexes
This one requires the Stack Overflow demo database. If you don’t already have a copy of that, don’t try to download it live during the session – it’s too big (~15GB torrent, then expands to a ~100GB SQL Server database.)
Get the estimated execution plan for this:
1 2 3 4 5 6 |
SELECT c.CreationDate, c.Score, c.Text, p.Title, p.PostTypeId FROM dbo.Users me INNER JOIN dbo.Comments c ON me.Id = c.UserId INNER JOIN dbo.Posts p ON c.PostId = p.ParentId WHERE me.DisplayName = 'Brent Ozar'; GO |
And then ask yourself:
- What index am I told to create?
- Does that index make sense?
- Is there anything SQL Server isn’t telling me?
Now try the estimated plan for this:
1 2 3 4 5 |
SELECT Id FROM dbo.Users WHERE DisplayName = 'Brent Ozar' ORDER BY Age; GO |
And ask yourself those same questions.
BEGIN TRAN ERIK
CTEs
Sometimes a CTE won’t change anything at all. This is the case with simple predicates.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
/*When they don't matter*/ WITH freedom AS ( SELECT p.Id FROM dbo.Posts AS p WHERE p.Id = 4 ) -- Simple predicate inside SELECT COUNT(*) AS RECORDS FROM freedom AS f; WITH freedom AS ( SELECT p.Id FROM dbo.Posts AS p ) SELECT COUNT(*) AS RECORDS FROM freedom AS f WHERE f.Id = 4; --Simple predicate outside |
CTEs don’t materialize results. What do you think this is, Oracle?
If you join a CTE to itself, you’ll run the CTE query again.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 |
/*One joins*/ WITH freedom AS ( SELECT p.Id FROM dbo.Posts AS p WHERE p.ParentId = 0 AND p.CreationDate >= '20160306' AND p.Score >= 2 ) SELECT COUNT(*) AS RECORDS FROM freedom AS f JOIN freedom AS f2 ON f2.Id = f.Id; /*Two joins*/ WITH freedom AS ( SELECT p.Id FROM dbo.Posts AS p WHERE p.ParentId = 0 AND p.CreationDate >= '20160306' AND p.Score >= 2 ) SELECT COUNT(*) AS RECORDS FROM freedom AS f JOIN freedom AS f2 ON f2.Id = f.Id JOIN freedom AS f3 ON f3.Id = f.Id; /*Ah hell, ten joins!*/ WITH freedom AS ( SELECT p.Id FROM dbo.Posts AS p WHERE p.ParentId = 0 AND p.CreationDate >= '20160306' AND p.Score >= 2 ) SELECT COUNT(*) AS RECORDS FROM freedom AS f JOIN freedom AS f2 ON f2.Id = f.Id JOIN freedom AS f3 ON f3.Id = f.Id JOIN freedom AS f4 ON f4.Id = f.Id JOIN freedom AS f5 ON f5.Id = f.Id JOIN freedom AS f6 ON f6.Id = f.Id JOIN freedom AS f7 ON f7.Id = f.Id JOIN freedom AS f8 ON f8.Id = f.Id JOIN freedom AS f9 ON f9.Id = f.Id JOIN freedom AS f10 ON f10.Id = f.Id; |
Thankfully, nested CTEs don’t exhibit the same problem.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
/*What about nested CTEs?*/ WITH freedom AS ( SELECT p.Id FROM dbo.Posts AS p WHERE p.ParentId = 0 AND p.CreationDate >= '20160306' AND p.Score >= 2 ), more_freedom AS ( SELECT * FROM freedom ) SELECT COUNT(*) AS RECORDS FROM more_freedom AS f; |
CTEs and derived tables will behave similarly as far as performance and query plans go.
One difference is that you can’t reference a derived table more than once, where you can do that with CTEs.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
/*CTEs vs Derived Tables?*/ WITH freedom AS ( SELECT p.Id FROM dbo.Posts AS p WHERE p.Id = 4 ) SELECT COUNT(*) AS RECORDS FROM freedom AS f; SELECT COUNT(*) AS Records FROM (SELECT p.Id FROM dbo.Posts AS p WHERE p.Id = 4) AS x |
CTEs are cool though. You can filter on things on the outside that you can’t filter on the inside.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 |
/*This will failbot*/ SELECT p.OwnerUserId, p.Score, p.Title, DENSE_RANK() OVER ( PARTITION BY p.OwnerUserId ORDER BY p.Score DESC ) AS ScoreRank FROM dbo.Posts AS p WHERE p.PostTypeId = 1 AND p.OwnerUserId > 0 AND ScoreRank <= 3 /*This will work excellently*/ WITH t1 AS ( SELECT p.OwnerUserId, p.Score, p.Title, DENSE_RANK() OVER ( PARTITION BY p.OwnerUserId ORDER BY p.Score DESC ) AS ScoreRank FROM dbo.Posts AS p WHERE p.PostTypeId = 1 AND p.OwnerUserId > 0 ) SELECT TOP 100 u.DisplayName, t1.Title, t1.Score FROM t1 JOIN dbo.Users AS u ON u.Id = t1.OwnerUserId WHERE t1.ScoreRank <= 3 ORDER BY t1.OwnerUserId; /*Be careful where you put that TOP*/ WITH t1 AS ( SELECT TOP 100 p.OwnerUserId, p.Score, p.Title, DENSE_RANK() OVER ( PARTITION BY p.OwnerUserId ORDER BY p.Score DESC ) AS ScoreRank FROM dbo.Posts AS p WHERE p.PostTypeId = 1 AND p.OwnerUserId > 0 ORDER BY p.OwnerUserId ) SELECT u.DisplayName, t1.Title, t1.Score FROM t1 JOIN dbo.Users AS u ON u.Id = t1.OwnerUserId WHERE t1.ScoreRank <= 3; |
Functions
This query runs without a function and finishes pretty quickly.
1 2 3 4 5 6 7 8 9 10 11 |
/*Normal string aggregation*/ /*This all ends in 2017 with STRING_AGG*/ SELECT b.UserId, STUFF((SELECT N', ' + b2.Name FROM dbo.Badges AS b2 WHERE b2.UserId = b.UserId GROUP BY b2.Name FOR XML PATH(N''), TYPE ).value(N'.[1]', N'NVARCHAR(4000)'), 1, 2, N'') AS Badges FROM dbo.Badges AS b WHERE b.Date >= '20160301'; |
If we turn that string aggregation expression into a scalar valued function…
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
/*Let's not repeat ourselves*/ /*Lets MAKE A FUNCTION*/ CREATE OR ALTER FUNCTION dbo.Fake_String_Agg (@UserId INT) RETURNS NVARCHAR(4000) WITH RETURNS NULL ON NULL INPUT, SCHEMABINDING AS BEGIN DECLARE @WickedBadIdeaDude NVARCHAR(4000) SELECT @WickedBadIdeaDude = STUFF((SELECT N', ' + b2.Name FROM dbo.Badges AS b2 WHERE b2.UserId = @UserId GROUP BY b2.Name FOR XML PATH(N''), TYPE ).value(N'.[1]', N'NVARCHAR(4000)'), 1, 2, N'') RETURN @WickedBadIdeaDude END GO |
Now we can crap up all our queries effortlessly.
1 2 3 4 5 6 7 8 9 10 11 |
/*Let's see how things go with our new function*/ SELECT b.UserId, dbo.Fake_String_Agg(b.UserId) AS Badges FROM dbo.Badges AS b WHERE b.Date >= '20160301'; /*THIS IS SO COOL WE CAN USE IT WITH OTHER TABLES*/ SELECT u.DisplayName, dbo.Fake_String_Agg(u.Id) AS Badges FROM dbo.Users AS u WHERE u.LastAccessDate >= '20160306'; |
Checking on query performance with sp_BlitzQueryStore…
1 2 3 4 5 6 7 8 9 10 |
DECLARE @ThisIsTheModernWorld DATETIME = GETDATE() EXEC master.dbo.sp_BlitzQueryStore @DatabaseName = 'StackOverflow', @StartDate = @ThisIsTheModernWorld GO EXEC sp_BlitzQueryStore @DatabaseName = 'StackOverflow', @PlanIdFilter = 3527 GO DECLARE @ThisIsTheModernWorld DATETIME = GETDATE() EXEC master.dbo.sp_BlitzQueryStore @DatabaseName = 'StackOverflow', @StoredProcName = 'Fake_String_Agg' , @StartDate = @ThisIsTheModernWorld GO |
Computed columns with Scalar Valued Functions in them will be similarly crappy.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
/*Let's use a computed column instead*/ CREATE TABLE dbo.LittleBadges ( Id INT NOT NULL PRIMARY KEY CLUSTERED, UserId INT NOT NULL, Badge_Agg AS dbo.Fake_String_Agg(UserId) ); INSERT dbo.LittleBadges WITH (TABLOCK) (Id, UserId ) SELECT b.Id, b.UserId FROM dbo.Badges AS b WHERE b.Date >= '20160301'; |
Crappiness doesn’t depend on whether or not we select the computed column. It’s there no matter what.
1 2 3 4 5 |
SELECT TOP 10 * FROM dbo.LittleBadges AS lb SELECT TOP 10 lb.Id, lb.UserId --NOT SELECTING THE COMPUTED COLUMN FROM dbo.LittleBadges AS lb |
Let’s see what happens when we add in a check constraint based on a UDF.
What to look for in XE: executions of function after inserting rows. Executions after selecting data.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
/*Can we make this worse?*/ /*You betcha!*/ TRUNCATE TABLE dbo.LittleBadges ALTER TABLE dbo.LittleBadges ADD CONSTRAINT [HAHAHAHAHAHAHAHA] CHECK (dbo.Fake_String_Agg(UserId) <> '') /*Let's Extend Ourselves*/ DROP EVENT SESSION [Crud] ON SERVER EXEC xp_cmdshell 'DEL /F /Q C:\Temp\crud*.xel' DECLARE @SPID VARCHAR(3) = @@SPID DECLARE @event_sql NVARCHAR(MAX) = N' CREATE EVENT SESSION [Crud] ON SERVER ADD EVENT sqlserver.rpc_completed(SET collect_data_stream=(1) ACTION(sqlserver.database_id,sqlserver.database_name,sqlserver.server_principal_name,sqlserver.session_id,sqlserver.sql_text) WHERE ([package0].[greater_than_uint64]([sqlserver].[database_id],(4)) AND [package0].[equal_boolean]([sqlserver].[is_system],(0)) AND [sqlserver].[session_id]=('+@SPID+'))), ADD EVENT sqlserver.sp_statement_completed(SET collect_object_name=(1) ACTION(sqlserver.database_id,sqlserver.database_name,sqlserver.server_principal_name,sqlserver.session_id,sqlserver.sql_text) WHERE ([package0].[greater_than_uint64]([sqlserver].[database_id],(4)) AND [package0].[equal_boolean]([sqlserver].[is_system],(0)) AND [sqlserver].[session_id]=('+@SPID+'))), ADD EVENT sqlserver.sql_batch_completed( ACTION(sqlserver.database_id,sqlserver.database_name,sqlserver.server_principal_name,sqlserver.session_id,sqlserver.sql_text) WHERE ([package0].[greater_than_uint64]([sqlserver].[database_id],(4)) AND [package0].[equal_boolean]([sqlserver].[is_system],(0)) AND [sqlserver].[session_id]=('+@SPID+'))) ADD TARGET package0.event_file(SET filename=N''c:\temp\crud'') WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=2 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=ON,STARTUP_STATE=OFF) ALTER EVENT SESSION [Crud] ON SERVER STATE = START ' PRINT @event_sql EXEC sys.sp_executesql @event_sql /*Insert 100 rows*/ INSERT dbo.LittleBadges WITH (TABLOCK) (Id, UserId ) SELECT TOP (100) b.Id, b.UserId FROM dbo.Badges AS b WHERE b.Date >= '20160301'; /*TO THE SHREDDER*/ DROP TABLE dbo.LittleBadges |
Using an inline TVF makes things faster for the query, but we can’t use it in a computed column. Other downsides: inline TVFs aren’t tracked in DMVs (2016 has a function_stats DMV that doesn’t catch them).
This is the same in Query Store.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
/*Let's make that inline, y'all*/ CREATE OR ALTER FUNCTION dbo.Fake_String_Agg_Inline (@UserId INT) RETURNS TABLE WITH SCHEMABINDING AS RETURN SELECT STUFF((SELECT N', ' + b2.Name FROM dbo.Badges AS b2 WHERE b2.UserId = @UserId GROUP BY b2.Name FOR XML PATH(N''), TYPE ).value(N'.[1]', N'NVARCHAR(4000)'), 1, 2, N'') AS Badges GO /*Testing out this inline voodoo*/ SELECT b.UserId, f.Badges FROM dbo.Badges AS b CROSS APPLY dbo.Fake_String_Agg_Inline(b.UserId) AS f WHERE b.Date >= '20160301'; SELECT u.DisplayName, f.Badges FROM dbo.Users AS u CROSS APPLY dbo.Fake_String_Agg_Inline(u.Id) AS f WHERE u.LastAccessDate >= '20160306'; /*TURN OFF QUERY PLANS OR ALL WILL BURN*/ DECLARE @ThisIsTheModernWorld DATETIME = GETDATE() EXEC master.dbo.sp_BlitzQueryStore @DatabaseName = 'StackOverflow', @StartDate = @ThisIsTheModernWorld GO EXEC sp_BlitzQueryStore @DatabaseName = 'StackOverflow', @PlanIdFilter = 3526 GO DECLARE @ThisIsTheModernWorld DATETIME = GETDATE() EXEC master.dbo.sp_BlitzQueryStore @DatabaseName = 'StackOverflow', @StoredProcName = 'Fake_String_Agg_Inline' , @StartDate = @ThisIsTheModernWorld GO SELECT OBJECT_NAME(qsq.object_id), * FROM sys.query_store_query AS qsq WHERE OBJECT_NAME(qsq.object_id) IS NOT NULL |
Temp Tables or Table Variables
Table variable modifications are forced to run serially.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
/*No parallel inserts*/ DECLARE @Votes TABLE (Id INT NOT NULL, VoteTypeId INT NOT NULL) INSERT @Votes ( Id, VoteTypeId ) SELECT v.PostId, v.VoteTypeId FROM dbo.Votes AS v WHERE v.VoteTypeId = 16 GO DECLARE @Votes TABLE (Id INT NOT NULL, VoteTypeId INT NOT NULL) INSERT @Votes ( Id, VoteTypeId ) SELECT v.PostId, v.VoteTypeId FROM dbo.Votes AS v WHERE v.VoteTypeId = 16 OPTION(USE HINT('ENABLE_PARALLEL_PLAN_PREFERENCE')); GO |
They’re also not guaranteed to be in memory. Backed by temp objects which may spill to disk.
1 2 3 4 5 6 7 8 9 10 11 12 |
SET STATISTICS TIME, IO ON DECLARE @Votes TABLE (Id INT NOT NULL, VoteTypeId INT NOT NULL) INSERT @Votes ( Id, VoteTypeId ) SELECT v.PostId, v.VoteTypeId FROM dbo.Votes AS v WHERE v.VoteTypeId = 16 SELECT COUNT(*) AS [Vote Type] FROM @Votes AS v GO |
Bad estimates may prevent parallel plans from happening when they should have.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
/*Crappy estimates*/ DECLARE @Votes TABLE (Id INT NOT NULL, VoteTypeId INT NOT NULL) INSERT @Votes ( Id, VoteTypeId ) SELECT v.PostId, v.VoteTypeId FROM dbo.Votes AS v WHERE v.VoteTypeId = 16 SELECT COUNT(*) AS [Vote Type] FROM @Votes AS v GO DECLARE @Votes TABLE (Id INT NOT NULL, VoteTypeId INT NOT NULL) INSERT @Votes ( Id, VoteTypeId ) SELECT v.PostId, v.VoteTypeId FROM dbo.Votes AS v WHERE v.VoteTypeId = 16 SELECT COUNT(*) AS [Vote Type] FROM @Votes AS v JOIN dbo.Posts AS p ON p.Id = v.Id GO |
Does recompiling always make things better?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
/*Does recompile really even help?*/ DECLARE @Votes TABLE (Id INT NOT NULL, VoteTypeId INT NOT NULL) INSERT @Votes ( Id, VoteTypeId ) SELECT v.PostId, v.VoteTypeId FROM dbo.Votes AS v WHERE v.VoteTypeId = 16 SELECT COUNT(*) AS [Vote Type] FROM @Votes AS v JOIN dbo.Posts AS p ON p.Id = v.Id OPTION(RECOMPILE) GO |
Temp tables generally work better!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
/*Temp table superiority*/ DROP TABLE IF EXISTS #Votes CREATE TABLE #Votes (Id INT NOT NULL, VoteTypeId INT NOT NULL) INSERT #Votes ( Id, VoteTypeId ) SELECT v.PostId, v.VoteTypeId FROM dbo.Votes AS v WHERE v.VoteTypeId = 16 SELECT COUNT(*) AS [Vote Type] FROM #Votes AS v JOIN dbo.Posts AS p ON p.Id = v.Id GO |
Do indexes change anything?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
/*BUT INDEXES*/ DECLARE @Votes TABLE (Id INT NOT NULL, VoteTypeId INT NOT NULL, INDEX ix_Id CLUSTERED (Id)) INSERT @Votes ( Id, VoteTypeId ) SELECT v.PostId, v.VoteTypeId FROM dbo.Votes AS v WHERE v.VoteTypeId = 16 SELECT COUNT(*) AS [Vote Type] FROM @Votes AS v JOIN dbo.Posts AS p ON p.Id = v.Id GO DROP TABLE IF EXISTS #Votes CREATE TABLE #Votes (Id INT NOT NULL, VoteTypeId INT NOT NULL, INDEX ix_Id CLUSTERED (Id)) INSERT #Votes WITH (TABLOCK) ( Id, VoteTypeId ) SELECT v.PostId, v.VoteTypeId FROM dbo.Votes AS v WHERE v.VoteTypeId = 16 SELECT COUNT(*) AS [Vote Type] FROM #Votes AS v JOIN dbo.Posts AS p ON p.Id = v.Id GO |
ROLLBACK ERIK
Are your performance skills out of fashion?
If you learned things during the webcast, and you’re starting to question your taste, have no fear: we’re here to help. We’re doing an all-day pre-con class before the PASS Summit called Expert Performance Tuning for SQL Server 2016 & 2017. We specifically designed it to update your performance skills for today – and a lot of the techniques are even useful on currently patched versions of 2012 & 2014, too. Learn more and register for the pre-con.
6 Comments. Leave new
Great Session, really enjoyed the flashbacks!
From what I’ve seen with a recent Inline TVF that I did (to replace three scalar UDFs), I wonder if part of the reason that TVFs aren’t tracked in DMVs is because the optimizer treats them like a regular view. I definitely was seeing the I/O and TIME statistics in my “outer query” once I switched to the I-TVF, whereas when it was doing UDFs, the statistics for the UDF weren’t visible from the outer query and you had to check the DMVs to see how ugly they were.
Yep, that’s exactly why.
I just wanted to say that putting Groove VARCHAR(100) into table TheHeart made me laugh out loud. I appreciate the subtle humor! Keep it up!
-dougl
Hahaha, thanks sir. Wondered how many people would catch that.
Wow – i really enjoyed this session. I wish every learning video session would have this much laughing and funny pictures in it – it would help me not falling asleep and actually stay focused on the session and not check social media during watching.