I recently ran into a performance problem that had me scratching my head. How could a stored procedure’s plan have changed, and yet it not be reflected in sys.dm_exec_procedure_stats?
So here’s what happened and what I knew:
- 7:45am: Users were complaining that a page was timing out
- 8:00am: Developer emailed the offending stored procedure and input parameter values
- 8:30am: Developer reports that the page is fast again
- 8:35am: I start troubleshooting even though it’s already fixed as we need to figure out what happened
- UPDATE STATISTICS job runs every 30 minutes
- WhoIsActive data saved every 30 seconds
I just knew that the UPDATE STATISTICS job caused the stored procedure to be recompiled and get a better execution plan. But I needed to confirm it.
I first ran the stored procedure with the provided input parameter values and checked the execution plan.
I ran DBCC SHOW_STATISTICS against each of the indexes in the plan and checked the Updated column in the first result set. One of the indexes did get updated.
Example DBCC SHOW_STATISTICS using StackOverflow database:
DBCC SHOW_STATISTICS(‘dbo.Posts’, ‘PK_Posts__Id’);
I next checked when the plan was cached and which plan was in cache:
1 2 3 4 |
SELECT ps.cached_time, qp.query_plan, * FROM sys.dm_exec_procedure_stats ps CROSS APPLY sys.dm_exec_query_plan(ps.plan_handle) qp WHERE [object_id] = object_id('SomeStoredProc') |
It returned 2 rows: one for the application executions and one for my adhoc execution. I could tell which one was mine as execution_count=1.
Often times, my plan will be different than the app’s plan, but this time they were the same. If they are different though, you can mimic the app in SSMS by using its SET options. To get those options, grab the options_text value of the app’s username for the Login event in an XE session (use the System Monitoring\Connection Tracking event session template). Run those SET options in an SSMS query window and then run the stored procedure in that same window. You should have the same execution plan as the application.
The app’s plan was cached the day before. But wait a second! My assumption was that it had recompiled this morning due to the updated stats.
I dug into the WhoIsActive data:
1 2 3 4 |
SELECT TOP 100 query_plan, * FROM DBA.dbo.WhoIsActive WHERE CAST(sql_command AS VARCHAR(MAX)) LIKE '%SomeStoredProc%' AND collection_time BETWEEN '2016-01-04 07:30:00.000' AND '2016-01-04 08:30:00.000' |
It did have a different execution plan that morning!
But why would the cached_time value show the day before?
I scratched my head for a bit and then took to Twitter (#sqlhelp). Kendra replied and said to check the SQL statement’s compile time, not the proc’s compile time.
SQL Statement Cached Time:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
SELECT * FROM ( SELECT total_worker_time/execution_count AS [Avg CPU Time] ,creation_time AS cached_time ,SUBSTRING(st.text, (qs.statement_start_offset/2)+1, ((CASE qs.statement_end_offset WHEN -1 THEN DATALENGTH(st.text) ELSE qs.statement_end_offset END - qs.statement_start_offset)/2) + 1) AS statement_text ,* FROM sys.dm_exec_query_stats AS qs CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st ) t WHERE CAST(statement_text AS VARCHAR(MAX)) LIKE '%SELECT blah%' -- <--Something to help you find the query ORDER BY [Avg CPU Time] DESC; |
And there it was. The cached time was from that morning.
I’m unclear why the cached_time didn’t change in sys.dm_exec_procedure_stats when the stored procedure’s plan did change when the SQL statement’s plan changed as a result of updated stats.
Brent says: even if you’ve got a monitoring tool, it can be helpful to log sp_WhoIsActive results into a table so that you can go back for troubleshooting later. Just make sure to set up a job to clean out that table periodically.
3 Comments. Leave new
I believe that it’s the difference between procedure-level recompilation and statement-level recompilation. When the stats changed it would have caused the statement to recompile, so the date was reflected in sys.dm_exec_query_stats for that statement. If you’d run sp_recompile on the proc, then the date in sys.dm_exec_procedure_stats would have changed.
I notice recently that the plan_handle value for a proc or batch stays the same even when one of the statements recompiles. Based on this insanely-detailed post (http://blogs.msdn.com/b/sqlprogrammability/archive/2007/01/09/1-0-structure-of-the-plan-cache-and-types-of-cached-objects.aspx) it looks like a query plan is a collection of “MEMOBJ” structures that are linked together. So I think that means that with a statement-level recompile, the plan’s overall structure/identifier/location stays pretty much the same, including those internal links, except the “MEMOBJ” structure for the specific statement that was recompiled.
You’re saying that the proc suddenly became slow, then the UPDATE STATISTICS run, making a statement recompile, next, a new execution plan was generated solving the problem? Is that?
But why from a day to another, the proc became slow? The only way I figure this hapenning, is by having a big ammount of data beign modified in a short period of time. The correct approach to this problem, was not verifing the wait stats for that proc?