sp_Blitz® Result: Poison Wait Detected
Whenever SQL Server needs to wait while it’s executing queries, it tracks that time as a wait statistic. We can query the waits tables to get overall bottleneck data over time.
Some waits are what we call poison: any occurrence of them means your SQL Server may feel unusable while this wait type is happening. You might only have a few minutes of this wait type, out of hours or days of uptime, but a few minutes of inaccessibility sure seems like a total outage to your end users.
This part of our SQL Server sp_Blitz® script checks sys.dm_os_wait_stats looking for significant cumulative wait times for the following wait types:
CMEMTHREAD – for this particular warning, see Microsoft’s blog post on running SQL Servers with >8 cores per NUMA node. If you suspect that CMEMTHREAD is your server’s largest problem, and you meet the symptoms described in that KB article, open up a support ticket with Microsoft to be safe rather than just enabling this trace flag. It’s only $500, and they can give you a double check confirmation that this trace flag makes sense for you. (Or call us. I’m just sayin’.)
IO_QUEUE_LIMIT – in Azure SQL DB, this means your database is getting throttled.
IO_RETRY – a read or write failed due to insufficient resources, and we’re waiting for a retry.
LOG_RATE_GOVERNOR – in Azure SQL DB, this means your database is getting throttled.
PREEMPTIVE_DEBUG – someone probably accidentally hit the DEBUG button in SSMS rather than Execute.
RESMGR_THROTTLED – in Azure SQL DB, this means a new request has come in, but it’s throttled based on the GROUP_MAX_REQUESTS setting.
RESOURCE_SEMAPHORE means SQL Server ran out of available memory to run queries, and queries had to wait on available memory before they could even start. You can learn more about this in Query Plans: Memory Grants and High Row Estimates.
RESOURCE_SEMAPHORE_QUERY_COMPILE is a lot like RESOURCE_SEMAPHORE, but it means SQL Server didn’t even have enough memory to compile a query plan. This is usually only seen on really, really crappy servers: think 8-16GB RAM for heavy production workloads.
SE_REPL* – in Azure SQL DB, we’re waiting for the secondary replicas to catch up.
THREADPOOL means SQL Server ran out of worker threads, and new queries thought the SQL Server was frozen solid. During any occurrence of THREADPOOL, your SQL Server will feel like it’s locked up solid – although trickily, your CPU usage might be near zero.
To FIX THE PROBLEM
Danger, Will Robinson: your performance troubleshooting is about to become more adventurous. Start digging into the server’s overall health.