Blog

It’s easy to find problem queries on your SQL Server after they’ve happened. You can open your favorite monitoring tool or look in the plan cache. Those options are great for finding things after they’ve happened, but what if you want to find problems as they happen?

Server Side Trace

You could set up a SQL Server Server Side Trace. This is an acceptable solution if you’re using SQL Server 2008 or earlier. There are a few problems with this approach.

Problem the first – the SQL Server Profiler UI is less than beautiful. The GUI interface is somewhat cumbersome and the scripting interface is hilariously bad. To make matters worse – to only capture a single query requires string based filtering. There’s a plan handle property available in profiler, but it’s not available for TSQL or stored procedures.

Something appears to be missing

Something appears to be missing

As much as I like string matching in T-SQL (hint: I don’t), I think it’s fair to say that Profiler/Server Side Trace are best avoided unless you’re using SQL Server 2008 or earlier.

Extended Events

Extended events are nowhere near as hard as you’d think. Trust me on this. How do I know? I set up an Extended Events session to grab a particular query.

Here’s what you do:

  1. Create a crappy stored procedure. Use your imagination.
  2. Run the stored procedure a lot.
  3. Grab the query_hash for that stored procedure. The easiest way
    I’ve found to do this is to use our plan cache query to identify it. My query_hash is 0x2B42702322C10821. What’s yours?
  4. Create an extended events session to capture that specific query hash.

Step 4 isn’t as easy as it sounds. Astute readers will note that I never said this was easy, just that it’s not as hard as you would think.

Even though the query_hash is stored as a hexadecimal in SQL Server, Extended Events would prefer to see an unsigned 64-bit integer. To get this magic number, take your query_hash and cast it as either a BIGINT or a DECIMAL(20,0):

SELECT CAST(0x2B42702322C10821 AS DECIMAL(20,0));
/* my magic number is 3117177188303046689. */

Once you convert the query_hash to a magic_number, create your Extended Events session. I chose to capture the sp_statement_completed and sql_statement_completed events and a number of server level events like time, app name, and client host name. Here’s what it would look like capturing this to a file:

CREATE EVENT SESSION [query hash] ON SERVER 
ADD EVENT sqlserver.sp_statement_completed(
    ACTION(package0.collect_system_time,
           sqlserver.client_app_name,
           sqlserver.client_hostname,
           sqlserver.database_name)
    WHERE ([sqlserver].[query_hash]=(3117177188303046689.))),
ADD EVENT sqlserver.sql_statement_completed(
    ACTION(package0.collect_system_time,
           sqlserver.client_app_name,
           sqlserver.client_hostname,
           sqlserver.database_name)
    WHERE ([sqlserver].[query_hash]=(3117177188303046689.)))
ADD TARGET package0.asynchronous_file_target
(SET filename = 'C:\temp\XEventSessions\query_hash.xel',
     metadatafile = 'C:\temp\XEventSessions\query_hash.xem',
     max_file_size=5,
     max_rollover_files=5)
WITH (MAX_DISPATCH_LATENCY = 5SECONDS);
GO

Once you have that up and running, you should be able to start the session, if it isn’t already started, by running:

ALTER EVENT SESSION [query hash] ON SERVER
STATE = START ;

Just like that, you’ll be capturing your terrible queries to disk where you can mine the extended events files for gold and glory at your own convenience. If you want to query it, it’d look something like this:

WITH events_cte AS (
    SELECT 
        DATEADD(mi, DATEDIFF(mi, GETUTCDATE(), CURRENT_TIMESTAMP),
            xevents.event_data.value('(event/@timestamp)[1]',
            'datetime2')) AS [event time] ,
        xevents.event_data.value('(event/action[@name="client_app_name"]/value)[1]', 'nvarchar(128)')
          AS [client app name],
        xevents.event_data.value('(event/action[@name="client_hostname"]/value)[1]', 'nvarchar(max)')
          AS [client host name],
        xevents.event_data.value('(event/action[@name="database_name"]/value)[1]', 'nvarchar(max)')
          AS [database name],
        xevents.event_data.value('(event/data[@name="duration"]/value)[1]', 'bigint') 
          AS [duration (ms)],
        xevents.event_data.value('(event/data[@name="cpu_time"]/value)[1]', 'bigint') 
          AS [cpu time (ms)],
        xevents.event_data.value('(event/data[@name="logical_reads"]/value)[1]', 'bigint') AS [logical reads],
        xevents.event_data.value('(event/data[@name="row_count"]/value)[1]', 'bigint')  AS [row count]
    FROM sys.fn_xe_file_target_read_file
         ('C:\temp\XEventSessions\query_hash*.xel',
          'C:\temp\XEventSessions\query_hash*.xem',
          null, null) 
    CROSS APPLY (select CAST(event_data as XML) as event_data) as xevents
)
SELECT *
FROM events_cte
ORDER BY [event time] DESC;

Cleaning Up After Yourself

Once you’re done watching a specific query or queries, make sure you clean up after yourself. There’s no reason to add extra load to SQL Server when you aren’t watching. Make sure to stop and remove your Extended Events session:

/* Stop the Extended Events session */
ALTER EVENT SESSION [query hash] ON SERVER
STATE = STOP;
/* Remove the session from the server.
   This step is optional - I clear them out on my dev SQL Server
   because I'm constantly doing stupid things to my dev SQL Server. */
DROP EVENT SESSION [query hash] ON SERVER;

Augment Your Techniques

This won’t replace any other techniques that you’re using to find poorly performing queries. However, you can use this technique to find problem queries as they’re happening.

Different monitoring techniques provide you with a different view of the data. Our plan cache query shows you a historical record of what’s been going on, but it can only show so much. If a query has been run 14,400 times over the last 24 hours, does that mean it’s been run 10 times a minute for that entire period or is there a period of activity where that query gets run 1,440 times a minute for 10 minutes straight?

This isn’t going to replace the need for other tools you have at your disposal. It’s just another way to get the job done.

↑ Back to top
  1. Maybe I’m misreading your post, but it doesn’t seem you have described a method of finding a problem query but rather getting information about a running query you already know is a problem. Despite your comment in the first paragraph, at least one monitoring tool, SQL Sentry does in fact show you long running queries while they are running, and you don’t have to know what it is, or it’s plan handle beforehand. You don’t have to run a long query to find it, save it to a file then manipulate it, etc. etc.

    Despite consultants infatuation with them, IMHO Extended Events are still not a practical solution for the everyday DBA. I’m not convinced they are a better solution than a server side trace in most cases, except that they are lighter weight, which in some cases may be an important consideration. They are certainly not a better solution than a quality commercial monitoring tool. I have yet to see a usable, scalable monitoring solution using only Extended Events.

    • Great comment, Justin. Tell me – how would you go about tracking a single problem query using nothing but a SQL Server server side trace?

      • Hi Jeremiah – I don’t think I would track a single problem query using a server side trace, if for no other reason than the one you cited regarding string matching. I would use my monitoring tool.

        I realize that as a consultant for hire you often don’t have that luxury, so EE makes sense for this particular scenario. I just believe there are better options (at least one) for most DBAs working in their own environment where they hopefully choose/control the tools at their disposal.

        The entire 4, 5, 6 step process, including writing, querying, and cleaning up files on disk seems like something out of the 70s.

        • Justin – wow, that’s interesting. Can you describe your monitoring tool usage scenario? I’m curious to see how installing, configuring, and using a monitoring tool can be done in less than 4 steps, but I can’t wait to see it. I’m excited to hear how much better they’ve gotten recently, and you sound like just the person to explain it to us.

          • Brent – Wow this is getting completely out of hand. I never suggested, nor would I suggest one would install, set up, and configure any monitoring tool for a single query tuning session. BTW, your sarcasm is unbecoming.

          • Ah, sorry. When you were calling Extended Events’ 4-5-6 step process as “something out of the 70s” I didn’t realize you were being serious, not sarcastic. I thought that with a comment like that, you would welcome more sarcasm. Clearly I was mistaken, and you really did believe that a 6-step process belongs with AMC Pacers and bell bottoms. I had no idea, and I apologize deeply.

          • I think Brent and Jeremiah misunderstood Justin. He didn’t say that what Jeremiah was doing altogeter was too many steps, he was saying steps 4,5,&6 are just archaic in what they are doing.. (just paraphrasing not my opinion). He also didn’t say to use SQL Server server side trace to track a problem, he said SQL Sentry, which I will go Google when I finish my comment to see what they offer.

            I do agree with Justin that using the extended events in this scenario, you have to already know the query_hash of the sproc you are monitoring. I guess If you use Activity Monitor or Ask_Brent sproc, you can figure out your top running queries and then use that to feed the query_hash for monitoring.

            [edit: Jeremiah’s name is in fact “Jeremiah” not “Miah”]

          • To be honest, I recommend that you clean things up at the end because if I don’t say it, someone is going to leave a server side trace running, fill up their C: drive, and then SQL Server is going to crash and someone will say it’s my fault.

            Likewise – why not use the ring buffer? Someone is going to change a script, set it to 20GB of ring buffer, and then get upset when 20GB of memory went away because they filled up the ring buffer.

            Modern file systems and networked computer systems are both amazing pieces of technology that came out of the 1970s… along with the main memory database. We should probably stop using SQL Server because it’s based on 1970s technology, too… right?

          • What about just using ClearTrace and/or ReadTrace?

          • Ryan – with either option you need to use string matching to match the query text that you’re looking for. If you only want to collect metrics on a single query, it’s much lighter weight to only collect that information on a single query from the beginning than it is to collect all queries and filter it later.

  2. It’s been unfortunate that Justin’s comments have been misunderstood and the subsequent wraths. Anyway, we should get out of it and look for Brent and Jeremiah’s brilliant articles and guidelines.

    [edit: Jeremiah’s name is in fact “Jeremiah” not “Miah”]

  3. Having taken time to re-read my comments I realize they were off point and Ill considered. I even broke my own rule about not criticizing others in public. Jeremiah’s post was well written and informative. My comments took the discussion in the wrong direction. I hope you’ll accept my apology.

  4. I like the article, and it (almost) makes me want to like Extended Events. However, I have an issue with the title “Finding One Problem Query With Extended Events”. If the query_hash has to be embedded in the EVENT SESSION, how is this helping to find the problem query? Sure, it will capture all sorts of goodies about a known problem query, but that isn’t what the title says.

    • Because the article describes a method to find a single query that causes problems?

      • I have to agree with Andy (and Justin’s original point), a much better title would have been “Examining a problematic query with Extended Events.”

        • Thanks for your contribution to the world of SQL Server Extended Events.

        • I just caught up with this chain of comments.

          Jeremiah’s title makes sense to me with the word “find”. One problem I’ve frequently had clients come up against is needing to “find” and document how often specific queries run that they know have problems.

          Their goal is to show the impact that the query is making in a simple way and justify a software development investment to make it better.

          This post is the foundation for doing some VERY cool things that I don’t know of a way to do without Extended Events – “find” the waits associated with the problem query. Not just the overall wait on the server, but the waits by query. That’s certainly what I’d call and advanced/Senior DBA skill, but I don’t think it’s a “consultant only” skill or anything like that. (In that case I would say, “Let’s find the waits for the query in production.”)

          This is probably just a difference of language– but the word “find” does seem natural to me in the context of how I’ve seen software development work.

          But then, I also learned to speak English in Texas. :)

          –Kendra

          • Thanks Kendra, I appreciate that one of three B.O.U. responders chose NOT to make a snarky comment. I’d like to think that despite y’all having elite minds you could sometimes see things from the perspective of us normal people.

  5. Hi,

    Why not use the DMVs to quickly and easily find problematic* queries as they happen (i.e. as they’re running)?

    The following query reports on queries that have been running for more than 20 seconds (this can be changed), ordered by duration….

    SELECT DATEDIFF(s, r.start_time, GETDATE()) AS [Duration(s)]
    , c.session_id
    ,SUBSTRING (st.text,
    r.statement_start_offset/2,
    (CASE WHEN r.statement_end_offset = -1
    THEN LEN(CONVERT(NVARCHAR(MAX), st.text)) * 2
    ELSE r.statement_end_offset END –
    r.statement_start_offset)/2) AS [Individual Query]
    , st.text AS [Parent Query]
    –, s.host_name, s.login_name, s.status, s.login_time, s.program_name, *
    FROM sys.dm_exec_connections c
    INNER JOIN sys.dm_exec_sessions s ON c.session_id = s.session_id
    INNER JOIN sys.dm_exec_requests r ON c.session_id = r.session_id
    CROSS APPLY sys.dm_exec_sql_text(c.most_recent_sql_handle) AS st
    WHERE DATEDIFF(s, r.start_time, GETDATE()) > 20 — Query running over 20 seconds
    ORDER BY [Duration(s)] DESC

    You can filter by part of the query name too (if you’re after a known query)…

    Thanks
    Ian
    Author: DMVs In Action (www.amazon.com/SQL-Server-DMVs-Action-Management/dp/1935182730/)

    * problem may relate to CPU time, elapsed time, excessive reads/writes etc you can filter on this too.

    • Howdy Ian,

      The DMV queries are great as long as a) the query you want to see is running and b) someone is there to run the query. Sure, you could sample, but then you have to sift through results.

      Once you know the query you want to track, the code samples in this blog post will let you track all executions of a single query and determine the waits and I/O associated with each query. It breaks the waits down and computes the average, standard deviation, and several percentials (75, 90, 95, and 99).

      So in answer to your question – your query does something totally different.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

css.php