Ever heard someone say this? “I tuned a query, and I made it 50% faster!”


Want to watch a demo? Kendra shows an example in this 3.5 minute video.

Whenever I hear that I wonder how the person measured performance, and what their test runs were like.

Measuring query performance can be tricky. Say you run a query and it takes sixty seconds to return data in SQL Server Management Studio. You make one change and you run it again, and it takes thirty seconds. Did you just improve the runtime by 50%? Or did the second query run faster because the data it needed was already in memory, and the first query had the disadvantage of having to read all the data from disk?

You won’t be able to show that you truly improved query performance unless you have a good tool to measure with. Luckily, the same tool helps you make sure you’re doing a fair comparison between the queries at the same time.


To start measuring your queries, run the following statements in SQL Server Management Studio:


Running these statements will cause SQL Server to print some performance information to the ‘Messages’ tab for your session window. This impacts only the session you run the statements in — this does not change a server level setting.

You can turn off these options at any time by running:



Turning on the STATISTICS IO option for your session causes SQL Server to track and print information about how SQL Server is interacting with your storage when you run queries.

The most useful numbers you will get are:

  • Logical Reads: How many 8K pages were read to satisfy the query. This count includes pages that were read memory AND pages that were read from disk. (They may have been read from disk by either physical reads, or read-ahead reads).
  • Physical Reads: How many 8K pages were read from disk, because they were not already in memory. These pages are left in memory after the query is done for as long as SQL Server can keep them there.
  • Read-Ahead Reads: How many 8K pages were pre-fetched from disk for the query. These are physical reads, but they are optimized for performance. Learn more about read-ahead here.


Use the STATISTICS IO setting for measuring significant changes. If you have thousands of reads and cut them in half, this helps you measure that.

Avoid using STATISTICS IO to measure fine tuned improvements— you typically don’t care about differences in a couple of pages read, unless those pages are read so frequently the represent a significant amount of CPU burn. (In that case, you probably want to think about whether there are other ways to solve the problem in your application environment outside of SQL Server, anyway.)

Don’t worry too much if physical reads + read-ahead reads is not always exactly equal to logical reads. This is sometimes the case, but SQL Server has optimizations that make the numbers not always add up.

Statistics IO also reports on “scan count” for your IO. I don’t find this number to be very useful for big-picture comparisons, so if you’re just starting out, don’t worry about it!


Good news, STATISTICS TIME is even easier than STATISTICS IO.

When you turn on the STATISTICS TIME setting, SQL Server will print messages explaining how much CPU time was used when executing the query. This is a great way to compare query performance because it’s a server-side metric. When you just look at the elapsed time in SQL Server Management Studio there are all sorts of other things that come into play, like how long it took to return the data to your client, and how long it took to stream the results into the UI.

You will notice that there are some slight variations in CPU time when you run a query multiple times. Again, we want to look at bigger numbers, and when you’re really getting into testing you probably want to average the number you get over multiple query runs.

How are Parallel Queries Measured by STATISTICS TIME?

When you look at CPU time executed for a query, you’re literally seeing the amount used across all CPUs involved. This means that if you’re looking at a parallel query, total CPU time may well be greater than the total elapsed time for the query.

This isn’t a problem— and it’s actually a good thing when it comes to comparing performance!

Measuring Performance Improvements: An Example

Let’s take a look at a quick example. I am running this example on SQL 2012 RTM, Developer Edition.

First, I use the ContosoRetailDW sample database and turn on my STATISTICS IO and STATISTICS TIME set options.

USE ContosoRetailDW;


Next, I create a temporary covering index to use for the query I want to tune.

CREATE INDEX test_ix_FactOnlineSales_DateKey_Incl_SalesAmount
	ON dbo.FactOnlineSales (DateKey)
	INCLUDE (SalesAmount);

Here’s my problem query. I have a user who has been running this, and they’ve asked me if there’s a way to make it better.

SELECT  MONTH(DateKey) AS [Month], SUM(SalesAmount) AS [Total Sales]
FROM    dbo.FactOnlineSales
WHERE   YEAR(DateKey) = 2007

When you first run your query, you see this as the output:

SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 6 ms.

(12 row(s) affected)
Table ‘Worktable’. Scan count 0, logical reads 0, physical reads 0,
read-ahead reads 0, lob logical reads 0, lob physical reads 0,
lob read-ahead reads 0.

Table ‘FactOnlineSales’. Scan count 3, logical reads 40866,
physical reads 0, read-ahead reads 0, lob logical reads 0,
lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 3219 ms, elapsed time = 1670 ms.

Here are the key pieces of information we want to know:

First, I did 40,866 logical reads and burned 3,219 milliseconds of CPU. I did not do any physical or read-ahead reads— my data was already in cache. (Creating the covering index did this for me!) Since my elapsed time is roughly half of my CPU time, I can tell that this query went parallel.

Knowing that all the data was in cache is important. If I am just doing logical reads, then comparing CPU time between multiple runs is fair.

I look at the query and focus in on this part as the easiest thing to change:

WHERE YEAR(DateKey) = 2007

By wrapping a function around the date and then comparing it, I’m forcing SQL Server to scan every row of my covering index, apply the function, and then compare it to the literal value I provided.

I decided to rewrite this one line in the query and try a new version. My revised query is:

SELECT  MONTH(DateKey) AS [Month], SUM(SalesAmount) AS [Total Sales]
FROM    dbo.FactOnlineSales
WHERE    DateKey >=  ‘2007-01-01’ AND DateKey < ‘2008-01-01’;

Giving this a whirl, I get the results:

SQL Server parse and compile time:
CPU time = 7 ms, elapsed time = 7 ms.

(12 row(s) affected)
Table ‘Worktable’. Scan count 0, logical reads 0, physical reads 0,
read-ahead reads 0, lob logical reads 0, lob physical reads 0,
lob read-ahead reads 0.

Table ‘FactOnlineSales’. Scan count 3, logical reads 13032,
physical reads 0, read-ahead reads 0, lob logical reads 0,
lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 1734 ms, elapsed time = 916 ms.

Once again, I did all logical reads— so I’m not comparing one query that had to do physical IO and/or readaheads with another query which did not have to do them.

Let’s compare these numbers:

  • Run 1: 40,866 logical reads and 3,219 milliseconds of CPU.
  • Run 2: 13,032 logical reads and 1,734 milliseconds of CPU

Voila! I can now show that with a one-line change, I reduced logical reads by more than 65% and cut CPU burn by 45% in my test run— and I have the numbers to prove it.

Extra Geekery

If you want to dig into scan count, learn more about scan count in Books Online and in this SQL CAT blog by Lubor Kollar.

If you’d like the deep info on how CPU Time is measured, you’ll enjoy Bob Dorr’s posts on timers in SQL Server.

Kendra Little
My goal is for you to understand your SQL Server’s behavior– and learn how to change it. When I’m not figuring out the solutions to your database problems, you’ll find me at user group meetings in Portland, Oregon. I also love to draw.
Kendra Little on sabtwitterKendra Little on sablinkedin
↑ Back to top
  1. Hi Kendra,

    Your session is very interesing, last few weeks every day i had learning some new points from your session.

    what is read a head read ? what is the advantage read a head read..?

    Could you please explaine bit clear its really helpful to me

    • Hi Siva,

      I’m glad the posts and videos are helpful! Read-ahead is a special kind of asynchronous read from storage. People usually use the words “pre-fetching” to describe it– the idea is that the query starts running and SQL Server has to do lots of different things. Sometimes it figures out that it can do a special kind of advance asynchronous read and fetch the pages from storage and put them in cache while the query is already doing other things with the CPU. This is a read-ahead read.

      Enterprise Edition in SQL Server has the ability to read larger chunks from disk when it’s doing read-ahead, thereby making it even more efficient.

      More info is in Books Online here:

      If you want to get super-deep into pre-fetching, I recommend a blog post by Craig Freedman as well:

      Hope this helps!

  2. Hi Kendra,

    Thanks for your reply..
    As per my understanding befor query executing Read a head reader is read from the storage and put them into cache.

    when i am executing the query 1st time, read a head read from storage and put them into cache.. but

    2nd time executing the same query process is same or it will featch from cashe only ?

    • Great question.

      You will see different things depending on how much data you read, how much memory you have, and what other things need to use the memory.

      Let’s say I have 5GB of memory, I’m reading 128MB of data, and I’m the only one doing anything on the SQL Server. In that case, when I rerun the query, it should just need to do logical reads (no physical, no read-aheads) because there was plenty of room in cache and no other users of the system were doing things that hogged up the cache after I ran my query.

      If I wait a long time to run the query a second time and other things are happening, then at that point data may not be in cache anymore.

  3. Thanks for your reply …mam Have greate day. im going to bed from INDIA.

    we wil meet tomorrows session.


  4. Kendra,

    Very helpful post — thank you for explaining the process so clearly. Your approach really caters to my learning process, so thanks again!

    Jesse Schleicher

  5. This is a great way to determine actual performance improvements. One way I’ve used to get an estimated performance improvement is to display the estimated execution plan for the original and re-written query and compare the estimated subtree cost value reported at the final icon. This is obviously not as precise as the above method, but I find it helps give me a general sense of any improvement (or degradation) my rewritten query has. Am I thinking correctly here?

    • Looking at execution plans is great. Never a *bad* thing.

      Hi Shaun,

      Thanks for your comment.

      If you start looking at statistics IO output, you should see that there are some cases where a change in TSQL gives you an execution plan that says it is more expensive, but actually uses less IO and less CPU. That’s because some changes in TSQL can help get you a more accurate plan. So sometimes a more “expensive” plan is actually a cheaper query, because it was able to more accurately predict the cost.

      One thing to note– and I learned this from Grant Fritchey, I’ll never forget how he said it– is that costs in execution plans are always estimated. Even on actual plans, the costs you see reported are estimates, and not a record of work it actually did.

      Execution plans are still great tools (both estimated and actual) and work well hand-in-hand with statistics time and statistics IO.

  6. Thanks for the quick tutorial. I can see this come in handy doing performance testing and used alongside execution plans.

  7. I need a way of getting these sort of performance metric for a particular database during application testing. With oracle I take a Snapshot between tests and gather logical/physical reads cpu time/elapsed which gives excellent daily reports to measure performance during the development process. NB multiple sessions .
    This then allows us to spot and focus in particular issues using the above method cheers

    • Hi Andrew,

      A somewhat similar way to snapshot and do a diff in SQL Server would be to capture information from two DMVs before and after, and then compare: sys.dm_os_wait_stats and sys.dm_io_virtual_file_stats.

      sys.dm_os_wait_stats will not give you CPU seconds, instead it tells you what SQL Server was waiting on at different periods. This isn’t super easy to interpret at first, but with practice it can be very helpful.

      sys.dm_io_virtual_file_stats is a bit more straightforward and can help you see how many reads/writes were done from storage. (Not the same as logical reads.)

      To really get similar numbers for CPU time, logical reads, etc, you would need to use SQL Trace or Extended Events in SQL Server. SQL Trace is an older technology and tends to be more performance impacting than Extended Events, but it works on more versions of SQL Server an can be much simpler to use (prior to SQL 2012).

      If you’d like to check out a tool that may help make this easier, check out Brent’s video from this week —

      Brent demos Cleartrace, which can help you easily run a lightweight trace and then slice and dice the stats from a workload. Cleartrace is a free tool written by Bill Graziano. (Thanks, Bill!)

  8. Thanks for feedback I’ll investigate

  9. What’s the “WorkTable” mention about? I don’t see that in my statistics results.

  10. I wonder why my query CPU time is greater then elapsed time.

    SQL Server Execution Times:
    CPU time = 125 ms, elapsed time = 112 ms.

    I thought elapsed time = CPU time + time return the data to the client + time to render UI.

    • CPU time can be greater than elapsed time when the query is using parallelism. The amount of time on the processors is calculated as a total– if multiple processes are in use at the same time, their sums are just added up (so it can be greater than elapsed time).

      I actually think this is a very good thing because it lets you see the total work done on the CPUs, and it doesn’t “hide” the work.

  11. Hi,
    isn’t it possible that a difference in CPU time vs elapsed time is attributed to signal wait (e.g. the plan didn’t go parallel but there were some I/O issues)

    • Hi Thierry,

      In the example above, CPU time is > elapsed time, so I wouldn’t think signal wait (waiting on getting on a CPU) would factor in to this case.

      Waiting on getting on a CPU would count as time waiting, and could certainly pump up the elapsed time, but shouldn’t be counted as running on a CPU. If CPU waits were a really big deal I would expect to see a different proportion of lower CPU time and higher elapsed time.

      Hope this helps!

      • Hi Kendra

        well, I got that mixed up good :)
        Thanks again for the quick reply and very informative post(s)!

        Best regards

  12. For measuring IO I typically do three before runs and three after runs, in both cases discarding the initial run. The logic behind this is physical reads aren’t consistent so you get everything in cache and only focus on the logical reads. This also gives me more accurate durations since tuning can change parallelism, and, although duration is my least favorite metric, it’s the only one my users see.

    Are there any problems with this approach?


    • Hi, Steve. Let me ask a followup clarifying question: you say, “get everything in cache.” Are you saying that everything you tune fits in cache?

      • I don’t think I could say yes to that, although I am spoiled with some good dev servers. Slowing down and putting thought into my wording I would have to say “everything on a level playing field” instead of “everything in cache”.

        I should also mention that I recompile before each set of three test runs to ensure I’m truly comparing apples to apples with plans that are cached with the same parameters.

        Not to get off topic, but an old post of yours explaining how you schedule blog posts a week or more out to give the ideas time to settle in before the public sees it is really starting to sink in.

  13. This is insignificant to the authors intended point of the article, but it stands out like a sore thumb. The substituted WHERE clause will inadvertantly exclude the day of December 31st, due to BETWEEN only accounting for midnight.

    WHERE DateKey BETWEEN ‘2007-01-01’ AND ‘2007-12-31’

    For new comers I would recomment the GTE/LT approach. This will include all timeframes the YEAR() function captures.

    WHERE DateKey >= ‘2007-01-01’ AND DateKey < ‘2008-01-01’


    • LOL– good point! When designing the demo I probably just quickly wrote out whatever TSQL returned some rows from AdventureWorks.

      Note: I updated the code sample. Thanks for pointing this out.

  14. It is probably worth to stress that STASTISTICS IO does not count IO that is within a scalar or table-valued functions at all. Details about it with an example is here:

  15. Thank you very much for this post and all your others. This one helped clarify a few things and goes well with the “Watch Brent Tune Queries” video that came along a couple of years later.

    Very minor typo in one spot, just trying to help…

    “CPU time = 3219 ms, elapsed time = 1670 ms.” is followed by “First, I did 40,866 logical reads and burned 3,3219 milliseconds of CPU.” Note the milliseconds in the second quote! There is an extra “3” in there.


  16. Hi Kendra,

    Very nicely and well written and with good explanation.

    Do I need to consider both CPU & time Elapsed time while comparing 2 different queries performance, or just the CPU time?


    • I typically focus mostly on CPU time and logical reads, but I wouldn’t ignore elapsed time altogether.

      If elapsed time is notably higher than CPU time, there may be some resource waits slowing the query down that you need to know about.

  17. I have what I would consider a simple query —
    SELECT a.ID ,
    RIGHT(‘000’+cast(a.TestNumber as varchar),4)+’-‘+a.LocationID TestNum ,
    a.BRANDID Brand ,
    FROM BatchProcess a
    INNER JOIN tblLookUpLocation b
    ON a.LocationID=b.LocationID
    WHERE b.LocatoinID=1
    –ORDER BY a.TestDate DESC, a.TestNumber DESC

    BatchProcess has a clustered primary key on ID, with 80K rows.
    tblLookUpLocation has 5 rows.

    With no ‘ORDER BY’ clause the query uses an index scan of the Process table and a merge join on the Lookup table. The CPU time is 47ms and elapsed time is 881 ms with 248 logical reads.
    If I uncomment the ‘ORDER BY’, logical reads remain 248, but CPU time increases to 188 ms and elapsed time falls to 452.
    3x increase in CPU and 1/2 total time. These were run immediately after each other and results were delivered to ssms on my desktop.

    I wondered if the increase in elapsed time is the result of CXPACKET waits, or if those waits are part of cpu time. In other words, is cpu time the time the cpu is actually working only, and does not include waiting on results? Thanks

  18. I neglected to mention that the query that has the ORDER BY clause runs parallel. Sorry.

    • It sounds like you’re saying without the ‘order by’ that it’s a serial plan. That would explain the lower CPU time. The higher elapsed time for the serial plan could be from a variety of causes / waits on that run, including network latency — kinda hard to say from here.

      I would expect CPU time to go up with a parallel plan as you observe.

  19. I’m having trouble using STATISTICS IO when a query has a huge result set I’m trying to tune, so big in fact; that I will run out of memory.

    I set discard results after execution(query -> query options -> grid -> discard results after execution) but that throws out the output of STATISTICS IO.

    Any Ideas?

    • Brandon – yeah, rather than returning the result set to SSMS, just insert it into a temp table.

      • The explain plan is showing that it will have to pay IO to do that which would skew my IO results for a query that is just data retrieval.

        Is that IO consistent enough that you just measure the difference between the before query and after query with the insert on both?

        Thanks for the replys and your blog btw I’m learning a lot.

Leave a Reply

Your email address will not be published. Required fields are marked *