How Do I Know My Query Will Be Fast In Production?

Last Updated April 1, 2015

We’ve all had this problem before – the production server is significantly more powerful than the development server. You tune a query and you’re pretty sure you’ve made it faster, but there’s that doubt in the back of your mind. There are so many things that can change performance – memory, CPU count, disk performance – how can you tell that you’ve made a query better?

Measuring Slowness

When you know you have a slow query that you need to tune, run it in production one last time. But, before you do that, gather CPU and IO metrics using STATISTICS IO and STATISTICS TIME:

SET STATISTICS IO, TIME ON; 

EXEC dbo.MyAwfulQuery @slow = 'yes';

1

2

3

SET STATISTICS IO, TIME ON;

EXEC dbo.MyAwfulQuery @slow = 'yes';

You’ll get a bunch of jibber jabber in the “Messages” tab that documents how much CPU time was spent compiling and running your query as well as information about logical IO. Save this off to a text file for now.

Measuring Fastness

Go to your dev server. Tune your query. I’ll wait.

As you’re tuning your query, pay attention to the CPU time and logical IO numbers that you’re getting back from STATISTICS IO and STATISTICS TIME. These are a relative indication that things are getting better. If there are CPU differences, you’re likely to see different CPU numbers on that dev machine, but that’s OK. We’re looking to compare total logical IO and total CPU time in milliseconds.

If you want an easy route to compare your results, you can paste the output from STATISTICS IO and STATISTICS TIME into statisticsparser.com. This will go through the results and push the data into a nice table for your perusal.

How Much Faster Will my Query Be?

Using STATISTICS IO and STATISTICS TIME to tune helps me figure out how well I’m doing compared to where I started from. Some queries move around a lot of data. In these cases, it can help to know how much faster dev is than production.

If you really want a reasonable comparison between dev and production, you can find some guidance in the Fast Track Data Warehouse Reference Guide. There’s a section of this document about measuring the MCR (Maximum Consumption Rate) and BCR (Base Consumption Rate) of your SQL Server. These numbers reflect 100% cache reads and 100% disk reads, respectively.

Once you have MCR and BCR, you make effective predictions about how query performance could differ between the two. This gives you the power to predict, with some margin of error, how fast newly developed queries can run.

But is it Faster?

Between STATISTICS IO and STATISTICS TIME you can determine if a query is going to use fewer resources. The MCR and BCR give you relative numbers for how fast a server can process data. Between these measurements, you can predict how a query will perform between environments and what kind of impact your changes will have.

Brent says: as you get used to doing these measurements, you’ll also be able to tell if a new dev-only query is going to be fast enough in production, too. You’ll know how many logical reads is slow (or fast) in your environment.

Erik says: The hardest thing for me to ever get was commensurate data to test on in dev. If you don’t have it, get it. If you have it, don’t waste it.

James Lupolt
April 8, 2015 11:29 am

I’ve also come across the opposite situation frequently, where dev is faster than prod because there’s no load on the dev SAN.

I think the same principles still apply: focus more on how much work the query is doing (via DMVs or STATISTICS IO/TIME) than on how long it takes, and you can be more confident that you’re improving performance.

Reply
Mark Freeman
April 8, 2015 12:41 pm

Using a ratio of MCR and BCR sounds good, but I sometimes find that the execution plan is dramatically different between environments due to differences in the number of cores and amount of RAM. That would seem to make those ratios seem useless because we are no longer comparing doing the same work the same way between environments.

In a related rant, it would be nice if the optimizer took I/O speed into account somehow. I’ve got slow high-latency platters in DEV and TEST, but all SSD in PROD. That should influence the plans as well, but doesn’t. But that would only make the idea of using MCR and BCR ratios even less useful because even more plans would be wildly different between environments.

Reply
- Jeremiah Peschka
  April 8, 2015 1:44 pm
  
  When we work with clients pricing out production systems, we remind them of the importance of an equally sized development system for precisely that reason – you can get dramatically different plans between those two environments. Developers need to be able to accurately write and tune queries. The cost difference for these things is a fraction of the licensing cost on a production SQL Server and even a fraction of the support costs for the SAN.
  
  On the second point, I disagree. The optimizer should know about the number of pages involved, but it can’t make a guess about storage performance. An idle mid-level SAN can outperform an overworked high end SAN. No costing model can account for the inconsistencies of concurrent disk access or the trickery that SAN vendors use to get performance improvements out of their storage. Some other database vendors let you adjust the costing model used, but that’s just adjusting how I/O costs are calculated and it doesn’t reflect storage performance itself.
  
  Reply
  - James Lupolt
    April 8, 2015 6:37 pm
    
    A while ago I asked someone on the optimizer team if he thought SQL Server should adjust the costing model for different storage scenarios, for example by introducing an option to assign random and sequential I/O a more similar cost when SQL Server is on flash storage. His answer was that even if this meant the model was built on more accurate assumptions, it was unlikely to result in better execution plans.
    
    Reply
    - Jeremiah Peschka
      April 8, 2015 6:38 pm
      
      That was my experience with tweaking those settings in PostgreSQL – it produced different plans. They weren’t necessarily better or worse, they were just different.
      
      Reply
Alex Friedman
April 30, 2015 5:22 am

Yeah, the real trouble is really that the data in dev never really represents the scope and distribution of the production data… Do these unicorns really exist?

BTW, statisticsparser.com is awesome! Thanks for that. A good substitute if SQLSentry’s Plan Explorer can’t be reached quickly.

Reply
- Brent Ozar
  April 30, 2015 6:23 am
  
  Alex – actually, we find that in most cases, dev is a frequent refresh from production. I need my developers to work with the same types of data that they have in production – after all, they need to catch edge cases that may not exist in the dev data.
  
  Reply
  - Alex Friedman
    April 30, 2015 9:36 am
    
    I’m really glad to hear that. My experience is mostly in financial systems, where refreshing from production has regulatory complexities and requires complex data masking — not to mention the lack of the budget for a production-scale dev env.
    It’s great that recommend this for your clients.
    
    Reply
    - Jeremiah Peschka
      April 30, 2015 9:53 am
      
      In general, if you need a refresh of data, even with data under strict compliance rules, there are solutions for that. There are a number of products that can perform data cleansing and a variety of ways you can make that work.
      
      Nothing says development has to be the same gear as production. You can pick up a 1TB SSD for less than $500.
      
      Reply