T-SQL & Development

How to Load Test a Database Application

By Brent Ozar · January 15, 2019 · 10 comments

It sounds really simple:

Capture a bunch of production queries
Replay those same queries over in a load test environment

So what’s the big deal? Why is it so hard to find good guidance on how to pull this off? How hard can it be?

Problem 1: many queries aren’t repeatable.

When you try to replay this query:

DELETE ProcessingQueue
  WHERE JobID = 12345;

1 2	DELETE ProcessingQueue WHERE JobID = 12345;

The rows have to actually be there in the table, or else your delete isn’t doing any work. That means you can’t replay that same query several times in a row. You’re going to have to restore the database after each load test, starting it back at the same point.

Deletes aren’t the only commands with this problem, either: for any query with a WHERE clause, we need the data to exist in the state we want:

SELECT *
  FROM ProcessingQueue
  WHERE IsProcessed = 0 AND JobID = 12345;

1

2

3

SELECT *

FROM ProcessingQueue

WHERE IsProcessed = 0 AND JobID = 12345;

In both cases, the rows for JobID 12345 have to already exist, so for a successful load test, our setup steps at the beginning of the post need to look more like:

Take a backup of the production database(s)
Capture a bunch of production queries
Restore that same backup to the load test environment
Replay those same queries over in a load test environment

(I can already hear the security crowd groaning, saying, “No no no, you’re never allowed to restore production data somewhere else.” Put a sock in it: we’re already talking about capturing queries in production, and as you know, those have personally identifiable data already. Read through til the end of the post before you get the flamethrower out.)

Problem 2: amplification introduces false blocking.

Managers think that to simulate more load, they can just take the production queries and replay them multiple times, simultaneously, from the replay tool. We’ve already talked about how you can’t reliably replay deletes, but even inserts and updates cause a problem.

Say we’re load testing Stack Overflow queries, and our app does this:

UPDATE dbo.Users
  SET Reputation = Reputation + 1
  WHERE Id = 22656;

1

2

3

UPDATE dbo.Users

SET Reputation = Reputation + 1

WHERE Id = 22656;

If try to simulate more load by running that exact same query from 100 different sessions simultaneously, we’re just going to end up with lock contention on that particular user’s row. We’ll be troubleshooting a blocking problem, not the problem we really have when 100 different users run that same query.

We really need to simulate activity across lots of DIFFERENT users, so you end up writing code to randomize parts of the load test. Now our load test routine looks like:

Take a backup of the production database(s)
Capture a bunch of production queries
Analyze the production queries to see which ones need to be randomized, and write that code (or require it as part of the development process)
Restore that same backup to the load test environment
Replay those same queries over in a load test environment using a tool that can randomize the parameter calls

It’s hard work, but it’ll pay off because we can reuse our load test over time, right? Well, wait..

Problem 3: production is running yesterday’s code.

Load testing is (theoretically) useful to know that our new code changes are going to perform well when we deploy ’em. However, if we’re capturing production queries…they’re not running the new code yet, nor does the production database have our new columns and tables.

So now our testing routine looks like:

Take a backup of the production database(s)
Capture a bunch of production queries
Analyze the production queries to see which ones need to be randomized, and write that code (or require it as part of the development process)
Restore that same backup to the load test environment
Apply the deployment scripts for our new code
Replay ~~those same queries~~ a mix of old code and our new production code over in a load test environment using a tool that can randomize the parameter calls

That’s why it’s so hard to load test databases.

Everybody starts down the analysis path, gets about an hour into the planning, and realizes it’s still a big challenge in 2019. When you take all of these problems together, plus the risks of using production data in development, the capture-and-replay technique just flat out doesn’t make financial sense for application load tests.

But I was careful there to say application load tests.

Capture-and-replay load tests can still make sense when you want to test new hardware, different compatibility levels, or new indexing strategies. Just make sure to restore the database to the right point in time, and don’t try to amplify your workloads lest you introduce blocking problems.

Free, 3× a week

Get my new posts by email

Three posts a week, plus a Monday roundup of the best database news from around the web.

10 comments

Don

January 15, 2019 at 10:13 am

I thought SQL had a tool to do this, Distributed Replay. Does Microsoft make any more tools for this scenario? Essentially, we are trying to replicate a blocking scenario on dev environment.

Reply
1. Brent Ozar
  
  January 15, 2019 at 10:23 am
  
  Don – unfortunately, the limitations described in the post exactly match Distributed Replay.
  
  Reply
Fernando Jacinto VIP Student since 2019

January 15, 2019 at 10:20 am

Or maybe using DEA https://docs.microsoft.com/en-us/sql/dea/database-experimentation-assistant-overview?view=sql-server-2017

Reply
1. Brent Ozar
  
  January 15, 2019 at 10:24 am
  
  Fernando – yep, same limitations as Distributed Replay, and the same problem described in the post. Not a solution either, sadly.
  
  Reply
Greg Sohl

January 15, 2019 at 11:21 am

That’s why we automate the application and capture the database activity for performance analysis. Randomization is built in.

Reply
1. Brent Ozar
  
  January 15, 2019 at 11:25 am
  
  Exactly – if you can guess what users are going to do, in what quantities, then you can reliably produce load and repeat it.
  
  Although if you can precisely guess what users are going to do, well, that’s another blog post altogether!
  
  Reply
  1. Willie
    
    January 15, 2019 at 11:58 am
    
    That’s like Marty McFly and Biff with the Sports Almanac going back in time! 😉
    
    Reply
The Difficulties Of Database Load Testing – Curated SQL

January 16, 2019 at 5:00 am

[…] Brent Ozar shares some of the trouble you might run into when database load testing: […]

Reply
Robert Eder VIP Student since 2017

January 21, 2019 at 2:09 pm

There is one more difficulty not mentioned and in my opinion, it is the most difficult to reproduce: concurrency and contention. Consider a production environment that has four application servers all hitting the same database. You can capture the queries that these servers send to SQL, but can you replay these queries to simulate the same concurrency and contention that exists between the four application servers in production? Unless you can find a way to do this, you will hear “It worked in the load test.”

Reply
Office Hours: Read This, Watch This Edition - Brent Ozar Unlimited®

May 14, 2026 at 1:27 am

[…] Read this. […]

Reply