So you just built a brand new SQL Server, and before you go live, you wanna check to see if it’s going to perform as well as your existing server.
The easy way: test your maintenance jobs
It’s as easy as 1, 2, 3:
- Restore your production backups onto the new server
- Do a full backup, and time it
- Run DBCC CHECKDB, and time it
Oh sure – this is nowhere near as good as testing your application code, and you should do that too, but this is your very first test. It’s extremely easy, and often it surfaces trouble very quickly. If your backup and CHECKDB runtimes are slower than your current production server, ruh roh – you’re in trouble.
This isn’t perfect because:
- Your job start times may be different – for example, if your production backup jobs run in the middle of the night at the same time as your ETL jobs, then your production backups could be artificially slower.
- Your concurrent workload may be different – maybe all your production backup jobs point at the same shared file target at midnight, making it dead slow. When you test your new server at 1PM in the afternoon when nothing’s happening on the shared file target, they may be artificially fast.
- Your backup target may be different – the production servers might be writing their backups to local storage, which is of course one hell of a bad idea.
- Your new version of SQL Server may be different – if you’re migrating from 2014 to 2016, and you find that CHECKDB runs faster, it might be the CHECKDB improvements in 2016.
But again – all of that said – if you find that your new production server’s maintenance jobs run slower than your current production servers, time to set off the alarms.
The easy but wrong way: test a synthetic workload
You could download HammerDB, an open source load testing tool, and run the kinda-sorta TPC-C workload against your SQL Server. It’s not an official TPC benchmark, but it’s a repeatable workload that you can run against multiple servers to see whether one is faster than the other.
At that made-up workload.
Which is probably nothing like your real apps.
Using HammerDB to compare two of your production servers is like comparing a Porsche 911 and a Chevy Corvette by measuring how many live babies they can carry in the trunk. It doesn’t really matter who wins – the comparison is meaningless.
The harder way: test individual queries
Use sp_BlitzCache to build a list of your worst-performing queries. Then run those same queries against the new production server to compare:
- Their logical reads
- Their duration
- Their CPU time
- Their execution plans – to understand why the above numbers are different
Just running the same queries back to back isn’t all that hard, but the reason I call this method “harder” is that you have to have enough SQL Server performance tuning knowledge to understand why the numbers are different, and what control you have over them. Are they the result of version/edition changes, SQL Server setting differences, or hardware differences? This stuff takes time to analyze and correct.
The really hard way: test your real workload
“Just run the same queries we run in production – how hard can it be?” Well, at scale, very hard – because it involves:
- The same starting point – after all, you can’t run the same delete statement twice in a row to achieve the same effect, and every insert statement affects subsequent test runs
- The same data load distribution – it’s easy to run an identical query across 1,000 sessions, but if they’re all trying to lock the same row, then you’re just reproducing blocking issues that you don’t have in real life
- The same workloads – and since your app and your database is constantly changing, the query traces you gathered 3 months ago are meaningless today, so you have to keep rebuilding this wheel every time you want to tackle a load testing project
Is it possible? Absolutely – but just start with the easy stuff first to get as much valuable data as you can in less time.
Because remember – even when you find differences between the servers, that’s still only the start of your journey. You have to figure out why things aren’t performing as well as you’d expected – and now you’re right back at the prior step, comparing query plans and doing root cause analysis. Before you try to do that across 100 queries, start with your worst-performing query by itself.