For the last twenty years, SQL Server performance tuning has relied on a few facts:
- SQL Server provided a lot of performance metrics with thorough documentation from both Microsoft and the community.
- You could fetch those metrics in a variety of ways, like Perfmon counters and dynamic management views.
- Many of the metrics were cumulative since the instance started up (index usage DMVs, wait stats) or since “the thing” happened (like the query plan went into cache.)
- Instance restarts rarely happened, so we had a lot of metric history available at any given time.
- Your SQL Server’s performance capacity was fixed: the core count and memory wasn’t changing.
- Your SQL Server’s behavior was fixed for long periods of time: no one was patching the engine without you being aware, and before the patch was applied, you could test it to understand what the patch affected.
I’m not saying the cloud is bad – I freakin’ love it – but it’s different, and that’s why we have to keep adapting as performance tuners.
The first major shift was the original version of Azure SQL DB. It was effectively SQL Server, but only for one database at a time, and hosted & managed by someone else (Microsoft.) It was mostly SQL Server, but had a few minor key differences – one of which was that Microsoft could restart and/or patch your SQL Server instance at any time, without warning.
Microsoft put a lot of work into hot patching Azure SQL DB, and for them, the results are fantastic: they can keep your instance up to date and on stable hardware without having to communicate outages to you.
But Microsoft’s also been following your example and simply not documenting their changes. There’s no changelog for Azure SQL DB, no notification that your server was just patched, and no warning that behaviors have changed. In theory, there are no bugs – but as we can clearly see from the repeated problems with Cumulative Updates, there are one hell of a lot of bugs left in the database engine.
Azure SQL DB Elastic Pools made it easier to tune with your wallet: group a bunch of databases together, buy hardware capacity for the entire pool, and then if any one database needed a burst of power temporarily, that was fine. That was completely revolutionary. (By that I mean, it’s exactly how SQL Servers have worked for the last two decades. Whatever.)
Azure SQL DB Serverless automatically scales capacity up and down based on the running workload. That means even if you’re logging every performance metric, and you understand what they all mean, that’s still not enough: you need to be able to interpret them at different points in time based on the amount of compute capacity that was available at that time.
For example, say the running workload 30 minutes ago demanded a lot of capacity, so Azure automatically spun up more horsepower, but now there isn’t much of a workload, and so a query that ran slowly 1 hour ago (at low capacity) is running super fast now (at high capacity.) We can’t just look at query runtimes as an indicator of query performance because they vary too much, and they don’t indicate query plan quality – just wallet depth.
Azure SQL DB Hyperscale reinvented the way SQL Server’s storage works. If you’re still thinking of monitoring in terms of disk queue length and buffer cache hit ratio, you can throw that knowledge out the window when you move to Hyperscale.
I got into databases because I hated learning languages. I have so much respect for developers who can jump from one language to the next, constantly learning new syntax. I burned out quickly when I tried that, and I wanted something where the language stayed the same for decades. That’s databases.
But I do love learning, and it’s a hell of an awesome time to be a database performance tuner. There’s so much to learn all the time with every new technology that comes out.
The hardest part is figuring out where to place your learning bets. There are hundreds of talking heads in the database community that say, “You should totally learn Technology X or else you’re gonna be a dinosaur.” Well, given our limited hours of free time per week, we can’t learn everything – not to the depth that will really pay off in our careers – so what do we learn? Which technologies are gonna take off, and which technologies are gonna be dinosaurs themselves? It’s even more complex when you try to learn emerging technologies because the product you learn today can evolve dramatically within the next 1-2 years, rendering a lot of your learnings obsolete.
I don’t have answers, but I’m excited to see how the next few years shake out.