Database Performance Tuning is Getting Much Harder.

Cloud Computing
25 Comments

For the last twenty years, SQL Server performance tuning has relied on a few facts:

  1. SQL Server provided a lot of performance metrics with thorough documentation from both Microsoft and the community.
  2. You could fetch those metrics in a variety of ways, like Perfmon counters and dynamic management views.
  3. Many of the metrics were cumulative since the instance started up (index usage DMVs, wait stats) or since “the thing” happened (like the query plan went into cache.)
  4. Instance restarts rarely happened, so we had a lot of metric history available at any given time.
  5. Your SQL Server’s performance capacity was fixed: the core count and memory wasn’t changing.
  6. Your SQL Server’s behavior was fixed for long periods of time: no one was patching the engine without you being aware, and before the patch was applied, you could test it to understand what the patch affected.

Eyes up here, kidIn the cloud,
those aren’t facts anymore.

I’m not saying the cloud is bad – I freakin’ love it – but it’s different, and that’s why we have to keep adapting as performance tuners.

The first major shift was the original version of Azure SQL DB. It was effectively SQL Server, but only for one database at a time, and hosted & managed by someone else (Microsoft.) It was mostly SQL Server, but had a few minor key differences – one of which was that Microsoft could restart and/or patch your SQL Server instance at any time, without warning.

Microsoft put a lot of work into hot patching Azure SQL DB, and for them, the results are fantastic: they can keep your instance up to date and on stable hardware without having to communicate outages to you.

But Microsoft’s also been following your example and simply not documenting their changes. There’s no changelog for Azure SQL DB, no notification that your server was just patched, and no warning that behaviors have changed. In theory, there are no bugs – but as we can clearly see from the repeated problems with Cumulative Updates, there are one hell of a lot of bugs left in the database engine.

Each new cloud service
changes facts you relied on.

Azure SQL DB Elastic Pools made it easier to tune with your wallet: group a bunch of databases together, buy hardware capacity for the entire pool, and then if any one database needed a burst of power temporarily, that was fine. That was completely revolutionary. (By that I mean, it’s exactly how SQL Servers have worked for the last two decades. Whatever.)

Azure SQL DB Serverless automatically scales capacity up and down based on the running workload. That means even if you’re logging every performance metric, and you understand what they all mean, that’s still not enough: you need to be able to interpret them at different points in time based on the amount of compute capacity that was available at that time.

For example, say the running workload 30 minutes ago demanded a lot of capacity, so Azure automatically spun up more horsepower, but now there isn’t much of a workload, and so a query that ran slowly 1 hour ago (at low capacity) is running super fast now (at high capacity.) We can’t just look at query runtimes as an indicator of query performance because they vary too much, and they don’t indicate query plan quality – just wallet depth.

Azure SQL DB Hyperscale reinvented the way SQL Server’s storage works. If you’re still thinking of monitoring in terms of disk queue length and buffer cache hit ratio, you can throw that knowledge out the window when you move to Hyperscale.

This is a great time to be
a performance tuner.

I got into databases because I hated learning languages. I have so much respect for developers who can jump from one language to the next, constantly learning new syntax. I burned out quickly when I tried that, and I wanted something where the language stayed the same for decades. That’s databases.

But I do love learning, and it’s a hell of an awesome time to be a database performance tuner. There’s so much to learn all the time with every new technology that comes out.

The hardest part is figuring out where to place your learning bets. There are hundreds of talking heads in the database community that say, “You should totally learn Technology X or else you’re gonna be a dinosaur.” Well, given our limited hours of free time per week, we can’t learn everything – not to the depth that will really pay off in our careers – so what do we learn? Which technologies are gonna take off, and which technologies are gonna be dinosaurs themselves? It’s even more complex when you try to learn emerging technologies because the product you learn today can evolve dramatically within the next 1-2 years, rendering a lot of your learnings obsolete.

I don’t have answers, but I’m excited to see how the next few years shake out.

Previous Post
Why Are Linked Server Queries So Bad?
Next Post
Tweets You Missed

25 Comments. Leave new

  • Douglas Z Coats
    July 20, 2021 1:34 pm

    So what technologies do you think are going to be hot later and which ones do you think will be dinosaurs? I know you said you dont have answers but I am generally curious to see what you think. Even if its just db related.

    Reply
    • “I know you said you didn’t have answers, but what are your answers?”

      I’m chuckling over here. I have thought about blogging that, but I know I’m going to piss people off and get into a bunch of arguments. People bet on technologies, and then they deeply, personally identify with the bets they made. Saying they made the wrong bet can be like insulting them personally, and they don’t take it well. Because of that, I gotta be a little careful how I write posts like that. I’ll hold off on that for now and eat my popcorn from over here.

      Reply
      • douglas coats
        July 20, 2021 2:37 pm

        I guess I shouldve expected that response lol Though I would find it amusing to see your satire/troll blog post on the topic.

        Reply
  • Robert McClean
    July 20, 2021 3:10 pm

    This post speaks to me. I currently suffer from Synapse nightmares.

    Reply
    • Steve Powell
      July 23, 2021 9:02 pm

      You and me both. Though Synapse is a pussycat when compared to the sabre toothed tiger that is ADF

      Reply
  • Lawrence Dugger
    July 20, 2021 4:17 pm

    It’s all good until it hiccups and we are left in the dark. Thanks for leading the charge.

    Reply
  • James Fogel
    July 20, 2021 4:44 pm

    I have no advise on what technologies someone should become an expert in, but I think the DBA is going to start wearing a Solutions Architect hat more often. I’ll consider technologies and product offerings as one in the same here. Knowing which offering such as SQL DB, managed instance or VM with SQL Server installed, etc. are the right choice for the job. What are the benefits and what are the drawbacks. Some are obvious and some are not.

    Still, someone can know the DB offerings from all the major clouds inside and out as if they were a sales rep, but if the person making the recommendation doesn’t have a grip of how the database is used then bad recommendations will happen. That lift and shift from on-prem to SQL DB sounded great until you learned that there is a huge dependence on things like CMD Shell, etc. You still have to know what you are doing…..

    Reply
    • Fully agree…
      Working with Azure (or probably any cloud) is like being given a 5 gallon bucket of legos to build something with. None of the legos match colors. They are from different editions, some “space”, a smattering of “city”, random “bionicle” components, barely enough “technics” and even an NXT 1st gen. You may or may not have the USB cable for the NXT, hopefully the Bluetooth works. Oh, and about a 3rd are duplo, good luck adapting those. And, half are megablocks in exactly the color you needed and if you start your build with these, forget about integrating anything else. Don’t worry though, it’s extensible, scalable, and super flexible. Just read the documentation, it’s been photocopied from some of the editions in your bucket and there may or may not be a page or five missing. Just know that if a few of the pieces don’t fit or work together, just keep trying that piece is still being worked on and improved. Also, don’t forget, if you get your thing to eventually work, keep an eye on it because that super handy piece you hinged the entire design on might not work next month because of revisions. Don’t worry, the photo copies of the updated documentation will be available eventually. No worries, you can do this!

      Reply
  • ken ambrose
    July 20, 2021 5:20 pm

    In my very first course at Cal in Information Systems, in the very first class (systems analysis), the instructor introduced himself and then announced: “if you are not prepared for lifeliong learning, this is not the career for you”.

    Being mentally forewarned – and therefore mentally prepared -for the lifetime learning curve helped me greatly to accept it, and sometimes even embrace it. Thank you Cal!

    Reply
  • I share your feel Brent. I was programmer many years, and even when I enjoyed lot at first, then I realized I always had to learn new languajes. Fortunatelly, dabatases are not so changing. Regards.

    Reply
  • “We don’t actually fix potholes in the road here in Florida – but we do move them around so you can’t memorize where they are”

    That’s it. That’s my comment.

    Reply
  • In 2008 I started my adventure with relational databases (Oracle) and in postgraduate course my professor told that we do not have to worry about relational databases for 30 years (I am not sure if it was 30 but it could be 20 or 40). There are so many databases and data warehouses. And I am not sure if every company wants to have it in the cloud. I think for the biggest companies it would be rather private than public cloud. It is not always but I can be wrong.
    A couple of days ago Brent had a post about SQL Server versions. After this post I am sure we are safe for may be 10+ years.

    Reply
    • Part of me wishes that our jobs weren’t QUITE so safe, and were just a little bit easier, hahaha.

      Reply
      • I am a developer and very often design indexes and tune queries so it is a little bit different for me.
        From DBA perspective it is different. But still I am not so convinced that migration to cloud will be as fast as people think.

        Reply
  • Serverless is a trap, and the early picture in the documentation is misleading:
    You get billed based on the maximum of cpu usage and memory usage.
    from the documentation:
    “Resource billed: CPU and memory. Amount billed: vCore unit price * max (vCores used, memory GB used * 1/3,)”
    “When the database is busy, the cache is allowed to grow unconstrained up to the max memory limit.” “Also, there can be additional delays after user activity stops before memory reclamation occurs ”

    meaning: if you don’t let your server have 0 user queries, for the server to shut down, it will remain billing at Maximum vcores, because of ram cached.
    Unless your database is substantially lower than ram size, and cpu usage is typically less than 30%, hard provisioning vmax cores is cheaper than serverless, and more performant.

    Reply
    • Johannes Vink
      July 21, 2021 3:50 am

      Depends on the usage pattern. If there is a intermittent usage pattern, unpredictable and needs to burst when needed: serverless. Anything else: fixed capacity please. Or scale up/down via ETL or an application.

      That said… I have only 1 db on serverless, which is accessed maybe twice a month and then needs to run hard. Other databases are fixed capacity.

      Reply
    • Agreed. So far in my experiences, I have yet to run across anyone in a company who fully understands what their cloud landscape will cost them. The best they can ever do is guess and the number is a wide range. The usual response is, “Well, let’s try XYZ, start small and go from there.” The cloud companies don’t let themselves get pinned down to a fixed price model. Don’t get me wrong, I understand why (I think) – the distributed and highly scalable nature of cloud resources makes for a highly variable cost structure for the cloud company. However, this also means we have to trust that they are not misbehaving and exploiting how difficult and obscure it is for their customers to understand what they are actually paying for.

      Reply
    • We were doing serverless – only needed working hours – I told the Dev’s they would have to code for wake up time from them being paused.

      Move forward a bit when I get called into a costs meeting to explain why the DB’s were costing so much -and on a look I some nice DEV had come along and disabled Pause on them all!! By the way – I had no rights to look at the costs!!

      Don’t have a go about rights and allowing devs being able to do that – not my area so I am told 🙁

      Any jobs going – remote based?? 🙂

      Reply
  • Johannes Vink
    July 21, 2021 3:52 am

    I recognize this Brent. Zero updates from MS and then you read in a blog about something totally different a line that ‘recent updates to clustered column indexes resulted in better performance’. It makes you wonder what more changed under the hood…

    Reply
  • George Walkey
    July 22, 2021 11:30 am

    This is a good thing actually
    I pine for days I can run my SQL on a Pentium and tune it
    Now we can spin up low-level SKU and tune away

    There is nothing like being able to spend 10X the $ and get 10X worse performance from a cloud SKU than a Dell Server sitting in the data center

    Reply
  • These days I try not to predict database fashion. What my employer likes – I like.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu