How to Patch SQL Server

Microsoft releases SQL Server Cumulative Updates about every 60 days. This is a part of your job that you’re going to be doing a lot, so let’s get good at it! Here’s the strategy I like to use.

Pick what patch you’re going to apply. Generally speaking, you should be on the most recent Cumulative Update available for your version. (Years ago, folks only applied Service Packs, but starting with SQL Server 2017, Service Packs are gone. Microsoft only ships Cumulative Updates now.)

Decide how you’re going to detect problems. Every now and then, an update breaks something. For example, SQL Server 2019 CU7 broke snapshotsSQL Server 2019 CU2 broke Agent, and so many more, but my personal favorite was when SQL Server 2014 SP1 CU6 broke NOLOCK. Sure, sometimes the update installer will just outright fail – but sometimes the installer succeeds, but your SQL Server installation is broken anyway, and it may take hours or days to detect the problem. You need to monitor for new and unusual failures or performance problems.

Design your ideal rollout strategy. Here’s the order I like to use, but I understand that not everyone has all of these environments. More on that in a second. Roll out the patch in this order:

  1. Development servers – you want your developers seeing any failures or behavior changes first so they can feel confident that the eventual production patch will produce behavior they’re used to.
  2. QA/test servers
  3. Disaster recovery servers – often, these are the least-critical servers in the list that are actually being monitored with monitoring software, and leverage SQL Server’s high availability and disaster recovery features like clustering, and also have to keep up with the level of writes happening in production. New problems will show up here, and hopefully monitoring will detect them before you apply patches to subsequent levels.
  4. Read-only replicas – servers where some end user activity happens, but it’s less critical than your primary servers. This advice applies whether we’re talking replication, log shipping, or Always On Availability Groups.
  5. Failover servers – now we’re getting really close. The idea here is to patch these without taking a production outage – but that’s not always possible depending on the HA/DR features you’re using, and the way you’re using them.
  6. Production primary servers – and the way you patch these is to actually not patch them at all. On “patch day”, simply fail over to your failover server, which has already been patched. This way, if you experience any surprise issues with the patch within the first few days, you can fail back over to your unpatched production server. (This also means you need to hold off patching that server for a few days to give yourself a safety net.)

Design your actual rollout strategy. Having read that above rosy-world scenario, now you’re looking at your own environment going, “Brent, I don’t have a bunch of those servers.” Okay, no problem: scratch out the lines you don’t have, but understand that you’re also scratching out possible safety nets. This is something to think about when you’re designing your next SQL Server architecture.

Design your rollback strategy. In the event that you do detect problems – and it’ll happen sooner or later – you want to have a rough idea of what you’re going to do. In dev/QA/test, you might just choose to uninstall the update and wait it out, giving other SQL Server customers time to troubleshoot the problem with Microsoft on their mission-critical servers, then apply the next fixed update instead. If the update made it all the way to your DR or failover tier without you catching the problem, you might not have the luxury of cleanly uninstalling the update, and your rollback strategy may be to open a support case with Microsoft to troubleshoot the problem – hopefully before applying the failed patch to your production primary servers.

What we’ve done so far seems like a lot of designing, but remember, you only have to do this once, and you can reuse it for every update you apply to this environment.

When applying the actual patch, here’s what I like to do, in order:

  1. Verify that you have backups. Ideally, do a test restore, too: backup success messages don’t mean you have working backup files.
  2. Stop or shut down client apps. You don’t want folks starting a transaction as your update begins.
  3. Make sure there’s no activity happening on the server, especially long-running jobs like backups.
  4. Apply the update – if you’re using PowerShell, check out how to automate patching with DBAtools.
  5. Apply Windows updates since you’re down anyway. (Sometimes I find folks have been applying SQL updates, but not Windows updates – they’re both important.)
  6. Confirm the SQL Server service is started, and check your monitoring tools for any unexpected failures.
  7. Confirm the SQL Server Agent service is started again, and kick off your next log backup job.
  8. Start client apps back up and make sure they function.

Over the coming days, keep a much closer eye than normal on monitoring tools looking for unexpected failures. Then, it’s time to hop back on the hamster wheel again, and start planning your next round of updates.

Previous Post
Who’s Hiring in the Database Community? June 2021 Edition
Next Post
“I’m getting index seeks. Why are my row estimates still wrong?”

27 Comments. Leave new

  • Thomas Franz
    June 8, 2021 4:22 pm

    Very nice to-do-list.

    It would be nice, if you could place a link on https://sqlserverupdates.com/ to this article – it makes it easier to find when you really need it 🙂

    Reply
  • Are there any ramifications of applying updates on a DR server when all the databases are in Restoring mode?

    Reply
    • Yep. The specifics are beyond the scope of this post though – that’s where your design steps come in as I discussed in the post. The good news is you still have job security. 😉 The bad news is that I’m not doing ALL of your job, heh.

      Reply
  • James Fogel
    June 8, 2021 6:00 pm

    I got a little sick when I saw the mention of opening a ticket with MS. I haven’t had to open one for an SQL Server incident since last year, but I’ve open several in the last few months related to AD issues. We have either figured it out on our own or given up. Weeks go by with nothing to show for it. It seems that only tickets submitted via the Azure Portal are handled by people with a clue.

    Reply
  • As always nice post @BrentO.
    Totally agree in the list.

    Number 2. and number 8. are the toughest ones! Stop/Start manually or automated!

    And-And-And your App-Admins won’t [always] know what Services they need to stopp 😉 you got to tell them.

    for one server one instance that’s all ok but for a thousand servers or several th… ha ha

    Reply
  • Dba locoCrazy
    June 8, 2021 6:13 pm

    what “monitoring tools” you guys sugest ? I’m used to create my owns but for sure you guys know better ones.

    Reply
    • That’s outside of the scope of this blog post. Let’s not turn this into a shopping comparison – just don’t Want to go there here. Thanks for understanding.

      Reply
  • Chad Miller
    June 8, 2021 6:25 pm

    If you work in a large enterprise, your company will have already invested in patch automation software which is fully capable of also patching SQL Server. Two which I’ve used: SCCM and Tanium. The patching team which manages all the Microsoft patches can include MSSQL CU’s and service packs as part of monthly patch updates and rollout out at same schedule as MS monthly patching staggering dev, qa, test and prod environment. This requires working silos, but that’s always a good thing.

    Reply
    • Chad – can you take a quick look again through the post and verify that the patching software can do those things? May want to double check those.

      Reply
      • Chad Miller
        June 10, 2021 2:25 pm

        On the applying the patch, yes it does. Things like shutting down apps, verifying backups and verifying SQL Server after patching are additional processes/procedures as you pointed out.

        Reply
  • Should we stop sql services as well before the patching? I think that step 2 should include it as well, right?

    Reply
  • Thierry Van Durme
    June 9, 2021 6:27 am

    Thanks for your insights Brent! Adding a couple to our list 🙂
    I like to separate Windows and SQL updates – if something fails it makes it a little easier to narrow down the possible causes.
    Also, when D/R (log shipping for example) is in standby mode, that may not always work if your production server is not patched yet because your databases might need to be upgraded before SQL can bring them online.

    Reply
  • Juanita Drobish
    June 10, 2021 3:22 pm

    Question on patching 2 servers involved in transactional replication between each other. Both servers are on SQL 2014 sp2 cu17. To apply the most current updates, I need to apply 2014 sp3, then cu4 then GDR. Do both servers have to be patched at the same time or can one server be patched then wait a few days to patch the following server?

    Reply
    • Howdy ma’am! You can usually patch them out of order.

      Reply
    • Just a quickie note: the GDR releases are separate to the CU releases; you cannot apply both.
      Once you have applied a CU, you must remain on the CU releases; same for GDR, once you start on GDR, you stick on GDR. The only time you can swap between them, is when applying a new SP (you could be waiting a long time these days!) or reverting back to the original RTM.
      For many of the Instances I look after, we stick on the GDR release schedule; far fewer updates to apply, most for security patching; sure, we don’t get the latest-and-greatest features and improvements; but for these systems, stability and reliability are more important.

      Reply
  • Hello Brent,

    Nice article, for me as working with SQL Server since more than 17 years, I learnt the hard way, Never ever try Microsoft first patch release, wait for some time until “Someone else try it first”, then apply it in your environment in the same order you provided in the article.

    As now MS is releasing a patch every 2 month, I am patching my server once per year, that`s more than enough for my current stable environment.

    Reply
  • Dev Bethanasamy
    June 12, 2021 2:35 pm

    We upgrade passive node on FCI and failover to that node, after the upgrade and keep the upgrade other node on lower CU, so we can rollback to that node when something goes wrong. We do this for a week and then upgrade other node.
    If we failover, SQL Server nicely rollbacks any Changes it made.

    Reply
    • Valerie Mortensen
      October 13, 2021 10:51 pm

      How many sql servers are you doing manually like this? I’m trying to find if people are widely using SCCM to automate .. especially where SQL clusters are concerned

      Reply
  • Rick Harderwijk
    June 29, 2021 6:38 am

    Thierry Van Durme already mentioned it… there might be an issue with holding of updating the previous-version active node after moving the database to the formerly-standby node of a cluster. The internal database version might change and I am not 100% sure you can always re-active the old node(s) with the updated databases. I do recall having issues with that, although that was almost a lifetime ago, so that issue might no longer be a thing. If anyone could confirm that either way that would be nice.

    Reply
    • We can never really confirm that permanently – only Microsoft could. I’m gonna be honest here: they’re not putting that level of testing into Cumulative Updates, and it’s not fair to ask them to do it, either. They would have to test upgrading from every possible CU – and that’s just not realistic.

      The fact that they’re not putting this level of testing in is exactly why we have to not upgrade every production server at the same time. You simply wouldn’t be able to handle the aftermath when these bugs strike, and they’ve certainly been striking with a vengeance in the last couple of years.

      Reply
      • Hi Brent,
        I just stepped in to this trap while was updating my cluster environments of SQLs 2017 (CU19 to CU24) and 2019(CU11 to CU13).
        in our test all worked well, but on prod 50% failed to failover. now we are investigating what is the problem.
        this is only to share the info (and my frustration).
        never expect the patches to magically work:-)

        Reply
  • Steve Pearson
    June 30, 2021 6:10 pm

    Are security updates cumulative? For example, SQL 2014 SP3 has 2 consecutive security updates listed:
    12.0.6433.1 and 12.0.6372.1.

    Does 12.0.6372.1 contain what was included in 12.0.6433.1? or do I need to apply both?

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu