Availability Groups: More Planned Downtime for Less Unplanned Downtime

I often hear companies say, “We can never ever go down, so we’d like to implement Always On Availability Groups.”

Let’s say on January 1, 2016, you rolled out a new Availability Group on SQL Server 2014. It’s the most current version available at the time, and you deploy Service Pack 1, Cumulative Update 4 (released 2015/12/22). You’re fully current, and it’s a stable engine from 2014 – how many more bugs can they find, right?

Here’s what your patching schedule would look like:

2016/02/22Cumulative Update 5 – corrupted columnstore indexes when AG fails overstack dumps on AG secondaries.

2016/04/19 – Cumulative Update 6 – non-yielding schedulers during AG version cleanup, FileTables unavailable after AG failover, canceling a backup causes the server to crash (not related, but cringeworthy) – whew! This one has a lot of big fixes. We should definitely apply this.

2016/05/31 – OH SNAP! CU6 broke NOLOCK. Sure hope you didn’t apply that. Time to take another outage to apply the revised version.

2016/06/21 – Cumulative Update 7 – SQLDiag fails in AGs. You could probably skip this one if you don’t use SQLDiag, and most shops don’t.

2016/07/11 – Service Pack 2 – improved lease timeout to prevent outages, filestream directory not visible after a replica is restarted (wait I thought we fixed that in CU6? no wait that was FileTables), missing error numbers in XE.

2016/08/26 – Cumulative Update 1 – memory leak on AGs with change tracking, error 1478 when you add a database back into an AlwaysOn availability group (sic).

2016/10/18 – Cumulative Update 2 – no AG fixes, woohoo!

What do you mean there's only one engine?
What do you mean there’s only one engine?

That’s 5-7 patch outages in 11 months (and I’m not even listing all of the fixes in these, which include things like incorrect results bugs, plus awesome new DMV diagnostic features that you definitely want.)

Here’s the way I like to explain it to companies: if you have an airplane, it’s absolutely imperative that its engines not fail mid-flight. In order to accomplish that, you have to have regular downtime for mechanics to examine and replace parts – and that doesn’t happen up in the air. With Availability Groups, we’re lucky enough to be able to transfer our passengers databases from one airplane to another quickly – but we still have to have those other airplanes getting constant examinations and patches from mechanics.

Previous Post
Selective XML Indexes: Not Bad At All
Next Post
Should I Install Multiple Instances of SQL Server?

10 Comments. Leave new

  • Hi Brent,

    Great post as always. I can barely handle being on an airplane when everything goes as planned. I can’t imagine hearing “uh… hi guys… this is your captain speaking… we uh… we’re having some problems up here and we… uh… well … do you see that other plane over there flying dangerously close to us? Well… we uh… we’re gonna have to get you all over there like.. like right away… I uh… Women and children first, I guess. Smoke if you got ’em.”

    You sure have been posting a lot of pics of you in that Oracle jacket lately. How has Microsoft not revoked your MVP card?

  • I seem to be spending more and more time coming up with analogies to try and explain basic process concepts to people recently, but your airplane one there’s a thing of pure beauty. Have a drink.

  • Liked the article . Good analogy.

  • Brent,
    Is it advisable to hold off AG setting as there seems so many issues?
    Cluster environments can help us achieve always on, right?

    By the way, always love to read your post!

    • Amanda – I would just generally advise folks to find the simplest solution that meets their RPO and RTO goals. Always On Availability Groups is a fantastic feature – you just have to be armed with the right people and processes to tackle it.

  • …and the more engines you have, the more likely you’ll have some kind of engine failure at some point.

  • Hello Brent,

    Reminds me about my flight simulator: runs on Windows 7, and is *never ever* patched/updated. I don’t want to introduce *any* side effect through an update.


  • Lance Gabreil Zamora Villacrusis
    June 24, 2021 2:37 pm

    I like your Analogy, thank you so much for simplifying it


Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.