SQL Server 2012 brings huge improvements for scaling out and high availability. To put these changes into perspective, let’s take a trip down memory road first and look at the history of database mirroring.
SQL Server 2005 first introduced mirroring, although it wasn’t fully supported until a service pack. In many ways, mirroring beat the pants off SQL Server’s traditional high availability and disaster recovery methods. Log shipping, clustering, and replication were known for their difficulties in implementation and management. With a few mouse clicks, database administrators could set up a secondary server (aka mirror) to constantly apply the same transactions that were applied to the production server. In synchronous mode, both servers had to commit every transaction in order for it to commit, giving a whole new level of confidence that no transactions would be lost if the primary server suddenly died. In asynchronous mode, servers separated by hundreds or thousands of miles could be kept in sync with the secondary server being a matter of seconds or minutes behind – better than no standby server at all.
SQL Server 2008 improved mirroring by compressing the data stream, thereby lowering the bandwidth requirements between the mirroring partners.
In one of the most underrated features of all time, Microsoft even used mirroring to recover from storage corruption. When the primary server detected a corrupt page on disk, it asked the mirror for its copy of the page, and automatically repaired the damage without any DBA intervention whatsoever. Automatic page repair doesn’t get nearly the press it deserves, just silently working away in the background saving the DBA’s bacon.
Database Mirroring’s Drawbacks
While SQL Server was able to read the mirror’s copy of the data to accomplish page repairs, the rest of us weren’t given the ability to do anything helpful with the data. We couldn’t directly access the database. The best we could do is take a snapshot of that database and query the snapshot, but that snapshot was frozen in time – not terribly useful if we want to shed load from the production server. I wanted the ability to run read-only queries against the mirror for reporting purposes or for queries that could live with data a few minutes old. Some companies implemented a series of snapshots for end user access, but this was cumbersome to manage.
Unlike log shipping and replication, mirroring only allowed for two SQL Servers to be involved. We could either use mirroring for high availability inside the same datacenter, OR use it for disaster recovery with two servers in different datacenters, but not both. Due to this limitation, a common HA/DR scenario involved using a cluster for the production server (giving local high availability in the event of a server failure) combined with asynchronous mirroring to a remote site. This worked fairly well.
The next problem: database failovers are database-level events. DBAs can fail over one database from the principal to the secondary server, but can’t coordinate the failover of multiple databases simultaneously. In applications that required more than one database, this made automatic failover a non-option. We couldn’t risk letting SQL Server fail over just one database individually without failing over the rest as a group. Even if we tried to manage this manually, database mirroring sometimes still ran into problems when more than ten databases on the same server were mirrored.
Database mirroring didn’t protect objects outside of the database, such as SQL logins and agent jobs. SQL Server 2008 R2 introduced contained databases (DACs), a packaged set of objects that included everything necessary to support a given database application. I abhor DACs for a multitude of reasons, but if you were able to live with their drawbacks, you could more reliably fail over your entire application from datacenter to datacenter.
Enter AlwaysOn: New High Availability & Disaster Recovery
It’s like mirroring, but we get multiple mirrors for many more databases that we can fail over in groups, and we can shed load by querying the mirrors.
That might just be my favorite sentence that I’ve ever typed about a SQL Server feature.
I am the last guy to ever play Microsoft cheerleader – I routinely bash the bejeezus out of things like the DAC Packs, Access, and Windows Phone 7, so believe me when I say I’m genuinely excited about what’s going on here. I’m going to solve a lot of customer problems with mirroring 2.0, and it might be the one killer feature that drives Denali adoption. This is the part where I raise a big, big glass to the SQL Server product team. While I drink, check out the Denali HADR BooksOnline pages and read my thoughts about the specifics.
First off, we get up to four replicas – the artist formerly known as mirrors.
Denali also brings support for mirroring many more databases. We don’t have an exact number yet – we never really got one for 2005 either – but suffice it to say you can mirror more databases with confidence.
DBAs set up availability groups, each of which can have a number of databases. At failover time, we can fail over the entire availability group, thereby ensuring that multi-database applications are failed over correctly.
Denali’s HADRON improvements change my stance on virtualization replication. For the last year, I preferred virtualization replication over database mirroring because it was easier to implement, manage, and fail over. Virtualization still wins if you want to manage all your application failovers on a single pane of glass – it’s easy to manage failovers for SQL Server, Oracle, application servers, file servers, and so on. However, the secondary servers don’t help to shed any load – they’re only activated in the event of a disaster.
AlwaysOn Isn’t Perfect
I need to be honest here and tell you that Denali threw out the baby with the bathwater. There’s going to be a lot of outcry because some of our favorite things about database mirroring, like extremely easy setup, are gone. Take a deep breath and read through this calmly, because I think if you see the big picture, you’ll think we’ve got a much smarter toddler.
AlwaysOn relies on Windows clustering. I know, I know – clustering has a bad reputation because for nearly a decade, it was a cringe-inducing installation followed by validation headaches. Some of my least favorite DBA memories involve misbehaving cluster support calls with finger-pointing between the hardware vendor, SAN vendor, OS vendor, and application vendor. This is different, though, because clusters no longer require shared storage or identical hardware; we can build a cluster with a Dell server in Miami, an HP server in Houston, and a virtual server in New York City, then mirror between them. Now is the right time for AlwaysOn to depend on clustering, because the teething problems are over and clustering is ready for its close-up. (One caveat: clustering requires Windows Server Enterprise Edition, but Microsoft hasn’t officially announced how licensing will work when Denali comes out.)
When you’ve got a clustering/mirroring combo with multiple partners involved, you want to know who’s keeping up and who’s falling behind. You’ll also want to audit the configurations. There’s an improved Availability Group dashboard in SQL Server Management Studio, but I’d argue that GUIs aren’t the answer here. For once, brace yourself – I would actually recommend PowerShell. I’ve given PowerShell the thumbs-down for years, but now I’m going to learn it. It’ll make HADRON management and auditing easier.
Summing Up Denali AlwaysOn
There’s a lot of challenges here, but as a consultant, I love this feature. It’s a feature built into the product that gives me new ways to handle scalability, high availability, and disaster recovery. There’s a lot of potential in the box, but the clustering requirements are going to scare off many less-experienced users. Folks like us (and you, dear reader, are in the “us” group) are going to be able to parachute in, implement this without spending much money, and have amazing results.
Over the next few months, I’ll be taking you along with me as I dig more into this feature. I plan to implement it in labs at several of my customers right away, and I’ll keep you posted on what we find. If it’s anywhere near as good as it looks, I’m going to be raising a lot of glasses to Microsoft.
If not, I’ll be pointing Diet Coke bottles at Building 35 until they fix the bugs, because this feature could rock.