Common SQL Server Clustering, AlwaysOn, and High Availability Answers

Our live webcast topic this week was Q&A – you could bring any HA/DR questions, and we could avoid your answers. Just kidding. Here’s the webcast – we apologize for the audio, WebEx is still getting their act together. It cleans up after the first couple of minutes.

Watch this video on YouTube

And here’s the questions we couldn’t get to during the webcast:

SQL Server Licensing Questions

Q: If using virtual machines and clustering / failing over at that level (not sql server) is there any reason that SQL Server Standard Edition won’t work? Someone once told us in a sql class that Enterprise Edition was necessary for this.

Answer from Brent: don’t you just love those “someone once told us” things? You’ll want to get them to tell you why. Standard Edition works fine in virtual machines. It may not be cost-effective once you start stacking multiple virtual machines on the same host, though, because you have to pay for Standard Edition for every guest.

Q: Hi, with mirroring being deprecated and AlwaysOn AG only available with Enterprise Edition – what are our HA options going to be with Standard Edition in the future? Any ideas if AlwaysOn synchronous will make it into Standard?

Answer from Jeremiah: You have a few HA choices with SQL Server 2012 Standard Edition and beyond. Even though mirroring is deprecated, you could feasibly use mirroring in the hope that something new will come out. Obviously, this isn’t a viable option. The other HA option is to use clustering. SQL Server Standard Edition supports 2 node clusters, so you can always use it for HA.

How to Manage AlwaysOn Availability Groups

Q: Did you experience or know “split brain scenario” in AlwaysOn Availability Groups that when secondary node is up to take over primary role, the transaction becomes inconsistent? And how to avoid it?

Answer from Brent: Ooo, there’s several questions in here. First, there’s the concept of split brained clusters – when two different database servers both believe they’re the master. Windows Server Failover Clustering (WSFC) has a lot of plumbing built in to avoid that scenario. When you design a cluster, you set up quorum voting so that the nodes work together to elect a leader. In theory, you can’t run into a split brain scenario automatically – but, you can most definitely run into it manually if you go behind the scenes and change cluster settings. The simple answer here: education. Learn about how the quorum process works, learn the right quorum settings for the number of servers you have, and prepare for disaster ahead of time. Know how you’ll need to react when a server (or an entire data center) goes down. Plan and script those tasks, and then you can better avoid split brain scenarios.

Q: Can you recommend any custom policies for monitoring AlwaysOn? Or do the system policies provide thorough coverage? Thank you!

Answer from Brent: I was a pretty hard-core early adopter of AlwaysOn Availability Groups because I had some clients who needed it right away. In that situation, you have to go to production with the monitoring you have, not the monitoring you want. The built-in stuff just wasn’t anywhere near enough, so most of my early adopters ended up rolling their own. StackOverflow’s about to share some really fun stuff there, so I’d keep an eye on Blog.ServerFault.com. You should also evaluate SQL Sentry 7.5’s new AlwaysOn monitoring – it’s the only production monitoring I’m aware of, although I know all the other developers are coming out with updates to their tools for monitoring too.

Q: Is it wise to have primary availability groups in one server of the nodes and have primary groups on another of the servers that form the cluster. Or is it better to have all primary groups on server 1 and secondary on server 2?

Answer from Brent: If you split the primaries onto two different nodes, then you can do some load balancing.

Q: Would you consider Always-ON AG Read-Only replicas as a method to offload or load balance reporting? Looks like the Read Intent option acts like a load balancer for reading off of those DBs, right?

Answer from Brent: Offload yes, load balance no. The read intent options give you the ability to push read-only queries to a different replica, but there’s no load balancing. Your clients just hit the first server in the list. If you need true load balancing, you’ll want to put all of the read-only replicas behind a real load balancing appliance.

Windows Clustering Setup and Management

Q: Where can I find a good list of cluster hotfixes SQL 2008 R2 and perhaps the OS as well?

Jes here. You can go to the Update Center for Microsoft SQL Server to find the latest CU and hotfixes. Check the Support pages for Windows Server 2008R2. Updates aren’t released as cluster-specific. This is why it’s really important to have a test or QA environment that is also set up as a cluster, so you know if the cluster services are affected at all.

Q: What is the recommended order/procedure when you have to do Windows updates to servers in a cluster?

Answer from Kendra: Microsoft knew you were gonna ask this! Check out their “SQL Server failover cluster rolling patch and service pack process” KB here. But do yourself a favor and always deploy patches to a non-production test cluster first and let them burn in a bit.

Q: From your previous answers, it sounded like you don’t recommend use Windows 2008 R2 for AlwaysOn. Can you elaborate bit more on why Windows 2012 is better suited for this? I need more persuasive power to talk the rest of folks of my company to use it.

Answer from Brent: Sure, check out the AlwaysOn Availability Groups Lessons Learned video at the bottom of that page.

Q: Would you have a single DTC group or multiple groups configured for a 4 instance cluster?

Answer from Kendra: There’s no shortcut here: you have to decide on an instance by instance basis. For each instance you gotta determine how much it uses distributed transactions, and how impacted it might be if DTC were to temporarily be offline. Review Cindy Gross’ information on DTC to find out pros and cons of different approaches to configuring DTC.

SQL Server Clustering with VMware and Hyper-V

Q: Is VMWare HA a good alternative to use instead of a Microsoft Cluster?

Answer from Jeremiah: The HA choice comes down to where you want your HA to be managed. VMware HA pushes the high availability question out of the SQL Server realm and into the VMware infrastructure. More than anything else, this is a business decision – just be sure you’re happy with the decision of which team is managing your uptime.

Q: When using a virtualized active/passive 2008R2 cluster with underlying iSCSI storage can the nodes by on different hosts or is FoE needed to have nodes on different hosts?

Answer from Brent: Check out VMware’s knowledge base article on Microsoft cluster support. It lays out your options for iSCSI, FC, FCoE, and more, and separates them by shared-disk clustering versus non-shared-disk (AlwaysOn Availability Groups).

Q: Any thoughts on implementing AlwaysOn in conjunction with a virtual SQL environment using VMWare HA/ Site Recovery Manager (SRM)?

Answer from Kendra: With this level of complexity, when things get tricky it’s incredi-hard to sort out. You gotta have a rockstar team with great processes and communication skills to handle problems as they arise– and you are going to hit problems.

Even if you have the rockstar team, you want to first ask if there’s a simpler way to meet your requirements with a less risky cocktail of technologies. If you rush into what you describe, you’ll find that your high availability solution becomes your primary cause of downtime.

Shared Storage for Clusters

Q: Was reading a great article from Brent on SQLIO. How does this work on a SQL Cluster?

Answer from Kendra: You run SQLIO against the storage (not the SQL Server instance) so it works the exact same way.

Q: After Setting up The Cluster and adding the various CLUSTER DATA Drives how can I add additional Drives after gaining new internal storage?

Answer from Kendra: Before you touch production, make sure you’ve got a lab environment. If you don’t, check out the link above on how to build one. The exact steps to do this are going to vary depending on your version of Windows, your version of SQL Server, and exactly what storage you’re talking about.

For new shared storage on Sever 2008 or later, the basic process is presenting the storage to all of the nodes, bringing the drive online on one node, creating a volume, adding the disk in the failover cluster, and then adjusting dependencies in the cluster as needed. (Dependencies can be adjusted online in SQL Server 2008 and later).

If you have new non-shared storage that you want to use under tempdb (such as SSDs), you’ve got to make sure that every node in the cluster has the drives for tempdb online / volumed/ formatted/ and configured identically, and then you can move tempdb files over to it. You will need to restart SQL Sever to make modified tempdb files recognize the new paths.

Sharding and Mirroring Questions

Q: I have a peer to peer replication with 3 nodes (all bidirectional). Very beneficial but a big pain to maintain. Is that what the industry feels?

Answer from Jeremiah: SQL Server peer-to-peer replication solves a very specific need – the ability to have multiple active SQL Servers where writes can occur and where you can have near real-time updates to the other servers. While peer-to-peer replication meets that need, it has a relatively heavy price tag in terms of DBA expertise, support, and licensing costs. Even experienced teams want to have multiple DBAs on staff to deal with on call rotations and, let’s face it, while peer-to-peer replication hasn’t been deprecated, it’s a difficult feature to work with.

Q: I’ve implemented db sharding on Oracle in several environments. Is there an applicable tech in SQL Server?

Answer from Jeremiah: Sharding is just a buzzword for horizontal partitioning. In a sharded database, either the application or a load balancing router/reverse proxy is aware of the sharding scheme and sends reads and writes to the appropriate server. This can be accomplished with SQL Server, Oracle, MySQL, or even Access. There are no technologies from Microsoft and I’d be wary of anyone attempting to sell something that Just Works® – database sharding is time consuming, requires deep domain knowledge, and adds additional database overhead.

Q: Currently using SQL 2008 Mirroring. Planning a move to 2012. Your thoughts about jumping 2012 and going straight to 2014 Always On technologies?

Jes here. There were no major changes to Database Mirroring in SQL Server 2012, and I don’t foresee any coming in 2014. Eventually – we don’t have a specific version yet – Mirroring will be deprecated.

More Free AlwaysOn and Clustering Training Resources

And if your business needs help deciding which one is right for you, check out our SQL Critical Care® methodology. We quickly get to the root cause of your availability and performance pains, and we train you how to fix it. Learn more about our SQL Critical Care® now.