Production DBA

SQL Server Always On Availability Groups in AWS

By Jeremiah Peschka · April 24, 2012 · 6 comments

Many companies are investigating a move into cloud hosting. With the recent release of SQL Server 2012, these very same companies are also looking at a migration to SQL Server 2012 to take advantage of readable secondaries. Combining Amazon Web Services and SQL Server 2012 seems like a perfect match. Unfortunately, there are some gotchas, like limited support for clustering, to keep in mind as you consider deploying SQL Server 2012 AlwaysOn Availability Groups in AWS.

What’s Wrong With Clustering?

AlwaysOn Availability Groups don’t require shared storage, but they do require Windows Server Failover Clustering. Let’s be clear: there’s nothing wrong with clustering.

There’s a white paper from Amazon Web Services that suggests clustering won’t work in AWS because clustering requires support for multicast networking. This changed with Windows Server 2008 – multicast networking is no longer part of Windows Server clustering. For those who don’t know, multicast networking is delivering one packet from the source computer to multiple destination computers. Multicast relies on the intermediate hardware making the decision when to make copies of the packets to send to the destinations and, in general, relies on the hardware making that decision at the latest possible moment. Because so much of the AWS infrastructure is virtual, a lot of these decisions become trickier for the intervening networking gear (which might also be completely virtual). Ultimately, this is why you’d want to avoid multicast networking in a purely virtual environment. Since Windows Server 2008 and newer don’t use multicast networking, it’s not an issue.

It’s Not You, It’s… You

If multicast networking isn’t what stops clustering from working in AWS, what does?

Every cluster has a cluster access point – a name and IP address used to administer the cluster. The cluster access point is a virtual IP address, when you set up the cluster access point, Windows creates a virtual IP address that is used to provide a way to access the active node in the cluster. This isn’t strictly necessary for SQL Server clustering, but it does make life easier when managing the underlying Windows Server Failover Cluster.

In addition to each server’s individual IP address, the cluster access point requires its own IP address. This is where things start to get sticky. When you assign static IP addresses in AWS, you don’t assign the IP address at the server level – you assign the static IP address to a virtual ethernet adapter and attach the ethernet adapter to your EC2 instance. Each of these Elastic Network Interfaces (ENI) can only be assigned one IP address, for now.

Creating an Availability Group, Is It Possible?

Since it isn’t possible to create virtual IP addresses it would be easy to conclude that it isn’t possible to create an availability group – during availability group set up SQL Server has to get in touch with the cluster control point. There’s a way around this limitation – along as you create the availability group on the current host server of the cluster, then you can create the availability group.

It is possible to create an availability group, however it isn’t possible to create a TCP listener for the same reason that it isn’t possible to create the cluster control point – it isn’t possible to have more than one IP address per ENI in AWS. This doesn’t mean that AlwaysOn Availability Groups are a non-starter in AWS, but it does mean that you won’t have access to the full AlwaysOn Availability Group functionality until later this year when Amazon introduces multiple IP addresses per ENI.

Faking Out the TCP Listener

Here’s what we know so far:

It is possible to set up a Windows Server Failover Cluster in AWS.
It is possible to set up AlwaysOn Availability Groups on our Failover Cluster in AWS.
It is not possible to configure a Cluster Control Point or TCP Listener.

It is possible, though, to get much of the same benefit of the TCP Listener without using the SQL Server’s built-in functionality. A load balancer like HAProxy can be used to create groups of readable secondaries for different applications. True, your applications won’t be able to take advantage of the other TCP Listener features like specifying read only intent in the connection strings, but it will be possible to create multiple connections in HAProxy that will work the same as connections specifying read only intent. HAProxy has the ability to use sticky sessions, monitor connections, and detect when servers are offline and route traffic away from the dead server. People have been doing this with MySQL for years.

Where Do We Go From Here?

Teams using AWS who need SQL Server 2012 Always On Availability Groups can start making use of these features right now. All of the benefits of SQL Server 2012 Always On Availability Groups won’t be available, but it will be possible to use multiple readable secondaries for scale out reads. Many of the features of Always On Availability Groups can be configured using a load balancer like HAProxy for now and configured using native functionality once multiple IP addresses can be attached to a single ENI. With a little work and some patience, you’ll be able to use this feature right now and be ready for when AWS supports virtual IP addresses.

More SQL Server AlwaysOn Resources

If you want to learn more about how to design, test, and manage this slick feature that offers high availability, disaster recovery, and scale-out reads, visit our AlwaysOn Availability Groups resources page.

Free, 3× a week

Get my new posts by email

Three posts a week, plus a Monday roundup of the best database news from around the web.

6 comments

Andy V

August 20, 2012 at 1:11 pm

Hey Jeremeiah,

Great post, now that Amazon has multiple ENIs does this change anything?

Reply
1. Jeremiah Peschka
  
  August 20, 2012 at 1:55 pm
  
  ENIs change things completely. You can check out AlwaysOn Availability Groups in AWS Revisited to get the full scoop. The short story is that ENIs make this possible.
  
  Reply
  1. Andy V
    
    August 20, 2012 at 4:44 pm
    
    Great, I just read that article, good stuff. Have you succesfully set one up in AWS? I have a VPC with several subnets in AWS and a fully functional AD Server. I’ve added two EC2 instances on the same subnet, added two EINS to each instance. I run through the setup, all EINS are DHCP enabled, so I don’t get many options besides choosing the nodes. The instance name will fail to start as the IP address resource won’t come online as it says it is bound to another computer on network (itself). Do I need to tweak anything on the Network Interfaces or do I ignore this and just move forward with Always On?
    
    Reply
    1. Jeremiah Peschka
      
      August 20, 2012 at 5:33 pm
      
      You don’t want to use DHCP for a the Windows Failover Clustering piece. Assign static IPs to each of the ENIs and use the static IP. The ENI changes also let you have many IPs on a single ENI, so you can add a static IP for each node and have static IPs for the cluster control point and AG listener. For the static IPs that you use for the cluster control point and AG listener, make sure you check the “Allow Reassociation” box so you can dynamically move them at will.
      
      At every step of the way with AlwaysOn Availability Groups you’ll want to make sure that the cluster is passing validation. Check out the checklist at https://www.brentozar.com/go/alwayson.
      
      Reply
Matt

February 9, 2017 at 6:04 am

Hi, I know this is an old article but I hope you don’t mind me asking a quick question…

I’ve set up an Availability Group with a 3 nodes in AWS, which seems to work well. However, each node only has a single network adaptor with 3 IP addresses assigned (1 primary, 1 cluster resource, 1 listener). Is this sufficient or do I need additional networks for redundancy? Obviously in a standard physical cluster it’s best practice to have multiple networks to ensure there is no single point of failure, however I don’t know if this still applies in AWS where presumably they have redundancy built in at their end?

Thanks

Reply
1. Brent Ozar
  
  February 10, 2017 at 1:46 pm
  
  Matt – I answered this over in your Stack question.
  
  Reply