Blog

You’d love to know how to make your SQL Servers more highly available, but you’re just not sure what “clustering” is all about. We can help! In this 30 minute video Kendra Little shows you how failover clusters make your SQL Servers easier to manage. She uses diagrams and drawings to teach you the strengths and weaknesses of failover clusters and shows why clustering is becoming even more critical for high availability and disaster recovery.

This webcast was sponsored by Idera Software– and they gave away a free trip to the SQL PASS 2013 Summit to one lucky attendee. How cool is that???

Got Questions? Scroll on down and check out Q&A from the live webcast below the video, or read more about clustering SQL Server.

Webcast SQL Server Clustering Q&A

For failover cluster instances, does Windows Clustering needs to be enabled before you configure SQL Server Clustering? Or is it enabled by default? You need to enable the Failover Clustering feature on all nodes and create and validate the Windows Failover Cluster. You then install SQL Server onto the cluster (you tell the SQL Server installation GUI you’re installing it on the cluster and it walks you through a wizard).

So there is only one active node at any given time with one or more possible passive nodes. Right? Yes, exactly– any given clustered SQL Server Instance can only be active on one node (physical server) at any given time. (You can have more than one instance on each cluster, so you could potentially have multiple instances active on a single node, or multiple instances active across multiple nodes.)

Can we have two nodes on the same physical hardware? A “node” IS a physical server, so no.

Must applications be coded as cluster aware? No, apps have no idea that they’re connecting to a database that lives on a cluster. They just need to be able to handle the brief downtime during a failover so that if their connection is interrupted, they retry/keep on going.

Is there any test or lab microsoft provides to try to test clustering? No, you’ll want to build your own lab for that.

What about a SQL Server cluster with one physical node and one node that’s a virtual machine– Is this recommended? We don’t recommend it. Read why it gives us the heebie jeebies here.

Would a virtualized failover cluster avoid the downtime during the failover? No.

How do you configure and Manage MSDTC? Several people asked this one. Cindy Gross answers everything you need to know about configuring DTC on a failover cluster.

How many IP addresses/names do you need? We had a couple of versions of this question– and I don’t mention every name and IP address involved in clustering in the video. Here’s a list:

  • Windows Failover Cluster Name and IP: The windows cluster itself gets a name, but you won’t use that to access the SQL Server. (It’s just for cluster management purposes.)
  • SQL Server Network Name and IP: Each SQL Server instance gets its own name and IP that can move around the cluster. This is a good thing. This *is* what you use to talk to the SQL Server.
  • DTC Name(s) and IP(s): The number you need depends on how your cluster and DTC are configured. See the link on the MSDTC question above.
  • Physical Server Names and IPs: Just like any other server, each physical node gets a name and IP so you can manage it.

Do we need to configure a heartbeat network? Great question. Check out Brent’s article, “Everything You Know About Clustering is Wrong“. (Great title, eh?)

What manages which node is up or down? The Windows Failover Clustering feature does this. It’s a set of services built into Windows that work for all kinds of clustering, like Exchange or file/print servers.

Are there advantages to a cluster with shared storage vs a cluster running availability groups without shared storage? Yes – say your database is 1TB. With AlwaysOn AGs, you need at least 2TB of storage. There are also scenarios with very high amounts of data modifications or high numbers of individual databases that (could) work (relatively) easily on a failover cluster, but push the limits of Availability Groups.

Doesn’t Microsoft discourage using the terms “active/passive” and “active/active”? They sure do! But the phrases “single instance failover cluster” and “multi instance failover cluster” just don’t roll off the tongue, so they’re rarely used in the real world. When we say “multi instance failover cluster”, inevitably people ask, “oh, you mean active-active?”  So you can either use the “approved” terms, or the terms most people understand— whichever you prefer.

In a two node cluster with one SQL Server Instance (“Active/Passive”), what exactly is the status of the passive node? Is it something like the SQL Service needs to start when the active node fails?  Yes! You got it. And if you look at the SQL Server services on the “idle” node, you’ll see they’re in a stopped state and set to manual start. That’s by design– it’s the failover cluster that will start the services when it needs to happen.

Would it be possible for each node to host its own SQL instance which fails over to the other one respectively? – Our goal would be to host one 2008 R2 and one 2012 instance of SQL server. Technically speaking, this is called a “side by side deployment”, but I don’t know anyone who recommends it. This question reminds me of the Meatloaf song, “I’d do anything for love, but I won’t do that.” Let’s break down the problems:

  • Installation: You’ve got to install everything in a very specific order (lowest to highest) on every node. Get anything out of order and you’ve gotta start over.
  • Maintenance: More updates, more patches, more failovers
  • Troubleshooting performance or availability problems: Oh, that gets ugly. You’d be running in a scenario that very few people use. Let’s imagine that you start getting unexplained failovers. It’s now much more hairy to troubleshoot.

So I urge you to consider other options. If you do go this route, work extra hard to make sure you have a very similar setup (ideally identical) in a non-production environment to mitigate risk and heartache.

To avoid confusion: For service packs and cumulative updates, you can use a rolling upgrade process on the cluster. That’s still totally cool.

For instant file initialization, you need to set that up on each node or is it “cluster” aware? And do you recommend it for clusters? You have to set it up on each node, and yes, we’d still recommend it.

Other than checking from SQL Server logs, Windows EventViewer logs, can we find the last failover duration? Nope, not outside of those places.

Is clustering available on the express edition? Technically, no, but you can cluster it manually yourself. If you’re considering this one, step back and take a good look at the problem you’re trying to solve– we’re a little worried.

Can I set up fake instance in my DNS that points to myServer\ThisInstance to start porting over to what will be in place after the cluster is in place?  Setting up before hardware order is placed we get to tweak existing sprocs?  Or is this a BAD IDEA?  Yes, you can create a pointer in DNS to do this. I call it a “DNS friendly name”, some people call it a “DNS pointer”, some people just say “CNAME”. This can be very useful.

More Microsoft SQL Server Clustering Resources

Whether you want help choosing between an active/passive and an active/active cluster, or if you’re the kind of DBA who knows that’s not even the right name for failover clustered instances anymore, check out our SQL Server clustering training page.

↑ Back to top
  1. I never went through a better explanation of SQL Server clusters every, in books there was this boring description, here it became so vivid with examples and illustrations.

  2. Excellent explanation, easy to understand, thank you for this.

    I still have a question about this: when the failover mechanism detects that it needs to failover the sql services to the other node, does it have to reboot the node that fails, during the failover process?

    regards,

  3. Hello,

    Let say that I have SQL Failover cluster with two nodes and the third node which is not in SQL Failover cluster but it is a member of WFC. I know that it is possible to have AlwaysON feature on this configuration, but it is also possible to have automatic failover.
    I am curious how to minimize downtime when failover happening. Can be done this with AlwaysON or the downtime will always be?

    Best regards,

    • Hi there!

      I’m afraid I don’t completely understand what your question is. If you are asking if there is always a downtime during failover, then the answer is yes– when SQL Server fails over from one node to another, there is always a downtime. (The duration varies).

      If you have always on with a Windows Failover Cluster plus an Availability Group Secondary, automatic failover is only within the Windows Failover Cluster Instances. (FCI).

      Hope this helps– and if I misunderstood your question, please clarify.

      Kendra

  4. Hello,

    Is there any compelling reason to have the exact same hardware on the failover node? For instance, do you ever hear of a failover node being slightly less robust than the main server? Or do they really ideally need to be as identical as possible?

    Thanks!

    Josh

    • Usually when you take in all the factors, the hardware is the least expensive part of the whole thing. You’re talking about shared storage and Microsoft licensing, both of which are more expensive than saving a few bucks on the node sizes.

      Plus, if you do patching, you usually want to patch the standby node to prep it ahead of time, then fail over during an outage window, and patch the other node. If both nodes are equally sized, you only have to take one outage – you can take your time patching the other node, and you don’t have to worry about failing back.

  5. First, thank you all for your responses (on this and other posts). You are helping immensely.

    If the cluster is holding MDF/LDF on the SAN and backups need to be elsewhere, what is recommended? Local wouldn’t make sense as the failover node would not have access to the failed JBOD, etc, so it would need to be a shared resource. What are some good options that still allow good throughput for all of those backups and validations?

    Thanks!

  6. Cool Stuff, Well Explained.

  7. Hello , I am in process of creating a cluster environment .
    can I create 2 two node cluster with help of 3 physical machine . some thing like A ,B & C .
    B Machine will be part of both A & C cluster .
    the cluster is like A–B (Active & Passive ) B — C (Active and Passive) .

  8. Thank you !! One more scenario ,if I have two node cluster (A–B) with Active Passive . Can I add a third node (C) with out disturbing the existing service (Sql Instalation ). That mean making this 2 node cluster to 3 node cluster .

    That mean can I have a liberation to add or remove nodes to the existing cluster with out disturbing existing Sql installation .

    some thing like I can create/remove nodes dynamically whenever I needed .

    • Deb,

      You can add nodes and even remove (evict) nodes easily (easily being a relative term of course) all without affecting the uptime of your cluster. Just one more reason I love working with clusters.

  9. Brent, I work on application that is measured in milseconds for DB peformance. Can you share your insight on if Always on is slower in terms of insert/update/deletes to the DB versus Windows failover clustering. Since were are talking a shared SAN versus some form or replication stream my mind tends to think the shared SAN has less overhead and will be faster. we will setup a test to prove this out this theory but i wanted to get your thoughts on the matter.

    • Hi Laurence,

      I’m not Brent, but I did write this post, so I’m guessing it’s fair game to answer.

      With a Windows failover cluster instance there’s one copy of the data. If you’re using an AlwaysOn Availability Group and streaming data across to secondaries, that’s obviously more overhead. Even if you’re using an asynchronous mechanism to stream the data to the AG secondary, it’s more work and of course there could be a performance impact depending on your implementation.

      Kendra

  10. Hi Brent,
    Great read, thank you. Question for you.
    Our current setup uses a very high number of small databases versus one big database.
    We are looking into setting up a cluster, but we are getting mixes opinions on whether to convert all our individual databases (about 5000) to one big individual database, or keeping our current setup.
    Some say that it would be easier to manage one large database, and ensure high availability, while some say that managing a huge number of small databases would not be an issue.
    What is your take on this?
    Thanks
    A.

  11. Hi Kendra,

    Great explanation of HA and SQL Clustering. Now for my question. I am confused on how to best setup SQL HA in a Virtual environment. Recently I built out a 2 node test SQL cluster on a VMware Host. But this would not afford me true HA for SQL Server the same way you have explained here. If there is a failure of the VM host, the SQL cluster is useless. HA would have to be done leveraging vSphere and v Motion. Then there would have to be a cold migration of SQL to the new host. And the SQL services would not be monitored in the same fashion they are in a SQL cluster. So I am looking for advice and direction on what the best way is to getting the same results you show here but in a virtual SQL environment.

    Thanks for your help.

    Your Old Friend, Miles (I remember you when you were just a baby DBA)

    • Hi Miles! Hope you’re doing well.

      Yeah, having more than one VM in a cluster on the same host kind of defeats the point. Most hypervisors have ways for you to set up rules that make sure that the VMs will always be on different points to limit your exposure to single points of failure like that.

      However, things get tricky with virtualized clusters. Network communication is incredibly important to keeping the failover cluster online — failures or even planned migrations of vms from host to host can cause interruptions in network communication and you could potentially lose quorum and have an outage.

      Vmware and hyper-v both have some good documentation for best practices if this is something that you absolutely need to do. If you have the option I would definitely start with a non-virtualized cluster and since it’s less complex. Brent writes more about that here: http://www.brentozar.com/archive/2012/09/why-your-sql-server-cluster-shouldnt-be-virtualized/

      Great to hear from you!
      Kendra

  12. Hi Kendra,

    Excellent tutorial video for a high level clustering.

    We are currently on virtual environment where we have 10-15 number of virtual servers running in our organization and we are looking to move to clustering.

    From your video I understand it’s not much of a wise choice going for clustering on top of a visualized environment due to potential network connectivity etc.

    Could you advise the best option in case management doesn’t work to spend much money?

    Best regards,
    Mohan

  13. Pingback: What is a SQL Server failover clustering - SQL Shack - articles about database auditing, server performance, data recovery, and more

  14. Hi,
    Thank you for the video and Q&A. Totally great and helpful.

    I get a little confused regarding this question’s answer:”Would it be possible for each node to host its own SQL instance which fails over to the other one respectively? – Our goal would be to host one 2008 R2 and one 2012 instance of SQL server.”… (Not because of the song reference; that song I know).

    Confusion arises because I’ve read that you would rather use a multi instance failover cluster (i.e to benefit from upgrade testing). So the difference here is that with your proposed multi instance failover cluster, only the critical/most important database fails to the other node while the “lesser-important databases” will not fail over the other one?

    Thanks in advance!

    • Hi there,

      If I have a choice and performance and availability are important, I’d always rather have just one instance of SQL Server on its own cluster. That’s because there’s no risk of multiple instances having to run on a single node and impacting each other’s performance. To mitigate the risk for testing, you want to have a non-production cluster that you can burn in patches / firmware updates / configuration changes on.

      Does this cover your question? I’m not 100% sure I understood everything you’re asking, so just let me know if I answered the wrong question.

      Kendra

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

css.php