1.5 Triaging Failure in Availability Groups (32m)

You’ve built an AlwaysOn Availability Groups, and it hasn’t failed yet – so what’s the big deal? Watch as Brent deals with a broken Availability Group, stepping through his thought process during a failure, and get a glimpse into just how complex clustering can become – even with a simple 2-node AG.

11 Comments. Leave new

drljsl
August 17, 2018 8:08 am

21:19&25:36 shows dbo.ChChChChanges but not much said about it.
Does dbo.ChChChChanges show that ddl and dml can be run on primary and show up on secondary or is it some type of canary command running all the time?
If the command is not running all the time is it still referred to as a canary?
Thank You!
Darrell

Reply
- Brent Ozar
  August 17, 2018 10:42 am
  
  You can ignore that. It was from another demo.
  
  Reply
jitesh.khilosia87
November 14, 2018 6:33 am

Hello Brent

If we have 2 node cluster AG, is it require to install cluster service on both the node where sql service installed , or we can install it on seperate machine.

Reply
- Brent Ozar
  November 14, 2018 6:34 am
  
  Yes, the cluster service is a dependency for the nodes.
  
  Reply
Kapil Bhasin
February 20, 2019 8:41 am

Its been a Great learning so far for all the sessions that I’ve been going through. Just curious, is there a complete Recorded or online module available only for Always ON. I saw earlier session from Edwin as guest instructor, but it seems to be taken out and no more available. Also, I’ve gone through your blogs for setting up AG’s but is there a plan in future to cover all the basics of implementing Always on and learning the in depth concepts? Thanks

Reply
- Brent Ozar
  February 20, 2019 8:43 am
  
  No, for Always On Availability Groups training, your best bet is to contact Edwin: https://learnsqlserverhadr.com/contact/
  
  Reply
jlochbaum
February 27, 2019 7:24 am

Starting to see why you don’t default to recommending AG for every environment. It’s kind of scary that they are so dependent and that so many things can go wrong, even with the dynamic voting/witness improvements. We don’t mess with these at all in our environment but I love learning about what all is out there.

Great class and clear teaching; also the step-by-step troubleshooting with slides was awesome.

Reply
- Brent Ozar
  February 27, 2019 7:28 am
  
  Thanks, glad you’re enjoying it!
  
  Reply
emmanuelemore
April 17, 2021 3:49 am

I know this is an old thread and I am going through the training again to see if I can find answers to two questions that have been bothering me regarding alwayson replicas.

Does the frequency of transaction log backup affect the synchronization of data in alwayson replicas? I have a three node alwayson replica setup. First (primary) and second (secondary) nodes are sync commit while the third (secondary) is async commit with a transaction log backup taking place every 30 minutes.

With the same setup above, I occasionally run into situations where the replication between the primary and the third replica lags behind for a considerable amount of time (last_commit_time on the third node). All I do is monitor the redo_queue_size and the redo_rate until the last_commit_time on the third replica is close enough to the primary. Is there anything that can be done after the pages are hardened on the third replica to speed up the redo process? Is there something causing the lag?

Reply
- Brent Ozar
  April 17, 2021 6:59 am
  
  For troubleshooting specific problems in your own environment that aren’t really related to the training, your best bets are either to post it on a Q&A site like https://dba.stackexchange.com or https://SQLServerCentral.com, or if you need my help specifically, click Consulting at the top of the site. Thanks!
  
  Reply
  - emmanuelemore
    April 17, 2021 11:29 am
    
    Thanks for the information. I reached out to Edwin: https://learnsqlserverhadr.com/contact/ form the link you posted above to see if I can find a training session for my questions. I will check out https://dba.stackexchange.com or https://SQLServerCentral.com as well.
    
    Reply

1.5 Triaging Failure in Availability Groups (32m)

11 Comments. Leave new

Leave a Reply Cancel reply