This week, Brent, Richie, and Erik discuss best account practices for SQL Server, change tracking, disabled indexes, measuring queries to gauge their speed, overcoming issues between Management Studio and integration services, deadlocks and long-running processes, query tuning, and lock escalation.
Enjoy the Podcast?
Office Hours Webcast – 2018-7-11
Brent Ozar: Alright, Donna asks, “Should SQL Server run under the local system account, an active directory account, or NT authority network service? What’s the best practice or easiest to support?”
Erik Darling: I always went with the service account. I liked being able to control permissions to places outside of the SQL Server. I liked to very specifically have my service accounts inherit certain permissions; so either lock pages in memory or perform volume maintenance tasks. But really, aside from that, you know, if you’re not doing anything crazy and you find yourself spending more time creating different service accounts and trying to figure out what to do with them then just use the local ones. But if you have a need for, you know, going places outside of the SQL Server or assigning specific permissions, I really like the service account better.
Brent Ozar: Mark says, “Afternoon, y’all. I’m investigating blocking caused by the change tracking cleanup process running on some heavily updated tables during production hours. My thinking is to turn off auto-cleanup and manually clean by looping through the tables using the documented stored procedure sp_flush…”
Erik Darling: I really wish you had read this before you started reading it out loud.
Brent Ozar: Even our transcriptionist is like, what the heck did you just say? Note to transcriptionist, I apologize for that. You don’t have to get it right. It doesn’t matter. We wish we wouldn’t have read the question. We’re all full of regret at this point.
Erik Darling: Our transcriptionist just became an anti-Putin blogger to get a more safe line of work. What just happened?
Brent Ozar: Mark, I don’t think any of us have every used change tracking. I don’t even think – Richie, have you ever used it? You’ve never used it either, right? No. his video either froze or he’s shocked with horror. He might be shocked with horror. Now, at this point, we should make Miami jokes [crosstalk 0:02:05.5]… Our poor transcriptionist… Richie, I love Mexico. What could I say that would shock him and make him move? Oh, he did disappear. It was his internet…
Erik Darling: Here in crime-free Brooklyn the sirens go on forever.
Brent Ozar: Oh my gosh, mark says, “In testing it takes upwards of six hours to complete.” I would actually open a support case. I would open a support case with Microsoft because I think it’s probably going to be the cheapest way to get that answer. Or I would post it on dba.stackexchange.com, just because that way, I think there’s other people out there using change tracking. I don’t have an anti-change tracking field; it’s totally fine, I just never run into people using it, so you got me there.
Erik Darling: I think, just from having read some change tracking questions on Stack Exchange, I’m at least sort of sure that you can index tables differently to help these processes along. So I would ask there, open u a thing with Microsoft if you have support with them and start from there. We just don’t use it enough to be terribly helpful with that.
Brent Ozar: That totally make sense. It looks like the slide deck froze too on Drew Furgiuele too. Let me move that a little, there it goes…
Erik Darling: Everything is frozen today. Richie had a blue screen.
Brent Ozar: I wish I had a way to easily show you guys the screenshot of Richie’s blue screen. Richie also has more computers probably than any of us, more laptops, more of this gear but none of them work for – if you ever wonder why it takes a long time for us to get software out, it’s not because Richie’s not a good developer. He’s a great developer. His machines just keep blue screening.
Erik Darling: It’s his damn Windows developers ruining everything for him. They can’t get those audio drivers synced up.
Brent Ozar: Joe asks, “What happens when an index is disabled?” Joe’s probably our favorite name too. We know a lot of good Joes. “What happens when an index is disabled? Is there any overhead or is it taking up space on disk?”
Erik Darling: Nah man, you disable that index – you do have to be careful because there is some locking involved when you do the disable, but it will de-allocate all the pages were associated with it and you just keep the definition of the index in object explorer or in SQL Server stiff it will show up as a disabled index. It makes life pretty easy if you disable something that you wanted to keep. You can just kind of right-click and hit rebuild and have it come back to life. You do have to be careful though if you’re using maintenance plans or do any sort of index-y stuff on your servers. They will re-enable disabled indexes for you without asking, so fun stuff there. But aside from that, no, not really an overhead.
Brent Ozar: I could not believe it when I first ran into that. somebody told us if you run maintenance plans, it enables disabled indexes. And I thought there is no way maintenance plans can suck that bad. I don’t like maintenance plans, but wow, they do suck that bad.
Erik Darling: The stuff you run into like that that just makes you lose your damn mind, like now I have to know this. I have to remember this, I just forgot someone important’s birthday because now I have to remember this thing about maintenance plans. Like that part of my brain just went…
Brent Ozar: Erika asked me yesterday, she’s like, how old are your parents? I’m like, I have no idea. No idea. What birth year are they? No clue. Do you know their birthdays? Mom’s in February, dad’s in June, just no clue.
Erik Darling: No, but I can tell you all about the 8K page structure, like everything else…
Brent Ozar: Pablo asks, “My prod and QA servers have different storage and my dev team doesn’t care. They say, when they develop stuff, don’t worry it will be faster on production anyway. How do I measure…” I’m going to change what he asks to make it make more sense, “How should my development team measure their queries to know whether or not they’re going to be fast in production?”
Erik Darling: I would disregard disk almost entirely and I would focus on making sure that I have enough RAM so that whatever the disk subsystem is, I’m not spending a lot of time touching it when my queries run.
Brent Ozar: Next up, Steve asks, “Do you have any tips…” Oh, do we ever…
Erik Darling: Frosted sometimes…
Brent Ozar: “For overcoming the version mismatch issues between management studio and integration services?” Oh, that’s so terrible.
Erik Darling: No, I still don’t know how to open SSIS, so no.
Brent Ozar: What I had to do as a database administrator, and this is going back years ago, but I had to maintain different VMs with different versions of management studio so that I could go open up different integration services packages.
Erik Darling: Oh yeah, because maintenance plans from one to the other would just be like, we can’t figure this out.
Brent Ozar: Yeah, it sucked so bad. And these days, with the fast and furious release of SSMS, I’m kind of even more interested in that because they’ve brought out stuff that breaks functionality and they’re like, we’ll ship a fix soon. And I’m like, yeah but I need to create a table today.
Erik Darling: 17.8.1 is now available…
Brent Ozar: Which tells you something in the agile – and it’s not that I want it to go back to only updating every service pack. I like where they’re going, but there’s a mix in between those two.
Erik Darling: You know, and I understand the pain of frequent release cycles. I’ve had less than perfect First Responder Kit releases. There’s been some weird stuff that’s gone on in there. So I don’t pretend to be totally guilt-free and innocent with the whole thing, but I’m one person testing against a limited amount of stuff. I would hope that with Microsoft’s nearly unlimited financial resources, they’d have some case-sensitive instances to back stuff up against. I don’t know, little things.
Brent Ozar: And you would also hope that the people who are managing Azure in-house would [dog food 0:08:14.0] first. And Microsoft keeps talking, “Everything we give you, we’ve already been using for months up in the cloud.” And then they ship an SSMS that doesn’t work with Azure SQL DB. I’m like, what, what are they using up there?
Brent Ozar: Joe asks, “We seem to experience deadlocks during long-running CPU intensive processes, like stuff that takes 15 minutes or more. Is there a relation between deadlocks and long-running processes?”
Erik Darling: I mean, when you have a long-running process, you kind of tend to increase the likelihood of hitting or locking or deadlocking scenario, especially if those long-running processes are crammed inside a BEGIN TRAN or whatever. You know, usually when we talk to people who have these problems, like the three most common solutions are either to make sure you have the right indexes in place to make these things fast, maybe batch your modifications so that you’re not spending a long time getting a ton of locks, trying to escalate locks, all that other stuff – batch sizes of 1000 or less for your modifications. If you can, look at an optimistic isolation level like RCSI or snapshot isolation if you just want to, like, have certain queries knock against each other for readers and writers.
There’s a lot of options to kind of make your life a little bit easier when you have long-running processes. But you know, obviously, the first goal is to have good indexes, to make sure that we have processes that run as quickly as possible before we go jumping into other things that are a little bit more rocket science-y.
Brent Ozar: Yeah, a link we always point people to, Michael J Swart’s Take Care When Scripting Batches, talking about how you do work in smaller chunks. I also want to throw a plug out for Erik’s excellent sp_BlitzLock. This thing is awesome. So I’m going to fire open SQL Server management studio. I don’t have a deadlock already in here, so we’ll see if – I don’t think I’m going to be able to – I’ve got a script real quick where we’ll go and create one, just because we’ve got time on here and I love you people. Deadlocks are fun.
Erik Darling: Yeah, why not? Let’s spend time with each other.
Brent Ozar: So I’m going to say – yes, even us on camera time together. I love you, man. So sp_BlitzLock will go through and look at the system health session in order to see if any deadlocks have happened recently, what queries were involved. So right now, I just ran it and it doesn’t show any results. So let’s fix that. So what I’m going to do is I’m going to go get out a query that I’m going to be running later this week on a training class, actually.
Erik Darling: Now, if you have an extended events session that is capturing deadlock XML, you can also point BlitzLock at that. you don’t have to just use the system health one. If you have specific monitoring for it, you can go and look at that too.
Brent Ozar: So over on the left-hand side, and I’ll switch into the Stack Overflow database just because I don’t like doing things in master – so over on the left-hand side, I’m going to start by creating a couple of tables. I’m going to create lefty and righty. They both have just a few rows in them. They both have a primary clustered key, so lefty and righty both exist right now. Over in the left-hand side window, I’m going to start a transaction and I’m going to lock lefty. So lefty is now locked in the left-hand side because I didn’t commit yet. All I did was begin the transaction…
Erik Darling: But it feels like someone else did.
Brent Ozar: But it feels like someone else… I don’t know how many of our audience is going to get that, but I was pretty proud that I got it.
Erik Darling: You’ll know by the amount of people that drop off the call.
Brent Ozar: These people have no – what are they talking about? Adolf Garlic – what? So over on the left-hand side, lefty is locked. Over on the right-hand side, righty is locked and everything works fine here so far. They can both hold locks on two separate tables. Now, over on the left-hand side, I’m going to go through and try and update righty. And it won’t work because right now, over on the right-hand side, righty’s already locked.
So on the left-hand side, this guy is blocked. There’s no timing for block outs, blocking by default on SQL Server. By default, blocking can go on forever, but I’m going to do something on the right-hand side and things are about to happen real fast. So I’m going to explain what happens before it happens. On the right-hand side, I’m going to try to update lefty. Now, at this point, the left-hand window is waiting for stuff on the right, the right-hand window is waiting for stuff on the left. It’s exactly like the Reservoir Dogs scene in the garage where they’re all pointing guns at each other and the mafia and all that other good stuff from Stack Overflow.
So when I hit execute on this right-hand side, within five seconds, SQL Server is going to all of a sudden wake-up and decide to kill one of these transactions. It decides to classically kill the one that’s the easiest to rollback. In this case, they’re both roughly the same. The one on the right was chosen to die because he hadn’t done any work yet, whereas the one on the left had updated both tables. So we had a deadlock. The transaction was deadlocked, try to rerun your query – screw that. What we’re going to do is we’re going to go look at…
Erik Darling: You might want to keep that up. Sometimes the first lock doesn’t end up…
Brent Ozar: Ye of little faith. You’re probably right. Look at that, oh god, look at you. So let’s try it again, right side, let’s go update, BEGIN TRAN…
Erik Darling: I kept running into that when I was writing BlitzLock. I’d be like, okay, clear things out, start fresh, deadlock. Where’s the deadlock? Damn, where is it?
Brent Ozar: Please work this time… No, still. Alright, so BEGIN TRAN, BEGIN TRAN, update, update – oh, he’s still executing. Hold on, rollback. This is going to be all kinds of fun. Rollback… People are like, I just wanted to get my question answered and I had to sit through and watch them herpa-derp through one deadlock after another.
Update, update… sp_BlitzLock, there we go, look at that.
Erik Darling: That felt good.
Brent Ozar: Do you wish to commit these transactions? Hold on, cancel rollback. It’s Richie Rump, ladies and gentlemen. So forget everything we’ve just said in the last five minutes.
Erik Darling: Yeah, we lied.
Brent Ozar: So now, what sp_BlitzLock shows you is here are the queries, here’s deadlock number one, here’s deadlock number two, shows you who the winner was, who the victim was. You can see from DBCC INPUTBUFFER type stuff. Here’s the exact queries that are involved. Here are the object names that are involved in the block. Here’s the query that was involved, or here’s the application that’s involved. One of my favorite things is that you can see, because people are like, “I swear I’m only running everything in NOLOCK with read committed snapshot isolation or read uncommitted.” You can see right here what’s going on. Then, he’s got a really nice summary grid down here were you can get more information about the queries that were involved, run sp_BlitzIndex to see which indexes are missing. All kinds of fun stuff.
Erik Darling: If you scroll a little bit to the right, one of our new contributors added the entire XML deadlock graph; Joshua Darnell added that in a recent thing. So if you need to go send that off to someone else, you can go just copy and paste it form right over there.
Brent Ozar: Or, if you want to show it in SQL Sentry Plan Explorer, visualization of deadlocks, copy and paste it into there. The world is your oyster; your terrible smelling oyster.
Brent Ozar: So let’s see – next question, Jim says, “Hi, guys. We’ve been using 2017 read scale availability groups without a windows cluster. We thought we’d be able to do backups on our secondaries but our secondaries are never detected as a preferred backup replica.” Yeah, no you don’t want to do that. “Is that a limitation of clusterless read scale availability groups?” I don’t know if it’s a limitation of it documented or not, I’m just a huge fan of only running backups on a primary. In our senior DBA class, we talk about why. So, unfortunately, no dice.
Brent Ozar: Paul asks, “I’m creating log shipping across two separate domains. What would be the best way to set up the service account that will run the log shipping process?”
Erik Darling: I’ve never done a cross-domain – I’m not even sure where to begin looking for issues with that.
Brent Ozar: I just wouldn’t. I mean, you can do the log shipping, but they don’t need to be identical service accounts. Like earlier, Erik goes, “Just make sure that they’re service accounts so that you can designate those service account permissions on the file share where the backups are going to be.” But that’s all you should have to deal with. They shouldn’t be directly talking to each other at all. We should put in a plug for sp_AllNightLog too, but then I’m not going to demo one more thing inside this session. That’s about where I draw…
Erik Darling: And now we have this server over here… No, we don’t.
Brent Ozar: Marci asks, “I loved y’all’s post, Why Multiple Plans For One Query Are Bad. Have you seen any success with dynamic shops using either optimized for ad hoc or forced parameterization? We have a query with over 1000 plans. It’s not too terrible, but which option should we try?”
Erik Darling: Whichever one you can support, I guess. So like, there are some times where I’m working with a client, I’m like, look you have these three options to fix parameter sniffing. Which one are you mist likely to go with and which one puts your server in the least amount of danger? And any single one of those can be a reasonable solution in a certain set of circumstances. So like, if you have a query that doesn’t run very often and it’s just always going to be weird and different and you don’t have anyone sitting there checking out the plan cache and monitoring stuff, throw a recompile hint on it.
I’m not going to argue with you, I’m not going to yell at you. It’s probably not going to do anything to awful to your server. It really just kind of depends on your situation and the amount of time and effort you have to fix things. If you really want to spend a lot of time and dig deep and fix queries and indexes and try rewrites and temp tables and CTEs and whatever else, go ahead. But generally, it falls upon the person who has to support it long term to decide which thing works best for them.
Brent Ozar: Josh says, “If you stop and restart the system extended events session, you’ll get results in BlitzLock.” That’s not a bad idea too, but I’m of the band my head against the wall repeatedly until the stuff comes out. I’m all about that.
Brent Ozar: Oh my god, Paul… Paul asks, “The developers at my job…” They need to spend more time hanging out with Richie. “The developers at my job want to disable table lock escalation to alleviate locks during large data processing sessions. Is this a good practice? If not, why?”
Erik Darling: No. So lock escalation is a cheeky monkey and it’s going to – SQL Server does it for a good reason. Like, SQL Server only has so much memory that it’s going to give out for locks and when a query is – when SQL Server is like, oh this query is going to need five, 10, 15,000 page or row locks, I’m going to try to get this table level lock to just lock the one object. You don’t want that much memory being overcommitted to locks. You don’t want SQL Server waiting to get memory to give out to locks. You don’t want any of that stuff happening. That’s generally a pretty bad idea.
If you are having that much trouble with locking in this situation, this is what partitioning is for. It’s not the magic performance feature that a lot of people think it is; it is exactly the stuff like this. It’s a data management feature so you can easily work on a single partition, swap partitions in and out without killing concurrency too badly.
Brent Ozar: I also worry about – you know, we just demoed deadlocks – what happens if you have say five, 10, 100,000 locks pending against two different sessions and then they want to start stepping on each other and you run into a deadlock? Holy cow, the rollbacks are going to suck. The rollbacks will just be terrible; absolutely awful.
Erik Darling: And you know, even with partitioning, Microsoft, when that came around, they brought out a new lock escalation level which was auto, which would escalate locks from a row or page to a partition. Even in there, there’s a big bold note in the release, like this may cause deadlocks in certain scenarios if you’re doing it, like, you know, if these partitioning queries start stepping on each other’s partition. So it’s not – there’s no such thing as a free lock, no matter how you take it.
Brent Ozar: And then finally, Sreejith asks, “Upgrade question – what year will Richie’s t-shirt’s stars get upgraded from four stars to five stars?”
Richie Rump: Not this year; we’ll just put it that way. So for those who don’t know, that’s for each World Cup that a team has won. Every team that has won the World Cup puts a star on there, except Uruguay. Uruguay has won two World Cups, they have four stars, so…
Erik Darling: How do they do that?
Richie Rump: Well because they count their Olympic wins in there, which nobody else does.
Erik Darling: Oh, that’s cheating.
Richie Rump: It’s cheating.
Brent Ozar: Thanks, everybody, for hanging out with us this year. Now we’re all going to go off and watch – I almost said Americas Cup, but World Cup. There ain’t no America this year. Americas Cup is sailing and there ain’t none of that. So thanks, everybody, and see ya’ll at the next week’s Office Hours. Adios.
Erik Darling: Bye.