This week, Brent, Tara, and Richie discuss relaying SQL Server performance updates to upper management, differential backups in AlwaysOn Availability environment, replication, Troubleshooting THREADPOOL Waits, an acceptable amount of preemptive waits, Postgres, and the technologies Richie is working with these days.
Enjoy the Podcast?
Office Hours – 5-23-18
What status report should I show management?
Brent Ozar: Marci asks, “I’m looking for general advice on showing my boss a weekly basis of what’s alright with the SQL Server state. I hate SQL Server’s canned reports. I thought about the PowerBI dashboard but it takes much too much technical explaining.” That’s a really good question. So Tara, when you guys wanted to, as database administrators, how did you bubble up to management that, like, everything’s going okay?
Tara Kizer: I mean, honestly I didn’t because I can’t talk the manager speak, so my manager would do that and pass it up…
Richie Rump: Low and slow.
Tara Kizer: I had a more recent client who had really high-up people in the company join the call and they just wanted the executive summary portion and so I would talk and then somebody in that company, a sysadmin type, would have to translate. So no, I’ve never done that actually.
Brent Ozar: Marci laughs. What about you, Richie, when you want to convey to – well even to me, Where do you go to – alright, how do you everything’s okay in the environment?
Tara Kizer: He has monitoring tools…
Richie Rump: Yeah, I mean, I usually go on level of risk; maybe that’s the PM training in me. So if there’s an issue, what’s the risk level – and try to explain, hey, if we have a high priority risk, that means we need to fix something right this second or we’re in danger of losing data and then kind of breaking all the way down. So I usually try to convey it that way. I’ve been kind of lucky with having some CIOs that were previous technical people, so they could kind of understand the technical lingo. And typically, the technical CIO typically would want to go into the deeper levels because they’ve been out of it so long and they’re like, any little thing, and they kind of grasp to it. But, you know, if you’re talking to other executive-level folks, it’s usually best that you just say, okay here’s the risk, here’s the issue, we’re not going to go any deeper than that, but this is the consequence if we don’t fix X, Y or Z.
Brent Ozar: I like the risk thing. Marci says, “My boss does the Charlie Brown wah wah wah wah sound at me whenever I start talking.”
Tara Kizer: One thing that we did provide the management was the system availability. So monthly, was there any unplanned downtime – that type of thing because we did measure that. I guess we could have shown our ticketing system, you know, number of issues per system, that type of thing, but there wasn’t really any metrics that I would have gathered in SQL Server. Availability, obviously, but that was through our monitoring tools.
Brent Ozar: I always sucked at that too and I would just – I ended up going with a list of helpdesk tickets. I’m like, here’s a list of stuff I’m working on now, he’s a list of stuff that I don’t want to put out, [inaudible] because I’m going to start getting danged for having a whole bunch of issues, but those are just the only two things. I always had a tough time with that. Ron says, “Managers need pictures.” I would agree with that now.
Should I do differential backups in an AG?
Brent Ozar: Rob says, “What’s your take on doing differential backups in an Always On Availability Group environment? I realize that they’re not supported.”
Tara Kizer: They’re supported on the primary. I don’t recommend doing backups on the secondaries anyway because they’re out of date. Even a synchronous replica, they can be out of date because SQL Server can silently downgrade them to async if there’s – I forget what it is – some sort of like timeout issue. So I want my backups to be up to date so I have always ran all of my backups on the primary replica. And if you’re having to offload backups to a secondary, I wonder what else is going on in your box, you know. Why are you having to offload them in the first place? So the differentials are supported just fine; all three types of backups are fine on the primary.
Brent Ozar: Yeah, the thing that’s so tough is figuring out when the differential is going to take longer than the full, or like as long as the full, and it makes your restores take longer. Usually, when I’m dealing with an Availability Group, it’s because people want the thing up all the time and as quickly as possible and differentials make restores take longer, not shorter. So it’s kind of tricky. I always kind of aim towards doing backups, fulls daily if I could, but as soon as I get to the point where I can’t do a full backup daily, then I start to get nervous.
What replication technology should we use?
Brent Ozar: Lee says, “We have a wireless connection to SQL Server at a remote site. There’s a desire to set up some kind of replication to our primary site. What would you suggest for the least network intensive replication strategy; log shipping or something else?”
Tara Kizer: Wireless connection? I don’t know about this…
Brent Ozar: A guy with a Pringles can aimed at the…
Tara Kizer: You’re going to have to test what kind of latency you have with transactional replication, log shipping, you know. You could have some significant latency depending upon how busy your system is.
Brent Ozar: I’m a fan of, if you want to get from between two places, like, usually the shortest route is the cloud. Like, how good bandwidth can you get up to the cloud instead of trying to get to your primary site too? Because some systems will have better internet connectivity up than they do to your office, like to a flaky VPN.
Richie Rump: Interns with a USB drive.
Brent Ozar: Good point, yes. Especially redundant, you get several of them, and I don’t mean the USB drives, I mean the interns.
Richie Rump: They’re cheap, sometimes free…
Brent Ozar: Lee says, “It’s up the canyon to our water treatment plant.” I don’t know how real-time it needs to be, but now suddenly I’m really liking the sneakernet idea; that if you only need it for reporting purposes, I bet you have people going up the canyon to the water treatment plant all the time. Get your backups offsite. Start rotating backups offsite, put them in somebody’s USB drives…
Richie Rump: There’s someone on this call that would do it for free.
Brent Ozar: Yeah, maybe the listeners.
Richie Rump: There’s someone here that would do it for free.
Brent Ozar: Not you, not you? Richie doesn’t do anything for free. He doesn’t get out of bed for less than 10,000 query-bucks.
Richie Rump: Well especially hike up a canyon…
Brent Ozar: Oh, I see where you were going with that. there has to be an elevation change and all of a sudden [croostalk]. And the water treatment plant has to smell good. It can’t be just a sewage kind of thing. No one’s going to want to go hiking to a sewage place.
How should I send an error message to Microsoft?
Brent Ozar: Scott says, “We have a new 2017 Enterprise CU6.” And every hour, he’s getting an error. A user request has generated a fatal exception, SQL Server is terminating this session, contact product support services with the dump produced in the log directory. Ooh, “What is the process for passing this on to Microsoft?”
Tara Kizer: Open up a support case with them. I don’t remember what the 1800 number is, but you can go on their website and find that on there. $500 and they’ll figure it out.
Brent Ozar: And assuming it’s a bug in SQL Server, they end up refunding it to you anyway; they refund you the $500, so that helps too. If you have an enterprise agreement, like with premier support, it’s even easier. I would say don’t screw around. Go actually do it. If you’re memory dumping every hour, that can be, what we call in the industry, bad.
How can I simulate THREADPOOL waits?
Brent Ozar: Pablo says, “Ola amigos, what is the root cause of THREADPOOL waits and how can I simulate them? I tried like 500 threads but I just got a little bit of THREADPOOL waits.”
Tara Kizer: You know, it’s funny, my blog post that just went live today, when I was testing it last week, my blog post had nothing to do with THREADPOOL and I kept encountering it. It had nothing to do with blocking, it was just the amount of threads I was pushing through OStress. But if you wanted to, OStress can do it because it can hammer your box. SQL Query Stress just did not help out with that specific blog post.
Brent Ozar: There’s two magic parts there; the queries need to go parallel is the big one. You want to have like a low-cost threshold for parallelism and a wide MAXDOP. So if you’re on like an eight-core box, leave MAXDOP at unlimited – hello THREADPOOL, as soon as you hit a couple of hundred queries that should do it.
Tara Kizer: And then also, just put in – do an update statement on whatever query it is, or whatever table it is, that the selects are all going to be hitting and put it in a transaction and just don’t commit it. So now, you’re going to hit THREADPOOL fairly quickly.
Richie Rump: The wait for Deadpool is over, so go see Deadpool 2 in theatres now.
Brent Ozar: And what was your verdict on that, on Deadpool 2?
Richie Rump: I liked it, but I got all the indie comic X-Men references because I’m like a complete X-Men geek. So I was like all up in it. I was crying when he said his daughter’s name and the whole thing, but again, it was crass, it was bloody, it was violent. It’s kind of what you expect from Deadpool. They tried to put a little more story into it.
Brent Ozar: I’m happy, it really looks like it’s got legs and they’re going to continue doing this for a while, and Domino is utterly amazing. She’s fantastic.
Richie Rump: Zazee Beetz was amazing. I had my questions, like how are you going to do Domino? But no, she was great.
Should I pay attention to preemptive waits?
Brent Ozar: Daryl says, “I’m seeing more preemptive wait types lately. I’m wondering if I need to pay attention to these. What are acceptable amounts of preemptive waits?”
Tara Kizer: Which preemptive though? Is it PREEMPTIVE_OS_PIPEOPS, because that’s XP command shell, so what are you doing with XO command shell at that point? So you need to figure out what type of external stuff you’re calling. And is it SQL Server built-in – is SQL Server doing that or is your code doing that? But PIPEOPS is really the only one that I look for when looking at the list. XP command shell is almost always the case.
Brent Ozar: Let’s see, Lee says, “Thanks for the ideas.” Daryl says, “It’s COM_GETDATA on SQL Server 2014. I’ve never seen that one.
Tara Kizer: No idea.
Brent Ozar: If I had to guess – so normally when we talk about wait types and how much are too much, poison wait types are one thing and we have those listed separately on sp_BlitzFirst. But generally when we talk about wait types, I don’t really worry unless there’s at least, say, half an hour of waits for every hour on the clock. So in the span of an hour of time on the clock, if there’s not at least a half an hour of a kind of wait, I’m not usually too concerned about it. Again, poison waits are different, RESOURCE SEMAPHORE, THREADPOOL, that kind of thing, it’s totally different. But if you’re just like, CX PACKET, PREEMPTIVE whatever, page I/O latch, half an hour in the span of an hour isn’t really that big of a deal.
Tara Kizer: I just looked it up and the COM_GETDATA, if it becomes prevalent in your box, like Brent was saying, but you need to look for the query that’s causing it. And SQL Skills, you know, Paul’s information on that guide, says you might be doing a hash join or a sort. So find the query.
Brent Ozar: And if you were going to go tack down which query it was, how would you go about doing that?
Tara Kizer: Well I would be using sp_whoisactive.
Brent Ozar: That’s true.
Tara Kizer: You can use BlitzWho also. I like to do this via current activity, you’re logging it to a table. And if you do it frequently enough, you should be able to catch it if this guy is a fast query because maybe it’s a fast query. It’s just running so often, the waits are going really high for it. But BlitzCache is good for looking up queries, but you’re not getting the wait information. I need to be tracking queries so it would be sp_whoisactive to a table. It’s a lead info column. You might not catch it at the time though, that’s the problem. [crosstalk]
Brent Ozar: Yeah, people also say out there too, there are things you can do with Query Store or with Extended Events and you totally can, we’re not poo-pooing that, they’re just much harder. They involve setting things up ahead of time, running a lot of tracking. And if it’s not really a big deal then it’s kind of a pain in the rear to catch, whereas sp_whoisactive to a table is a piece of cake.
How should I troubleshoot IIS errors?
Brent Ozar: Pablo asks, “Are there some things that I should check on in IIS when my app is getting error 503 due to high concurrency?” Well, you know, we don’t really do much with IIS around here, especially if it’s 500 level errors. Those are usually application level stuff. You could look in SQL Server at the error log around the same time just to see if you’re seeing something. Richie, have you got anything to add on that one? I’m guessing no.
Richie Rump: No, I ditched IIS a long time ago.
Brent Ozar: Yeah, it’s sad but true. Richie, what kind of technologies are you working with these days?
Richie Rump: Well, I’m trying to do some updates to PasteThePlan but I keep getting some errors popping up in ConstantCare that require my attention, so it’s kind of annoying. But today, I was doing some optimization with views in Postgres, so there we go.
Brent Ozar: What kind of stuff are we looking at adding to PasteThePlan?
Richie Rump: Oh, I – I don’t even know Brent because I just started on the low-level stuff. So adding the serverless functions – not functions, but framework – to the application, and that allows us to more easily deploy the functions to Lambda and kind of hold everything in one spot as opposed to crazy gulp functions that I wrote a long time ago. Doing an upgrade to … 8.1 – in serverless, it’s just change of a text. So that’s another reason we’re doing all of that.
We’re getting rid of some of the GDPR stuff, so like the recent compliance and things like that. Just so you people know, we’re not – we want to go back to Europe at some point. We’re getting ready for it, but it’s going to take us some time. There’s only one of me and my wife doesn’t want me cloned. There’s a new version of the HTML Query Plan, so there’s a whole bunch of new stuff in there we’ll be adding as well as fixing some of the valid XML stuff that I ignore.
How hard is it for a SQL Server DBA to pick up Postgres?
Brent Ozar: Tammy asks, “Speaking of Postgres, how hard is it for a SQL Server DBA to manage? Are there any gotchas?” Well, when we switched to it, we didn’t switch directly to Postgres. Richie, do you want to talk about what we switched to and why?
Richie Rump: We just started Postgres, so there was really no switch to be done there…
Brent Ozar: Well, instead of SQL Server…
Richie Rump: Right, the idea is that it was the – first of all, there’s no SQL in AWS. There is, but it’s a whole lot of different cost level. So when they came out with Aurora about a year and a half ago, we got in the beta really, really early and we were kind of impressed at the cost per performance level that we had there. So we just kind of grew as that service grew and went live when they went live. So it was more of an organic thing for us.
It kind of made a lot of sense. If you’re a SQL Server person, just pick up a book and skim some of the topics. Everything’s going to be so very familiar to you. The indexing is the same. You’re doing the same thing just using a different engine. And some of the syntaxes is different. I know, Brent, you trip on that just a little bit because you haven’t been using it as much as I have. But everything is familiar, but just in a different language a bit.
Brent Ozar: Yeah, I love how easy it is to jump in and write queries, but there are – I’ve got a blog post started with a couple of differences. For example, you can’t just go and declare variables and put in an if statement and go, if this then go do that, if this then go do something else. There’s just a lot of subtle differences in terms of syntax and then a lot of subtle differences in data types, but nothing too big. Like, casting and converting is a little different.
Richie Rump: It’s easier, frankly, from my – and the functions are really, really powerful. So they’re kind of a step ahead of some of the SQL Server stuff as far as internally. Like, I don’t have to do an LTRIM or RTRIM; I can just trim.
Brent Ozar: The date stuff is really powerful. It’s just different, you know, it just takes a while to get used to, but I’ve been really happy with it. The one thing I adore is in the Postgres client, I use Postico. You can just hit the command and period buttons on Apple and it just runs the statement that you’re on. You don’t have to highlight the statement, it just goes, oh your cursor is here, this is the query you must want to run.
Tara Kizer: Then you move back to Management Studio and you’re expecting it to do that type of stuff.
Brent Ozar: It’s terrible. I’m like, what do you mean I have to highlight this like some kind of caveman; come on. And I was in a class and somebody was like, just hit these two keys in SSMS, and I’m like, no, that executes the whole thing. That’s just like control E or, you know – how could SQL Server possibly know what line you want to execute? My cursor is there. My cursor is on it, it’s right there. It’s not hard.
Richie Rump: One of the funny things, I think, about running ConstantCare is that, you know, we have what, somewhere around 50, 60 years combined DBA experience and I have, like, zero, and supposedly I’m administrating these things. So we may have to pass off some of the buck here to someone a little more experienced, other than the DBA noob over here.
Tara Kizer: I have the most amount of years but I have the least amount of knowledge. It doesn’t matter how many years you have.
Brent Ozar: No, man, I don’t know, it’s weird. You talk about people and they say, I have ten years of experience, but they’ve done the same thing for ten years straight. Because we deal with that a lot as consultants, you know, you walk in, you talk to somebody and they’re like, I’ve used SQL Server since four-point whatever, how come you have priority boost on, seriously? It’s not that hard, turn it off.
Or the other question we’ll ask is, when was the last time you went to training or a user group or went to a SQL Saturday, PASS, anything like that, because you can learn a lot at those kinds of events. Alright, well thanks, everyone, for hanging out with us today. That’s all the questions y’all have submitted in. we will go out, grab lunch, and we will see y’all later.
Wanna attend the next Office Hours podcast taping? Register free.