This week, Brent, Erik, Tara, and Richie discuss using SQL Server 2016 vs 2017 for production environment, reporting, reindexing jobs, Azure vs Amazon, parameter sniffing, creating custom alerts for agent jobs, activity monitor, troubleshooting SQL Server 2014, replication, turning off auto-updating stats on tempdb, and why Brent can’t get a tan.
Enjoy the Podcast?
Office Hours Webcast – 2018-6-27
Brent Ozar: Tom starts out by asking, “Does the team still recommend SQL Server 2016 over 2017 for production environments?”
Erik Darling: I don’t know that we ever recommended one over the other. I usually recommend being on whatever is the newest version, like, for new environments. Like if I was spinning up a brand-new SQL Server box for production today, I wouldn’t want to be on 2016 SP2, I want to be on the newest one. I want to be on the most recent version because that’s the most likely to get back-ported fixes and improvements and all that other stuff and then when you bug Microsoft for support, they’re more likely to respond because they’re like, oh crap, this is our newest one. If there’s a problem here, it’s going to be a problem later when people actually – when real people start using it. So I would always recommend being on the most recent, even if it’s a little scary.
Brent Ozar: I’m there too because I don’t want to reinstall this thing that often. I’m going to put the SQL Server in and it’s going to be there for like ten years because my managers never ever let me upgrade or install anything. So I want to get on – man, if they could let me get an early access build to 2018, I’d probably be in. like, alright, let’s do this.
Erik Darling: I can get you one of those for $5.
Brent Ozar: $5, they’re at Chinatown, right? They have all those DVDs…
Erik Darling: SQL Server.
Richie Rump: Is that one of those places where it’s kind of like a storage unit and you say, I want that one, and they say come back in 30 minutes and then they’ll have your DVD ready?
Brent Ozar: Or you go through an ally and go past all the fake Rolex watches; the Folex watches…
Erik Darling: Yep, your Casios, your [Boonie and Dereks], whatever you want.
Brent Ozar: I do have to say, it does bother me that you can no longer get – I can no longer get – fake watches in New York, because I used to love going and getting a $20 Breitling or something because I thought they were funny as hell. They would die, and then of course, when they die, you’re not going to take them in to get the battery replaced because you don’t want to be like, hi can I get the battery replaced in my $20 Rolex?
Brent Ozar: Brian asks, “Hi team, my boss wants some readable databases on another server, but we can’t use Availability Groups because this SQL Server is stuck on 2008 R2 for a while. Could I do some kind of database mirroring or automated nightly snapshots, or is there a better way to do this?
Tara Kizer: Replication.
Brent Ozar: That’s fair. So what kind of replication… [crosstalk]
Tara Kizer: Only because of the old version though. My client right now, they’re on 2008 R2 and they would like a reporting environment. Well, you get log shipping or you get replication. They can’t handle the latency of delaying the restore, so if they get replication – you can do it with a database snapshot, but you know, you have to do it regularly, put it on a schedule and the data is immediately latent. It’s not being kept up to date so you have to rebuild it all the time. I really don’t think that a snapshot on top of a mirror is a good solution for me for reporting.
Brent Ozar: It’s Enterprise only, you’ve got to pay [crosstalk] bucks for it.
Tara Kizer: yeah, transactional replication…
Erik Darling: My question would be, like, how old does that data have to be? Because like Tara’s saying, if you want it to be up to date, then replication is probably a good bet. If you can withstand some latency in the data then log shipping might be a good bet. There was a really good answer on Stack Exchange recently about taking snapshots with mirroring where someone was talking about setting up a database with – it’s all synonyms that point to the snapshot and then dynamically rebuilding them with a new snapshot name whenever you generate one. I thought that was a really cool idea because that’s a lot better than – what I always learned was the very manual process of taking a new snapshot, draining users out of the old one, directing them to the new one. Like, I thought the synonym idea was – I think it was Aaron Bertrand or someone from Microsoft. But I was like, wow that’s cool; I never would have thought of that.
Brent Ozar: I have done that and the pain in the rear is, if you can’t get single user basically on the database, your script goes halfway through altering the synonyms and then breaks because somebody has the object open and it’s blocked. So then some of your queries are pointing to one snapshot and some are pointing to another. Yeah, that’s…
Erik Darling: Just use NOLOCK.
Tara Kizer: The fact that you said automated nightly snapshot is a possible solution means that latency is tolerated here, so I mean, for me, I might even just do a backup and a restore. You could either do log shipping with a delay – you know, that way you skip like eight hours out of the day then it will catch up on its own – or a backup and a restore, but you know, that’s probably one of the least desirable as far as fast. Snapshot replication can do it too, you know, it doesn’t have to be transactional.
Erik Darling: Yeah, you know, if you were going to go the backup and restore route, I would probably want to grab the PowerShell scripts from dbatools because they have a good automatable restore thing. Or grab like Quest LiteSpeed, which is pretty cheap per server. And they can automate a lot of that stuff too.
Brent Ozar: Let’s see, next up, Jeremy asks, “We had a re-index job running on a 4TB database. Our Availability Group…
Tara Kizer: See what you’ve done with this re-index job…
Erik Darling: You can just see Tara’s face turning…
Brent Ozar: It’s going straight to her. “On the secondary server, the database was in a recovery state. How do I see what it’s doing or any progress?”
Tara Kizer: Good luck with that. Progress check the error log… Check the error log. It should tell you the phases it’s in, in percentage, because if it’s in recovery then it’s going to crash recovery over there. I mean, at least this only happened to the secondary. If you have a re-index job during job during a failover and it happened to be working on a large index, oh boy, crash recovery is going to take a while. So this sounds like it might have been async, since I would have thought that if it was sync then the primary would have had issues too. I don’t know. A lot of people can’t run re-index when you use Availability groups, especially on high-volume systems and 4TB. So I’ve had to completely disable maintenance on Availability Groups, and guess what, no performance issues. I had an update stats job once or twice a day and I was fine.
Richie Rump: Yeah, I had a 60TB database that went into recovery and it took weeks; weeks. And the database, it was completely the database administrator’s fault. He was doing something he shouldn’t have done and he was not fired.
Tara Kizer: And you guys just left it that way? You didn’t just start the restore process? I mean, 60TB – I can’t imagine how long that’s going to take.
Richie Rump: the server was pretty beefy, so I quickly essentially failed it over manually with just data and just started using that for processing. So I saved the day.
Tara Kizer: Richie pulled out his DBA hat.
Erik Darling: Quite the DBA hat he has. SQL Server DBA, Postgres DBA, cloud DBA…
Richie Rump: One server failed – look the other failed over – that one failed, oh look it goes back the other way, that failed. Oh look, it goes back to the other one and it fails.
Brent Ozar: Some fun times with our own hosting.
Brent Ozar: Ron asks a relevant question. He says, “If y’all are going to pick hosting, do you pick Azure or Amazon?” Richie, what did we pick and why?
Richie Rump: We picked Amazon because Brent was using Amazon. I mean, really, that’s the only – at this point – I guess the other thing was that we were looking into serverless and the serverless story was better at the time at Amazon than it was in Azure. But really, they’re so comparable these days. I mean, if you’re on SQL Server and you don’t want to use RDS and you just want to go ahead and use Azure or whatever, go do that. I mean, if your company is already using Azure, go use Azure because the resources are probably there and you have knowledge in the company and stuff like that. But they’re both pretty comparable. I find Amazon a little bit easier in some things and I find Azure a little bit easier in other things. So it is what it is.
Brent Ozar: I would say too, just like we picked one based on one particular service that wasn’t available anywhere else at the time. You may have the same kind of thing. There may be some business make or break that – man, I have got to have exactly what Azure CosmosDB offers. This is make or break for me for some reason, but it’s fairly rare.
Richie Rump: Yeah, I would love to use CosmosDB in some of our scenarios. That would be really, really interesting to use some of that stuff, but it’s fine. We get along with what we got.
Erik Darling: I think I would avoid Azure because DTUs terrify me…
Brent Ozar: Now you can get cores and RAM. They let you buy Azure SQL DB in cores and RAM. Because you and everyone else in the world – it sure terrified me.
Erik Darling: Like what the hell – it’s like getting a weird tax at the end of the month, like, what is…
Richie Rump: Yeah, I will say this; I do find Azure security much more difficult to grok than AWS security I think because they added that Enterprise layer to it, it’s just a little more difficult to put your head around some things.
Brent Ozar: Look, dude, there have been like three GitHub issues in a row where we’ve added new tables and we can’t get permissions to access them. If it’s this bad in AWS, how bad would it be in Azure. Holy smokes…
Richie Rump: Oh no, that’s just me being a jerk. What are you talking about?
Brent Ozar: Richie’s like, I got this new table over here. We go to query it; access denied.
Richie Rump: It’s fine in dev, if you ever worked in dev, you know…
Erik Darling: We don’t have access.
Brent Ozar: Sadly, I do have access, I just always forget to go over there.
Brent Ozar: Pablo asks, “Hey, I’m trying to work on this parameter sniffing problem in QA and I’m wondering, if I rename a table and name it back, will that recompile my query?”
Erik Darling: Yeah, I think that’s enough of a DML change for it to happen.
Brent Ozar: I wouldn’t do that if you want to recompile. I would do something else.
Tara Kizer: Well it sounds like the dev team did it and he’s wondering if that could have fixed the performance issue.
Brent Ozar: yeah, that would do it. Don’t let them do that very often.
Erik Darling: don’t reinforce that bad behavior.
Tara Kizer: Show them how to really recompile. You can recompile just the query. You don’t have to rename things.
Brent Ozar: Daryl says, “I’m not getting any positive feedback at work, but SQL ConstantCare is my only happiness when y’all say congratulations. I worked my hiney off last week and I cleared tons of errors but I didn’t get any feedback. How do you decide what to congratulate and what not to?” Richie, when do the congratulations rules on SQL ConstantCare – how do those work?
Richie Rump: Well, see I have this bag of dice here and what I do is I roll them. Let’s see if I can get them up here. See I roll all of these dice and I roll them and then whatever rule number it comes on is what we congratulate on.
Erik Darling: Are there any dice with negative numbers on them?
Richie Rump: If I draw a negative in front of them…
Erik Darling: Can you buy any with negative numbers?
Richie Rump: You can buy any dice you want, man…
Erik Darling: yeah, because I feel that would be an interesting thing. If you could possibly get a negative number it would go like the opposite favor…
Richie Rump: Stats can have negative numbers, right. [crosstalk] Your charisma stats or down the tube, man. It’s awful.
Erik Darling: I think I’m an Ork. I think I’m an Ork character; that’s basically what happened.
Brent Ozar: Troll maybe. So the way that they work is they run every Monday afternoon – well, on your Monday collections; whenever your Monday collections run. And they don’t trigger or everything. So for example, if something bad happened during the week, we may not believe that you’ve truly fixed it yet. It may still be triggering, so we get a little paranoid on those.
Brent Ozar: Let’s see, John asks, “Is there a way to raise an alert for a long-running agent job? For example, if an agent job normally takes ten minutes but it’s now taking an hour, I want to see an alert for that.”
Tara Kizer: yeah, just need some custom scripting and you can do a raise error. Have the raise error post a message that can be sent out automatically via alerts. But yes, custom scripting would be needed here for sure.
Brent Ozar: And it’s not like you’d want to hardcode a time; you’d want some kind of standard deviation or rolling average of the last ten days on that task.
Erik Darling: And then is it like the whole job or just a particular step in the job or, like – there’s a lot of stuff at stake there.
Brent Ozar: Next up, Mike asks, “Activity monitor, does this require any special permissions?”
Erik Darling: Who knows…
Tara Kizer: Activity monitor, we don’t use it. A client earlier this week had it up, I was like, can I close this? I need more screen space. I don’t want to see this. Can we download whoisactive?
Brent Ozar: Yeah, I would say talk about what you’d use instead of activity monitor.
Tara Kizer: Okay, sp_whoisactive. Got to whoisactive.com, download the latest version and off you go. I haven’t been using – I mean, I sue BlitzWho occasionally if I don’t have whoisactive on a client machine and I have to be able to – but whoisactive has more features, more columns, more capabilities. And I log that to a table and I can query the table instead of running the stored procedure. There’s a little bit of a learning curve with it, you know, figuring out which input parameters you want to use. I always use get plans – get plans equals true – so I can see the execution plans, just in case I need to look at them.
Erik Darling: I mean, Adam has a lot of great documentation on the site about how to use it. Like Tara said, go to whoisactive.com. There’s a download link, there’s documentation links. He walks through everything. And aside from like, I think, general Azure compatibility improvements, I don’t think any of the parameters or anything have changed in like a decade.
Tara Kizer: … familiar with running sp_who or sp_who2, just download whoisactive and then sp_whoisactive and that will be your replacement. It’s getting the data from other places, whereas who and who2 get it from sys.processes.
Richie Rump: Is it weird that I have more than one set of polyhedral dice?
Brent Ozar: No, I would be surprised if you didn’t have four or five different colors knowing you. I don’t think I have any. I don’t think I’ve had a single die in here for years. I gamble on the SQL Server. I use RAM.
Richie Rump: Do I shut it down? Snake-eyes, yeah.
Brent Ozar: Steve asks, “What’s with the word on the street, holdup on SQL 2017 cumulative update eight?” I think Microsoft’s services team had a real rough week last week. They took a while to get CU8 out the door, they took a while to get the release notes out, the release notes had problems. Then SSMS 17.8 had a problem and couldn’t create a database or something, so just a bad week. We all have bad weeks sometimes.
Brent Ozar: Paul says, “I have the transaction log in tempdb grow out of control. After I reviewed the log, I found the query that was causing the issue, but it wasn’t directly causing the issue. There was this query that was saying select statman on the table that was causing the long-running query. Why would – what’s this query and what’s it doing?”
Erik Darling: That is update stats, baby.
Tara Kizer: Maybe if you don’t have a job, maybe the auto update stats kicked in.
Brent Ozar: Teeshal says, “Are indexes that are not used for reading, could they still be useful for updating in large tables?”
Erik Darling: No, they would be the opposite. They’re not helping any read queries and you have to update them. You are generating negative I/O. Like, the negative dice numbers. Like, generally the kind you don’t want if you have to update those indexes but they’re not helping any read queries because they’re not helping the update or delete.
Brent Ozar: Well, they could be. If you say update on a really big table where hair color equals grey, and that’s the only time you ever use the hair color field and you want to avoid a table scan, yeah, that can be useful for quickly narrowing down the rows that you want to update. But…
Erik Darling: But if they’re not being read then no.
Brent Ozar: Well, if you say – so let’s say I want to avoid the table scan and I don’t want to…
Erik Darling: I get what you’re saying. If you have a where clause on that column, but if like, it’s just not being read at all then it’s not helping.
Tara Kizer: Yeah, I just wonder [crosstalk], you know, reads equals zero or by not being used for reading, I wonder if she’s referring to selects.
Erik Darling: We’ll wait for a follow-up.
Richie Rump: Yeah, well, it actually could be used for foreign keys as well and updates on other tables. Things like that.
Erik Darling: Who has foreign keys though?
Brent Ozar: Aaron says, “Got a web app,” and he’s got a big long thing, Aaron, because of the length of that, you probably want to post that on Stack Overflow. You’ll see the pages go by with info on how to post that.
Brent Ozar: Joe says, “Currently I have SQL Server 2014 on a SAN. I have a couple of large MDF files that are on a sinology box, like a cheap toaster [NAZ] in a different location. The databases can’t find the drive or the drive is inaccessible, however, I can access the drive when I remote in. Where should I start troubleshooting?”
Tara Kizer: Good lord, so you can’t connect to the drive where your files are supposed to be? What on earth? It’s at another location? Oh my goodness. Hire us. It sounds like you need to hire us. [crosstalk0]
Brent Ozar: My guess, and this is just a guess is that when you remote desktop in, you have drive mappings that are like automatically mapping a letter to that sinology. SQL Server doesn’t log in remotely so it doesn’t have drive mappings.
Erik Darling: You could use XP Command Shell to run net use and set up those drive mappings if you wanted. But then you would have to make sure that the account that the SQL Server is running under has permissions to that drive. So it could be a permissions thing too.
Brent Ozar: And you’d have to make sure that net use runs every time so that the drives are there when you restart.
Tara Kizer: This is a terrible architecture, I’m sorry.
Brent Ozar: Everything about this is setting off alarm bells. We usually try to keep a poker face with y’all, like, we don’t ever want to say, “Hey, put some pants on,” but in this case, this is no pant alarm.
Erik Darling: I wouldn’t gamble with that with my large MDF files.
Tara Kizer: I mean, what kind of a budget were they given to set this up? Like, 20 bucks?
Erik Darling: You can buy anything from the bargain bin at Staples you want to build this thing. There was some adhesive spray, there were thumb drives.
Richie Rump: Yeah, we have some zip drives back here, you know.
Brent Ozar: Sinology, great for a home NAZ, for your…
Tara Kizer: Yeah, I was going to say.
Erik Darling: Like, it’s like, do you have your plex server on there too? Like, what else…
Brent Ozar: Terry says, “Brent, isn’t it summer in Chicago yet? Don’t you ever go outside? Your blue wall makes you look very pasty white.” No, I am pasty white. I just got back from California. That’s – and I tan like that. I mean, I just tan immediately, and I still, no, nothing.
Erik Darling: Also, what a lot of people don’t realize is that Brent is perfect Pantone white. It’s like, not regular, like, perfect, perfectly calibrated Pantone white. Not by accident.
Richie Rump: But you know, it makes complete sense because he lives up in Chicago in the winter and then it’s instant camouflage. Take off the shirt, take off the pants, and nobody sees him.
Erik Darling: Brent has strangely never been mugged naked. I don’t know how…
Brent Ozar: Sweet Jesus.
Brent Ozar: Steven says, “While we’re on the topic of replication,” – no Steven…
Erik Darling: We are not.
Brent Ozar: I saw you add that question, Steven. You added it way after we were on replication. He says, “Any recommendations on giving our subscribers back in sync after we restore them from a backup other than reinitializing? I have several publications that transfer data across the LAN, reinitializing these is a very lengthy process.”
Tara Kizer: Yes, so restore your transaction log chain and get it to the same point in time, and then set up replication and say I’m already initialized, just start sending me data.
Brent Ozar: Oh, Paul has a weird one. He said, “Would it be a good practice to set auto update statistics to false on tempdb?” Should he turn off auto updating stats on tempdb?
Erik Darling: I wonder even how he came up with that question. Like, what did you see that made you worry about auto update stats on tempdb? I would be curious about that.
Brent Ozar: Me too.
Richie Rump: Did the developer suggest it?
Brent Ozar: Then if that’s true, you should just take them at their word because they’re very sharp people. Also, they control my permissions.
Brent Ozar: And then Jeremy asks – so Jeremy had the follow up too, he had the AG failover when in the middle of rebuilding an index. He says, “Here’s the error. Recovery of database is 0% complete. Phase one of three, this is an informational message only.”
Tara Kizer: This is like my worst nightmare on Friday when my client’s machine went down and we were watching crash recovery, watching grass grow, basically. Eventually, it came online, but you know, I replied to Jeremy I believe earlier, just saying, yeah, if it stays in this state, 0% after like a day, then I would just punt and remove the secondary out of the AG and then re-add it.
Erik Darling: So what you need to do is grab a chair, go down the hall to the vending machine and smash it and grab every Snickers bar in there because…
Brent Ozar: You’re not going anywhere for a while.
Richie Rump: Hungry?
Brent Ozar: I’d also say too, if you’re in an AG, that means you got multiple nodes. Take one of the nodes that isn’t doing this and go start restoring a database onto there, just get yourself a plan B. Because I know it’s four terabytes and you may not have the space, but if the SAN admin can get you the space pretty quickly, that’s what I would do.
Erik Darling: I had a five terabyte database take like, 21, 24 hours to go through crash recovery, and that wasn’t even like, an AG thing. That was just a VLF thing. Like, so those honkers can take a while to budge. And it’s not like you’re sitting there hitting at five and like, getting constant feedback. It’s just sitting at a number for a real long time. It’s rough.
Brent Ozar: No visibility into what’s going on.
Brent Ozar: Paul’s asking about tempdb, the turn off auto update stats, and he said that the reason why his drive filled up was that that statman query was running, it caused his transaction log to grow to 100GB and that’s why he thinks that the thing…
Erik Darling: Oh no, so that’s going to happen – user databases are going to do that work there too. So it’s not going to help to turn off auto update stats in tempdb. User databases would also do their stat sorting there so sorry about that. Get a bigger tempdb drive. 100GB is not that much.
Tara Kizer: No.
Brent Ozar: We usually – just as a ballpark, at least 25% of the size of your databases, so like, the total database size on the instance, if you’ve got a one terabytes worth of user databases on there, it’s not unrealistic to think that you need at least 250GB for tempdb.
Erik Darling: Just to put things in a little bit of perspective, I have a 256GB micro SD card in my phone right now. So like, you know, that’s in my phone and I don’t even think I’m smart enough to put apps on it.
Brent Ozar: Did you expense that?
Erik Darling: No, I got it for free. I bought a Galaxy S8 and they sent me this like, whole VR kit for free that I didn’t want and I sold, but the one thing I kept was the 256GB memory card that was I guess supposed to hold all my sweet VR games. Okay, thanks. I’ll keep this.
Brent Ozar: Alright, well that’s all the questions that we got for today. Thanks, everybody for hanging out with us this week at Office Hours and we will see you all next week. Adios.