This week, Erik, Tara, and Richie discuss disaster recovery, Python vs C# for SQL Server, whether you should perform log backups on primary or secondary on AGs, why Postgres is so popular, using mount points for tempdb, attention events, database corruption, MERGE statements, restoring a table from a database, features they look forward to in SQL Server 2019, and how Santa is able to visit all those homes in one night…
If you prefer to listen to the audio:
Enjoy the Podcast?
Office Hours Webcast – 2018-12-19
Can I reset the SA password?
Erik Darling: Josh asks, “I’ve come across an older database and I need to perform some maintenance tasks on, but the documentation we have mentions the only account is SA in this database. I did not know the SA password. Is there any way I can reset the SA password or generate another account with sysadmin role? My senior DBA created this database a while back, but he has since retired, leaving me looking for advice on where to even start correcting the issue.
Tara Kizer: How are you even connecting in the first place? You must be using your Windows account and probably have sysadmin, so you’re good to go. But I would definitely add another SQL sysadmin account, just as a backup in case something happens with your Windows account. And then maybe plan for an SA password change. I would see if anybody’s logging in with SA, because maybe the application isn’t changing that password, obviously. Not a good idea if the app is using it.
Erik Darling: You know, what I’d do is go look under the old DBA’s keyboard for the sticky note that he has all his passwords on. That’s what I’d do because we know they’re under there. We know they’re there somewhere.
Richie Rump: So where do you guys stand on the SA account? Do you keep it or do you replace it?
Tara Kizer: One of my jobs that was really into security renamed it, but that doesn’t really change anything because it’s still the same SID, but it was renamed. I’ve never gotten to a point where I’ve just disabled the account, but it’s always something like a 20-character password that we just throw into LastPass or whatever and then nobody is supposed to use it. It’s just our backdoor entry as a just in case.
Erik Darling: Yeah, there’s been some weird cases where applying CUs and Service Packs have failed if SA account either is renamed or disabled. Even if you want to follow really good security practices and then you want to patch your software so it’s more secure, you have to do something unsecure to be more secure. It’s a weird tail chase on that sometimes.
Will adding filegroups help with disaster recovery?
Erik Darling: Alright, let’s go to the next one, which I haven’t read yet, so who knows what surprises will be in here. Brenting this one hard, “We have a database that is approaching 1TB in size. Twice we have had to perform DR to recover something that was deleted and the process to restore the database takes about four hours. The developer in charge of this department is requesting that we split out this database into file groups in hopes that this will help with DR. Is this possible? Do you have better ideas?” My number one idea would be to stop giving people access to this database where they can lose data. That would be the first thing I did because if you have – let’s see, what’s a nice word – jack somethings. Let’s just call them donkeys. If you’ve got some donkeys on your team who go around running queries without where clauses, running faulty updates and stuff like that, then they shouldn’t have access to this data anyway, especially if they’ve caused this many problems this quickly. With this kind of business continuity issue, where it takes four hours to restore a database to get some stuff back, I would ban them permanently from the database.
Tara Kizer: I was just wondering what they mean by DR. I mean, it stands for disaster recovery, but did they failover to the disaster recovery site, ro do they mean this is a disaster, let’s restore right here…
Erik Darling: Well it sounds like they had to restore the database to somewhere, maybe another server or something, and then recover the data back to the original server, perhaps.
Tara Kizer: So I am a donkey and I have done a delete without a where clause on accident. I was able to do a side by side restore because I had the storage available to do it, and then I was able to move the data over. So making sure you have transaction log backups and taking a final tail log backup, that way you can potentially do this without data loss. However, as data is coming in while you’re doing this process, a side by side process at least, you could be mucking with things. You might have some foreign key issues even though there’s no telling. I was luckily in a table that I did this oopsy on that it didn’t have foreign keys anywhere and there wasn’t much risk on this table.
Richie Rump: I did that once where I didn’t have a where clause in the user table. The table name was users. So yeah, that was not fun. And since then, I’ve been paranoid with deletes, just paranoid. Which is good, I think you should be paranoid when deleting stuff.
Erik Darling: Changing data in general should be…
Tara Kizer: Exactly, and I’m sick of that as a production DBA because, usually, we’re changing data because of application bugs. And developers, oftentimes, will know what the bug is, or maybe they don’t, but if they do, the business hasn’t given them time to fix it. They’re all, let’s move forward with all these feature requests and these other critical things, but not this. So I’m stuck having to manipulate production data, and there’s room for human error. File groups certainly helps out with this situation because you can do restores of a file group at a time, but I don’t know that I would implement file groups because of this.
Erik Darling: Yeah, that seems like a weird step to take, especially given some of the pain that you might find in creating all these files in file groups and moving data over to…
Tara Kizer: Moving data, especially if it’s 1TB in size and you’re moving things around. Wow…
Erik Darling: That’s not a trivial task, even if you’re lucky to have the online rebuilds. You’re looking at a pretty tremendous amount of work. If your developer has a really quick and easy plan for that – oh yeah, just add file groups, it will be fine – go ahead, that’s your weekend then, pal. That’s not my weekend.
Richie Rump: So what would be some reasons to add file group type things?
Tara Kizer: Well, I mean, this is one of them.
Erik Darling: You listed all exactly one of them. I mean, aside from if you have a single file which is approaching the limit, which is what, 16TB. So if you have one file approaching that limit, then it might be a good idea to start adding in some more, but then you’re still looking at moving crap around.
Tara Kizer: Or, you know, if you have some disks that you want to leverage for this table, faster disks or slower disks or read only file group type things. But usually, we start thinking about those before we hit 1TB. Once you gte into the VLDB sizes, oh boy, it’s tough.
Erik Darling: Yeah, I mean, there’s a lot of choices that need to get made early on when you’re worried about a database scaling. You know, stuff like use bigints, get used to developing with an optimistic isolation level, get into compression, just general things that are really good ideas if you’re worried about a database getting really big and scaling up in size really quickly. So file groups are sort of part of that picture. Mostly, I would just whip my developers until they stop losing data.
Should I use Python or C#?
Erik Darling: Here’s a good Richie question. It’s not a good anyone here question, “Does Python have any benefits over c# in Microsoft SQL Server in particular…” come on, let’s just keep it to SQL Server. So, Richie, what do you think for interacting with SQL Server, c# or Python?
Erik Darling: Does c# offer anything for data science people, or is it completely just offloaded to Python and R and MATLAB and whatever else?
Richie Rump: Well, I mean, all that stuff is kind of not really focused on that mathematical stuff. You could kind of do all that stuff in c#, it’s just probably going to take you more lines of code to do that sort of thing. So once you – I have this theory on languages; once you learn one and the basics of how to program a language, jumping to another one isn’t as hard as learning it from scratch without having any programming knowledge.
Erik Darling: Yeah, I know, I’ve seen it…
Richie Rump: How can you see it? I ought to block your account from any of the source code.
Erik darling: I can’t do anything with it. I’m not even smart enough to break it.
Richie Rump: You know, each language has its own kind of flavor, what it kind of does best. Python is kind of the data type thing; c#, I think, is more of a Windows type environment great general-purpose language where you can do a lot of different things pretty quickly and pretty easily. The really hard part about c# is learning the .NET framework. The .NET framework is vast, it’s long in the tooth. There’s a lot of things going on in there and even the new one that they have, the .NET core, there’s a lot that has changed in three versions going on there. So I could talk long over this, but this is not a programming podcast.
Erik Darling: Notice that Richie didn’t say PowerShell once. Just throwing it out there…
Richie Rump: It’s not a real language.
Should I do my backups on the primary or the secondary?
Erik Darling: So here’s a good Tara question. So we had a good Richie question, here’s a good Tara question, “In an Always On Availability Group, do you recommend performing your log backups on the primary or secondary and why?”
Tara Kizer: I do my backups on the primary because there’s no latency there on my backups. I don’t want any backups to have even a few seconds of latency in case there needs to be some kind of recovery there. Even on a synchronous commit secondary replica, there can be latency. The secondary can switch to async without even telling you to do some kind of issue. It’s documented in Books Online MSDN. So I don’t offload that task because backups aren’t causing me any problems on the primary.
Erik Darling: I hear that.
Why is Postgres so popular?
Erik Darling: Let’s see here, Graham asks kind of an interesting question, perhaps a bit religious, “Why is Postgres so popular, aside from the fact that it’s free?”
Richie Rump: There you go. That’s the reason. You just answered your own question. Thank you very much. I mean, I guess more of the question is why is MySQL so popular, because MySQL has more stuff that will remove data and all this other crazy stuff and settings you have to make sure are right. So why is MySQL so popular? Postgres is more along the lines of an Oracle or SQL Server with more enterprisey features, but MySQL has a lot of stuff that makes us cringe and stuff that’s kind of why would you want that in a database. And the answer to both is, free.
Erik Darling: Yeah, if I had the power, I would classify MySQL as a super fun site. That would be the first thing I would do, just get it walled off, dig a giant pit, throw it in, get some toxic waste in there. Postgres is far more feature-rich, far more dependable, reliable, as a database. And for the price, you kind of can’t beat what you get in return for it. I also think that it’s fairly nifty that you can go and read through all the source code. So if you run into an issue, you can be like, oh that’s where it is. You can contribute changes to it, and in a way, I feel like that’s a lot more interesting to a certain group of people who don’t just want to impotently raise the user voice item and wait for someone from Microsoft to maybe get around to fixing it.
Richie Rump: [crosstalk]
Erik Darling: Yeah, exactly.
Richie Rump: I mean, going back to MariaDB, or MySQL, MariaDB exists because of MySQL’s failings in a lot of ways, so I don’t know why MySQL came up, but I have just so many problems with MySQL, core problems.
Erik Darling: It was like barely a notch above SQLite, but missing basic stuff that linguistically, as far as the optimizer goes, like certain join types were not supported. It was just like, holy cow, what are you doing?
Richie Rump: Yeah, and not to say Postgres doesn’t have its failings, because there’s a fair amount in there.
Erik Darling: Like vacuuming…
Richie Rump: Or single threading and some of the query processing and stuff like that.
Should I use mount points for TempDB?
Erik Darling: Alright, Josh asks, “For tempdb, we are using mount points instead of drive letters or local storage due to decisions made by others previously. Are there any downsides to doing this or any concerns?”
Tara Kizer: There is a downside and I can’t remember what it was, but we were suing mount points directly off the root drive. And so we’d use F-backups, F-data, F-log, F-tempdb. And we had no issues with it, but Microsoft, when they did an assessment of our servers through whatever you get with the PFE stuff or whatever it is, we were notified that you’re supposed to create a folder first, and then have the mount point underneath that folder. But we never ran into an issue, but there was some sort of possible issue that could be encountered that way. But as far as using mount points in general, totally fine. I mean, I’ve been using SQL Server and Windows a long time, and back in the day, we didn’t get to have mount points and we had a lot of instances on failover clusters and the possibility of running out of letters in the alphabet. So mount points helped out with that.
Erik Darling: I’m glad I’ve never run into that problem.
Tara Kizer: Well, I mean, we were only notified of the problem; we didn’t run into the problem… Oh, the drive letter thing?
Erik Darling: Yeah, like, thankfully that’s a lot easier of a problem to detect than running out of ints or – it’s like, oh, we’re at Q, shoot…
Richie Rump: Is that where some Star Trek character comes up and starts randomly annoying you?
Erik Darling: I hope so, because it’s not a weird night without that.
Tara Kizer: I supported a four-node cluster with 11 instances on it back in the day.
Erik Darling: I don’t like the way that maps out.
Tara Kizer: I know, I think six of them were production and five of them were staging and they wanted to have the staging on the same cluster, but we had gotten up to the letter P, I think it was, in the alphabet… I know, it wasn’t what I wanted to do…
Richie Rump: P for poor decisions.
Erik Darling: Alright, I got some other fun questions around, “What would lots of attention events and a profiler trace during a bulk insert indicate?” I don’t know. Got me on that one.
Tara Kizer: from what I remember, attention events can relate to a timeout from the application.
Erik Darling: Well, I hope it’s not too closely related.
Why isn’t my restore working?
Erik Darling: Alright, Joseph asks, “My database backup does not restore. When I try to restore the header, it shows incomplete. Does this mean the backup is corrupt? What could be the possible reasons for it to be corrupt?”
Tara Kizer: Oh yeah, it‘s corrupt.
Erik Darling: Throw that one out. Time to take a new one, format…
Tara Kizer: Fix your disk, whatever’s going on there. You’ve got some kind of, probably, disk corruption. I would be running CHECKDB if you have any databases running on that same drive.
Erik Darling: Backup to something more reliable, like wet paper, you know, lots of choices there.
Tara Kizer: Also, run restore header only to see what kind of error it has.
Erik Darling: Or restore file list only. Maybe there’s multiple files in there and you’re getting it the wrong one or something like that.
My MERGE statement takes 28 minutes…
Erik Darling: Steve says he, “Started a new job and there are a bunch of merge statements that take 28 minutes. What would you do?”
Tara Kizer: I’d stop using merge, I mean, period, end of story.
Erik Darling: Yeah, Aaron Bertrand has many good articles about merge and the problems with it, but I have to say that some of my favorite articles are written by a guy named Michael Swart. Let me see if I can get to the internet. I started typing in Michael, instead of Internet Explorer where I was going to type Michael. I’m on my VM where I don’t have Chrome installed.
Richie Rump: What?
Erik Darling: What am I supposed to do?
Richie Rump: That is weird, man.
Erik Darling: Why is that weird?
Richie Rump: Even Microsoft said, no, we’re going to go to Chrome.
Erik Darling: I read about that. There we go, so Michael J Swart has a great post about UPSERT patterns and antipatterns, and this is because he’s written a bunch of other posts about how kind of god-awful crappy merge can be and ways he’s gone around that where, you know, merge is an antipattern. And when you finally get through all of the antipatterns that he’s discovered over the years, there are some better ones where you go to the patterns and does some begin and try and some other things. So I would really just run screening for merge in general. I personally much prefer to do either insert or update, you know, figure out of the date is there. If it’s there, update, if it’s not there, insert. Merge is just syntactical sugar for a bunch of garbage that Microsoft stopped working on years ago because there were just so many issues with it that I think they just kind of gave up too. Merge, what garbage.
How can I restore just one table?
Erik Darling: Brandon asks, “What is your process if you have to restore one table from a database? Do you restore the whole database and pull out what you need, or is there secret tips?” There are third-party tools like Quest LiteSpeed that can do object level restores. SQL Server used to have it but it got yoinked. I think it’s probably one of the most requested features, to be able to do object level restore. But right now, you need to use third-party tools in order to do it, otherwise there’s no other good way to do that.
Tara Kizer: We used to have that functionality way back in SQL Server 6.5.
Erik Darling: It’s hard to believe that they were able to achieve such a technological feat then, and then they ceded all work on that to people who were, I don’t know, developing other tools and other backup methods. I don’t know, weird all around.
What are you looking forward to in SQL Server 2019?
Erik Darling: Michael says, “Merry Christmas…” Merry Christmas to you too, Michael, I think, maybe, if you’ve been a good boy this year. “Is there anything in SQL Server 2019 that you are really looking forward to?” You guys got anything?
Tara Kizer: I haven’t really paid much attention to it.
Richie Rump: I haven’t paid much attention to it either because I live in a different world now; a strange and unique world where I have to tidy up my databases using a vacuum process. But I would love to see, and we haven’t seen it since SQL Server 2012, is more T-SQL improvements. The more I use Postgres, the syntactical sugar and some of these other functions, it’s just a little nicer than what we have in SQL Server. So if I had my druthers, they’d be going back to the T-SQL land and improving the language a bit.
Erik Darling: So like specifically, what from Postgres would you like to see over in SQL Server?
Richie Rump: Oh, I couldn’t tell you offhand right now because I’m just wondering if my vacuum process just failed or if it’s fixed. There’s some inherent functions that makes things a little easier as well as casting. All I need is two dots to cast, and things like that. Like, that’s a lot better than doing a cast or a convert function on something like that.
Erik Darling: Sure. I think, for me, for SQL Server 2019, I’m most looking forward to batch mode for row store indexes. I think that’s going to be the absolute biggest improvement. It’s going to be interesting though, because it’s really only good for larger data sets. And so for people who are doing OLTP type stuff, they might not see any love from that just yet. But I think that’s what I’m most looking forward to. Froid, so the stuff with inlining of scalar valued functions, that’s going to be huge for a lot of people. That’s like an absolute reason to upgrade, like not having scalar valued functions crap all over your workload all day long once per row, just like leaving little turdlets along the way. I think that’s going to be pretty awesome. I don’t know, lots of cool stuff. I think what’s really telling to me are the problems that are still out there in the wild not getting solved. Like parameter sniffing is still, you know, a huge, huge problem. People supplying like optional parameters is still a huge, huge problem. So I think what’s interesting is the stuff that’s still on the table for later versions that hopefully will get addressed at some point so that I can retire.
Tara Kizer: Has it been confirmed that the inlining of scalar functions is in 2019?
Erik Darling: It’s in CTP 2.1 and 2.2, so…
Richie Rump: And hopefully won’t get pulled, right? Right?
Tara Kizer: Have you tested it?
Erik Darling: Yeah, it’s worked okay in the stuff I found. I mean, there’s limitations to it, but like everything else.
Is there Office Hours next week?
Erik Darling: Julie asks, “Is there Office Hours next week?” I don’t think so.
Tara Kizer: Probably, because I think we only have Christmas Day off. I won’t be there, I have a client.
Richie Rump: I won’t be there because I’ll be off.
Tara Kizer: Oh, finally taking some vacation?
Richie Rump: It was requested by the big boss, the biggest boss.
Erik Darling: Yeah, I don’t know, maybe, we’ll see. If it’s just going to be me, the answer is no. I don’t know who else is going to be around though, we’ll see. It’ll be a surprise. It’ll be a Christmas miracle if we have Office Hours next week.
Where can I learn more about Postgres?
Erik Darling: Let’s see here, a loyal dedicated attendee asks, “What are some good blogs you suggest to learn about Postgres?” Do you know any, Richie? I don’t…
Richie Rump: No, I don’t. In fact, it’s really hit and miss with some of the Postgres stuff. So actually, I don’t have any learning materials to kind of learn it other than to just dive into it, which is my preferred method of learning something new. It’s not like the SQL Server community, where there’s a lot of great bloggers out there and they’re consistently blogging and they have been blogging for 10-plus years. It’s different.
How does SQL Server pick stats sampling rates?
Erik Darling: Let’s see, Joseph asks, “When you run a stats update and you don’t give it a sample percent, what does SQL Server use to figure out what percentage to sample at?”
Tara Kizer: I can’t remember, does update statistics have a default that if you do sp_updatestats, the built-in stored procedure, it will use whatever…
Erik Darling: The default is a weird calculation. Paul White wrote about it and he has an answer on Stack Exchange that I can’t find easily, but he talks about the equation that gets used to figure out – it’s like…
Tara Kizer: It’s not just the 20%?
Erik Darling: No, it does something internally where it does a calculation based on, like, how many rows are in the table or something like that and it does some weird percentage of them. If you ever look at the stat man thing and you see the table sample thing in there, it’s like that percentage gets figured out in a real weird calculation.
Richie Rump: The answer is 42.
Erik Darling: I wish. I wish there were magic numbers in there and not bizarre calculations.
Richie Rump: We’re not going to tell you 42 what, it’s just 42.
Erik Darling: Yeah, units, query bucks maybe.
How can Santa visit all the homes in one night?
Erik Darling: And to finish things off, Tammy asks, “How is Santa able to visit all those homes in one night?”
Tara Kizer: That’s a good question. That’s why my son started questioning the whole Santa thing in kindergarten.
Erik Darling: The bad news is that Santa has done what just about every other industry has done and he has outsourced all of the work to other people. So it’s mostly just parents who deliver the presents to their own homes and hopefully don’t steal any from other homes. [crosstalk]
Richie Rump: The answer is teleportation. So it’s Nightcrawler. Nightcrawler is Santa.
Erik Darling: If you wanted to give a real gift to the world, Santa would share his teleportation technology.
Richie Rump: Or, if you’re more in a Harry Potter universe, he “disaporates.”
Erik Darling: No thank you, skip that. Skip that whole thing. No good. Alright, that’s all the questions we have. Thank you for joining us. Maybe see you next week, maybe not. It will be a surprise, just like what’s under the tree, and hopefully it will smell better. Alright folks, thank you, goodbye.