This week, Erik and Richie discuss creating/updating statistics, tools for viewing large flat file tables, the worst code they have ever seen, high disk queue length, dealing with databases that store blobs, reindexing jobs, transparent data encryption, and more.
If you prefer to listen to the audio:
Enjoy the Podcast?
Office Hours Webcast – 2017-03-08
How should I configure max server memory with 512GB RAM?
Erik Darling: Clark asks—I’m not going to try to pronounce that last name, I’m just going to call you Clark Ginger. “I’m configuring a SQL Server 2016 64 Enterprise instance.” His previous DBA, the guy who presumably he took over for, left this value set to 256 gigabytes on a server that has 512 gigabytes installed. What would I set it to? It would depend on a couple few things that you take into account when you’re setting max memory on a server. If you have goofy other services installed on there, like SSIS, AS, or RS, you might need to bump SQL Server’s memory down a little bit to accommodate those. If you have multiple instances on there, you may need to bump it down to accommodate those. If you have a failover cluster, where it is active active and you might need to have another SQL Server instance failover to that node, I would probably set it a little bit lower. But if it’s a standalone or if it’s never going to have any company on there, I would probably set that to about 475 or 480 gigs to start and just kind of see how things go. Usually when you have a server with that much memory on it, you’re trying to account for a fair amount of data. So when you run CHECKDB and other tasks that get pretty big memory grants and require a lot of memory to run, you can end up in some trouble if you set max memory too high. I had a failover cluster that would fail every weekend when CHECKDB ran on a four and a half terabyte database, really just setting max memory down a little bit lower to 480 solved it. It was set to 500 before that.
Richie Rump: We could have fun with this, right? You could say, “Roll 25 d20 and add it up and then that could be your max number.” Because that sounds about what this guy did. It’s totally random. 256 because…?
Erik Darling: Well, it’s half. It’s exactly half.
Richie Rump: But why?
Erik Darling: I don’t know. I wasn’t there. I didn’t ask him. If I was there, I would have had questions.
Richie Rump: I still like my d20 idea.
Erik Darling: Your d20 idea is not wrong. It’s just he had a pretty good roll I would say.
Should I update statistics if I’m doing index rebuilds?
Erik Darling: Here’s a funny question. Alex asks, “If we have enabled auto create statistics for a database and are doing index rebuilds, do we need an additional SQL agent job to create statistics for a database or is it okay to allow SQL Server to take full responsibility?” Okay, so here’s the thing. When you auto create statistics that just means that if you have a column that’s not the left-most lead column in the key of an index, whether it’s clustered or non-clustered or whatever else, that means if you run a query where you want to look at that column but it’s not in an index like that, SQL will generate a separate statistics object that’s just a histogram on that column by itself. So if your where clause is like “select blah blah from table where unindexed column equals something or greater than/less than something,” SQL will usually go ahead and create some statistics on that column so it can make a better guess about what that where clause is going to produce. So you don’t have to go and automatically create statistics based on that. You are probably going to want to have an update statistics job and you’re probably going to want to scale that rebuild stuff back a little bit because this—I don’t know if you know this but you’ve wondered into sort of a wasp nest of anti-index rebuild sentiment with Brent Ozar Unlimited. We prefer update statistics jobs to index rebuilds because index rebuilds, as the great Michael J. Swart says are a tremendously expensive way to update statistics. I would just mention that offhand.
Richie Rump: Yeah, and I don’t think there’s anything to worry about having SQL Server control the statistics creation or anything like that.
Erik Darling: Nope.
Richie Rump: I think that only a handful of times in my career I’ve actually had to create statistics on a column or multiple columns or something.
Erik Darling: Yeah, and it’s always some weird kind of statistics, like multi-column or filtered. It’s never been like, “Oh, I just need a statistic.” Unless it was like something weird with an index view where it was not doing its job.
Richie Rump: Yeah, that’s exactly right.
Should I turn on remote admin connections?
Erik Darling: “SQL 2012 sp_configure remote admin connections 1. Do you recommend having this on by default?” Yes, absolutely. That is the remote DAC. That is your VIP entrance, your lifeline to SQL Server. If you ever can’t connect via more traditional methods like Management Studio or RDP or the application. Go in there, run sp_whoisactive a whole bunch of times and see what’s going on. So, yeah, turn that thing on. Unless you have some crazy security concern.
Richie Rump: We should have like different names for all these different features. Like, that’s the “velvet rope.”
Erik Darling: Yeah, right? It is the “bouncer.”
What should I do about MAXDOP 1 problems?
Erik Darling: David asks, “I have taken over for a server.” Well, there’s your problem. You’re not a server. You’re a human being, you silly goose. “I see that MAXDOP has been set to one. I have tried anything over one and the server just locks up. What would be your first move to troubleshoot this? Cost threshold is set to 100.” Geez Louise. Sounds like you have some CPU issues over there. See, that’s one of those things where I would need to know a lot more, like how many CPUs and cores you actually have in there and what kind of application you have in there. Is that SharePoint?
Richie Rump: That’s it. What’s your app? That’s the big question.
Erik Darling: If it’s SharePoint or one of those other dingdong applications, you have to leave MAXDOP set to one or else they just go all … on you. So I’d just be careful with that. Setting cost threshold to 100 doesn’t matter if you have MAXDOP set to one. A query could cost a billion query bucks and it wouldn’t go parallel.
Richie Rump: I do have a special name for SharePoint at that’s SharePoint-less because that’s what I feel like when I deal with it.
Erik Darling: My daughter would call it SharePoop. So that’s that.
Can you blog about Kerberos double hops?
Erik Darling: Brian asks, “How often have you had to deal with Kerberos authentication issues?” Double hop stuff. No? Okay. Yes, please blog about it and shoot us or me an email when you have completed that blog because if I find it to be of value, I will send people to it constantly if they ask me other Kerberos authentication questions. I’ll send them there blindly too. Like all I have to do is here the word “Kerberos” and I’ll go, “Oh, this blog post…” Same way I do with most other things. “Oh, this is blog post…”
Richie Rump: So here’s a nice comment. “I should not have done shrooms before this webinar.”
Erik Darling: Me either.
Richie Rump: I’m just wearing one.
Erik Darling: Richie looks like stained glass. It’s weird. I keep thinking it’s my glasses.
What do people think about your tattoos?
Erik Darling: Clark asks a non-SQL Server question. “When you tell people what you do for a living and you say it’s a nerd-related activity, do they do a triple take with my tattoos?” No. Because usually the only people who ask me what I do for a living are doctors and I’m already in their office because I have nerd sinuses so they are not surprised at all. When they find out I work with computers they’re like, “Oh, geek. Good job.” It’s like, “Yes, you’re here because you have a double XL nose that you somehow can’t breathe out of.”[Richie Laughs]
Erik Darling: A vacuum cleaner attached to your face and you can’t breathe for some reason. So, there we go.
Richie Rump: What does he look like he does? I mean seriously.
Erik Darling: I dress like a bouncer so I get it.
Richie Rump: “I drive a subway train, man, you know?” It’s like the worst Uber driver ever. It’s like you don’t want to get into Erik’s car.
Erik Darling: No. I wouldn’t because I don’t have a license. Whatever car I’m in is stolen. It’s not a good sign for you.
Richie Rump: [Laughs] I’m so sorry. I’ve gone totally off the rails here.
Erik Darling: That’s all right. There’s not that many questions anyway. Just kidding.
How should I open large flat files?
Erik Darling: J.H. asks if we know of any free tools to download to view large flat file tables, like greater than 8 gigs since Notepad and other common ones are unable to open. What do you use, Richie?
Richie Rump: It’s been such a long time since I had to do that. Umm.
Erik Darling: I’m going to throw out something weird.
Richie Rump: I can’t think of anything off the top of my head. There was a couple special tools that I needed because I was doing large data loads and things like that but I can’t think of them off the top of my head because I haven’t had to do things like that in such a long time because, well, we live in 2017 and these large files, they don’t exist. We just have, you know, a billion of them now as opposed to having one large file.
Erik Darling: Sometimes. This guy apparently, J.H., doesn’t. So, J.H., what I would do, and I only know this because I have opened the Stack Overflow database with it, is a hex editor called HxD can open up large flat files. You’ll have to deal with like the hex stuff in there as well but you’ll be able to look at other stuff. It’s free to download as long as you’re on Windows I guess. I never tried to download it for a Mac.
Richie Rump: Yeah, that’s my hex editor of choice actually.
Erik Darling: Yeah, it’s a nice one. I like that one. I just like the logo for it.
Will hardware ever be able to beat bad code?
Erik Darling: Let’s go on down here. Guillermo Garcia asks, “Will hardware technology ever get fast enough that SQL performance issues are no longer a thing even with bad code, bad indexing, and everything else?” No. The thing about hardware is that it just responds. So you could write a terrible query and the hardware will respond to that terrible query. That terrible query will use so much of that hardware that other things will just start going wonky. The other thing is that data just keeps getting bigger too. So you know, at some point, sure, if you have like a tiny little, dingy little tables, a little 8-gig database or something, you could probably stick that on a S7 or iPhone and get pretty okay performance out of it. But, you know, if you have genuinely large data workloads, hardware is never going to make it so that you don’t have to do any work. But, interestingly, Microsoft is working on some stuff that was, they started talking about it at PASS. One of those things was adaptive query plans where the optimize will start responding to feedback from query execution and change things in flight. Oracle has been doing it for years, which is awesome. Oracle has got the adaptive query thing down flat. So if you can imagine like where SQL, some parameter sniffing issue, SQL selects a crappy little nested loops join plan and then halfway through they realize that there’s about five million more rows in play than they estimated, and it will switch that over to a merge or a hash join or something else more appropriate.
Richie Rump: So it’s no longer going to say, “Hey, you have a trillion rows in this query” when you only have like 100?
Erik Darling: No, it’s still going to say that. It’s just in flight it’s going to say, “Oh, we made a mistake. We’ll do something else instead. We don’t need a 250 gig memory grant. We can just take a couple gigs and do that.”
Richie Rump: Give it all the memory. Give it it all.
Erik Darling: Yeah. I mean you can certainly cover up a lot of developer sins and indexing laziness and bad code crap with hardware but it’s never a replacement. It’s just a band-aid or bandage. What’s the worst code you’ve ever seen written?
Richie Rump: Who, me?
Erik Darling: Yeah, what’s like the crappiest code you’ve ever seen?
Richie Rump: Usually they’re written by DBAs. I can’t think of anything off the top of my head but I’ve seen some pretty crappy stuff.
Erik Darling: Like queries or like tasks?
Richie Rump: Queries.
Erik Darling: Okay.
Richie Rump: Tasks are pretty simple, right? It’s usually because DBAs aren’t used to diving into data.
Erik Darling: True.
Richie Rump: At least a lot of the ones that I’ve dealt with because they’re DBAs, right?
Erik Darling: Right.
Richie Rump: Hardware and backups and all this other stuff. But, when you’re digging in really deep into this stuff, especially large datasets, things just kind of blow up.
Erik Darling: I’m with you. I’ve met several DBAs who can barely write a select query. Like they’ll get the order of like from and where and group by wrong regularly. Like they’ll just mess that up. But, you know, they can do all sorts of cool stuff with their high availability and backups and all that but when it comes to a simple query they’re like, “Where do you join?” I’m like…
How should I learn PowerShell?
Richie Rump: All right. You got the next question because for some reason I have to let my wife in the house.
Erik Darling: Oh, okay. Well, I’m not going to wait for you then. “You may have answered this question before, how do you learn PowerShell on self-pace, like books, videos you recommend?” I don’t know. I never really learned PowerShell all that well. I messed with it a little bit to get familiar enough to do a few things but I didn’t like learn PowerShell. Mike Fal, F-A-L, is a great blogger. He writes a lot about PowerShell. If you want to look at some cool PowerShell code, PowerShell DBA Tools has some good—well I guess good PowerShell code there. I don’t know if you’d call it good regular code. Then, I want to say that a while back there was a learn PowerShell in a month of lunches, or something like that, that had a pretty good explanation of stuff. Also, there’s like some PowerShell newsletter that I used to get that would send out like pdfs with all different PowerShell topics on it. Anyway, Richie is back. So I can stop with my fluff answers.
Did you have clients affected by the Microsoft account outage?
Erik Darling: Michael asks, “Did either of you have any clients affected by the Microsoft account outage?” No. I didn’t.
Richie Rump: Oh, yeah, that’s…
Erik Darling: It happened. I don’t think we had anyone affected by it, at least no one that bothered us.
Richie Rump: But, but, we were affected with the S3 outage.
Erik Darling: That’s true.
Richie Rump: That wasn’t fun. Oh, and by the way, as soon as it went down, I mean as soon as it went down, I saw right away because things started failing. Then I had all these other people that I know started messaging me, saying, “Is S3 down? Is S3 down?” I’m like, am I Amazon?
Erik Darling: You’re the website. You’re “Is S3 down Richie.”
Richie Rump: What? But Paste the Plan never went down. Never went down. Amazing.
Is my high disk queue length a problem?
Erik Darling: Eric asks—not me, some guy who spells Erik wrong—“My SAN admin said he is seeing high disk queue length and he is blaming SQL Servers. Can I trust my perfmon counters, etcetera etcetera.” I would ask him what problem high disk queue length is causing for him other than it being on a graph somewhere. I’d want to know what about disk queue length he thinks was an issue for him and what’s going on with that.
How should I store BLOBs in the database?
Erik Darling: Guillermo asks another good question. Here we go. “What’s your strategy dealing with databases that store BLOBs? Do you move BLOB data files to a storage different than your regular data files are? I feel like they just waste disk space which is optimized for regular data pages.” I feel like you’re right. Richie, do you have any input on the BLOB question?
Richie Rump: Yeah, somewhere in the mid-90s I think I put BLOBs in databases and learned that was not very good idea. They’re there. They take up space. You’re absolutely right.
Erik Darling: But we need to edit pdfs in SQL. [Laughs]
Richie Rump: I know. And then getting them in and out becomes a problem as well. So definitely store them on disk and then access them that way. Actually, I did a test not so long ago about putting BLOBs in DynamoDB, which ended up working. It worked fine.
Erik Darling: Really.
Richie Rump: But then you pay for the storage costs associated with DynamoDB. I’m like, “Do I want to pay more for that storage or should I just pay less and put it in S3?”
Erik Darling: That’s a very good question.
Richie Rump: So we just put a link to S3 there. We read that, go get it, and Paste the Plan gets that stuff from S3.
Erik Darling: That’s a good idea. There’s a post that Jeremiah wrote when he was still hanging out with us about getting BLOBs out of the database that I’ve always found pretty good about that. My take on it is that you should just store a path to the file in the database and then get that file some other way.
Richie Rump: Yep.
Erik Darling: Don’t make SQL Server store your GIFs.
Richie Rump: If you’re really paranoid about, “What about my file? What if someone deletes it?” Then right a process every night that reads it all and just figures it out if it’s there or not. If you’re paranoid. Some of us aren’t that paranoid. Who cares if it’s there or not.
Erik Darling: Especially about BLOBs. Like unless you’re dealing with like really sensitive financial or government or like you know other documents that you need to have complete consistency on.
How much space do index rebuilds need?
Erik Darling: J.H. asks another banger of a question. “When preparing to allocate enough space for a reindexing job is it correct that it will need the same size space as its current index used size?” Are you talking about the data file or the log file? The data file I just store it in tempdb anyway. Then for the log file, I usually say about one and a half times the size of your largest index. That’s only because you need to accommodate that largest index and you might have other stuff going on at the same time, so I’d just give myself a little bit of slack on that.
Does CHECKDB slow down query activity?
Erik Darling: “Does CHECKDB affect performance on a high-volume database?” It affects performance on just about any database. It’s an intense task, even if you do physical only. It takes up CPU and memory and it reads everything from disk and it does a whole bunch of crap. Yes, so the answer is, yes, absolutely. This is why a lot of people choose to offload CHECKDB so they don’t have to deal with it affecting their transactional data.
How much memory does my SQL Server need?
Richie Rump: I see a good one somewhere there.
Erik Darling: Yeah? Which one? You go ahead and read one.
Richie Rump: I will read one. But that’s it, Erik. I’ve had enough.
Erik Darling: All right.
Richie Rump: “VM architect won’t give me more than 16 gigabytes of RAM. Our db is over 150 gigabytes, four core, 2014 Standard. What do you say of that?” You know what has more RAM? My phone! You want to put SQL Server on this?
Erik Darling: I mean my laptop has 32 gigs of RAM in it. The desktop that sits at my feet like an obedient dog has 128 gigs of RAM in it. Your VM architect—I don’t know where he got that title from. He won’t give you more than 16 gigs.
Richie Rump: My tablet has 16 gigs of RAM. My tablet.
Erik Darling: Yeah, so I would just avoid him at all costs. He doesn’t sound like a terrifically bright individual.
Richie Rump: And it’s really not a question of going physical. It’s a question of this guy needs to get a clue about what SQL Server needs and how it performs. It sounds like you’re going to need to do a little bit of education on him, or his superior, to get the RAM that you need. Some basic education needs to happen to understand that everything SQL Server needs it needs to put in memory first. And every time you have too little, then it’s doing a lot of this swapping all over the place and that slows everything down.
Erik Darling: Brent has a great post called “The Sysadmin’s Guide to SQL Server Memory.” I’ll grab the link on that but that’s where I would point him to. So even though he is a VM architect, even though he I guess went to art school or something, and learned how to be an architect, he might be part sysadmin underneath so I would send him to that link and say, “Hey, before I have to start punching you, please give me more memory.” That’s fun.
Are page splits causing my performance issues?
Erik Darling: All right. We have a few minutes left, let’s see here. “Are page splits causing performance issues?” No. No, they’re not.
Can I claim my conference expenses on a tax return?
Erik Darling: “I know this is not a financial advising session. My company is not paying for training or conferences. I am willing to pay out of pocket. If I pay myself, can I claim my expenses on a tax return?” I don’t know. I would just get a new job. I think it would be less painful to get a new job than to deal with the IRS on that.
Richie Rump: You could. I think you can claim that. I’m not sure if you need to create your own organization or not because I always kind of had my side thing going down, so everything was kind of funneled through that if I paid for anything. So talk to an accountant. Unfortunately, we don’t do accounting talk here because we’re not accountants. But your tax professional will have an answer for you.
Does an explicit transaction lock differently?
Erik Darling: Michael asks—Michael tried to sneak this one in via email and since he’s a diligent attendee I’ll let him get in the last question on this one. He wants to know, “We’re trying to debug a deadlock that happened on a couple delete queries,” then got off on a tangent about if there’s a difference about the way locks escalate if you declare an explicit transaction within your stored proc or whether the default auto commit works the same way. “Does an explicit transaction have a higher lock mode? Does it do anything other than give you a mechanism to handle the role back?” Not in my experience. How about you, Richie?
Richie Rump: Not that I know of.
Erik Darling: Yeah.
Richie Rump: I really haven’t messed with it that much. I’ll plead ignorance here.
Erik Darling: Yeah, so, I guess what I’ve observed, I wouldn’t take this as biblical truth. This would be a pretty good question for Stack Exchange but I’ll just give you a little bit of what I’ve observed, is that, no, the only thing that declaring a transaction buys you, especially if it’s a named transaction is the ability to a) see in the transaction log where stuff happened if you name it and b) you can do the commit or rollback or do some error handling within it. Other than that though, I’ve never seen a lock escalation or lock duration difference unless someone forgets to commit or rollback a transaction. In which case, all heck could break lose. As long as you put the begin tran and the rollback are committed in the same place that it would happen automatically. I don’t think that would have a different outcome in it.
Do you like Transparent Data Encryption?
Richie Rump: I think we could go one more because we started late.
Erik Darling: Yeah? Okay.
Richie Rump: I have one last one.
Erik Darling: Okay.
Richie Rump: It will probably fit right within the one minute we have left. “We have several clients asking about encrypted data at rest. Any experience/comments on transparent data encryption or any other approaches you have seen success with?”
Erik Darling: Are you asking me? Or did you have a good answer for that?
Richie Rump: No, I’m asking you, dude. You’re the database expert.
Erik Darling: I am. Oh my god, I always forget. So TDE, it’s interesting, but it only encrypts data at rest. It does not encrypt data at flight. So it takes a whole lot of software engineering work to have end-to-end encrypted data. It also, prior to 2016, breaks a few cool things like instant file initialization and whatnot. So, that’s what you get out of TDE.
Richie Rump: TDE does do the encryption. I’ve used TDE, not from an admin perspective. It works but there are gotchas around the certificate and the encryption keys and all this other stuff. When you do restores, before you implement it, you probably want to do some test restores because that becomes more complex as well.
Erik Darling: Any high availability, mirroring, log shipping, availability groups, those all get weird with TDE. You better make all sorts of concessions. Well not concessions, you just have to plan for it.
Richie Rump: That’s exactly right. I’ve seen it work. It adds additional overhead to your CPU usage as well.
Erik Darling: Takes a while to happen depending on the size of the database.
Richie Rump: Yeah, and I’ve seen large databases have it, multi-terabyte databases have TDE there, but you need to do your own testing on it. Your situation is unique. Your experiences are unique. I would say go in to a test environment. Start running some tests, see the gotchas. Do your restores. Run some load tests. Do some perf on it. See if it will work in your environment.
Erik Darling: Yeah. Sounds good to me. Also, you know, just be aware if you’re on a version of SQL prior to 2016 and you’re compressing your backups, that’s going to stop working too. So all sorts of fun stuff in there. You can still like do row or page compression if you’re still into doing that.
Richie Rump: Yeah, but it is important because when you do your backups, that data is not encrypted.
Erik Darling: I mean, they do have encrypted backups.
Richie Rump: Yes.
Erik Darling: Which is …
Richie Rump: How many of us are actually doing that, really?
Erik Darling: Zero percent. I did a demo of it once for a client because they wanted and you can do it with Ola Hallengren scripts. I’m sure you could do it with MinionWare too, I’ve just never tried. But just to prove it worked I was like oh, I’ll do this. I hated every minute of it.
Richie Rump: Yeah, there’s a lot of hoops that you have to jump through through TDE, but if it is a requirement at your location then it’s in the box. Might as well use it. So there’s some benefits that are there but try it on your own. Start to get comfortable with it because there are some things there that are a little bit wonky, some things you thought worked kind of don’t work the same way when you turn on TDE.
Erik Darling: All right. I think we did a good job of making up for the fact we started a few minutes late.
Richie Rump: I think so too. Although we don’t look as pretty as Brent I think the two of us together equal like a quarter Brent.
Erik Darling: Maybe. Maybe if you like got on my shoulders.
Richie Rump: Yes, yes. Maybe you could have the coffee and I’ll have the Kind bar and we’ll be good.
Erik Darling: You know what, who gets the wine?
Richie Rump: We both get the wine.
Erik Darling: Okay. Oh wait, the coffee and wine combo.
Richie Rump: Coffee and wine combo but the question is are we splitting the wine case in the middle or is someone going to get a different bottle or whatever?
Erik Darling: You get the part of the bottle that has the cork in it and I’ll just take the part that has the wine.
Richie Rump: Well I would give you a couple extra bottles because I want to keep your wife happy because I know she’ll probably drink most of it.
Erik Darling: Yeah, usually. I mean, she has to put up with me, so all alcohol is welcome. All right, anyway. Thank you all for attending and listening to me and Richie babble incessantly. I’m going to go refill my coffee and try to get the rest of this optimum nutrition off my shirt. Thanks, guys and gals. Catch you next week.