This week, Brent, Tara, and Richie discuss splitting large database files across multiple files, security auditing solutions, what Brent is currently teaching at the client location, SQL Server in containers, heap index maintenance, contained databases, extended events, the difference between a junior and Senior DBA, page file sizing, and more.
Enjoy the Podcast?
Office Hours Webcast – 2016-10-10
Should I split up a 200GB database?
Brent Ozar: Paul says, “One of the databases that I monitor is 200GB in size, but it only has one file. Should I go ahead and split that up across multiple files?” Tara, you’re the DBA on the call.
Tara Kizer: You too. I mean, it’s fairly unnecessary unless you have special disks, you know, a special layout where you can add a file and some of your objects will be on that file. So if you have some faster disks and you want to move some of your heavily read tables, then move them over to another file, adding another file group. But size of database doesn’t really mean you have to have more than one data file.
How should I audit SQL Server?
Brent Ozar: Jason says, “Does anyone have any suggestions on a solution for security auditing for SQL Server? Should I use in-house or third-party options?”
Tara Kizer: I have no idea. I feel like the clients that have auditing, there’s performance issues.
Brent Ozar: Yeah, you really – what I tell people is, do you have to defend yourself in front of a lawyer and a judge? Is this going to be something where you have to legally defend yourself in court? And if you do, then that kind of guides the solution that you’re going towards because you shouldn’t have something that a DBA can turn off, something that can lose events really easily. So this kind of rules out a lot of things that you would roll yourself because you would be a suspect. You have to be able to stand in court and say there wasn’t a problem. So if you have to be legally defensible, I’m a huge fan of third-party appliances like Imperva or Guardium. These are big expensive appliances that sit in between SQL Server and the rest of the network. And they log every network packet, they track who sees what, because it’s not just a matter of who does inserts, updates, and deletes. It’s a matter of who sees whose patient’s records, for example. And there is an overhead with that. it’s not usually huge, but those things are six figures a piece. Whereas when you go roll your own, there’s usually all kinds of overhead, even if you’re using SQL Server’s auditing. And it’s not going to be legally defensible in those cases. So that’s just the first place that I would start if you have to defend yourself in front of a court. If you don’t have to, if you don’t care about anything legally defensible, go but IDERA’s SQL Compliance Manager. And I’m not saying it’s not legally defensible, but I’m saying that it’s about a thousand, two thousand bucks, you install it and then it just logs to a file. Sure, you could get around it, but at least that way, you’ve checked the box and you can go on with your life.
What is Brent teaching today?
Brent Ozar: Let’s see, Michael says, “What am I teaching and doing today at the client location?” We just finished a parameter sniffing class and I don’t actually know what we’re talking about next. I have to go look at the – it’s all database developers in the room today, so I’m doing a lot of T-SQL type stuff. So between Richie and I, Richie’s dancing and laughing, like yay for database developers. But Richie is also inheriting a hot mess of SQL that I have in our current production environment…
What did Brent break this weekend?
Richie Rump: Please tell them what happened this weekend, Brent. Please, share with the class what went on this weekend.
Brent Ozar: So I used queries behind PowerBI. I used PowerBI to build a little dashboard so for SQL ConstantCare I can tell people, here’s the problem you’re having, here’s the issues behind it…
Richie Rump: And it’s great. And it’s great…
Brent Ozar: Well, no. Richie’s been hearing me complain about PowerBI. It’s the only thing we have – it’s pretty cool. It’s gorgeous. It helps me get to answers quickly. But it all depends on the quality of my queries. My queries are not very good and I was pointing at a specific replica rather than pointing at a generic reader name, like an AG listener. So when we were running both production and my PowerBI queries on the same replica, I may have caused our production servers to failover multiple times. I’m trying to troubleshoot why the server’s having problems and then I’m like, oh, it turns out…
Richie Rump: Yeah, you know, because of the way we’ve architected it, it was really easy for us to recover those files and then push them out into – nobody lost any data. Nobody didn’t get their messages, and that’s the great part about working in the cloud and in this type of environment; hey, something fails, we just push it right back through and, hey, it goes and it finishes. That has been a lot of fun. It’s like, oh, failed? Oh, no problem, just push it right back through.
Have you worked with SQL Server in containers?
Brent Ozar: Andy Leonard asks a serious question. He says, “Have y’all played with SQL Server in containers?” And we laugh, but oh man. The serious answer is no, we – no, that’s not true. I did play with them, I just stopped.
Richie Rump: I mean, is Aurora running on container? I would assume that it is, so technically yes.
Brent Ozar: Well, it’s not SQL Server though, but…
Richie Rump: It’s not SQL Server.
Brent Ozar: It doesn’t solve a problem for us. I get that somewhere somebody has a problem that containers solve, I just haven’t seen it yet. That’s not true; continuous integration, continuous deployment, that totally makes sense, I get it. But for the rest of people, for production SQL Servers, I haven’t seen the problem that it solves yet that people went, oh, we could put this in a container and it’s cool now, let’s do it. I haven’t seen that. And we work mostly on production servers, so…
Richie Rump: Okay … and let’s qualify that. For continuous integration, yes. Continuous deployment, I’m not sure. I don’t think so because continuous deployment, what are we going to do, deploy to database again? And what happens to the data that’s currently in production? So yes, continuous integration where I need to restore to a state and all I have to do is wipeout what’s currently there and rebuild it, yeah, that’s fine. But scripts work too, just saying.
Do I need to do index maintenance on heaps?
Brent Ozar: Pablo asks a question and I’m going to reword Pablo’s question just a little. He has an application where a heap is the main table and the application constantly deletes rows in the heap. Does he need to do any kind of index maintenance on that heap? And if so, how should he do it?
Tara Kizer: Why is it a heap? That’s my question.
Brent Ozar: Because he’s not allowed to change it, but I’m with you. And why would you want to get it away from a heap?
Tara Kizer: Well, I mean, heaps are great for writes, but not for reads. Clustered indexes are better for OLTP environments where read performance matters. Not a lot of change structure – that’s an odd sentence because switching from a heap to a clustered index, I wouldn’t really consider that changing – I mean, it does change structure, obviously, behind the scenes, but who is giving that kind of mandate? But alter table rebuild, you know, you might need that.
Brent Ozar: That sucks.
Who should I hire for Analysis Services help?
Brent Ozar: Wilfred says, “We’re having analysis services problems and we know that’s not something you do, but could you recommend a consultant, preferably Europe-based, who we could contact?” yeah, Bob Duffy of Prodata.ie I think is the company’s name. But if you search for Bob Duffy Prodata, he’s based out of Ireland and does a lot of analysis services stuff.
Brent Ozar: Jason follows up on the auditing question. He says, “Well, we don’t need to defend ourselves in court, but we need to get audited by a company. We’re really just checking the accounts that have access, not all of the traffic.” So it comes down to whether or not – and I would really have this discussion with your security or legal team – would we have to defend ourselves in court? Because Enron got audited by an external company, Arthur Anderson, and they passed all those audits with flying colors, but of course, you know how that goes.
Brent Ozar: Chris says, “I’m finally switching to extended events. I’ve a few current Extended Events sessions logging to a file. Is there an easy way to view the results other than Xquery?” I don’t think any of us know of an easier way. Maybe there’s a third-party tool. Like, I think IDERA had an Extended Events tool, but I haven’t used it.
How do you feel about contained databases?
Brent Ozar: Oh, Ron asks, “Contained databases, good, bad, or ugly?”
Richie Rump: Yes.
Brent Ozar: Did you ever use contained databases?
Richie Rump: Nope.
Brent Ozar: Why?
Richie Rump: And in fact, I do remember a specific moment where I was introduced to them. I was at a code camp and my initial thought was, that’s stupid, why would you do that? And if you were moving stuff up to the cloud, then it makes perfect sense. But if you’re not doing that on a regular basis, then what’s the benefit? That’s the question; what’s the benefit for going to this specific technology? That’s usually what I ask. Well, what am I getting out of this? And if you’re not doing any of that, having that portability, it’s more hassle than anything else.
Brent Ozar: It’s not really like databases are usually portable either. There’s usually agent jobs and other stuff outside of the database, SSIS packages, that you end up having to move around. So it seemed like a really neat idea but just wasn’t finished with execution. And like a lot of features, they just launch it. And then, when nobody picks it up, they just dump it. Ron says, “I’m looking to take an on-premises database to the cloud.” If you’re just changing an existing database to be contained, I don’t think that’s going to be enough. Usually, you have things like linked servers you have to worry about, all kinds of other stuff.
Brent Ozar: Stephanie says, “For the Extended Events thing, export the Extended Events to a table and query them normally.” Ooh, I like that. you could set up an agent job, because he did say that he had a few sessions that he ran on a regular basis. You could set up an agent job to regularly import those. I kind of like that.
Richie Rump: You could transform them into JSON and you could just run JSON queries up against them.
Tara Kizer: That sounds fun…
Brent Ozar: Just when you thought Xquery couldn’t get any worse…
We have a valid use for containers…
Brent Ozar: Andy says, “We’re experimenting with using containers for SSIS testing.” See, that makes perfect sense. Spinning up a database, you spin up a sample ETL source database and a sample ETL target database and you do ETL between the two.
Richie Rump: Yeah, I mean, that makes total sense. And even if, hey, I just want to have an SSIS database just to run things off and my source and my destinations aren’t living in that container, even that might make sense, depending on where I want to spin it up, how frequently. Is it for different clients? Things like that, I get that. I buy that. But Andy’s a smart guy, so there you go.
Brent Ozar: You’re just saying that because he lives in West Virginia, well, Virginia, but he has a gun.
Richie Rump: Virginia – if it was West Virginia, I’d be like, well maybe… No, Andy’s a smart guy. I mean, I remember walking into one of his SSIS courses and I’m like, whoa, this is some good stuff. How did you figure all this stuff out? It’s not in a manual, that’s for sure.
Brent Ozar: Because the SSIS packages I used look like a kid with a crayon or a marker. God, I have no idea what I’m doing.
Network issue or threadpool waits?
Brent Ozar: Thomas posted a big long question with the big old error messages. Go ahead and post that on Stack Exchange – just usually not a good fit to dig into with a Q&A format.
Tara Kizer: I replied to his earlier – he had the same question above. Most of it’s an error message about SQL Server not being available, just that generic error. And saying, 99% of the time, it works, other times it gets that error. I posted my blog post with THREADPOOL waits, is it a network issue or THREADPOOL. So take a look at that link. If you didn’t receive the link, search on our site for THREADPOOL with my name on it. There’s a blog post, I think from last year, on the topic.
What should I learn to move from junior to senior DBA?
Brent Ozar: Josh says, “I have a noob question. I’m a junior DBA. Actually, I’m the only DBA because our senior DBA retired.” He might have literally meant senior DBA given that he retired. “Wondering what kind of things I should be looking at to start moving towards being a senior DBA.” What a good question. Tara…
Richie Rump: I know one; you could get into development. That’s what you could start getting into.
Brent Ozar: Tara, what do you think is the difference between a junior DBA and a senior DBA if you were going to tell somebody to study or what to work on?
Tara Kizer: I think a junior DBA is going to be using the GUI a lot, relying on everything in the GUI to do stuff. I will reference junior DBAs and intermediate DBAs when I’m talking about parameter sniffing with my clients and some clients are at the step where they’re either rebooting the box or restarting the service to solve performance issue that they just don’t realize are parameter sniffing issues. So you know, a senior DBA is just not going to do that type of test, but going from junior DBA to senior DBA, for me, it was just making sure I was at a company that had lots of technologies in place. I always had somebody with more knowledge than me helping to guide me, so it’s going to be challenging if you’re the only person there.
Hopefully, I would hope that is someone retired, they would go ahead and replace, hire out, for another senior DBA. That way, you have some guidance along the way. One of the things that helped me the most back then, from intermediate to senior – because before that, it was just experience and time really – but it was answering questions online. Looking at questions online and seeing the ones that were interesting and the ones that I didn’t understand and trying those out myself. If there was a T-SQL type question, I’d try it out in Management Studio. And just looking at people’s answers that were highly respected people that had really good answers and trying those solutions out on my own.
But once I started answering questions, that’s when I really started getting better because I was testing things on my own trying to get things to work and learning as I went. So for me, I would highly recommend that you get involved in the SQL Server community and attempting to either answer questions or trying them out separately and not answering the question.
Richie Rump: Yeah, I mean, jump on that. I’m not a senior DBA and never pretended to be one. I’ve never been a DBA and nor will I ever be one, but I understand a lot of what the DBA is supposed to do because I needed to understand if they were lying a lot. And frankly, a lot of times, they were, because they were terrible people that I used to work with. Anyway, I learned a lot of my information from the SQL Server community. So going to SQL Saturdays, and even attending the summit which is coming up next month, right? Am I wrong about that? A couple weeks I think. But going to those types of events, meeting these people who are actually, not only speaking, but actually attending the event and hearing some of the issues that they have and some of the solutions that they’ve come up with, and even the vendors, you learn a ton of stuff.
You get your own sessions just talking to vendors, just walking up to Microsoft and saying, hey what have you got for me? And then they’re like, oh, somebody wants to talk to me, and they’ll just spill out everything they’ve got and you just start asking questions and learning about the stuff that they’re doing and some of the problems they’re trying to fix. I think those are the types of things that will help you push not only understanding of the issues that a DBA has, but also the solutions that you will encounter when you start seeing the same things.
Brent Ozar: Buck Woody has this great saying – I’m sure he didn’t coin it, but I keep hearing it from him – “If you have one year of experience that you’ve done ten times in a row, or do you truly have ten years of unique experience?” And when you’re the only DBA in a shop, it’s really easy to keep doing the same year of experience over and over again. You don’t have anybody to mentor you, that’s where I adore Tara’s approach of going, how can I go out and see new problems that I haven’t seen before and how do I go reproduce them in my own environment? That’s probably the best junior to senior type of advice I could give somebody because no matter how well you think you master your own environment, it’s terrifying when you go to look at other people’s.
What size should my page file be?
Brent Ozar: Shaun says, “What is the best practice on page file sizing? If I have 8GB of memory or 512GB of memory, how big should my page file size be?” None of us can remember this off the top of our heads because it’s in our setup checklist. If you go get our First Responder Kit, there’s a setup checklist link to a page by Microsoft. Basically, you shouldn’t be hitting the page file. If you’re scratching the disk, you’re screwed for performance anyway, so you just need a page file big enough to do a mini-dump. And we link you to the Microsoft KB article which gives you the different file sizes for what it takes to do a mini-dump.
Tara Kizer: And then, if you actually call Microsoft for a problem, they’ll say you need to do a full-dump, unfortunately. So it ends up being, I think, the size of RAM that’s required. Because we’ve had to do that where our page file might have been on C drive and we had to allocate more space. In order to get the full-dump, that was required. The mini-dump, for whatever reason, there wasn’t enough information. But this could have been several years ago, so maybe things have changed. We needed the size of RAM.
Brent Ozar: Yeah, I think they’re still doing it kind of as a delay tactic. Hey, why don’t you take a full-dump on your server with 1TB of RAM? Okay, sure, how do you want me to get that to you? Well, erm, FedEx…Yes, Steve follows up with, “In the Windows NT days, it was 1.5 times RAM.” Yeah, those days are passed though, thank god.
How is fragmentation measured?
Brent Ozar: Mark says, “How is fragmentation actually measured? For example, if an index is 50% fragmented, does that mean for 100-page reads, it actually had to jump around elsewhere to read rows 50 times rather than reading continuously?” No, but there’s two kinds of fragmentation. There’s internal fragmentation and external fragmentation. External fragmentation basically means that the pages are out of order on disk. That only matters if you’re reading the data from disk. Because, if the data is up in RAM, pop quiz, what does the R stand for in RAM? Your pages don’t have to be laid out in order in RAM. It’s random, random access memory. So there’s no such thing as defragmenting pages externally when they’re up in RAM. Internal page fragmentation means you have empty space on each 8K page. That can affect the number of 8K pages that you have to read. That is a really big deal. But they’re measured in different ways in order. To see how they’re measured and in order to see which ones matter, search for Brent Ozar Why Defragmenting Your Indexes isn’t Helping. And I have a session over at GroupBy where I talk about why people are defragging their brains out and yet performance still sucks. Alright, well that’s all the questions that we have this week for Office Hours. Thanks, everybody, for hanging out with us and we will see y’all next week. Stay safe out there, everybody.