This week, Tara and Richie discuss deletes on heaps, Docker, using containers in production environment, advice for new DBA, downgrading server hardware, agent jobs, setting SQL Memory Min and Max to the same value, how to prove CPU pressure to management using the Power BI report using sp_Blitz, corruption, patching, backups, sargability, and more.
Enjoy the Podcast?
Office Hours Webcast – 2018-12-5
Tara Kizer: Alright, let’s get started – and I don’t have the answer to Carlos’s question right off the bat and I can’t remember – I’m going to ask anyway, “If doing a delete on a table versus truncate table for a heap, will they behave the same for [free net] pages? Will index physical stats DMV look the same afterwards?” It’s a vendor database. He doesn’t own the schema. I can’t remember for heaps. I’m a clustered index table type person and I know a lot of the ins and outs of heaps, but that’s one I don’t have memorized. Do you remember, Richie?
Richie Rump: I don’t think there’s a huge difference between the two. If this was on a test, I’d be like, yeah I think they look the same.
Tara Kizer: I know that there’s some issue with deletes on heaps because we have an alert in BlitzIndex for it and I didn’t quite understand what it meant. And Erik has moved it as a lower-priority item further downwards. Forwarded fetches on heaps are bad. I think it has something to do with no other tables can use that space, but the heap table can reuse that space. I’m sure we have blog posts out there on it. If you run sp_BlitzIndex mode four – maybe mode zero but probably mode four – it will let you know which tables have the deletes and see that there should be a URL on that row. Click on that and you’ll learn more about deletes on heaps.
Richie Rump: Wow, I did not know that. See, even I can learn something.
Tara Kizer: I look up Blitz Index stuff, you know, day to day as part of the client engagements. It’s a row that I kind of ignore though.
Richie Rump: I’ve been looking into BlitzCache code the last couple of days.
Tara Kizer: Yeah, I’ve seen your questions in Slack. I’m like, uh-oh…
Richie Rump: Like, what is going on here, people?
Tara Kizer: That’s one I will not open. I’ve contributed a couple of changes to Blitz and I don’t think any other ones I’ve opened. But BlitzCache, that’s a hard no. There’s no way I’m looking at that.
Richie Rump: Yeah, I found one of the queries for non-SARGable stuff and I’m like, I don’t think this will ever catch anything. I’m pretty sure that this is wrong and I’m going to go the route that I would know will capture something.
Tara Kizer: Alright, Marco asks, “Some quick-start documentation for SQL DBA that they want to start on it…” oh, for Kubernetes and Dockers.
Richie Rump: You’re talking to the wrong clan here. I don’t think any of us have actually played with Kubernetes or Dockers. If I wanted to learn about Kubernetes and Dockers, not necessarily for a SQL DBA, but just as a technologist, I’m probably first go to Pluralsight because a lot of those presenters, they distill the information pretty well. And they’ve been in the technology, so if I need to learn something really quick and I didn’t want to read a whole bunch of documentation whitepaper-y type stuff, I would go to Pluralsight first. And then if there was something that the speaker was talking about that I wanted to dive into a little bit more, then I would go off to the different places on the internet and look for other people to talk about stuff. That’s how I roll.
Tara Kizer: Jeremy asks, “What are your thoughts on using containers in a production environment?” I don’t really have any thoughts. I don’t really care.
Richie Rump: It’s the same question I have for any technology in a production environment; why? What is your reasoning for using containers in a production environment? Is it for ease of setup? We have multiple servers all over the place where you just want to deploy really quickly and have that? If that’s the case then yeah, but there’s got to be a reason, not because it’s the cool new hotness and everyone’s doing it. That’s not a good reason. There needs to be a firm technical reason for why you want to have some sort of container in a production environment. So yeah, the answer to that question…
Tara Kizer: Joshua asks, “Brent Ozar Unlimited offers training on multiple paths. Other vendors do the same. Is there a path you would most recommend to a new DBA with admin focus and not development focus? Which skills would you focus on and why?
Richie Rump: Well, I would suggest you go development and not DBA because I’m a developer and that’s where I would want to go.
Tara Kizer: I mean, we do have the fundamentals of DBA course. I think those are all Erik’s modules, I’m pretty sure. Start there as far as… I don’t know…
Richie Rump: How to Think Like the Engine. Start there…
Tara Kizer: Well yeah, that’s true.
Richie Rump: How to Think Like the SQL Server Engine, start there and then wherever else Tara is going to lead you.
Tara Kizer: I don’t learn very well in the method of taking classes. When I really jumped my skill set it’s because I was being very active online in forums. And for questions I didn’t know the answer to, I would look them up and see if I could figure out the answer, and then I would play around in Management Studio and see if I could figure it out. So I learned a lot by doing and so a lot of my experience came from test environments and later production environments. Obviously, if you’re a junior DBA and you don’t have the access to production, possibly that might be harder to do. But I really get my skills from doing rather than listening to people talk about it.
Tara Kizer: Teschal asks, “We have a few tables that have foreign keys that are not trusted and I run this command, alter table with check constraint, but still see that the key is not trusted after the alter completed successfully.” That’s an interesting question. Check the query. I imagine it’s our script from our foreign key trusted post, but not sure. Not sure if that’s a bug in the script or something else went wrong. Ask that question on Stack Exchange also after checking the script.
Tara Kizer: Chuck asks, “Current server has 160GB of RAM allocated for SQL instance. We feel that the server was purchased to spend money out of the budget…” I love servers like that, “Now the server has to be replaced with limited budget. How does one go about determining how much RAM the server actually needs?” I wouldn’t be wanting to give that up. As a production DBA there’s just no way. I mean, I might be okay with having lesser cores, especially if it’s Enterprise Edition, because that’s where it really costs a lot of money for a server, the licensing aspect. But RAM, I don’t know that I’d want – it just depends on what your system needs. It’s not something that we can really answer. I’ve worked for companies that have performance load testings, and that’s something we could test in our load test environment where replaying a production load or a synthetic load there. And we could see we could reduce the RAM there – I don’t think I’ve ever done that test though because I’ve never gone down, only up. But 160GB isn’t that big of a server these days. I would look at your processors instead if it were me to reduce costs.
Richie Rump: I think that’s going to be a more valid question as more and more servers start inching up to the cloud, because once we start lowering our memory and our CPU, we start saving money. So I think it’s a valid question and really it’s up to, hey, what is your app doing? How frequently is it doing it? How do your queries look, because if you have ill-performing queries, they’re going to have more resources, more memory, more CPU, more of everything? So you know your system and you could take a look and see what it’s doing when and just monitor it and see, hey, what are the times that all this stuff is going on and do we need more or can we actually back it down a little bit?
Tara Kizer: Yeah, especially in the cloud where you can’t really just add RAM. You have to go up in size and that gives you more processors and more RAM. That’s a really valid point that you made.
Richie Rump: Yep, I make one once a day. Thank you, guys.
Tara Kizer: That’s it, you’re done for the day.
Tara Kizer: Alright, I’m going to murder this person’s name. [Shayan] asks, “A SQL agent job executes two stored procedures. Is there a default timeout setting, how to make sure it can run for 60 minutes without timeout?” There is no default setting in SQL Server. That is imposed by applications. I’m sure – I don’t know if there’s even a setting you could change in the agent jobs. Well, you can. If you wanted to, you can have it stop after a certain amount of time. But you’re asking the opposite; how to make sure it can run to completion, basically. And you don’t have to make any changes to do that. I’ve had some jobs run over 24 hours and they just keep on running. Now, those are going to be like CHECKDBs on very large databases, but you’re good to go. Just set up a job and it should not have any problems, continuing to execute at least.
Tara Kizer: Alright, Julie asks, “Setting SQL memory min and max to the same value, why would you, or not, want to do this? Corporate policy…” So I think that in some version of SQL Server, and I’ve never encountered this, but I know about two and a half years ago, I think it was Doug that mentioned, we had a client that had min and max set to the same thing. and so I learned that day that he was remembering some kind of bug where it would wipe out the plan cache. Some odd thing was happening and it was just a bug where min and max were set to the same. So I would imagine whatever version that is, that it was a bug and it has been fixed, but I don’t know that there’s any benefit to having min and max set to the same thing.
On the systems that I’ve supported where we’ve changed the min value – normally I don’t change the min value. But we have changed the min value, it’s because we were in a clustered environment and on failover we wanted to make sure that the – I’m talking about a multi-instance failover. Say we have two nodes and two instances, one on each node, active, active, and a failover occurs for whatever reason, we wanted to make sure that that instance that had to do the failover because the server crashed or whatever, it could get some amount of memory. So by setting the min, that would allow it to steal some. I don’t know that there’s any other benefit to it. SQL Server’s not going to automatically allocate that min right off the bat anyway on restart. It takes some time to build up to that. I don’t know that there’s any reason not to do it except with in-mind that there may be some bug in some version, and I could be completely murdering this anyway.
Tara Kizer: Sheila asks, “How can I prove CPU pressure to management using the PowerBI report using the Blitz procs? Performance keeps getting worse and the only difference is a query using five execution plans; third-party code.” I mean, CPU pressure is easy; you just pull up perfmon or a task manager and you can see it. The little graph there also in SQL Server for wait stats, you’re looking for SOS schedule yield waits. That’s a good indication. So using BlitzFirst. And Blitz will let you know towards the bottom of the output if you do have high waiting time on SOS schedule yield waits, but that’s the wait stat to keep in mind. As far as PowerBI, I’m not the right person to answer that. I think Brent’s really the only one that could answer that part of it. The fact that a query is using five execution plans isn’t really a problem.
I mean, certainly adhoc queries that have thousands of execution plans, that can be problematic. But one with five execution plans is not. I would be looking at other things. I would be looking at the wait stats, CPU utilization, I/O, that type of stuff. But start with the wait stats and that’s what tells you where to go next. Look at your indexes. BlitzIndex mode zero, looking for high-priority items. And in BlitzCache, I would be sorting by, so BlitzCache sort order equals, and then I would start with CPU; average CPU. Maybe look at memory grant, depending on what version of SQL Server that you’re on. I don’t know that the PowerBI stuff is sufficient to actively troubleshoot the issue.
Tara Kizer: Alright, Pam asks, “We’ve recently experienced corruption in two databases on indexes. Our developers are very fond of using the NOLOCK hint, and I’ve read in past versions there is a bug that can cause corruption. Does this still hold true for SQL Server 2016?” I would imagine it does not. I am not a fan of NOLOCK hints. Where data accuracy matters, you cannot use it, period, end of story. I’m not aware of it causing corruption in the past, but I’m sure that you’re right. Brent and Erik would probably have that type of stuff memorized. I would seriously doubt that it would still exist in 2016 or even any recent version of SQL Server because that’s pretty serious, because a lot of people are using NOLOCKs out there. I mean, I don’t know what the percentage is in the world, but I’d say it’s a pretty big percentage.
Richie Rump: Too damn high. And as a DBA going to a developer, you need to ask them, why are you using a NOLOCK? And they probably don’t understand actually what’s going on in the backend with the NOLOCK stuff. And then you could let them know, you can get wrong data and you can get duplicate data, and there’s all this other fun stuff that they probably don’t know about. And then you could start thinking, if we really need this kind of stuff, maybe we need to go to RCSI on a database and get rid of all this NOLOCK stuff.
Tara Kizer: Exactly, fix the root cause of the problem. NOLOCK should not be the turbo button in SQL Server. It still takes out locks anyway.
Richie Rump: It’s really just a misunderstanding on the developer’s side of what NOLOCK does. I’ve done it intentionally, hey this is a reason why we need to use NOLOCK, boom, boom, boom, boom. And in this particular instance, we had a heap and there was no inserts going on at the time and there was a reason why we did it and we decided to do it. But I wouldn’t do it without a specific reason. And going through that thought process of what NOLOCK does and what it buys you while the system is going on in a certain time set or whatnot.
Tara Kizer: John asks, “I need to set up regular patching of my SQL Servers. To be safe, is it just getting the good backup, or is there a best practice, tips, tricks that you would recommend?” I always know that my backups are okay. Yeah, I’ve worked in larger organizations with tons of SQL Servers, lots of monitoring, all this type of stuff, and we’re restoring our backups, we have this stuff automated for our important systems. So I never specifically run a backup before I start my patching. I mean, certainly you could do that, but when you’re talking about 16TB databases, you’re probably not going to run a backup before you start patching, unless you’ve got some kind of snapshot utility going.
But making sure your backups are valid, and then most of my systems would have log backups running every five to 15 minutes anyway. So that is a backup occurring before patching starts. But start in your test environment. You need to make sure that your system is okay with these service packs, cumulative update packages, because some service packs and CUs have caused problems with certain things. So do your testing.
Tara Kizer: Donald says, “Explain SARGability once more please.” So SARGability is a made up word in the database world. It stands for search argumentable , so SARGable. There’s even a Wikipedia page, it’s not just a SQL Server thing. This is a relational database thing. So we want our where clauses and join conditions, our search predicates to be SARGable to allow SQL Server to be able to pick an index seek, for instance, instead of a scan. When our search predicates are not SARGable, meaning that they’re non-SARGable, SQL Server may have to scan the index. If an index even exists for the column where the SARGability issue is happening, it may have to do a scan there. Now sometimes we won’t see a scan because in the search predicates – or it could be other filtering that can occur, maybe you have those things indexed – and so by the time it gets to this search predicate that is not SARGable, it has less work to do. So it’s not scanning anything. It’s just doing that work after it has already filtered down to a certain amount of rows
So an example of a non-SARGable search predicate is using RTRIM, LTRIM, cast convert. Some date things don’t have SARGability issues, but converting out to get the year, that would have a problem at times. Functions, implicit conversions is an example. Anytime you wrap a column with a function, the built-in functions, that is going to be a problem. And we’re talking about where clauses, join conditions, where your filtering is going on. I don’t care what you do in the select statements, although if it’s in a view and it’s in the select statement, that becomes a problem because your outer query may be using that column in the select as a where clause. So it gets a little tricky when you start nesting things down. But there’s lots of information out there on SARGability. You can even start on the Wikipedia page to learn some more about it. it’s a very common problem out there.
Richie Rump: And in fact, this is what I was adding to ConstantCare this week. So we’re going through all the query plans and then we’re seeing, okay, do you have any non-SARGable predicates that are in there? And I had to work backwards. I had to work form the XML and create the queries to generate that XML so we could test all this stuff out. So it’s a problem and soon ConstantCare will tell you that you have a problem and you should seek help.
Tara Kizer: Mark asks, “Can you recommend any free cloud SQL Servers for people to try their hand in cloud DB?” I don’t know if there’s anything out there, because I know even Microsoft Azure, the Microsoft MVPs get very, very limited free stuff.
Richie Rump: I think they get $150 credit a month, something like that.
Tara Kizer: Something like that, very low. And I guess a lot of the presenters use that up in, like, the first day. So they have to be very careful. Yeah, you have to pay to use the cloud and there isn’t really a test version of the cloud or production version of the cloud. You’re paying. Of course, you can get a smaller server. There are some, you can get a pretty small server.
Richie Rump: Yeah, so I mean, I think what most people do is they have their own Azure account and it’s tied to their own credit card. They’ll go ahead and spin up a database. They’ll load their data, or maybe their database is already there. They load it form a backup then go ahead and run that stuff. And then when they’re done, they’ll turn it back off. Then that way, they’re not incurring the monthly cost of Azure DB or RDS or whatever craziness that’s going on. And they also had a very low tier. And so actually, I think a lot of our costs when we do training are we put them up at a higher tier, so in the cloud they have their own SQL Server databases. I mean, that’s one of the advantages of training with us, because we give you all the environments in the labs and stuff like that and you’ve got hands-on stuff which is pretty darn cool if you ask me.
Tara Kizer: Theo asks, “We have a SQL Server that gives a memory error at random times, like failed to release continuous memory, blah, blah, blah.” Oh boy, if I was getting that error, I would be doing the hardware checks. I would probably be opening a case with Microsoft, checking with the vendor of the hardware. You might have corruption issues. Memory errors like that, as far as I know, that’s a bad one. You might have a bad module. I would definitely look into it more and hopefully you’ve got some kind of high availability setup for the system so that you can failover to another server while you work on whatever problem. It sounds like a hardware problem to me, and you might have some database corruption. Maybe you don’t. Hopefully you have DBCC CHECKDB running at times to know whether or not you have corruption.
Tara Kizer: Mark says, “If we disable constraints of a foreign key, we could insert faster, right? But if we enabled again, but not with check, then what is the process or workflow of the insert? Is scan all table to when it’s a position or match the constraint?” I mean, certainly if you enable constraints, it’s going to help things out. But the constraints are there for a reason. And having trusted constraints can help you out with your queries, your select statements, because a join elimination can occur. I had a client this week that is having issues with this because they don’t have foreign keys in place or some of them aren’t trusted. So Brent’s got a module in his performance tuning class, mastering index tuning class, where he talks about this issue. And he shows you a query in Stack Overflow database. It’s between post and post types.
And if you have the foreign key trusted between the two tables and you do a query with a join to post types, it doesn’t even have to check the post types table in the execution plan because it can eliminate that. It knows that the data is fine in that table, so it doesn’t even have to bother hitting that table. So I would do some research on the join elimination topic.
Richie Rump: Right, you know, but if we’re doing loads into a table like that to make them faster, you do your cleaning of the data first in a staging environment, and then you don’t have to worry about when I put the index back on, is all the data right?