This week, Brent and Richie discuss sp_Blitz Backups, change history tracking, exams and certifications, using Azure for DR-only, which version of SQL server to use for a new DB, adding additional data files, load balancing, trace flag 2371, dealing with reporting services, and much more.
Enjoy the Podcast?
Office Hours Webcast – 2017-05-31
Brent Ozar: Wesley says, “Does Brent Ozar Unlimited sell Brent Ozar Unlimited coffee mugs? I need people to know who my DBAs are.” We do, but the thing is, they cost $4,000 and they come with a free training course. So you just have to come to one of our in-person classes and that’s when we give them away.
Richie Rump: Yeah, I tell people it cost me my soul, so, there you go…
Brent Ozar: But you got like half a dozen of them, so…
Richie Rump: I got four, I got four for my soul. And I’m good with that, right, because I got one for each of us here, you know, I don’t need anymore, there’s always one available, we’re good.
Brent Ozar: That’s my entire – I have an office bathroom and the entire bathroom is completely full of swag, you know, just stuff in boxes that we… Unbelievable.
Richie Rump: I have to share – my office bathroom is right that way, I’m looking at it right now. It hasn’t been updated since 1969, and I told the girls, I am not fixing this bathroom until you learn how to take care of the bathroom because they just throw stuff everywhere. They’re girls, you know, it’s already messed up, why not. I’m like, well you’re going to have to deal with it, not me, so hey, have fun.
Brent Ozar: Don says, “you should make the coffee cups available to those of us who purchased the online training.” The problem is taxes. As soon as we deal with taxes and shipping, things get really, really ugly.
Richie Rump: So here’s the thing, don’t call it a coffee cup, this is tea. This is Earl Grey, it’s not coffee.
Brent Ozar: Earl Grey hot, I would assume?
Richie Rump: Tea, Earl Grey hot, this is loose leaf, I think it’s Teavana that I’m drinking today.
Brent Ozar: Ooh very nice, yes.
Richie Rump: Yeah, I bought a big old bag of that sucker.
Brent Ozar: Let’s see here, we’ve got all kinds of fun questions. Clark says, “in SQL Server vNext, which is officially called 2017 now, have you guys used Python and can it be used for SSIS type things?” Well, Richie, I would ask you. So if you were going to do like data sciencey work, Python, anything, SSIS to get data in and out, would you run it inside of SQL Server? Or what would you do?
Richie Rump: I’d try it. I mean, if my data’s already there, I mean, it’s already stood up, why not, but typically we wouldn’t do that in SQL Server because we haven’t had the tools to do it. I guess, now that Python and R are in it, maybe we could do some more stuff. And frankly, I doubt that SSIS will have that. Think of SSIS as a completely different product. It’s under the SQL Server banner but it’s something completely different than the engine stuff. So I’m assuming you can’t, but hey, you got c# in there, so you know, have at it at that, buddy.
Brent Ozar: James says, “sp_BlitzBackups is awesome, I love it because I can show my boss the risks the business is taking.” That’s exactly why we wrote it. We have a project where the client’s building availability groups up in the cloud and we just want to be able to monitor how long it’s going to take to restore those databases. It’s tons of fun if you guys haven’t seen sp_BlitzBackups; Erik put a ton of work into that. And he has another brand new one coming out next week, sp_QueryStore, where he analyzes your SQL Server 2016’s throughput, like looks for your worst performing period of time in the last seven days and then finds the queries that were causing that workload problem, all by just running a stored procedure; it’s amazing.
Brent Ozar: Wes Palmer asks, “I’m trying to log DML actions and throw them into and audit table using triggers. Is there a better way to log those instead of using triggers?” Have you had to do any of that work, like history tracking work of whenever people make changes?
Richie Rump: Yes, and when I had to go and redo it I created history tables for it. So as we loaded the data, the schema automatically handled all that. So we had from and through dates pretty much through everything, so I didn’t have to sit there and throw rows out or anything like that. I guess, if you can’t redesign your whole entire application, I guess you can go ahead and use some of that new-fandangled stuff that they use for odd history stuff. I haven’t used it yet but…
Brent Ozar: [crosstalk] tables…
Richie Rump: Yeah, that’s an option now.
Brent Ozar: Damn, see I forgot all about that. Yeah, before temporal tables wither you had to change the app or else you used triggers, but temporal tables, that’s a great point; which are in ever version or every edition of SQL Server now. I mean everything’s included in standard edition since Service Pack 1. So that’s a really cool idea.
Richie Rump: Yeah, it’s funny me talking about SQL Server knowing that I haven’t really done much with it, you know, in the past year or so. And it’s like hey, let’s talk about Postgres, alright, let’s do that.
Brent Ozar: As you’ve worked with Postgres, what has been your most interesting difference between that and SQL Server?
Richie Rump: You know, I haven’t got into it too deeply yet, and it’s really the programming model of it, the PG/PLSQL is different, as well as, you know – I’m doing a lot of, hey this is SQL Server and then how we would move that data over into Postgres and then how we would transform the data types and do all that other stuff. There’s a lot going on in Postgres and kind of, there’s now – sometimes there’s not a one to one mapping, you’ve kind of got to figure it out, but there’s a lot going on from a programmatic side in Postgres that you just kind of have to rethink yourself. And from a SQL Server perspective, it was like, oh, okay no that’s good, I like that. I’d love to start doing some more perf testing on the Postgres side to really start seeing how it would work, but I’m not there yet, you know, we’re in our infancies of just getting, taking a really hard look at Postgres and Postgres in the cloud and seeing what perf is and all that fun stuff.
Brent Ozar: Getting rows in and taking rows out.
Richie Rump: Yeah, we’re just trying to get rows in, we’ll figure that out.
Brent Ozar: Expert level.
Brent Ozar: Let’s see here, John says, “we have high tempdb average write stalls of about 1.4 seconds. I recently changed tempdb from one file to multiple files, when it was one file it was lower.” Well really think of this as random versus sequential activity. The more files that you add, the more random your drive access can become. A couple of things I would think about is first, make sure you’re on a recent patch level for 2012 or 2014. There were improvements in there where SQL Server’s less eager to write to disk. Next thing I would look at is — get as much RAM as you can on the box, the storage, whether it’s 600 milliseconds or 1.4 seconds sucks, I mean that’s bad either way. So if you can keep it up in RAM you’re going to be able to avoid that. It doesn’t fix everything with tempdb latency, but it’s just going to help, and I say that – so like if you’re at 64GB of RAM, now’s the time to start looking higher. And then also local solid state, local solid state makes this problem much less of a big deal. With Intel’s PCI-express cards these days, 2TB is 1,000 bucks, so it’s much more cost-effective than it has been in the past.
Brent Ozar: Wesley Crocket says that, “the MCSA in SQL Server 2016 is an Azure marketing certificate.” I felt that way about Microsoft certs for a long time, since they discontinued the master. We have this discussion every now and then, we actually lost our Microsoft partner certification because not enough of us here have SQL Server certs, or any kind of certs, because they’re just not worth anything these days. I mean you nailed it when you said it’s a marketing certification for Azure, that’s really it’s hey, we brought out this shiny hammer, show us all the ways you would use this shiny hammer. I’m like, first off, that’s not a hammer, that’s a martini glass and you need to stop banging nails with that, it’s the wrong thing.
Richie Rump: Yeah, my favorite one is when I took the 2008 exam, SQL Server 2008 exam, it was like XML all the time, the answer was always XML. And it’s like what is going on here, because, in reality, XML is kind of crappy, what is this? But it’s all about hey, this is our new feature and we want you to know it.
Brent Ozar: It’s not related to what data people do, it’s related to what people want you to know. If I designed a certificate, which we’re never going to do because it’s a money-losing mess, but if we designed a certificate, it would be around things like can you backup, can you secure the database, can you make a query go fast, can you troubleshoot an outage when it happens. But none of that is what sells certifications or moves SQL Server as a product, so they just don’t put questions in like that.
Richie Rump: Yeah, speaking of stuff like that, that’s our – my favorite part of our classes is where we say, hey here’s a query, now what would you do to make it fast, and everyone kind of looks at each other like, he’s serious, right? Okay, we’ll figure it out. And so everyone has their answers and they, well this is what I came up with, what did you – and we had this great conversation about why this index would work, why this one’s better and all that, that’s a conversation you just don’t get very much.
Brent Ozar: And then – so I’m taking it to the next level for – I just sent out a survey to like 10,000 of our email subscribers to go, alright so for a query and index tuning class, how about a class where each person gets their own VM and it’s got a workload already running on it and I’m going to give you say one hour to figure out what the problem is and how to fix it, and then you gauge how many batch requests a second you’re getting afterwards. Then we’re going to get back together and I’m going to show you on my VM what I did in the span of that same amount of time. Then you’re going to get a fresh VM and you go do it again, see if you can mimic the same stuff that I did or change what you did again. So the cloud lends all these things so much more easily now because you can do them remotely via your own desktop.
Brent Ozar: Don says, “hey team, I have to build on my question from last week…” oh come on, man, like I remember that. “I have about 30 columns and 500 to 700 rows, I’ve turned on client stats and for 300 rows it’s about 600KB of data return. Does that support our developer’s claim that the SQL query is returning too much data?” 600KB – what I would do, especially if they have a web app, is use any tool like Chrome that will show how much their webpage render size is and I bet their webpage is going to be, say, 2MB or 3MB. So I’d be like, I’ll tell you what, if you think 600KB of data is a lot, how about you try to get your web page down to 600KB and tell me what happens.
Richie Rump: Yeah, and I would say it’s probably a lot more than that, your images are probably a lot more than 2K or 3K or MB, I bet you, again, how frequently are you running this thing? Are you running this five times a second, 100 times a second, are you running this once in a while, are you running it every time that the page loads? How frequently does the data change if there’s only 500 or 700 rows? It doesn’t sound like a lot. You probably should be caching this data somewhere and it probably should be – and if you’re using ASP.NET they have built in caching right there. Hey, even if you’re doing it every ten seconds, that’s all these calls that you’re not making that would save a ton on that 600KB going across for every call, however frequently you’re calling it. Don’t fall into the trap, cache your data, this is not a DBA thing, this is an app dev thing. So mister app developer who’s not watching right now, cache it, please. For the sanity of the rest of us, please just do it.
Brent Ozar: M.M. says, “my application is a data mart and reporting app, I’ve got several terabyte size databases, but we’re supposed to have DR but the budget was denied for physical servers. What do you think about Azure for DR only?” The problem is typically restore time. Cloud storage, its strength is not restore speeds. So if you got to our site and click Tools up on the top, in our first responder kit, we actually have a whitepaper on this for using the cloud just for disaster recovery where we did an exercise with Google. Google Compute Engine just sponsored us to do all kinds of cool tricks with log shipping and we give you the exact instructions on how to do it. You can take these same techniques and use them in Azure, it’s just that the PowerShell commands and whatnot that we give you are directly tied into GCE’s toolkit, You can do the same thing with Azure, the problem that you’re going to run into is, watch how long it takes those databases to restore the first time you do it and then communicate that to the business. Hey, is it cool that we’re down for, say, eight hours, ten hours or in the case of these multi-terabyte databases and Azure storage, sometimes that can be tens of hours.
Also check out on the blog, I’ve been doing this faux PAAS, like fake platform as a service, where we talk about building your own Azure SQLDB, one of the posts in there is about benchmarking your restore speed on Azure, so you can see, kind of, how fast that goes.
Richie Rump: Yeah, that’s my biggest problem with the cloud, your disk speeds, they’re not there…
Brent Ozar: They’re challenging.
Richie Rump: yeah, and you’ve got to code around that as a programmer.
Brent Ozar: As a database administrator, you may have to put every database on its own server so you can get the restore speed that you want.
Richie Rump: Yes, that’s a good – I’m going to write that down.
Brent Ozar: Tammy says, “our developers need a new database…” Well you’ve come to the right place, “We already had of instances of 2014, we could stand up a new 2016 instance, which instance would you choose or would you wait for 2017?” Man, I’d give them 2014 right now and just let them get started. Granted there’s stuff inside 2016 that’s available in standard edition, but I’d be like look, if you need a database, here you go, here’s one on 2014, or whatever you have production or development stuff in, and just code for whatever works in 2014. You’ll be fine in 2014 or 2016. If they run into performance or scalability problems, yeah, you could talk about a new version, but if you don’t need a new server, don’t stand up a new server, they’re so expensive.
Richie Rump: Yeah, and even in – I think the last big thing we got, from a developer perspective, was 2012, was all the T-SQL enhancements that we got inside there, the additional windowing functions and all that other fun stuff. So I say, go for it, go for 2014, it’s fine. If you want to mess with 2016 in a development environment, I’d say play with it, but understand what features are in there and would you need them, right. I mean that’s the question, do you need the features that are going to be in there? 2017 is interesting because you’ve got some Python-y stuff and you’ve got some R, so maybe that’s something that developers can play with, but I kind of wait for a release t, kind of, go with that. But maybe if you want to do it in a development environment, that’s cool, but I kind of would wait until at least Microsoft has a Go Live license or something.
Brent Ozar: which is a good point because often we’ve seen stuff where they’ve brought out a feature and they’ve either canned it shortly thereafter or it doesn’t scale or doesn’t perform the way you want it to. So maybe not code directly to 2017 yet. It does bring up another point, so if you’re going to stand one up in 2014, make sure to use the 2014 cardinality estimator, just so that they get the new cardinality estimator right from the get go, theycan plan their queries around that level of performance. there was something else I was going to talk about – oh, consider read committed snapshot isolation. This is – it’s the default, it’s called optimistic concurrency or MVCC, this is a switch that’s on by default in things like Oracle and Postgres. It’s not on by default on SQL Server, and it just makes a lot of your locking problems go away. If you search for RCSI on our site, read committed snapshot isolation, we’ve got a checklist on how you go about enabling them.
Richie Rump: Yeah, I think I even went to change my model database to turn that on [crosstalk].
Brent Ozar: Rex says, “for reporting server build out, what’s the indicator that you need to set up a build out with a load balancer versus a standalone? Is it better to overbuild or convert?” The big thing there is licensing. The more servers that you stand up, you have to pay for. So that’s the thing that I would use to guide my decisions. Ask the business, are you willing to pay for multiple SSRS boxes so that I can take one out, patch it and throw it back into the load balancing? Also, do you have a team who’s successfully used a load balancer already? Because that has a high startup cost if you don’t.
Brent Ozar: Robert says, “back to sp_BlitzBackup, can it be used for non-availability group environments?” Absolutely, you can use it for standalone servers, mirroring, replication, all kinds of stuff.
Brent Ozar: Mark says, “I’m running out of disk space where my current data file resides. What’s the best way to add another data file and how will it load balance?” I’ll tell you a secret, I’m kind of lazy, I use the GUI. I would right click on the database on SSMS, go into properties and just add a file from there, because for stuff like this that I never do, I’m never going to remember the T-SQL. I’m still amazed every day that I remember how to build an index and rebuild it from T-SQL without, you know, looking anything up. But for something odd and unusual like adding a data file, you can just add it through the GUI. If you’re adding it to the same file group, like the primary file group, it will not load balance. SQL Server uses proportional fill, so it will focus more attention on whatever file is the emptiest, so it will become a little bit of a hotspot. If you wanted to mitigate that, after you add the file you can do a round of index rebuilds to kind of even out the space between those. It’s not perfect, but if it’s just purely a space issue, I would just go and add the pother file for now and call it a day.
Brent Ozar: Let’s see here – this is a good time to speak a little bit about how you give advice. So when someone asks me a question, I try to read as much as I can into what their question is. I know, as sure as I say this out loud, there are going to be armchair architects who are like, well what you should do is add two precisely evenly sized files and do your index rebuilds until it reaches a point of perfect equilibrium. Ladies and gentlemen, this was somebody asking, how do you even add in extra files? So this is how do I get him across the finish line as quickly as possible? His drive space problem will be gone at that point, so…
Richie Rump: I have a question from a developer perspective. At what point would you, say I have a table and I want to throw that into a particular file group as opposed to putting everything in primary?
Brent Ozar: That’s a great question. It depends on whether or not the application can be up without that table. If the app can be up without that table, like history tables are a great example, or reporting tables. I can put that in a different filegroup so that then I can just restore primary in the event of an emergency, and the application is online as soon as primary is online. The only time I start micromanaging that is when I know the database is headed toward the 1TB mark, and when you get to a terabyte or above, restores are a non-trivial operation. So I may want to – it’s going to take me two hours to restore this thing, what are the core subset of table that I need for the app to come online, everything else can then go into a different file group.
Brent Ozar: Let’s see, Michael says, “we have a table with 500 million rows and we add 200,000 rows a day. We find that we have to update stats every day in order to sustain query performance. is that the right answer, or is there something else that we can do?” SQL Server’s notoriously vulnerable to something called the ascending key problem. Think about a data warehouse where every day you load last night’s sales and then your users immediately go query last night’s sales and it doesn’t have any good statistics about what happened yesterday. It is normal in data warehouses to have to update stats after things like data loads complete, even in high-volume OLTP environments. For a while at Stack Overflow, we were doing stats updates every single night. I’m fine with it if you maintenance window allows for it. I’ve even handed those commands to my data warehousing team and said, whenever you get done with your nightly builds, if you can, go update statistics, just so that their report queries will be as nice as possible the next morning.
Brent Ozar: M.M. Says, “is there a trick to getting sp_Blitz’s results to getting copy pasting properly into Excel? Some of the longer fields like [inaudible] scripts don’t work…
Richie Rump: That’s it, bye, thank you…
Brent Ozar: Next question… Yeah, so it comes down to Excel chokes when you paste XML into it. So some of our scripts like sp_BlitzCache have an export to Excel flag. Turn export to Excel equals one, and then it just omits all of the big fields that Excel usually chokes on when you copy paste stuff in.
Brent Ozar: Dan Clemens wants to add trace flag 2371. See, so here’s the deal, 2371 is only automatic updates to stats. It can trigger at unpredictable times. When I’m dealing with stuff like a data warehouse, at the end of my loads, I want the stats updates to happen exactly then. Even if they didn’t hit the thresholds, I want them to go ahead and trigger. I don’t want them to trigger in the middle of my loads or ten minutes into my loads, I want to go kick them off after the loads finish so that then I can predict the times. I’m not against 2371, if people want to use it, that’s totally okay, but when I’m managing a data warehouse, that’s just why I wouldn’t go that way. And that behavior is on by default now, starting in 2016, which is kind of cool.
Brent Ozar: Chuck asks, “in an availability group setting, how should I run SQL jobs on a primary after a failover and disable them on the new secondary?” I’m a fan of having a separate server for agent jobs. So if I have something that’s got business logic, running reports, processing files, I like that being in an agent server or an SSIS server that my developers control. They can have full sysadmin rights on it, they can change the packages whenever they want, and they just point to the AG listener. So if this thing fails over in the middle of a job run, my developers know when they can retry that and when they can’t; like when there’s some kind of manual intervention they need to do. I, as a DBA, I don’t know that. I don’t know which jobs are easily restartable and which ones are not. So I tend to offload that stuff into its own server that my devs have control over.
For my jobs as a database administrator, things like backups, CHECKDB, updating stats. I know that those things are idempotent, so I know when I can safely retry and when I don’t need to retry them. So I just have them all on every replica, and at the start of every job, or at the start of processing each database, they check and see, should I process this or not? Like, am I the primary for this database? For example with backups, if you use the current maintenance plans code, Ola Hallengren’s backup scripts or Jen and Sean McCown’s Minion Backup, these things will all produce jobs that fire off and succeed all the time, they just only backup the databases that are appropriate to backup based on wherever things have failed over. Huge fan of that. So then I can use exactly the same code everywhere.
Brent Ozar: Doug says, “we are thinking of maybe setting up a SQL Server reporting services site outside of our DMZ or intranet to build reports for client consumption. Would that be a good idea?” I don’t do reporting services, but man, the phrase DMZ kind of makes me nervous.
Richie Rump: Yes, it does.
Brent Ozar: You work with clients that have to deliver, or folks who have to deliver reports to the outside world, and if so, how do they go about doing that?
Richie Rump: We never threw a server outside the DMZ, at least a database server; that never happened. Now I haven’t had to do reporting outside to the internet for about ten years now, it’s been a while. So we do a lot of our own reporting on our own, back in the early days we used [chris] reports on the web, which was a pain in the ass. We then moved to Reporting Server, and when it first came out, I mean this was – I think it was on 2000 [crosstalk] and we were super excited for that, and so we didn’t run reporting services blank. We used a web service to go and get the HTML and then we scrubbed the HTML to fit inside of our website. Now I think there’s a viewer controller thing for all that kind of stuff, if you want to do that.
In general, having data out there outside your DMZ, especially reporting data where it’s all nicely rolled up and all that, it just seems like a very bad idea, especially with all the security issues that we’ve had over the past… ever, and it just keeps on getting worse. So I mean Chipotle was just nailed and anyone who used their credit card in Chipotle for the past two months or something, their credit card – and I got hit. I got an email from my bank saying we’re going to send you a new credit card. Thanks Chipotle for your malware, I appreciate that.
Brent Ozar: The burritos are great though, you have to hand it to them.
Richie Rump: They’re not Mexican enough for me, ma, I’m sorry. I’ll go in there and I’ll eat it but – my kids love it, but I’m just… I need some [crosstalk] sauce in there somewhere. Just somewhere it needs to be on. But yes, I would think that’s a fireable offence, especially if that got compromised in any way, if you decided to put that outside the DMZ. So I wouldn’t recommend it, but it’s not my job, so…
Brent Ozar: Yeah, that’s my thought too, I would rather not do that, I would rather build the reports internally somewhere and then go deliver them externally. Alright, well thanks everybody for hanging out with us during this week’s Office Hours. I will see you guys in GroupBy in two days, on Friday we have the next GroupBy conference, and then see you at the next Office Hours. Adios everybody.