This week, Brent, Erik, Tara, and Richie discuss SQL Server performance monitoring, ASYNC_IO_COMPLETION, AWS SQL Server licensing and fees, rebuilding indexes, licensing model for Blitz scripts, stored procedures, using (or not using) primary key on a table, choosing ORMs and other tools, and more.
Enjoy the Podcast?
Office Hours Webcast – 2017-07-27
What should I monitor on 2,500 SQL Servers?
Brent Ozar: Christopher says, “2500 SQL Servers and I’m growing at about ten per week…” That’s kind of like my savings account right there, “And I’m managing [inaudible] – if I have to monitor 2500 SQL Servers, what performance statistics would you look at in order to find the really big problems that I would have to fix right now?” He says, “The schema across all these instances [inaudible], it’s just that the amount of data varies.” Wow, that’s an interesting question.
Tara Kizer: That’s a lot of servers, how big is the team?
Brent Ozar: I was going to ask the same thing, yeah.
Tara Kizer: I’ve worked at a company where there were, I think, five local DBAs and we had three offshore DBAs, you know, eight of us, and there were about 700 servers, I think. I imagine that these 2500 servers are small, maybe they’re virtual machines. Sounds like it’s the same database across each of these systems, but that’s still a lot of servers, there’s a lot of patching you have to do; maybe they’re not patching though. But I’d want to know about just the [crosstalk]. You know, CPU utilization, job failures, I mean just the normal set of monitoring stuff and just making sure that all that stuff is going to the team.
Brent Ozar: Oh man, he has a follow-up. He says, “The dev team is 30 people, there’s only one DBA.” But he says, “I’m more of a perf…” At least he says, “I’m more of a [inaudible] developer.
Tara Kizer: One DBA… [crosstalk] Don’t even have time to work on any of this stuff, you know. Yeah, you can start monitoring it, but who’s going to work on the alerts?
Brent Ozar: How do you even deploy performance monitoring across 2500 servers when you only have one full-time DBA? [crosstalk]… That’s true, oh, I don’t even know that I would want alerts of 2500 servers if I’ve only got 30 developers. That’s pretty tough. Man, what would I collect? I would collect wait time per hour, but I don’t know that I would go any deeper than that because, in terms of urgency, the amount of data you’re going to get is going to be staggeringly tough to slice and dice. Anything else that I would gather – yeah, really I would just go to wait time per hour. Then you set – whichever servers are having the worst wait time per hour, then go tackle those servers. But the amount of data that you would have to collect, even just building a database to house that across 2500 servers, is going to be a problem.
Erik Darling: Yeah, I’d probably want to get really clear priorities for which servers I need to focus on, from management. And that’s based just – and that’s probably going to just be based on, you know, how much money is coming in from certain people every month. So I’d probably just want to get, you know, three or five servers from my boss that they want me to focus on, focus on those and then see if any of the performance things can trickle down to other servers in the environment. For example, if you find some really slam-dunk missing indexes on your top three to five servers, it’s totally possible that when you go and you start looking at your other servers, those missing indexes, if they’re not usable now, will be in the future. So I’d probably want to, you know, get some priorities and then – tune my priorities and then see what would apply to the rest of the environment; because it’s a lot easier to collect data from, then analyze and then make meaningful changes on a handful of servers than it is to do it across all of them. So you want to do it and test it where it matters and then deploy across the rest where it would count.
Should I do DML with literals or variables?
Brent Ozar: Let’s see, J.H. says, “Which DML statement would be faster? If I do something with parameter values, or do I do it hard-coded as batches like strings?”
Tara Kizer: I don’t understand the question. What is meant by hard-coded versus parameter values? I don’t…
Brent Ozar: I bet you he’s doing stings of, like, insert into or update into and he’s just doing a list of values or if he’s doing parameters. You can run into parameter sniffing. I’m thinking specifically of update statements. You could run into parameter sniffing hitting update statements, but if you want to paste in a couple of examples, then we may be able to tell you more details.
Are multi-database joins slower than single-database joins?
Brent Ozar: Let’s see here, [Anupinder] asks, “For performance of queries that do joins across multiple databases on the same server, are they any slower or is there anything extra that I have to look at?”
Tara Kizer: I mean, if it’s on the same server, it’s just the normal stuff. Just look at the execution plan and see what’s going on there. What is the query waiting on also?
Brent Ozar: Is it any worse than – if I say four tables across four databases, is it any worse than four tables in the same database?
Tara Kizer: I haven’t seen any performance difference between that, but maybe I could be wrong, I don’t know. [crosstalk]
Erik Darling: Usual stuff like volume of data, if my index is lined up for the query that I’m running, all that stuff. You know, look for blocking across databases. That might be a little bit more challenging to not go, because imagine if a join is blocked in one database but not in all the other ones. What’s SQL going to get up to?
Is it bad to run linked server queries to localhost?
Brent Ozar: Steve comes back with a clarification, and I’m so sorry to report this answer. He says he’s “Doing a linked server from self to self.”
Tara Kizer: Sucks to be you.
Brent Ozar: Oh dude. Alright, so the problems, and I’m probably going to miss some of the problems, but I’m going to say one and [inaudible] guys know any others. Here’s the big ugly part, whenever you’re doing linked server queries out, those results are not going to be cached; not even just the results, the log data pages aren’t going to be cached. Every time, SQL Server is going to be like, whoa, I got to go do linked server query, who knows what happens over on the other side?
SQL Server can even make dumb decisions about predicate pushdown. It may bring back table results across the linked server and then do the joins to the tables. So if at any possibility, I would go try to run them without the double linked server and see what happens, see if you get a performance difference there. I’ve seen a couple of cases where it was staggering. Like SQL Server was dumping a bunch of stuff into tempdb because it didn’t have enough RAM to cache all these tables that it was bringing back from the linked server. Any other things that I would worry about there? Distributed transactions would suck. Anything else that I would worry about?
Erik Darling: I’m just sill so puzzled by the concept of it and how this is the vendor’s big idea for high availability. Like, performance hasn’t even hit my mind yet, I’m just…
Tara Kizer: I know, what article did they read that they thought that it was a good idea and it would solve [crosstalk]
Erik Darling: It’s like, no, no, no we’ll query the server from itself to itself using a link. That’s like brushing your teeth with your weak hand, like what?
Brent Ozar: The one that I saw once that – a vendor was telling me about this, he was like, “Look, we just have one linked server for production, we have another linked server for dev and another for QA, but sometimes they’re on the same server, sometimes they’re on different database servers…” I’m like, no, no, oh god no, no.
What does ASYNC_IO_COMPLETION mean?
Brent Ozar: Let’s see, Graham says, “We have an instance that’s [inaudible] thousand batch requests per second.” He says, “My expensive queries are always updates and inserts but my biggest wait type is async I/O completion. What does Async I/O completion mean?”
Erik Darling: That’s tied up into backups a lot.
Tara Kizer: Backups and [crosstalk]. It keeps coming up on client’s systems, and I’ve been saying, you know, this is the full backups or possibly differential backups. It doesn’t apply to the transaction log backups. But I suspect – I mean it’s also at checkpoints and I think that the checkpoints are notable on these client systems. So in the past, I’ve said it’s probably your backups, your backup throughput is not the greatest, but when their backup is fine, I think they are experiencing an I/O issue, even though the other I/O wait stats aren’t as significant. But I think it’s the checkpoints, you know, writing the dirty pages to disk.
Erik Darling: I’ve never seen those tied in together. That’d be interesting if it did. Just for async I/O completion, it’s always just like from the start of a backup to the end of a backup, you’re accruing that wait.
Tara Kizer: Right, so the number of average waiting time on, it doesn’t reflect the backup job time. That’s why I’ve been confused on it, so I looked up Paul Randall’s async I/O completion article and he mentions checkpoints in there. So yeah, I wish that these wait stats wouldn’t have multiple meanings, you know. Separate them out, separate checkpoints out from backups.
Does AWS have extra licensing fees for SQL Server?
Brent Ozar: J.H. says, “Does AWS SQL Server have any extra fees, like the amount of data that gets queried, high availability clustering, or are all these fees upfront with licensing?” So licensing is one part of it, you can either bring your own licensing or you can pay Amazon by the hour, or longer term if you do reserved instances. There are the normal Amazon, and all cloud providers, have egress fees for networking, like how much data you pull out of the server. Generally, in terms of queries, that’s not a big deal. But if you do backups every day and you pull those backups out of SQL Server and put them somewhere else like on-premises or in another cloud, you can run into egress charges if you do enough of that.
Do stats updates help my update queries?
Brent Ozar: Tim says, “I’ve got an ERP update that times out updating a table. Whenever I rebuild my indexes, all of a sudden these updates no longer time out. Could this be an issue with statistics?”
Tara Kizer: Statistics or just, you know, putting those indexes back, you might get a different execution plan. So it could just be a bad plan cache too. Do people still – when loading data, is it common to still drop all the indexes, load the data in and then create the indexes again?
Brent Ozar: You know, I’ve seen it happen in a couple…
Tara Kizer: Back in the day, that’s what we did, but I don’t know that that’s common anymore.
Erik Darling: Whenever I’ve had to do it, rather than go through all that, I would always just have a staging table so that I could get my fancy minimal logging and all that other garbage, dump data in as fast as I could, and then put it out to my prod tables in batches so that I didn’t have to drop all my indexes and all that other stuff.
Richie Rump: Yeah, and I guess it depends on how much data you’re loading, right? I mean, if we’re loading a gigabyte of data, I may want to drop the indexes, I don’t know.
Erik Darling: So one thing that I would be curious about is prior to SQL Server 2014, SQL would have a whole lot of trouble with execution plans if there were ascending keys involved. So if your update is only, say, hitting the most recent chunk of data and you haven’t hit a statistics update threshold yet and it’s an ascended key, well you can run into some performance trouble in that regard. And then the index rebuild would update stats at the full scan, you would have that ascended key information in the histogram now, and that might do it.
What licensing model to the Blitz scripts use?
Brent Ozar: Tim asks, “What licensing model do the Blitz scripts use?” We use the MIT license, which kind of means that you can do anything with it that you want as long as you retain the copyright notice inside the script. Like if you go hand it out to other people or you change it, you just need to include the copyright notice, but you can charge for it, you can give it to strangers on the street, whatever. And then, “Do you retain any rights to pull back data, even aggregated, for the servers that it’s on?” Because it’s the MIT license, that doesn’t say anything about what we could do with the data, however, all the sp_Blitz scripts are just plain T-SQL that send results back to you. So they don’t do inserts over the internet or push stuff back to us in any way shape or form. We did used to, a long time ago in a galaxy far, far away; we had a Windows app that could have been able to do that. It didn’t, but just in terms of any time you run a Windows app, find out where the data’s going.
Erik Darling: You know, what’s really funny is when I first started working here, there was a conversation about the Windows app. And I Googled something and someone had, like, uploaded a PDF of the findings from the Windows app to the website and it was searchable. And so I click on it and I’m looking at it and I’m like, someone put this miserable sp_Blitz output online and it was searchable. Like you could find it by searching some of the Blitz text. I was like holy cow, someone did that. Like, I thought I was the only one who ever downloaded the app.
Brent Ozar: Oh, what I’ve learned over time is, if you give people an easy way to share stuff, they will do it. One way or another – here’s my social security number, can someone tell me how to fix my credit?
Would you rather use the server name or a DNS CNAME?
Brent Ozar: Brandon says, “When you guys let apps or clients connect to a SQL Server, would you rather have them connect to just the server’s name, or would you rather connect a DNS CNAME or a SQL alias and why?”
Tara Kizer: So having done some desktop support in the past and any time the server name changes, you know, it’s painful to update all those connection strings out there in the world. So I’m a big fan of using DNS aliases, and if you’re opt using some kind of fancy feature like either database mirroring or availability groups, which has the ability to point to – you know, like availability groups point to a listener name. but if you’re not using stuff like that – I like the DNS CNAMEs. I have used SQL aliases, but that has to get deployed to every single machine, it’s not just on a SQL Server. That’s every single place that has to connect. You can deploy that easily, it’s just a registry file, you can deploy it easily via some kind of login script, and I’ve done that in the past. But DNS alias, I think, is the best solution if not using something like an availability group listener.
Brent Ozar: As someone who’s had to play around with group policy before to push out SQL alias…
Should my SSRS reports use a stored procedure?
Brent Ozar: Dan says, “When you have an SSRS report…” And this really could be for any report, “Would you rather have the data be fetched from a stored procedure or a query built into the SSRS report?”
Tara Kizer: I prefer stored procedures because if you have to make a change, you could just modify the stored procedure. You don’t have to go and fix the report, the application or whatever. You know, I’ve had situations where there was a little tiny bug in the code and it required a full release to get this thing to ploy to production, whereas if it’s in a stored procedure, I could just go fix that real quick. And yeah, I’m going to do change control and all that stuff, but it’s so much simpler. I mean, there’s benefits to using stored procedures besides that, but I don’t like just throwing the queries into the report.
Why would you NOT create a primary key on a table?
Brent Ozar: Let’s see here, Mark says, “Why would you not create a primary key on a table?” He says, “I’ve got tables with 100 million or more rows and they’ve got a unique clustered index, but they didn’t create a primary key. Why wouldn’t you create one?”
Tara Kizer: So I had a client, this is a few months ago – it’s a vendor application and there were absolutely no primary keys anywhere, and they needed to move their data to other systems, and so they could not use transactional replication because transactional replication required primary keys. So they were using – I think this client was using peer to peer replication or some other, you know, some other SQL Server replication that did not require primary keys. Now, they did have the unique indexes and unique clustered indexes, but I mean, I like constraints because they’re meaningful and they tell you about the data, whereas a unique index, yeah this portion of the data is unique, but that doesn’t mean that that is the actual constraining uniqueness of that row.
Richie Rump: I mean, I think the only time I wouldn’t put a P key on there is like if it’s a loading table or something like that. Even if it’s a sorted key, I would most likely put it on there. 99 times out of 100, 999 times out of 1000 the primary key is going to go on that production table, unless it’s some sort of loading table, which is probably in a different database, as it were.
Brent Ozar: I’ve heard people say, like, I just don’t want to enforce referential integrity in the database, I believe it’s going to slow me down. So I’ve heard people say that; I don’t agree with it, but I’ve just heard people say it.
Tara Kizer: And it’s always developers that say it.
Richie Rump: There’s this thing called “indexing,” people, that you could put onto those keys to kind of speed things up a little bit. I’ve heard – it’s a thing now, I guess, I don’t know.
Is there a best ORM out there?
Brent Ozar: So speaking of things now, Nick asks, “Is there a best ORM out there or are they all pretty much the same?”
Tara Kizer: I don’t care what tool people use. I’m just – I’m coming from the perspective fighting fires in production. I just want things to perform well and not wake me up at three in the morning so that I can do proactive administration and not have to constantly be working on production problems. A lot of the time, production problems are performance based and I’ve had a lot of bad experiences with performance problems that were due to ORM. So I do have a blog post out there that’s got a bunch of activity on it. I don’t remember what the title is but it basically is why I don’t like ORMs. It’s not really about the ORM, it’s about the performance problems that they cause. I don’t care what tools you use, just don’t cause performance problems.
Richie Rump: And I’m the opposite side of that, I kind of like ORMs, I kind of like them a lot. So I don’t think there’s really a best one out there, it’s just a tool that you and your team feels most comfortable with. I’ve been on teams that have just gone straight to stored procedures and it’s thousands of lines of code written by developers, and you can see how fast that’s probably going to run. And we go to an ORM like Entity Framework and things get a lot faster. Now, you’ll run into a lot of issues because the team or the individual doesn’t understand what the ORM does on the backend and how it creates the queries and doesn’t even look at the queries on the backend.
So when you get to more complex stuff, there’s not a lot of thought there. It’s like, oh it works in dev, then you throw it into production with all the production data and it just starts slowing everything down. The best thing you can do for any tool that you have is understand that tool. And I know – understand that it’s hard for us developers because we have lots and lots of tools that do lots of different things, but if you don’t have that understanding, an ORM is going to be bad for you because you’re going to put it in situations that are not the best situations for the tool, for the ORM.
Erik Darling: When it comes to stuff like that, where I don’t have a vast or deep knowledge of what each and every tool is best or worst at, I would just want to ask two questions; what are most people doing and what are most people that are doing what you’re doing using? So that’s how I would at least narrow product choice down. I mean, Richie, can you think of an ORM outside of Entity Framework that’s really competitive with Entity Framework?
Richie Rump: Yeah, Hibernate. And Hibernate has been around forever…
Erik Darling: Okay, so what else? Like if you had to name a top three or five…
Richie Rump: So now you go into micro ORMs like Dapper and… There’s a ton of them. So what happens is that people, you know, like their developers say Entity Framework sucks, so I’m going to go create my own flavor of Entity Framework which is just going to do a small little thing of – it doesn’t do everything entity framework does, but it just does this little thing and it does it really super well. And that’s how you get all these little ORMs going on all over the place; everyone has their own little take on it and now all of this stuff is blowing up and everybody had their own simple version. And yet, mine’s the best, mine’s the best, mine’s the best, and therefore you get these developer wars going on, which I particularly hate; there’s no need.
Brent Ozar: It’s good Richie doesn’t like conflict.
Richie Rump: Yeah, me – I’m here with my hand on the detonate button right now, you know…
Erik Darling: He doesn’t like conflict or developers, so it’s just a weird pacifist line he’s drawn in the sand…
Richie Rump: I want to kill myself, essentially, at any given time, yes.
Can I prevent developers from using app logins?
Brent Ozar: J.H. Asks a really interesting question. “Do you have any suggestions on how to prevent developers from connecting in with the app’s logins accounts into production servers?”
Tara Kizer: I’ve struggled with that and for me the answer is why do the developers have that user ID and password? Isn’t there another team that has the production access – developers, at least in my environments, developers did not get to see the connection strings in production. They could see it in other environments. At my last job, or a couple jobs ago, they were talking about having some kind of encrypted application that we would control that we would be able to put whatever password we had selected when we created the account and then it would be encrypted. So even if developers had access to be able to look at the web config or whatever file it was, they couldn’t read the value and use it.
Richie Rump: Yeah, and there’s a setting, if you’re using c# in Windows config or app.config or whatever, if you’re using .NET, you can encrypt those encryption strings so nobody can see them. And it’s actually really easy to do inside the application to encrypt it.
What’s the biggest problem you faced when becoming consultants?
Brent Ozar: Now, last question that we’ll take, and this one’s kind of tricky. This is from Graham, and Graham, we may not be the right people to answer it, but we’ll give it a shot. “What’s the biggest issue or problem that you all faced when becoming consultants? Do you have any tips on how to get plugged into the SQL Server consulting community? I know blogging helps, but what else?” Richie, you went off on your own and did the consulting thing for a while; what kind of things did you see people going off and doing to get into the consulting business?”
Richie Rump: I saw a lot of people struggling, like myself. It’s a grind, right. I mean it’s a hard thing to do. So much so that I went and got a full-time job because there’s a lot of it that I just didn’t like. So the marketing and – it’s a constant getting yourself out there, going to different conferences and meeting different people, you know, write people, getting people to say yes. There’s a lot of soft skills involved within that. The stuff that Brent does really, really well, I don’t do well at all. So hey, now you know why I’m working for him; that’s just kind of the way that works.
Brent Ozar: Yeah, the hardest part by far is getting the business, getting enough business to keep yourself busy. And it’s this weird cycle where if you don’t do enough of the marketing then you don’t bring enough business in. so people will go off and they’ll get a 40 hour a week contract, it’ll last for three months, but when the contract’s over, there’s no business coming in if you weren’t building up that funnel leading up to it. So you nailed it, Graham, with blogging. Anything, blogging, presenting, writing, doing guest posts for vendors, webcasts for vendors, you’ve got to be out continuously, beating the drum going here’s what I’m good at, when you need help with this, call me. If you don’t love doing that and if you can’t do it at least eight to 16 hours a week, go work for a consulting company. It’s so easy to go to join a consulting company. Say like ours, you’ve got me, who does some of the marketing stuff to bring business in and some of the sales stuff in, but then you don’t have the pressure of how do I bring business in. It is really, really terrifying to try to [inaudible]
Richie Rump: yeah, I would even say it’s more than just the 16 hours, right? I mean I was pouring – it was almost like a second job to me, you know. Coming home, it’s three o’clock on the morning I’m still cranking out podcasts and practicing presentations and trying to avoid writing. That was kind of the norm for three and a half years. Unlike Brent, my 40 hour a week client didn’t last three months, it actually lasted three years. So that kind of helped prolong it a bit, but I still had real trouble getting people into that funnel.
Erik Darling: I was going to say, beyond marketing, just establishing a reputation has to be tough, like a trustworthy brand and reputation has to be tough. You could really establish any kind of reputation you want, but having it be a good one to bring in business is completely different.
Richie Rump: Even if you have a bad reputation [crosstalk]
Brent Ozar: Alright, well thanks everybody for hanging out with us this week. Adios, everybody.