Blog

[Video] Office Hours 2017/02/08 (With Transcriptions)

This week, Brent, Erik, Tara, and Richie discuss compatibility levels, SQL server SPN problems, Always On Availability Groups, failover clustering, capturing DML stuff, disc defragmentation, replication snapshots and reindexing, load testing, PowerShell, and whether you should pursue the path of a DBA.

Here’s the video on YouTube:

Office Hours Webcast - 2017/02/08

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes

Office Hours Webcast – 2017-02-08

 

How do I know it’s safe to delete an index?

Brent Ozar: We’ve got a bunch of questions in here already, we might as well go ahead and get started. We’re a little early, but why not? You people have good questions in here. This is an interesting one. Roland asks, “How can I be sure of no negative effects when I’m deleting an unused index? Is there any way that the optimizer is using those stats for a better plan but not using the index itself?”

Erik Darling: The only way to do that is to test, man. There’s not a better way. This is something I think I talked about in a blogpost but I can’t remember. I don’t know if it’s out yet. When you’re adding or removing indexes, or whatever you’re doing with indexes, people always focus on the one query that they’re working with. So if they’re adding a missing index they’re like, “Oh, it makes this query better.” But they never like sort of regression test other queries or other things that go on on the server as well. So really the only way to do is to know your workload, try to know your queries. I’m not asking you to know every single index they use, but try to know which queries run on average. When you go to add or remove indexes, make sure that you can replicate your normal workload to see if there’s any regressions within it. It’s hard but that’s why you get paid a lot of money, pal.

 

Can I get different plans in different database contexts?

Brent Ozar: Kyle says, “Can you stand just one more compatibility level question?” He says, “I’m asking about compatibility levels between databases on the same server. I had a query that went totally wonky when it queried across databases. One of my databases is 2012 compat, the other was 2016.” So could you get different query plans for having databases in two different compat levels?

Erik Darling: Yeah, cardinality estimator is different.

Brent Ozar: You can have the same query run in three different databases, as long as you fully prefix your objects with like SELECT * FROM StackOverflow.dbo.Posts, you can run that in several different databases and get several different query plans. Kyle mentioned that you’re on 2016, it gets even worse. 2016 has database-level settings like MAXDOP too, and that can give you different query plans per database that you’re in.

Erik Darling: There’s forced parameterization at the database level too.

Brent Ozar: Yeah, that’s good.

 

Where should I start looking for SPN problems?

Brent Ozar: Paul says, “We often lose our SPN for SQL Servers and then they need to be rebuilt.” Oh dear god. No, you don’t have to do that. “Where should I start looking for SPN problems and is this a common problem?”

Erik Darling: It is a common problem. It’s actually one that I fixed—haha.

Brent Ozar: I’m not so sure about that.

Erik Darling: Well, you know, it feels like I fixed it. For your service account user, the thing that did the trick for me, not having to rebuild them was to give it permission to delegate Kerberos authentication, it’s one of those. That’s what always worked for me. When I restarted SQL, it would re-get the SPN without a problem.

 

Are active/active Availability Groups a good idea?

Brent Ozar: Erik says—not that Erik—but he does spell his Erik the same way. He says he had a colleague mention that he wanted to have an active active AG—love this question already—so that we could make use of unused resources. “Basically, one node handling a database and another node handling the other database. Then synchronizes them across. I have an inclination that this is something I want to avoid. Any help would be appreciated.” I don’t know if he wants help with… what’s the question in here?

Erik Darling: Should I? Will I? Tara, take it away.

Tara Kizer: There’s nothing wrong with it. I mean, you’re going to need to license both servers. You need two availability groups since they’re going to be writing on one node and then writing in a different database another node. You may want to have a third node just in case if one node can’t handle the load of both databases if you lose a node. A third node would be recommended for a scenario like this where two nodes are active. You want to have a passive node in this scenario.

Brent Ozar: What would make you choose Always On Availability Groups over failover clustering if you had the choice between those? Is there anything that would make you choose one way or the other?

Tara Kizer: For me, I like failover clustering but we also had disaster recovery requirements so we had to pair that with database mirroring. So availability groups, we were able to get rid of FCIs and database mirroring and just use one technology. So just using one technology was my goal. Plus, we got to get rid of replication for read-only reports. So got to get rid of three things for just one feature.

Brent Ozar: Plus, we fired that guy we never really liked.

 

What’s the easiest way to capture DML on one table?

Brent Ozar: J.H. says, “What’s the easiest way to capture DML stuff—inserts, updates, and deletes—for a specific table?” He heard that change tracking and change data capture doesn’t track things like username, T-SQL statements, date and time, and host and IP address. Wow, you want to capture a lot of stuff for a specific table. Have you guys ever tried to audit username, host name, IP address and all of that for inserts, updates, and deletes?

Erik Darling: Not I.

Tara Kizer: The performance overhead on this kind of stuff is just not worth it usually. Why are you trying to audit this stuff?

Brent Ozar: Oh, he has a follow up question. This is, boy, this really makes it interesting. He says, “It’s in dev and there aren’t too many DMLs, so I’m not really concerned about performance. Would triggers be good? Or something better or easier?” It’s in dev, I have an easy way to do that. You restore the database every day. Every day you restore from production. Then anybody who is complaining about their inserts, updates, and deletes, whoever starts complaining that their data is lost, that’s how you find out who did it. Dev is not where you keep your data.

Erik Darling: If it were production, for that granularity I would go with a third-party tool. I would want someone who already has it figured out. Rolling your own for that is a pain in the butt.

Tara Kizer: And for a dev environment, I might even just use a server-side trace, not much activity is occurring there. Just go ahead and trace it.

Brent Ozar: Yeah, it’s easy to do. Every now and then I meet DBAs who are like, “I want to control everything with an iron fist and not let my developers do anything.” I’m like let them go put data in. They’re trying to do test cases and I know Richie puts garbage data in the database all the time. I’m not sure he puts in any other kind of data in.

Richie Rump: Yeah, “Brent sucks. Brent sucks a lot. We don’t like Brent. Go away, Brent.” You don’t actually read the dev database so I can put whatever I want.

Brent Ozar: Yeah, then it becomes production. Next thing you know, wait a minute, our website…

 

Have you ever had disk fragmentation affect performance?

Brent Ozar: Nestor says, “Did you guys ever experience performance problems that were due to disk fragmentation aka the hard drive?”

Tara Kizer: I would never know because I don’t ever worry about disk fragmentation. So how would I even know if I’m experiencing performance problems because of it? I don’t worry about it. Is it a problem? I don’t know. Who can defragment their disks? Who’s got the downtime to do that? I’ve always supported 24 by 7 systems that have really high SLAs.

Erik Darling: You can run into some bad problems too. If you’re on a SAN and you’re thin provisioned and you run disk defragmenter, all of a sudden you are very thickly provisioned in some situations. The last time I defragmented a disk and it did something, it was like Windows 95 and it made Riven like that much faster.

Richie Rump: Yes.

Erik Darling: Like the fog looked a little bit better in Riven after I defragmented my disk, I think. But that might have been the restart. I don’t know.

Richie Rump: Images came up just a little bit faster, right?

Brent Ozar: Don’t you think that back in those days the display of the disk defragmenter thing in Windows, that was the big reason to run it. It was really cool.

Erik Darling: … changing colors, it was nice. And like all the computer fixing suites had like their own defragmenter. Norton had one and McAfee had one and they’d all do something different.

Brent Ozar: The good ole days.

Erik Darling: Fragmentation has been a Windows lie for decades.

Tara Kizer: I wonder if the only time to worry about disk fragmentation is when you’re encountering an issue with sparse file. When you have the sparse file issue with CHECKDB isn’t that compounded by the disk fragmentation?

Brent Ozar: Yeah.

Tara Kizer: There’s a workaround for that but that’s the only thing I could think of.

Brent Ozar: Yeah, and when you talk about fixing it, if you’re really fragmented that bad that CHECKDB won’t run, then you’re really talking about restoring over to another SQL Server if you want to minimize downtime. Do log shipping or database mirroring over somewhere else. Who wants to do that? That’s a hell of a way to fix fragmentation.

Erik Darling: Yeah, plus you’re probably on some ancient, awful server with like misaligned disks and whatever, you probably have Windows 2003 or something.

 

More about the active/active Availability Group concept

Brent Ozar: Erik says, “I have a follow up to the active active availability group question.” He said he would rather just leave both of these databases on the primary and not have to split them up across nodes. “My real question is are there any benefits across splitting these databases onto separate nodes or would the performance be the same if they were all on the same node?”

Tara Kizer: No, the performance is different. If an instance runs on one node, another an instance runs on the other node, you’re using separate hardware so all your DML, if it’s synchronous, it’s all two-phase commit. But it’s got different plan cache, different buffer pool. You’re utilizing the hardware of two servers.

Brent Ozar: Buffer pool is a great point, yeah.

 

Would RCSI cause a problem in simple recovery model?

Brent Ozar: William says, “What kinds of problems would you encounter if you enabled both RCSI and simple recovery model?” I would like to point out that he used the word model which is Tara’s favorite hot button.

Tara Kizer: Yay!

Brent Ozar: Wow, that’s an interesting question.

Tara Kizer: I don’t see how they’re necessarily related. RCSI doesn’t really care. Your snapshots are being stored in tempdb’s version store. So I don’t see how—full recovery, simple recovery model, who cares?

Brent Ozar: If you have a long running transaction, a BEGIN TRAN, it doesn’t matter whether you’re in simple or full recovery or whether you have RSCI turned on. That log file is going to grow.

 

How do I know if SQL Server is using a lot of CPU?

Brent Ozar: Victor says, “We had high CPU usage on a SQL Server. I don’t think it was actually the SQL Server service but it’s postmortem. How can I rule out the SQL Server was the problem to my manager?”

Tara Kizer: What kind of monitoring did you have in place to have collected the data? Did you have WhoIsActive collecting data every minute or something like that? What kind of monitoring do you have in place? If you don’t have anything in place, you can’t answer this.

Brent Ozar: I feel like an idiot for not thinking about that. I was just like immediately like, “Well, you can’t.” The other thing, if you caught it within four hours, CPU usage is in the ring buffer—but I’m sure, you didn’t wait for Office Hours. It didn’t just happen two hours ago.

 

Is Resource Governor a good way to fix slow queries?

Brent Ozar: Doug says, “I’ve been using Resource Governor to prevent poorly written queries from specific reporting servers from taking down an entire SQL instance. Is that the proper use for Resource Governor or is there a better way to fix this?”

Tara Kizer: I get worried about slowing down problematic queries because now they’re going to be running longer and taking out locks for longer. Possibly wrecking more havoc because they’re running longer.

Erik Darling: I guess if you’re only concern is that they don’t take over the server, then that’s probably a pretty valid use for it, but you know, I would expect repercussions at some point.

Brent Ozar: Often the thing people complain about is hammering the disks too and it didn’t control I/O until 2016. So if it was written really poorly, it was still just hammering the disks.

 

Does a replication snapshot try to rebuild indexes online?

Brent Ozar: Philip says, “Does a replication snapshot try to rebuild indexes online?” Philip, I don’t even recognize the words in your question. It’s a long question. Now you have me intrigued. Normally, Phillip, I would have stopped here. He says, “I ask because we get application disconnections during a replication snapshot with 2012 Standard Edition. I would want Enterprise but not my choice and it wouldn’t matter because I could reindex online. I think it might be reindexing because of a message about reindexing I saw pop up on the replication monitor while it was processing the snapshot.”

Tara Kizer: It has to create the indexes if you told it to on the subscriber. So it’s not reindexing the publisher database. The only effect that takes place on the publisher as far as something bad is when it has to do the schema locks towards the end so you could experience some downtime, depending upon how sensitive your application is while at the very end of that snapshot process when it does that. Or maybe it’s during the entire snapshot process but the snapshot usually is pretty fast, even on large databases. It’s applying the snapshot on the subscriber that takes a while.

Erik Darling: I would probably just want to stop doing index maintenance and see if you still get the same messages.

Brent Ozar: Oh. That’s true too.

Erik Darling: I would just stop doing it. If your snapshots work then…

Brent Ozar: I thought you were going to say stop doing replication.

Tara Kizer: That too.

Erik Darling: I would stop doing both in a perfect world but if one’s breaking the other then you stop doing one. If the other one works, then cool.

 

How can I tell if linked servers are slowing me down?

Brent Ozar: I like Nick’s question. Nick says, “How do I go about getting metrics on linked server usage and performance? Someone set one up without my consent and now I need some data to back up my reason to launch them into the sun.”

Tara Kizer: I would look at the wait stats and look for the OLEDB wait stat. If that’s really high up on the list and there’s a lot of waiting time on it, linked servers are usually the culprit of OLEDB but you need to figure out what queries are running.

Erik Darling: BlitzCache will actually warn you if you have expensive remote queries in it. I wrote that one, that’s the only reason I know. If Jeremiah had written it, I would have no idea. It will check your plan cache, I forget what all the rules are but it’s like if you have an expensive remote query operator in your execution plan it will throw up a warning about that.

Richie Rump: See, I question his decision to launch him into the sun. That seems expensive. You got to get that person all the way out of the atmosphere and then breaking the gravity of the earth and into the sun. Besides, nobody can see them at that point. They’re burned up. So I think maybe stringing him up somewhere where everybody can see what happens when you do something stupid, I think that may be a better use of your time and your money.

Erik Darling: When really all you need is Aqua Net and a lighter. Can save yourself a whole lot of rocket fuel.

Richie Rump: I was just thinking duct tape and a flag pole but okay, man.

Brent Ozar: Here, ladies and gentlemen, you can see the difference between New York justice and Miami justice.

 

Is Always On recommended in virtual machines?

Brent Ozar: Anna asks, “Is Always On recommended in virtual machines?” Sure.

Tara Kizer: I’d support it in both. And what do you mean by Always On? Are we talking about availability groups or a failover cluster instance? Yes, virtual or physical.

Brent Ozar: Here comes Allan Hirt.

Tara Kizer: The Always On part drives me crazy. I don’t care about the space, the capitalization, things like that. “Always On” is just a marketing term.

Brent Ozar: And what does it refer to? Because there’s a couple of differences there.

Tara Kizer: Usually people are trying to refer to availability groups but Always On also has failover cluster instances underneath it.

Brent Ozar: Just like the word snapshot, there’s like half a dozen uses for it.

 

Can you mark a table read-only?

Brent Ozar: John asks, “Is it possible to mark a table as read only as opposed to marking an entire database as read only?”

Erik Darling: No, but you could make it so that no users can actually do anything to it.

Brent Ozar: No, you can do better than that.

Erik Darling: Really? What would you do?

Tara Kizer: Put it in its own file group and mark it read only? Is that what it is?

Brent Ozar: Yes.

Tara Kizer: Can you mark a file group read only in SQL Server or are you doing it at the file level?

Brent Ozar: You’d mark the whole file group read only. It’s elegant, users have no idea that anything changed, their just inserts, updates, and deletes fail.

 

Does replication split updates into deletes and inserts?

Brent Ozar: Roland says, “Are there some resources on how multiple independent updates on a table are aggregated into one transaction by an indexed view being replicated to another server?” The hell are you asking, Roland? Oh, he says in a follow up, “I think that my updates are being split into deletes and inserts.”

Tara Kizer: That’s true of replication but it doesn’t have to be, that’s just by default. Updates do get switched to delete and insert but there is a setting where you can change it to be updates.

Brent Ozar: Are you kidding me?

Tara Kizer: No.

Erik Darling: Is that a setting or a trace flag?

Tara Kizer: I don’t remember.

Erik Darling: I think it’s a trace flag that does it. Yeah, change data capture actually does the same thing when you update it, it shows an insert and delete. I’m not sure if that’s for every update because I know that behind the scenes the optimizer will make a choice to either do a simple update or an insert and delete combo. There’s like the whole split, sort, collapse chain of operators in an execution plan. So I don’t know if it’s every single one but for the ones that do do that, it’s kind of annoying.

Brent Ozar: I think between all of us working in our company chatroom I learn something sad about SQL Server every day. Many days I’m also happy I learn cool things, but most days I learn something sad too. That’s my sad thing for today.

 

My Windows team wants to put data and logs on separate drives…

Brent Ozar: Justin says, “My Windows team, they do our SQL backups.” This sounds like the beginning of a poem. “They want to put all of our high I/O databases onto a single drive and their logs on a separate drive. Would this be bad for performance? Right now we have multiple drives and our databases are spread out by I/O.”

Tara Kizer: What are they trying to solve though?

Brent Ozar: I like that question. I wonder if they’re trying to do like snapshot backups or a specific SAN vendor snapshot backups that makes the data and logs be on different drives. Justin says, “Our backup software has issues.” Yes.

Tara Kizer: Nix the backup software.

Brent Ozar: Yeah, or switch to native SQL Server backups. It’s just so much easier, write to a file share then backup that file share. Do like Bill Clinton, [imitating Bill Clinton] “I feel your pain. Let’s get together and fix your pain.”

 

How do you load test a new SQL Server?

Brent Ozar: Guillermo says, “After setting up a new instance of SQL Server, do you do any kind of load testing? What kinds of tools or methods do you use for this?” Erik is in the midst of—just finished writing a white paper for Google about this. What’s your white paper? What’s the methods in there?

Erik Darling: Goodness. The stuff that I really like to do is before you even install SQL Server, you want to get something like CrystalDiskMark or Diskspd fired up and make sure that you’re getting adequate disk throughput because if you’re not, before SQL Server is on there, it’s not going to get better after you install SQL. SQL doesn’t do anything magically to your I/O. After that, I want to throw something like CPU-Z on there and make sure that my CPU speeds are as advertised. I want to make sure that I don’t have something like balanced power mode or something else yanking them down, something ugly like that. Once I validate my hardware works as is, then I’m going to move on, I’m going to install SQL. I’m going to start running backups, DBCC CHECKDB, maybe some index rebuilds just to really test server throughput. From there, you can expand on to testing like actual query workload on there. What you really want to make sure is the basic maintenance tasks that you run on a new server run the same as they do on your old server. If you have a match on those or you do better on the new server, then you’re in good shape for going forward.

 

Should I learn PowerShell for my career?

(Editorial note: to the PowerShell folks who are inevitably reading this and about to convulse into spasms, ready to pound their keyboards with the force of a thousand soldiers, make sure you read the question and really digest it. The last few words of the question are really important.)

Brent Ozar: Brian asks, “My sysadmins tout their knowledge of PowerShell. I poked around with it…” [Erik makes blowing a raspberry sound] There’s your answer. I can stop reading there. “I poked around with it but as a DBA I’ve never had a substantial need for it. In your opinions, how important is it for me to learn PowerShell from a looking forward career perspective?” Well, we’ll go in a row. So, Richie, do you need to learn PowerShell for your career?

Richie Rump: F PowerShell.

Brent Ozar: Erik, do you need to learn PowerShell for your career?

Erik Darling: What he said.

Brent Ozar: Tara, do you need to learn PowerShell for your career?

Tara Kizer: No. However, I have attended a PowerShell class at one of my companies. My last job did use PowerShell quite a bit. The DBAs wrote scripts to do various things to help us do some tasks that we had to do, let’s say quarterly. The PowerShell script helped us do those tasks much more efficiently.

Erik Darling: Yeah, PowerShell has some neat stuff where you can spread out to multiple servers and do things. It has some all right integration with active directory and failover clustering, there are some cool commands that you can run and do things with. But generally for the things that I see most people applying PowerShell to, it’s like there is already a good enough hammer for that.

Tara Kizer: I think that maybe PowerShell would be useful for larger corporations that have a lot of SQL Servers and you need to loop through those servers to do the same repetitive tasks across all of them.

Erik Darling: But you’ll run into problems in there too.

Richie Rump: There’s an opening for Oracle PL/SQL report writer, maybe I’d rather be doing that.

Erik Darling: When I had to use PowerShell, like when I tried to use PowerShell across an enterprise, I got screwed so many times by just like there’s different versions of PowerShell on some machines. PowerShell isn’t set up to be remotely accessed on some machines. You always had to do this annoying check to make sure you could actually execute a remote script or something. Just like the whole thing was like…

Richie Rump: Yeah, I got a few things about PowerShell. I actually had to write some—what was that, a week ago? A couple weeks ago? I don’t know. The problem is is that the documentation isn’t that great compared to things like C# and some other stuff. It’s really hard to get in there, okay, how do I do X? Or how does this thing work? It’s not that intuitive. The other thing is if I wanted to do X, which way to do that? The blogging community, there’s like 15 different ways to do that. There’s not one generically, hey, this is the right way to do it. It’s always the next version is the right way to do it, but you don’t know, like Erik said, which version is on which machine. So things get all out of whack. It gets all out of hand. The other thing is the PowerShell community, I’d lump them in with the Apple and the Crossfitters.

Brent Ozar: And vegetarians.

Richie Rump: Vegans, really. Not vegetarians. At least I can get along with them. Hey, let’s have some fish. The problem is is that everything is PowerShell. It’s never, “Hey, can we try another technology, maybe do it better?” No. The answer is always PowerShell. I’m Thor PowerShell wielding a PowerShell hammer and I’m going to beat you into submission. I’m a multi-toolset guy. I like to use a lot of different technologies depending on what I’m using. Just the other day I was using DynamoDB and Postgres. I work for a SQL Server company, people. Seriously. I just don’t get the one tool fits all thing.

Erik Darling: “I wrote some PowerShell that runs a SQL query.” Like, what? All right.

Brent Ozar: I have this thought and I’ll keep beating the dead horse. I’m really motivated by retirement, like I want to cross the finish line to retirement so I can go drink and do whatever it is retired people do, just vegetate on a beach. If something will make my salary go up, I’m interested. I’m vaguely aware. If something won’t propel my salary upward, I’m kind of like, yeah, it’s cool, but there’s so many things I’d rather learn. I want to learn Spanish and French. There’s all kinds of things, or how to bake bread.

Erik Darling: Oracle.

Brent Ozar: Oracle, yeah. Postgres. DynamoDB. All kinds of stuff. So I’m kind of like is PowerShell going to make me more money? It’s a good scripting language I guess but if I’m going to learn a programming language at this time, it’s going to be C#. It’s going to be something that I can reuse across a wide variety of stuff. I didn’t say JavaScript.

Richie Rump: JavaScript.

Brent Ozar: That ranks up there with SAP. I get that you make a boatload of money doing it, but it is hard.

Richie Rump: So he’s motivated by retirement and yet he still employs me, ladies and gentlemen. I’m not quite sure I understand.

Brent Ozar: You’re inspiring. Being around you is so calming and soothing because nothing gets worked on. It’s almost like I’m retired already.

Richie Rump: Hi, I’m Richie Rump. Have you met me at all?

 

Is every DBA supposed to know ETL and data warehousing?

Brent Ozar: Graham says, “I’ve been looking at job postings for senior DBAs and I see a lot of data warehousing and ETL experience required. I’ve also seen a job that asks for 8+ years of experience with SQL Server 2012. Are these job postings from people who want the flavor of the month or are ETL and data warehouse work really becoming the purview of DBAs?”

Tara Kizer: I just see those and go, “Next.” I just look right over them. I don’t give them any attention whatsoever.

Brent Ozar: Wait a minute. Why are you looking at job ads at all?

Tara Kizer: They just come through LinkedIn all the time. I find it entertaining for the most part. I’ve always looked at the recruiter emails over the years and I was very happy at the time with where I was. But as far as the 8 years of experience with SQL Server 2012, they just mean SQL Server and they want you to have experience with 2012. But the ETL/ data warehouse thing, the company is just looking for a jack of all trades. They want to pay for one person and not two. This is just a money thing. You don’t have that experience and you don’t have any interest, just don’t apply for it. I wouldn’t be applying for that job, that’s for sure.

Erik Darling: The other thing to keep in mind is that the people who write job postings are either HR people who have no idea what they’re asking for. Or, they’re tech people who are supremely pants-less when it comes to what they need for SQL Server. So they will just throw whatever crap out there they think might be necessary. They’ll just like open up the installer and be like, “Yeah, we need database engine. We need R. We need SSIS, SSRS. We need data quality. Yeah, data quality sounds good.” They’ll just throw anything in there, years of experience, you need AG, performance tuning, backups.

Richie Rump: I think there’s a mentality too to cast the net wide so I could theoretically get more candidates and then hopefully I’ll get the one I want in that wide net. Typically, that usually doesn’t work well for, especially guys like me. I’m like, “No.” Because I see the same thing that Tara does. You want a jack of all trades and that’s not me. So have a nice day.

Brent Ozar: They’re writing that position to replace somebody who was stressed out and left. It was the one person, the one DBA who had to put together 15 things with duct tape and he’s like, “I’m so tired of this, working with this. I’ve got to get out of here.” Then they’re like, “Okay, we’ve got hire somebody to replace him,” not thinking that there’s a reason that guy left.

 

Can CHECKDB be slower on an AG secondary?

Brent Ozar: Tim asks, “Can CHECKDB be blocked? I was running it on a secondary in my availability group and it took 9+ hours but when I restored it somewhere else it only took an hour.”

Tara Kizer: I would look at the load on the original server. CHECKDB is a very I/O intensive resource process. If you’re running it when other loads are occurring, it’s going to slow it down. Even if it’s off hours, a lot of people have a lot of maintenance type stuff that has to occur at night and your test server probably doesn’t have any load. I’m surprised that a test server would complete in an hour when it’s nine hours in production because usually a test server has much lesser hardware and could take a while.

Brent Ozar: Slower storage.

Erik Darling: One thing that Brent wisely has brought up in the past is when DBAs set up their jobs, they set them all up on every server to run at the same time. So it’s always like all your backups start at midnight, CHECKDB always starts at 2:00 in the morning. Your index rebuilds all start at 4:00 in the morning, so that SAN at every like two hours just gets a new set of maintenance tasks.

Tara Kizer: Our SAN admins always knew when CHECKDB was running just based upon the I/O load.

Brent Ozar: Because you’re like, well, it’s all different servers. How bad could it be? Not knowing that they all have something in common.

 

Should DBAs be in the 90th percentile of the SAT?

Brent Ozar: Tim might have seen the strangest thing that I’ve seen in a while. He says, “I’ve seen a job listing that mentions that they want a candidate to be in the 90th percentile of ACT and SAT testing for their DBA.”

Richie Rump: I.e. we want somebody young.

Tara Kizer: I never even took them. I had no plans of going to a four-year college right out of high school. I was going to go to junior college. That was always the plan to save money and it’s not required to go to junior college so they would not be able to hire me with that.

Erik Darling: I never took the SATs.

Brent Ozar: I was top 1 percent of top 1 percent. I was a National Merit finalist, got full rides anywhere I wanted to go and I just wanted to go as far south as I could. I went to Houston just so I could get away from the snow. I’m like, “This place is amazing. There’s Mexican food and all kinds of stuff.”

Richie Rump: Houston is far from amazing, Brent.

Brent Ozar: Oh, it’s fantastic. To live there.

Richie Rump: It’s terrible. I spent 24 hours with you and I’m like, “Get me out of here.”

Brent Ozar: It’s probably easier—yeah—and at the time I was a Cure fan. I had long, dyed black hair. It was an unusual time of my life.

Erik Darling: So Texas was the place for you.

Brent Ozar: It was. Well, I continuously got pulled over. I had a Camaro, like an IROC Z kind of Camaro with T-tops and like black louvers and all that over my windows. I was continuously getting pulled over because they thought I was trafficking weed back and forth between Michigan and Texas when I would go up to see my family or whatever. I fit some kind of stereotype, I just don’t know what it was at the time.

Richie Rump: It wasn’t a stereotype here in Miami, I’m sorry, Brent. You would be pulled over too just because you’re weird.

Brent Ozar: Although I—well, yeah. See, that immediately makes me want to go down to, “Well I wasn’t that European guy coming over to Miami wearing a thong.”

Richie Rump: We welcome them with open arms because they have money. Please come.

Brent Ozar: They do have a lot of money, yes. You opened your arms kind of outstretched though, like you’re looking for a hug. I don’t know that you want that with the big European guys in speedos.

Richie Rump: How much will I get for this hug?

Erik Darling: The good thing is they’re covered in so much oil they’re easy to squirm away from.

Brent Ozar: That’s gross.

Richie Rump: You were speaking of retirement, Brent. That is my retirement goal. That’s to hit 300 pounds so I fit in my speedo. That is my retirement.

Erik Darling: Tara looks so disgusted right now.

Brent Ozar: Our next company retreat is going to be somewhere that requires parkas and fur coats.

Erik Darling: Now Tara is really looking at those job postings.

Tara Kizer: I’m in a down jacket today and have been for the past two weeks probably. I live in San Diego and I still have to wear a down jacket so I’m just going to freeze in August for the retreat.

Richie Rump: Yes, and I, like some of our clients, am not wearing pants.

Brent Ozar: Many, many of our clients.

 

What does a typical DBA’s day look like?

Brent Ozar: Last question we’ll take. Chris says, and this is so interesting, “I’m a database developer looking to move into a more traditional DBA role but I’m not quite sure what I would do on a daily basis. What does a typical day for a DBA look like?” We’ll work across and go from—in say your last job, Tara, as a DBA period, whatever DBA means to folks. What did your typical day look like?

Tara Kizer: I was usually working on project work. I had things assigned to me to go do. Sometimes I would be doing performance tuning, production stuff that wasn’t running very well, trying to see why it suddenly went from good performance to bad performance. You know, bad execution plans and things like that, looking over the WhoIsActive data. Usually, I was just assigned tasks. As an on-call DBA, when I was on call I’d have to be checking all the alerts and resolving whatever issues those were. Last job was a bit different. I’d say at other jobs though performance tuning was a great deal of my time and unfortunately, troubleshooting replication was a sizable percentage at times. That was unfortunate. But working on emergencies constantly. Just production outage type problems where maybe it wasn’t a server down but a significant blocking issue that the application seemed like it was down. Fighting fires a lot of times.

Brent Ozar: Erik, how about you?

Erik Darling: I think my last job was of course, I was a DBA at a Relativity shop, which changes things I think a little bit from traditional DBA roles where you’re just full time supporting a third-party application. So you’re maybe not doing query tuning as much. You are doing server and hardware tuning and coming up with plans and maybe separate heavy use cases out to other servers and stuff like that. You will do like index tuning because kCura is awesome about you adding indexes. They just say go crazy, whatever fits your workload, if it fits your macros approach to indexing, which is nice. So you can have control of that. On top of that, Tara’s different from me because she was always part of big DBA teams. When I was a DBA, I was by myself. So whenever there was an outage, whenever there was a restore that needed to get done, whenever something broke, it was just me. Me constantly. So that’s why it kind of depends on what sort of environment you end up in, the breadth and girth of the things that you’ll be exposed to and end up doing in a day.

Richie Rump: Chris, I’m going to talk to you. Just you. I like my coworkers but they lie. What you do when you walk in, you see, did my backups run? Then you figure out I need to schedule my backups. Then you run your backups. Then you leave, then you go and do it all again. That is a DBA. You don’t want to do that, man. You want to continue where you’re at. If you’re a database developer now, learn more about SQL. Learn more about the internals and how it works, how to performance. Do that and then get more into the development side. Get more into JavaScript, C#, those types of things. Get more project work because when you’re on a project, you don’t have a pager. You do not want a pager. Don’t do that. All right, man, don’t throw your life away man. You are too good. I love you, man. I’m talking to you. You’re the best. You’re the fairest. You see what I did there? You’re awesome. Don’t do this to yourself, man. You have too much to live for.

Erik Darling: I agree with Richie.

Brent Ozar: If you like database development, if it’s something that you’re enjoying and having a good time with, because it is a fun career. There’s a lot of cool stuff around being a data developer but I was intoxicated by the—I wanted to be in the middle of the room. We would have a bunch of sysadmins and network people and developers and executives and there’s an addictive thing to being the guy who troubleshoots the problems. But the flipside of that is, you are the guy who troubleshoots the problems. It’s the on call, when I finally switched to going into consulting and I didn’t go on call was probably the best month of my life when I realized that I’m like, “I can leave the phone over there.”

Tara Kizer: Being the guy that troubleshoots everything, that’s who I was two jobs ago and even when I wasn’t on call, I really was on call still. I always had to be the one that they called even when I wasn’t on call.

Richie Rump: You don’t want to be in the room where it happens. It’s all a façade. You don’t want to do that. Don’t be in the room where it happens. Just let them do it.

Brent Ozar: It’s so different at different times of your career. The first three or four years of your career suck to be the one who’s in the center of the room. You have to gain enough experience where you’re comfortable and go, “No, I can step in front of that train and I know exactly how to get this thing to work.” But, yeah, the first few years are rough. Just when you think you know what you’re doing, then all of a sudden some consultant comes in and finds out you don’t have backups and autoshrink is on, and you’re not wearing any pants. Thanks, everybody, for hanging out with us this week. We’ll see you guys next week on Office Hours. Adios.

Erik Darling: Later.


The Blog Posts You Loved to Talk About: Top-Commented Posts

SQL Server
4 Comments

In WordPress, I happened to sort our posts on the comment count column and got a huge laugh. I figured you’d like to see this list too, dear reader:

20. SQL Server 2014 Standard Edition Sucks, and It’s All Your Fault – 137 comments

19. How to Set Up Standard Edition Always On Availability Groups in SQL Server 2016 – 139

18. Microsoft SQL Server Licensing Simplified into 7 Rules – 140

17. SSD RAID Load Testing Results from a Dell PowerEdge R720 – 141

16. 7 Things Developers Should Know About SQL Server – 144

15. Jeremiah & Kendra are Heading Out – 149

14. SQL Server 2014 Licensing Changes – 150

13. You Can’t Kill Transactional Replication – 172

12. Contest: I ___ With Brent – 188

11. Meet PASS Board Candidate Matt Morollo – 200

10. How to Get a Junior DBA Job – Part 1 – 214

9. How to Set Up SQL Server 2012 Always On Availability Groups – 226

8. Top 10 Reasons Why Access Still Doesn’t Rock – 234

7. SQL Server 2012 CTP3 is Here! Five Things to Know – 237
(included a comment contest to guess the release date)

6. Stop Shrinking Your Database Files. Seriously. Now. – 244

5. Contest: SQL Server Theme Songs – 244

4. An Introduction to SQL Server Clusters – 312

2. (tie) Log Shipping FAQ – 338

2. (tie) Contest: We’re Renaming sp_BlitzFirst. What Should the New Name Be? – 338

1. Contest: Write the Scariest DBA Sentence – 1,268 comments!

Hmm, come to think of it, it’s been a while since we ran a contest….


A Tourist’s Guide to the sp_Blitz Source Code, Part 1: The Big Picture

First Responder Kit, sp_Blitz
10 Comments

sp_Blitz is our open source free SQL Server health check script.

From a really high level, here’s what it does:

  • Create #BlitzResult, a temp table to hold its output
  • Check something on your SQL Server, and if it’s found, insert a row into #BlitzResult
  • Check something else, and insert a row if it’s found
  • Do that a whole bunch of times
  • Finally, SELECT * FROM #BlitzResult

In this blog post, I’m going to give you a guided tour, and then over a few more posts, dive into some more interesting features.

TOURSTOP01: Creating #BlitzResult to Hold Data

If you’d like to follow along in the current sp_Blitz source code, do a search for TOURSTOP01, and you’ll come to this portion of the code:

TOURSTOP01

I’m using TOURSTOPXX rather than line numbers because line numbers will change over time.

The best way to explain #BlitzResults is to show you what comes out of sp_Blitz at the end:

sp_Blitz default output

That should look pretty familiar. Everything about sp_Blitz centers around populating #BlitzResults, and then showing it to the users.

So how do we populate #BlitzResult? Let’s go on to…

TOURSTOP02: Populating a Row in #BlitzResult

Search for TOURSTOP02, and bathe in the glory of CheckID #7:

TOURSTOP02: CheckID #7

Eagle-eyed readers will notice that there’s additional code above and beyond what’s shown here in the screenshot – more on that later. Let’s focus on just this screenshot for now.

If there are any stored procedures that run on startup, insert them into #BlitzResults along with an explanation.

This check inserts a row for each problem that it finds because startup-level stored procs are a Pretty Big Deal, and you probably want to know about all of them. However, that’s not the only way sp_Blitz checks work:

  • Some checks only insert a single row if they find a problem – say, a server-level configuration issue
  • Some checks insert a summary row with the number of problems they found – like heaps, because you don’t want to know about every single active heap, but just a total

TOURSTOP03: Running Checks in Every Database

If @CheckUserDatabaseObjects = 1 (and it is by default), then sp_Blitz checks inside databases too. Here’s an example:

TOURSTOP03 – sp_MSforeachdb checks

We’re using the known-crappy sp_MSforeachdb to loop through all databases and run a check. When we first started, it was the only way we could guarantee it’d work across everybody’s server. I’m not happy about that – sp_MSforeachdb hits every database, with no options to skip them, except that also sometimes it just skips databases for no reason.

Aaron Bertrand wrote a more flexible and reliable sp_MSforeachdb, and recently he was kind enough to add that to the open source First Responder Kit. We have an open Github issue to change the parameters to make it a drop-in replacement for sp_MSforeachdb – making the parameters and output compatible – but I haven’t had the chance to work on that yet. So for now, sp_MSforeachdb it is, but eventually we’d like to switch sp_Blitz over to Aaron’s new proc.

sp_MSforeachdb changes into each user database (sorta, but you still usually need to USE the database), then executes your query.

To make doggone sure we’re in the right database, we start with USE [?]; – and sp_MSforeachdb drops in your database’s name, like USE [StackOverflow];.

Then, we run the query we want to run, which is an insert into #BlitzResults if we find any problems that match our sp_Blitz check. In TOURSTOP03, we’re looking for databases that don’t have Query Store turned on yet. (I’m a big fan of that feature.)

All database-level checks are done in the section of code that starts with IF @CheckUserDatabaseObjects = 1.

TOURSTOP04: Dealing with Version Differences

SQL Server has gotten more powerful over time, adding more diagnostic tables we can check out, but sometimes those queries will fail if run on an older version of SQL Server.

TOURSTOP04 is an example:

TOURSTOP04 – skipping checks on 2000 and 2005

If you try to query the is_encrypted field on sys.databases in SQL 2000 and 2005, your query will fail because that field didn’t exist at the time. Therefore, we frame the whole thing in dynamic SQL.

TOURSTOP04 is a good example of several other things:

  • SQL 2000 and 2005 aren’t supported anyway, but this code is still in there. The code used to work okay on 2000/2005, but we’ve given up on that now that those are no longer supported, and we’re using things that only exist on 2008.
  • There are better ways to check versions, and not all the code is consistent in checking for it. In a perfect world, our code would be perfect. In practice…
  • Our indenting sucks.

TOURSTOP05: Outputting the Results

After populating #BlitzResults for thousands of lines of checks, time to dump it out:

TOURSTOP05 – last call for alcohol

And that’s it. You get a nicely formatted list of issues with your server.

This means that writing a new check is as easy as writing an INSERT statement. Well, almost – it gets just a little bit trickier because people are allowed to declare checks they want to skip.

TOURSTOP06 and TOURSTOP07: Users Can Skip Checks

Every check in sp_Blitz has a unique ID – here’s the current list of sp_Blitz CheckIDs (you have to scroll right to see the CheckIDs) and it’s also available as a Markdown file inside the First Responder Kit zip file.

The unique checks are used to build tools atop the sp_Blitz output, and they’re also used to let people skip specific checks. To see it in action, TOURSTOP06 is just a little above TOURSTOP02:

TOURSTOP06 – skipping checks if desired

There’s an IF NOT EXISTS statement that looks for a record in #SkipChecks, which is created and populated in TOURSTOP07:

TOURSTOP07 – populating #SkipChecks

When we first introduced that, people thought @SkipChecksDatabase by itself would skip all checks for that database, but that’s not how it works. @SkipChecksDatabase, @SkipChecksSchema, and @SkipChecksTable let you pass in a database, schema, and table where your own SkipChecks table lives.

Inside that table:

  • If ServerName is populated, and DatabaseName & CheckID are null, then all checks are skipped for that server
  • If DatabaseName is populated, and CheckID is null, then all checks are skipped for that database
  • If CheckID is populated, but ServerName & DatabaseName is null, then that check is skipped everywhere

That’s the theory, anyway. We don’t get a lot of bug reports on that capability, so I’m not sure how many people are using it, but sp_Blitz uses it too.

TOURSTOP08: Skipping Checks for Amazon RDS

Amazon RDS for SQL Server is their hosted & managed flavor of SQL Server. It’s real SQL Server, but they’ve disabled some administrative capabilities, so there are some queries you can’t run.

To work around that, we simply skip the checks we can’t run in RDS:

TOURSTOP08 – populating #SkipChecks for RDS

This works really well because RDS can execute everything else in sp_Blitz. This approach doesn’t work for Azure SQL DB because DMV access is so incredibly hobbled – for example, sp_MSforeachdb is out the window.

So there you have it – your first guided tour to the sp_Blitz source code. I’ll do another post to lay out a few more intriguing areas, like how we output to tables & Markdown.

Related stuff:


Inline Table Valued Functions: Parameter Snorting

You’ve probably heard about parameter sniffing

But there’s an even more insidious menace out there: Parameter Snorting.

It goes beyond ordinary parameter sniffing, where SQL at least tried to come up with a good plan for something once upon a compile. In these cases, it just plain gives up and throws a garbage number at you. You’ve seen it happen countless times with Table Variables, Local Variables, non-SARGable queries, catch-all queries, and many more poorly thunked query patterns.

While Scalar and Multi-Statement Table Valued Functions get lot of abuse around here (and rightly so), Inline Table Valued Functions aren’t perfect either. In fact, they can snort your parameters just as hard as all the rest.

Heck, they may even huff them.

First, let’s get Jon Skeet and his impersonators

In the Stack Overflow database export, there are four people in the Users table that have a DisplayName like Jon Skeet. Note that this query is most definitely not SARGable, but it gets the job done:

The results:

Gross

If we run a query like that, it turns out pretty alright. No problems here; at least none that couldn’t be solved if I could be bothered to create a covering index that starts with DisplayName. The real Jon Skeet is obvious enough. He’s the one that has a Reputation that looks like a PowerBall jackpot.

Put that query in an inline function

Let’s look at a function I use in a few demos. Awkwardly, I use it to demonstrate how much better Inline Table Valued Functions are. I never said perfect! Call my lawyer. Whatever.

Simple enough, right? Return a count from the Badges table based on UserId. There’s only one statement here, so this function goes inline – the best kind of function.

Let’s go on a date, just me and Jon Skeet. Let’s feed the function literal values because parameters are lovingly tended to, and they get their own special fancy plan.

Chicken taco. Steak taco. Missing index.

The first plan (the real Jon Skeet) has a plan that includes a stream aggregate because he has a boatload of badges.

Jon Skeet’s Mentor has, uh, two. Which is still probably more than you have, so stop snickering. He gets a slightly different plan that doesn’t include a stream aggregate.

Uh oh – that sounds like parameter sniffing

One query, two plans depending on parameters – ah, it’s our old friend, parameter sniffing. When you see that, you should also try running the query with local variables to see another potential problem:

Then our plans look like this:

Same little plan

We’ve been snorted. Snorted real hard. Both Skeets – the big one and the little one – are getting the local variable treatment. SQL Server’s optimizing for a relatively small number of badges, and neither plan includes the stream aggregate.

That means we can use a RECOMPILE hint to go back to the original plans with literals. We can also use unsafe dynamic SQL.

If we use parameterized SQL, we used the a cached plan for whichever value goes in first. This is a lot like what happens with dynamic SQL and filtered indexes.

Same big plan this time

Icky

While I’d much rather see you using Inline Table Valued Functions, because they are better than the alternatives somewhere in the neighborhood of 99% of the time, you should be aware of this potential performance hit.

Thanks for reading!


Using Trace Flag 2453 to Improve Table Variable Performance

I recently saw a server with trace flag 2453 configured. I hadn’t come across this trace flag before, so I did a little research. Microsoft says it allows “a table variable to trigger recompile when enough number of rows are changed”. This can lead to a more efficient execution plan. Trace flag 2453 is available in SP2 or greater for SQL Server 2012, CU3 or greater for SQL Server 2014 and RTM or greater for SQL Server 2016.

I was curious how a query using a table variable performed as compared to the “same” query using:

  • trace flag 2453
  • OPTION (RECOMPILE)
  • a temporary table

DEMO SETUP

BASELINE

Table variable details:

TRACE FLAG 2453

Table variable details:

OPTION(RECOMPILE)

Table variable details:

TEMPORARY TABLE

Temporary table details:

THE RESULTS

The baseline query has a clustered index scan for the table variable since almost 3.5 million rows are returned from it. The other 3 queries have the same execution plan and stats IO output, with a clustered index seek on the table variable or temporary table since 963 rows are returned from it.

WHAT WOULD TARA DO?

Should you always use trace flag 2453 or OPTION (RECOMPILE) if you are using table variables? Maybe. But I’m leaning towards “it depends”. Microsoft says the “trace flag must be used with caution because it can increase number of query recompiles which could cost more than savings from better query optimization”.

Some people say to use table variables if the table will contain a small data set. I’ve been burned by that before. I’ve experienced poor performance when the table variable always only had one row in it. Switching it to a temporary table improved the performance dramatically. Your mileage may vary.

I’ll just stick with temporary tables.


[Video] Office Hours 2017/02/1 (With Transcriptions)

This week, Brent, Erik, and Richie discuss AlwaysOn Availability Groups, execution plans, speeding up log shipping, applying wrong service packs, why servers “page wildly” when copying mid to large-size files from one SQL server to another, their favorite high availability solutions, and much more!

Here’s the video on YouTube:

Office Hours Webcast - 2017/02/01

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes

Office Hours Webcast – 2017-02-01

 

Is it worth learning Always On?

Brent Ozar: James says, “I’m a mid-level DBA.” I think that’s kind of like a mid-level manager, he’s kind of sort of a DBA. “We do not currently use Always On, however, is it worth learning the technology and how many of your clients use Always On?” Erik, what do you think? Is it worth learning the technology?

Erik Darling: It’s always worth learning, whether it’s worth implementing, whether it’s worth emotional pain that you can go through where every time you think you know what’s going on with your availability group, you don’t actually know what’s going on with your availability group. We’ve said it a million times before, or at least I have during Office Hours and to clients, is I’m an idiot. I can set up an availability group. It is dead simple to set up. What sucks and what’s hard is when they break. As soon as an availability group breaks or as soon as you have to make hard decisions about bringing an async replica online, if the primary [inaudible] goes down, what do you do? That’s where availability groups gets hard. So learning how to set one up is easy. Learning how to keep one up and learning what to do when something goes wrong with one is all the hard stuff. So, worth learning? Yeah. Worth implementing? It’s your coronary.

Brent Ozar: You have to think about what the real cost is. Tara writes about this in her white paper on the Google Compute Engine stuff. She says you really have to have a staging environment that’s separate. You have to rehearse everything in the staging environment, be comfortable with how things work. The second part of your question was how many of our clients use Always On Availability Groups. I don’t know ballpark on a percentage, I think on a percentage it’s probably fairly low but you have to remember that we’re kind of an emergency room for SQL Server. People either come in when the performance is bad or when it’s unreliable. When people come in for unreliable SQL Servers, it is often Always On Availability Groups, that they can’t understand why failovers happen for example.

Erik Darling: Just to add a little bit to that. I’ve never recommended availability groups to anyone. Part of our Critical Care, we can do some HADR architecture spec talk. I have never once recommended availability groups to a client. Beyond that, I think the percentage of clients using them is a much different percentage than clients who are using them happily. I’d say like maybe 3 to 4 percent of clients are using them. There’s an even smaller percentage who are like, “We’re psyched. This is going great. Got my happy face on.”

Brent Ozar: It’s often the reason why they’re calling, that there was a problem with it. I don’t believe it’s that the technology is broken, it’s just hard. There’s a wizard and it looks like you can just step through the wizard and you’re done. People are like, “That wasn’t so hard. What’s the big deal with that?” Man, it’s what happens afterward that’s harder.

 

Why are query plans different between two servers?

Brent Ozar: Josh asks, “I have SQL Server 2012 Expensive Edition” as we like to call it. He says, “There’s a difference in execution plans between two different servers, between dev and QA. I’ve tried making everything identical. What else can I look for that would cause execution plans to be different?” He’s ruled out clearing the plan cache, my set connection options, the hardware is the same, partition tables and indexes are identical. What would you guys look at?

Erik Darling: Different execution plans…

Richie Rump: Look at stats.

Erik Darling: Yeah.

Brent Ozar: I like it, yeah. Because what if the databases aren’t from the same point in time? What if there has been changes to them? What if one of them updated stats and the other one didn’t? That can hugely make a difference.

Erik Darling: Yeah, I’d double check other settings too. I would just make sure MAXDOP and cost threshold are in the same place. I would make sure that all of that stuff is floating around the same area. Just to dot the i’s and cross the t’s.

Brent Ozar: I would also say too that there’s a really cool tool built into Management Studio 2016 that lets you compare execution plans. If you Google for – and I’m not just saying that because today is like Google day – if you Google for compare execution plans with SQL Server 2016, it will show you what the differences are. It will show you what’s unequal. It can be things like the cardinality estimator, not on 2012, but it will show you all the differences between the plan, like what influenced it to do different things.

Erik Darling: One thing that might be even fun to try is see if there’s any hinting or whatever you can use to get the plan to be the same. That way, if it’s like a difference in a join type, try forcing the join type. If it’s a difference in joint order, try forcing the joint order to see maybe why that other server isn’t picking it. You may find that for some reason, somehow, that plan on that other server is getting a different cost estimate or something.

Brent Ozar: That’s true.
Brent Ozar: Roland says, “I’m running an Always On Availability Group with two nodes in synchronous mode. Could running a CHECKDB on the secondary replica somehow affect the primary replica?”

Erik Darling: Yes.

Brent Ozar: Yes. It’s going to slow things down, yes. If you’re doing inserts, updates, and deletes against the primary and then the secondary isn’t responding to writes as quickly, yeah, unfortunately that is going to slow down the primary.

Erik Darling: Another thing that could be happening is if you were one of those weird people who has an AG and still has a SAN behind it, you could be sucking all that DBCC CHECKDB data through the SAN and the other AG is sort of sharing that same pipe or whatever boobidi-babbidi and things are just kind of beating each other up there. Don’t make that face, Richie. You know boobidi-babbidi is a perfectly technical term for IO.

Richie Rump: Someone has a young daughter, that’s all I’m saying.

Brent Ozar: Is this like a Sponge Bob thing?

Richie Rump: It’s like “Bibbidi-Bobbidi-Boo,” Cinderella thing.

Erik Darling: “The thing-a-ma-bob that does the job.”

Brent Ozar: A VM host too, if they’re on the same VM host that would be horrible for other reasons.

 

Will a CONVERT function be SARGable?

Brent Ozar: Dimitri asks, “If I do a convert function in a where or a join, is that still going to be SARGable?”

Erik Darling: Only for some things, not for all things. The one that he explicitly asked about was date/time. There is a special optimizer rule built in for date to date time conversions where SQL can still use an index and still do a seek because the left-most part of the date is still part of the histogram and all that. So SQL can still do all sorts of nice stuff with just the date part to date time. It doesn’t work as well for all converts, tasks and whatnot.

Brent Ozar: The thing that’s tricky is it may still do a seek but the estimates can be wrong. In one of our classes I show that using that convert date and it gets a seek but it only estimates one row instead of like 4,000.

 

How can I reduce log shipping latency?

Brent Ozar: Tim says, “What are some ways to speed up log shipping to an off-site read-only standby with an app that can create hundreds of transactions even on a simple order lookup?” I guess what he’s really asking is – so if you can’t write less to the log file is there a way to speed up pushing that log data to another location?

Erik Darling: It depends on which part is slow I guess. You could take more frequent log backups.

Brent Ozar: Yeah, oh, yeah.

Erik Darling: …smaller logs across. I would rather send a 10 or 30 second log of data across than five or ten minutes. I think that’s going to help.

Brent Ozar: And if you’re worried about latency, that makes me think that maybe your querying the secondary in which case taking more faster log backups isn’t going to help a whole lot unless you’re kicking people out every time you do a restore.

 

Should I learn the cloud?

Brent Ozar: Interesting. Clark asks, “As a DBA who only became comfortable with virtual servers running Microsoft SQL, do I now have to swallow the red or blue pill and learn cloud-based servers? If yes, where do I start? My current employer would never agree to cloud-based anything.” No, if you’re not doing it, don’t go learn it. If the company won’t accept it, don’t bother learning. You’ve got enough things to learn. Focus on the things that will make you more valuable at your employer.
Richie Rump: Unless you want a new employer and you’re really passionate about cloud computing and cloud databases, then by all means, start spinning stuff up, start learning it. But if you’re comfortable where you’re at, you like where you’re at, and you like what you’re doing, there’s really no need.

Erik Darling: One thing I will say is that the perfect cure for interest in the cloud is learning the cloud.

Brent Ozar: Oh god, suddenly I’m scared. Hold me. Yeah, poor Richie has been spending the last two, three days, really since Friday I think struggling with a problem.

Richie Rump: Friday, yeah.

Brent Ozar: Yeah, struggling with an ugly problem.

Richie Rump: It’s been multiple build systems and multiple projects so it hasn’t just been the one. It’s been multiple and then trying to get all that automated because there’s so much documentation everywhere on how to do this stuff, you just have to figure it out.

Brent Ozar: We have a company Slack chatroom where everybody goes in and chats. One of the things that goes into one of the rooms is Richie’s failed builds so we all get to see whenever Richie’s builds are failing.

 

Should user databases have one file per core?

Brent Ozar: Renee says, “I have a server with four physical CPUs, each of which has ten cores. I’ve got a data warehouse, should I line up the number of data files with the number of CPU cores that I have? Should I have 40 data files for my data warehouse?”

Erik Darling: I don’t think that’s going to do much for you. I wouldn’t even say that for tempdb. I only mentioned tempdb because I can imagine that sort of thinking coming from having one tempdb data file per core so that’s why I say that.

Brent Ozar: You’ll sometimes read Microsoft Fast Track Data Warehouse reference architectures and they’ll do crazy stuff like that because they’re trying to drive 100 percent CPU use across lots of storage fabrics. That’s really a different kind of crazy town but that’s fairly rare to see. I don’t think I’ve ever seen anybody with more than ten files per file group and they didn’t like it even when they had ten.

 

Should I use Windows 2016 or 2012 R2?

Brent Ozar: Justin says, “Back in the SQL 2008 and 2012 days…” It’s still the SQL 2008 and 2012 days. We’re still here.

Erik Darling: For many of our clients it’s still 2008.

Brent Ozar: Tara I think answered a SQL Server 2000 question today on Stack Overflow.

Richie Rump: Noooooooo.

Brent Ozar: He says, “Have you guys recommended running Windows 2012 over Win 2008? Do you have a SQL or similar recommendation regarding on what OS you should run SQL Server 2016 on?” It doesn’t make a difference to me. Win 2012 versus SQL in 2016.

Erik Darling: Yeah, unless you’re chasing failover clusters or something else there are a few advantages to 2016, like you can add rolling cluster upgrades and some other cool stuff. I believe that if you’re running availability groups Windows 2016 is the only one that supports the distributed transactions.

Brent Ozar: Oh, yeah.

Erik Darling: So there are some advantages to Windows 2016 but generally I don’t think I care all that much as long as it’s not 2008 or 2003.

 

Should rename/lock/hide the SA account?

Brent Ozar: Wes says, “I have a bad question.” We’ll be the judge of that. “Do you guys lock the SA account and replace it with a new SA in order to hide the 0X01 user? Is there anything that I should think about there?” No. I don’t ever do that.

Erik Darling: I’ve disabled SA a couple times just to check off an audit box but never made a difference.

Brent Ozar: I don’t feel massively secure. My bigger problem is usually that everybody in the shop has the SA password and it’s like on every post-it note everywhere and I’m like, “Can I please get that culturally changed?” and they say no.

Erik Darling: Two things I always notice is at my last job when I walked in I asked my boss the SA password and he said, “It’s on the whiteboard by the developer section.” The other thing is, this is something that has always irked me a little bit. I say a little bit because I go back and forth on it. Everyone in Management Studio saves their password, right? Because no one wants to go do something if they’re using SQL logins. If you’re using Windows auth it obviously doesn’t matter but logging in with SA most people will save the password. I have a little bit of a gripe with that because then anyone, if you leave your workstation unlocked, they just open up Management Studio and they go right in from your workstation. So just as a best practice for you, unless you have those crazy SA passwords that you can’t remember because it’s all like hyphen, parentheses blah blah blah, then try not to save them.

Brent Ozar: Yeah.

Richie Rump: I just put it on a sticky note and post it on my monitor so I don’t forget.

Brent Ozar: Nice. Or 1… 2 … 3… 4… 5…

 

Should I deploy SQL Server 2014 now or 2016?

Brent Ozar: Ryan says, “We’re running SQL 2012 SP2 CU14.” Come on, now. If you have to name that I think I’m already in trouble.

Erik Darling: If you have to name that, why aren’t you on SP3?

Brent Ozar: Good point, CU14. “Our software vendor certifies what SQL Server version we can run on. They just certified 2014 and we expect them to finish certifying 2016 sometime this year. Performance is fine but some execution plans suffer. Would you wait until 2016 is certified or are the improvements in the cardinality estimator in 2014 worth making the change forward now?”

Erik Darling: I’d wait.

Brent Ozar: Yeah, me too. What are your reasons behind waiting?

Erik Darling: There are just enough bells and whistles and improvements to the cardinality estimator between 2014 and 2016 that I would just wait it out. There is no sense at this point upgrading to a version behind current especially Microsoft is getting into such an aggressive lifecycle for SQL Server. It sounds like vNext is going to be an every two-year thing now. We’re not going to see long waits between major versions. So I would be very careful with that and not intentionally put myself behind a version.

Brent Ozar: And if I do it, I only want to do it once. I don’t want to go through it in 2014 and then go through it again in 2016.

 

We really messed up our patching…

Brent Ozar: Rosemary says, “I have a question about applying the wrong service pack.” Well, don’t do that. She says, “On a two-node cluster someone has applied the wrong service pack for the passive node. Now the cluster won’t failover. Is the only solution to uninstall the wrong instance from the passive node and then reinstall it with the right binaries?” Based on the – and I’m skipping like five lines out of your question – call support. Call Microsoft Support. It’s $500. They’ll work the problem with you until it’s done but frankly if someone did something that incredibly stupid, like they applied the wrong version of SQL Server across, I would take this advantage to go build a new node in the cluster that I was going to then go failover to. Just something brand new from scratch that I know is going to be reliable. If someone that incompetent screwed up your service pack, god only knows what else they installed on the SQL Server. Like in our recent engagement at Google where we accidentally infected our domain controller with a virus. That would be Erik Darling.

Erik Darling: Excuse me.

Brent Ozar: Get something off of BitTorrent we said. How hard can it be we said.

 

How big of a drive should I use for system databases?

Brent Ozar: Tim asks, “When you guys build a new server, how big of a drive do you create for the system databases, master, model, and msdb?”

Erik Darling: I don’t ever separate those from other stuff.

Brent Ozar: Where do you put them?

Erik Darling: I put them in with whatever data files and log files I’m going to have for my user database. I might have a system folder on that drive but I don’t ever separate them out for anything. Tempdb is the only one that gets its own good boy drive.

 

What MAXDOP settings take precedent?

Brent Ozar: Jonelle asks, “If you set MAXDOP at the instance level, the max degree of parallelism, if you set it at one, does that disable parallelism unless you provide a query hint?”

Erik Darling: Yes.

Brent Ozar: Yes.

Erik Darling: Or, unless you’re on 2016 and you set it at the database level. There are database scope configurations for such things, which is weird because they do it for MAXDOP but not for cost threshold, right?

Brent Ozar: Right. Yeah, I wrote a blogpost recently on all the ways you can set MAXDOP and I think it was like 13 or 16 different ways that you can influence parallelism just by different settings. It’s crazy how many things you can tweak it with.

 

Can BEGIN TRAN fill up TempDB?

Brent Ozar: Eric says, “Is it possible…?”

Erik Darling: I did not.

Brent Ozar: What? Different Eric. He doesn’t know how to spell Erik correctly. He says, “Is it possible that a statement with a BEGIN TRANSACTION that uses tempdb that is left hanging can hold the tempdb log file hostage and eventually cause it to fill up and bring its SQL Server to its knees? If so, I find this disturbing.”

Erik Darling: Welcome to SQL.

Brent Ozar: Yep, bad news. Yes, that is possible. It’s kind of sad. I say kind of sad, it’s not like I’ve ever run into that. The problem for me hasn’t been the log files, it’s been the data files. Some yoyo dumping a huge thing into tempdb trying to join 50 tables together. It’s not always Richie, it’s just sometimes Richie. Less often now that he’s doing work in Postgres.

 

Why does the page file get used when I copy files around?

Brent Ozar: Guillermo says, “Every time I start copying a mid to large size file from one of my SQL Servers to another, I see the server starts paging wildly. Why is this?”

Erik Darling: I did this to a server once. If you look back shamefully, I had a Stack Exchange question, but I did it with PowerShell. I was using the move item command. I was moving a bunch of data files around and one of them happened to be like 500 gigs. As soon as I did that, the server just stopped. It was like no one could get it in. Nothing. It just stopped dead. It stopped until the file finished moving. That was because Windows tasks like that, like file system stuff and PowerShell and even like Xcopy or Robocopy, they don’t give a single care about your SQL memory. They see the memory available on the server and they’re like, “I’m going to buffer this whole file, move it over. I don’t care.” That’s why it’s very important that when you move files on SQL you do it very carefully and do not hit Control C and Control V to do it. Even just like dragging and dropping a file from one place to another can cause trouble if it’s big enough.

Richie Rump: Honey badger Windows don’t care.

Brent Ozar: Does not care. Takes all of your memory. If you’re playing around with command line stuff like Xcopy, look into unbuffered. There’s an unbuffered switch you can use for Xcopy.

Erik Darling: I think Robocopy has a similar one that I started using as well.

Richie Rump: Don’t mentioned that Power thing again please, Erik.

Erik Darling: I mention it as little as possible.

Richie Rump: We’re not on speaking terms right now.

Brent Ozar: Especially, it’s so illustrative, you can just trash servers so much more quickly. That’s not true. That’s kind of true.

Erik Darling: It is true.

 

Should I separate data and indexes to reduce gaps?

Brent Ozar: Heather says, “A colleague recently recommended that we should move indexes to a different file group because reorgs and rebuilds cause gaps. Is there any merit to this suggestion?” That’s incredible.

Erik Darling: Gaps? In what?

Brent Ozar: I guarantee I know what they’re thinking. They’re thinking I have like say 100 gigs that’s all completely full and I want to rebuild a 50 gig index. That it’s going to pick up that index and move it to the empty space at the end and there’s going to be a 50 gig hole at the beginning. So theoretically this could be true if you have one dominant object in the database, one object that takes say half of the database or more. But if you have, to keep the numbers simple, ten things in your database that are all equally sized, you’re not rebuilding them all simultaneously. You’re rebuilding them one at a time and you’re leaving free space behind. Is there going to be a gap? Yes, even if you put it in a different file group there will still be a gap, unless you get fancy and go back and forth between file groups or something but tell that colleague of yours to stick with the other department that their primary job description is in, management for example, but keep out of the database.

Erik Darling: Yeah, that’s one of those bananas things where it’s like how is that even going to change anything? Because SQL is going to read that from disk into memory anyway, right?

Brent Ozar: It kills me that people forget what RAM stands for. RAM doesn’t stand for sequential access memory. It stands for random access memory.

Richie Rump: How old is that server? Is it like SQL Server 6.5 or something?

Brent Ozar: Or how old is the colleague too?

Richie Rump: Goodness.

Brent Ozar: “When I was your age…”

Richie Rump: I said that this week.

Brent Ozar: What made you say that?

Richie Rump: Cloud.

Brent Ozar: Oh. “When I was your age, I would have gone into the data center and beat it with a hammer. A stone hammer, not PowerShell, the real, original stone hammer.”

Richie Rump: “When I was your age I did builds on my own machine and I liked it.”

Brent Ozar: “I really liked it because I could go do something else while my code was compiling.”

Richie Rump: “I went and got coffee.”

 

Should I enable xp_cmdshell?

Brent Ozar: Guillermo asks a controversial question. There’s no controversy with me. Guillermo says, “What’s your stance on enabling xp_cmdshell on SQL Server?”

Erik Darling: I don’t care. If you’ve got to use it, you’ve got to use it. You’re not any better off enabling and using xp_cmdshell than you are if you were to replace that with say using PowerShell in an [inaudible] or something. It’s just not. Plus, as a guy who has had to do a lot of dumb things with SQL Server, I have used xp_cmdshell to do many of them. My proudest moment was using xp_cmdshell to build a cookie and then use the cookie in a call to curl to download a file. Then lay that file onto a table. I was so psyched on that. It was like I don’t know why everyone hates xp_cmdshell.

Brent Ozar: That’s pretty cool.

Richie Rump: Learn a programming language, Erik.

Erik Darling: I tried to learn PowerShell because everyone…

Richie Rump: Again, learn a programming language, not script kiddie stuff, okay?

Erik Darling: … just write a quick PowerShell script to do it. I’m like, yes, quick PowerShell script. Then four days later I’m like sitting there in a pile of broken keyboards with like punches through my monitor and like teddy bear heads ripped off. Quick PowerShell script my foot.

Richie Rump: My thing is with xp_cmdshell is like should you be doing it in a database at all? Maybe this should be somewhere else other than the database.

Erik Darling: I agree but sometimes these tasks have to interact with the database, right? Like sometimes you do have to move a file. Then you can’t do anything in the database until the file is done moving.

Richie Rump: That’s why we have STKs to talk to databases with, these programming languages. You see how that works?

Erik Darling: You do. DBAs who don’t know C# and don’t have the time to learn C#.

Richie Rump: …it’s been around for – I don’t know – over 15 years. I mean, I think it’s pretty stable.

Brent Ozar: Here it comes: When I was your age…

Richie Rump: When I was your age…

Brent Ozar: I would ask – this does bring up a good point – what do you need xp_cmdshell for? That is the thing that I usually ask. Like are they trying to implement some kind of automation that is really best done in another language or in an app server. I was talking to a DBA recently who even does file unzipping through xp_cmdshell, I’m like this may not be the best use of CPU cycles on SQL Server’s licensing.

Erik Darling: 7-Zip CLI.

Brent Ozar: PKZIP… When I was your age…

Richie Rump: I could do that in Azure Functions. Just saying. Or Lambda.

Brent Ozar: When I was your age we didn’t have Lambda.

Richie Rump: That’s right.

 

On a SAN, does it matter if I separate data and indexes?

Brent Ozar: Dimitri says, “Speaking for file groups. If all my storage is on a SAN, does it make any difference for performance to have data and indexes on different file groups?”

Erik Darling: Not unless they’re on totally different spindles or I guess maybe if you have like a real fancy SAN and they’re routed through different storage paths, then maybe – through different network storage paths rather, not the other thing.

Brent Ozar: That was the reason we always talked about it in the old days – when I was your age – we would put the data on one set of drives and the indexes on another. The theory was you could read from the data and rebuild the indexes faster, like if you’re reading from one set of spindles and writing to the other. These days, kids seem to have this really fancy concept called memory. They put a lot of their data up in memory so then they don’t have to read from disks in order to write to disks but I don’t know, it’s just to give you the live broadcast – 64 gigs of RAM there in that desktop. Mine is a chump compared to Erik’s who has…

Erik Darling: 128.

Brent Ozar: 128 gigs of RAM in his desktop, ladies and gentlemen.

Erik Darling: Wouldn’t it be cool if SQL accepted weird slang for date expressions? Like “when I was your age.” Like “back in the day.”

Brent Ozar: We could write that.

Erik Darling: It would just interpret it, like based on your age and mindset.

Brent Ozar: Yeah.

Erik Darling: My barber is an old guy and he always says stuff like, “There was a time” and “back in the day.” Every sentence starts with it. Or it’s like “Nowadays.” Nowadays would be like the date minus seven or something.

Richie Rump: Did the html tag flash actually work? Did the marquee tag work and it scrolled across. Blink, how’s that?

Brent Ozar: Nice.

Erik Darling: Because it is a structured English query language so I think it should accept a wider variety of terms to express dates.

Brent Ozar: Yesteryear.

Erik Darling: Yeah.

Richie Rump: You want to make things more complex for DBAs? Is that what I’m hearing?

Erik Darling: Well, for developers.

Brent Ozar: No, just developers.

Richie Rump: Well then, sure, fine, why not? We’re not updating things every two weeks now, right? What could possibly go wrong?

Erik Darling: No one depends on that stuff.

 

Should I increase max worker threads?

Brent Ozar: Neil says, “Have you guys had any experience with high concurrent connections having max worker threads run out of idle threads? Is there a reason why I should increase max worker threads?”

Erik Darling: Noooooo.

Brent Ozar: Why no?

Erik Darling: I think we have a joint blogpost coming up about this soon about how like…

Brent Ozar: We do.

Erik Darling: So max worker threads, it sounds like one those – like, oh, if I just had some more worker threads everything would be fine because I’m running out of them. The problem is when you add worker threads, you don’t add CPUs, you don’t add CPU resources, those threads are all now competing on the same CPUs you had before. So now where your workload may not fall off into the abyss, like your server may not stop responding, but it’s going to get really close because all those threads are going to be running on CPUs and slowing down and sucking at life.

Brent Ozar: I had one client he set max worker threads at like 3,000. They actually had that many queries running simultaneously and they didn’t get – I said, all right, divide out the number of cores you have and that number of worker threads. If they’re all active and they could make progress, think about how little CPU time they get because they just go on for four milliseconds of CPU time and then step off the line for someone else to execute. They take longer, not shorter, so it’s pretty amazing.

Erik Darling: What you really want to do is if you are consistently running out of worker threads and you can’t add CPUs is start looking at parallelism because parallelism is one of the biggest consumers of threads on most servers that I see. So either cutting MAXDOP in half, instead of 8, cut it to 4. If it’s 4, cut it to 2. Maybe boosting cost threshold up, but those are short term, artificial ways to do it until you have time to do the query and index tuning you have to do so that queries don’t have to go parallel anyway.

Brent Ozar: Bingo.

Erik Darling: Again, that’s a performance hit too because now these queries, you know, either were going parallel before and aren’t going now or we’re using more cores and finishing faster now or we’re using less and taking longer. So any situation with max worker threads and all this other stuff, you are going to slow stuff down until you make it better. There’s no great solution.

 

How many VLFs are too many?

Brent Ozar: Renee asks, “Do you guys have a rule of thumb on how many VLFs are too many and how do you minimize them?” Yeah, run sp_Blitz. Sp_Blitz warns you and we warn you when it’s over 1,000, not because that’s bad but we want you to fix it before you get up to 15,000, 20,000. There’s a link in there on how you go about fixing those, really quick and easy fixes.

 

Why do my missing index suggestions disappear?

Brent Ozar: Ben says, “I use sp_BlitzIndex against a somewhat high transaction db.” Okay, quick, podium moment here. Everything you work with will always seem like high transactions or high size. Never use those terms. Instead, use real numbers. He says, “sp_BlitzIndex will display five high value missing indexes for a specific table but then I do an index rebuild and reorg on Sundays and those five suggestions disappear. Why do they come and go?”

Erik Darling: Because they’re karma chameleons.

Brent Ozar: Karma karma karma chameleon…

Erik Darling: Just kidding. I look nothing like Boy George today. So in SQL Server 2012 is the only one that still has it for limited versions of 2012 because it got fixed in service pack 3 I think, but there was a bug that happened between 2008 R2 and in 2012 where every time you rebuild an index it clears out the DMVs for that index. So like all the usage and the missing index stuff. I want to say, Kendra has a blog post on littlekendra.com about which versions and which commands still reset certain counters. I want to say that maybe the rebuild thing still might clear out missing index stuff and it [inaudible] other DMV counters. There’s something wonky about it that I’m glad you wrote down for me. I would check there and get the full list but that’s the reason why. So stop rebuilding indexes.

Brent Ozar: There you go.

 

What’s your favorite high availability feature?

Brent Ozar: Next up Guillermo says, “For high availability for SQL Server what’s your preferred method between failover clustered instances or availability groups?” I’m going to take away his restriction and I’m going to say any high availability mechanism and we’ll each give answers. Erik, what’s your favorite and why?

Erik Darling: I just talked.

Brent Ozar: Richie, what’s your favorite and why?

Richie Rump: Favorite…?

Brent Ozar: I’ll answer. My favorite high availability solution is usually either VMs, just a single VM, because it’s easy. It’s not perfect. There are all kinds of things it doesn’t really protect you from. If someone drops a table, if someone hoses up a patch. But it just is kind of free high availability, especially for your older stuff that you can’t afford to reinstall or buy a fancy pants edition for. Barring that, I like failover clustered instances a lot because they’re really similar to what they’ve been around for years, so it’s easy to get books and training on those.

Richie Rump: Cloud databases.

Brent Ozar: Cloud databases. That’s such a good answer.

Richie Rump: Azure SQL database you have multiple copies when you spin it up so if one fails it jumps to the next one and then starts rebuilding another one. So that’s free. It’s out of the box. You spin it up and it’s there. It’s the cheapest way to get HA.

Brent Ozar: How about you, Erik?

Erik Darling: Actually I’m going to expand a little bit on what you said. My favorite setup of all time is the failover cluster with log shipping because that gives you HA and DR outside and log shipping, it doesn’t get any easier than log shipping. If you screw up log shipping, god help you with anything else.

Brent Ozar: It’s pretty bulletproof. Doesn’t rely on Windows domains, anything like that. Speaking of which, the guys over at Madeira Data, Guy Glantser, just wrote a blogpost on building Always On Availability Groups without a domain controller. He does a three-node Always On Availability Group with no domain controller just in work groups. The checklist to do it is probably ten pages long if you go print it out. So, Guy Glantser. We’ll put that link in the show notes. It’s actually in this week’s Monday links too as well. I was like everybody needs to see this. It’s crazy. Thanks, everybody, for hanging out with us at this week’s Office Hours. We’ll see you guys next Wednesday. Adios, everybody.


Memory Grants and Data Size

What does memory do?

In SQL Server, just about everything. We cache our data and execution plans in it (along with some other system stuff), it gets used as scratch space for temporary objects, and of course queries use it for certain operations, most notably sorting and hashing. And of course, now Hekaton comes along to eat up more of our RAM.

In general, not having enough of it means reading pages from disk all the time, but it can have RAMifications down the line (GET IT?!), like queries not having enough memory to compile or run, and your plan cache constantly being wiped out.

If you’re struggling with the limits of Standard Edition, older hardware, bad hardware choices, or budget issues, you may not be able to adequately throw hardware at the problem. So you’re left to have someone spend way more money on your time to try to mitigate issues. This of course means query and index tuning, perhaps Resource Governor if you’ve made some EXTRA BAD choices, and last but not least: cleaning up data types.

How can this help?

Leaving aside the chance to maybe make your storage admins happy, you can also cut down on large memory grants for some queries. Here’s a quick example.

We’ll create a simple table. In order to make Joe Celko happy, it has a PK/CX. We have an integer column that we’ll use to ORDER BY. The reason for this is that if you order by a column that doesn’t have a supporting index, SQL will need a memory grant. The VARCHAR columns are just to show you how memory grants increase to account for larger chunks of data.

Some test queries

When we run the queries above, we can see in the query plans, and thanks to fairly recent updates (2014 SP2, 2016 SP1), a warning in actual plans about memory grant issues.

To make this a little easier to visualize, we’ll use an Extended Events session using a new event called query_memory_grant_usage. If you want to use this on one of your servers, you’ll want to change or get rid of the filter on session ID — 55 just happens to be the session ID I have.

Here’s what we get from our XE session.

Does anyone have a calculator?

Our query memory grants range from around 8 MB to around 560 MB. This isn’t even ordering BY the larger columns, this is just doing the work to sort results by them. Even if you’re a smarty pants, and you don’t use unnecessary ORDER BY clauses in your queries, SQL may inject them into your query plans to support operations that require sorted data. Things like stream aggregates, merge joins, and occasionally key lookups may still be considered a ‘cheaper’ option by the optimizer, even with a sort in the plan.

Of course, in our query plans, we have warnings on the last two queries, which had to order the VARCHAR(8000) column.

Barf

Sort it out

You may legitimately need large N/VARCHAR columns for certain things, but we frequently see pretty big columns being used to hold things that will just never approach the column size. I’m not going to sit here and chastise you for choosing datetime over date or bigint over int or whatever. Those are trivial in comparison. But especially when troubleshooting memory grant issues (or performance issues in general), foolishly wide columns can sometimes be an easy tuning win.

Thanks for reading!

Brent says: whenever anybody asked me, “Why can’t I just use VARCHAR(1000) for all my string fields?” I didn’t really have a good answer. Now I do.


SQL Server DBA’s Guide to the Gitlab Outage

SQL Server
17 Comments

This week, developer tools company GitLab had a serious database outage.

The short story:

  • An admin was trying to set up replication
  • The site had an unusual load spike, causing replication to get behind
  • While struggling with troubleshooting, the admin made a lot of changes
  • After hours of work, the admin accidentally deleted the production database directory

You can read more about the details in GitLab’s outage timeline doc, which they heroically shared while they worked on the outage. Oh, and they streamed the whole thing live on YouTube with over 5,000 viewers.

There are so many amazing lessons to learn from this outage: transparency, accountability, processes, checklists, you name it. I’m not sure that you, dear reader, can actually put a lot of those lessons to use, though. After all, your company probably isn’t going to let you live stream your outages. (I do pledge to you that I’m gonna do my damnedest to do that ourselves with our own services, though.)

I want you to zoom in on one particular part: the backups.

After the above stuff happened, it was time to recover from backups. In the outage timeline doc, scroll down to the Problems Encountered section, and you’ll see 7 bullet points. GitLab used PostgreSQL, Linux, and Azure VMs, but I’m going to translate these into a language that you, the SQL Server user, can understand.

My new transparency heroes, up there with Stack Overflow

Their 7 layers of protection were:

  1. LVM snapshots taken every 24 hours
  2. Regular backups every 24 hours
  3. Disk snapshots in Azure
  4. Synchronization to staging
  5. Replication
  6. Backups to Amazon S3
  7. Backup failure alerts

Let’s turn this into SQL Server on Windows language.

1. OS volume snapshots

In Windows, these are VSS (shadow copy) snaps. They freeze SQL Server’s writes for a matter of seconds to get a consistent picture of the MDF/NDF/LDF files of all of your databases on that volume. (These are not database snapshots, which are also useful in some cases, but unrelated.)

VSS is a building block, and you don’t hear Windows admins just using the term VSS by itself without also referring to a third party backup product. These are usually the products you despise, like NetBackup, which use VSS to substitute for full backups. Depending on your vendor, you may or may not be able to apply additional point-in-time transaction log backups to them. If the product doesn’t have that capability, it usually resorts to doing VSS snaps every X minutes, so it looks like you have a full backup every X minutes that you can restore to – but no other point in time.

Because of that, they’re usually a last-resort for SQL Server users where point-in-time recovery is required. (However, they’re better than nothing.)

2. Regular database backups

You’re familiar with native BACKUP DATABASE commands in SQL Server, and you probably know the difference between:

These are usually a DBA’s first choice for recovery. However, you’re only as good as your last restore. (In GitLab’s case, their backups were failing silently.)

I adore transaction log shipping because it’s essentially testing my log backups all the time. Log shipping is easy to set up, nearly bulletproof, and works with all versions/editions of SQL Server. Don’t think of it as just your disaster recovery: it’s also verifying that you’ve got good backup files.

3. Disk snapshots in Azure

On premises, this is the equivalent of a SAN snapshot or a VMware snapshot. The exact implementation details can either mean that the entire VM is snapshotted, or just the data/log drives.

This is a great insurance policy, and I hear some advanced SQL Server DBAs saying they do this before they undertake something dangerous like a SQL Server version upgrade or a massive schema change. However, rollback is all-or-nothing: if you revert the snapshot, you’re going to lose everything since the snapshot. (That’s why it makes sense for scheduled outages involving big changes with no simultaneous end user access.)

The usual problem with relying on volume snapshots as part of your normal recovery routine (not manual snaps) is that they’re done outside of the SQL Server DBA’s territory. The SAN admin usually controls when they happen, and who has access to them. If you’re going to rely on volume snapshots as part of your backup plan, you have to test those snaps.

In a perfect world, you build automation so that your snapshots are immediately made available to a second SQL Server, which then performs a CHECKDB on that snap. However, that costs licensing money plus personnel time, so I rarely see it done. Folks just assume their snaps are okay – but the SAN error emails aren’t sent to the DBAs.

4. Synchronization to staging

In GitLab’s case, they were pulling parts of the data to another environment. (GitLab users – I’m going to take some liberties here with the description.)

In your case, think about a periodic ETL process that takes data from production and pushes it into staging tables in a data warehouse. If the poop really hit the fan hard, you might be able to recover some of your most critical data that way.

The DBAs in the audience might immediately give that method the finger, but keep in mind that we’re in a new age of DevOps here. If everybody’s a developer, then you can do more creative code-based approaches to recovery.

5. Replication

Outside of SQL Server, it’s common to see replication used as a high availability and disaster recovery technology. Other platforms just echo their delete/update/insert operations to other servers.

GitLab wrote that their “replication procedure is super fragile, prone to error, relies on a handful of random shell scripts, and is badly documented” – and I can certainly relate to that. I’ve been in replication environments like that. (I built one like that when I took the Microsoft Certified Master exam, ha ha ho ho.)

In the SQL Server world, I see a lot of replication setups like that, so the thought of using replication for HA/DR usually prompts reactions of horror. It’s just a cultural thing: we’re more accustomed to using either direct copies of transaction log data, or direct copies of the data pages.

So when you read “replication” in GitLab’s post-mortem, think database mirroring or Always On Availability Groups. It’s not the same – it’s just what we would culturally use.

Just like your backups can fail, your AGs can fail. Replication breaks, service packs have surprises, all kinds of nastiness. Data replication in any form doesn’t make your job easier – it becomes harder, and you have to staff up for it.

6. Backups to Amazon S3

I’m a huge fan of cross-cloud backups because stuff happens. It shouldn’t be your primary failover mechanism – especially if you rely on proprietary services that only one cloud provider offers – but it should be a line in your insurance policy.

It just so happens that Tara’s working on a white paper on doing this with Google Compute Engine’s newfound SQL Server capabilities. More on that soon.

7. Backup failure alerts

I know way, way too many admins who set up an Outlook rule to dump all of their monitoring emails into a folder.

These admins say they go in and look at the contents of the folder every now and then to see if there’s anything important. They go in, do a quick scan, get overwhelmed by the thousands of emails, and move on to other stuff.

I know because I used to be one of ’em myself.

Even worse, when the backup monitoring stops working, these admins think that no emails = good news. They never stop to think that there’s a much bigger problem at play.

I went into one giant corporation, had a room full of DBAs in suits, and the first sp_Blitz we ran pointed out that they had corrupt databases. The head DBA said condescendingly, “Your script must be wrong. We have CHECKDB jobs set up on all of our servers, and we never get failure emails.” Upon drilling down, we found out that the jobs had been failing for years, but the notification emails were going to a defunct distribution list. They permanently lost data, and had a very awkward discussion with the end users.

More layers of backups only help if they’re working.

No pointing fingers at GitLab today, folks.

Go run sp_Blitz on your own SQL Servers, look for priority 1 alerts, and get ’em fixed.

I’d rather not see you on YouTube.


Always On Availability Groups Now Supported in Google Compute Engine

I’m excited to finally be able to talk about something Erik, Tara, and I have been working on for the last few months.

Here in the SQL Server community, when I mention cloud, you probably think of two companies: Microsoft and Amazon. We’ve been blogging about SQL in AWS for years, and Microsoft throws a ton of marketing money at the SQL Server community, talking about Azure at every possible conference and user group.

Turns out there’s another cloud company.

They’re kinda big. You might have heard of them. And they’re turning their attention to Microsoft SQL Server shops.

Google Compute Engine

Google Compute Engine is infrastructure-as-a-service (IaaS), selling virtual machines by the hour like Azure VMs and AWS EC2. You can run whatever you like in these VMs, and Google has long supported running SQL Server in GCE. You could build your own SQL Servers, or use pre-built (and licensed) instances of SQL Server 2012, 2014, or 2016 – but only Standard or Web Editions.

Today, GCE supports Enterprise Edition AND Always On Availability Groups.

We’ve got a white paper coming soon on how to build and test it, plus more cool stuff in the pipeline that DBAs will love.

Why Run SQL Server in Google Compute Engine?

Lemme get one thing straight first: the single biggest decision factor when it comes to cloud providers is the list of cloud services you’re already using. If your developers are building things like crazy on top of a service that’s only available in one provider, you probably want to stick with that cloud provider. If you try to span cloud providers, you’re going to run into latency issues and data egress charges.

I’m no cloud analyst, but I’d guess this is why cloud vendors are racing to build things nobody else has. (For Microsoft, that means things like Azure SQL Data Warehouse.) Cloud vendors want you to use these proprietary services so that you get locked in.

But as long as you don’t get tied into proprietary services – and most on-premises shops running SQL Server haven’t yet – then you can look at other decision factors like VM support, storage speed, networking, etc. It’s kind of a religious war right now – to give you an idea of one guy’s take, read The HFT Guy’s post. (I don’t agree with some of the stuff in that post- for example, AWS’s dedicated instances make sense if you need to lift-and-shift SQL Server from on-premises VMware licensed by the host.)

Religious flame wars aside, one of GCE’s biggest differentiators is their billing:

  • With Azure VMs, you pay a set price per hour. You can kill your VM whenever you want, and the billing stops. If you want a discount, you either gotta shut the VM off (ha ha ho ho), or you gotta contact a salesperson and sign an Enterprise Agreement. (Gotta protect those highly-paid partners.)
  • With AWS EC2, you can get discounts. If you decide you want to keep a VM instance size around for a while, you can reserve that instance, pay an up-front fee, and then pay less by the hour for that VM. It’s not very flexible, though.
  • With Google Compute Engine, your price just goes down. The longer you keep a VM running, the VM price just keeps dropping, and I’m not talking about years – I’m talking about weeks. You automatically get up to 30% off workloads that run for a significant portion of the billing month. Here’s more details.

The small business guy in me really loves the simple GCE approach. Don’t force me to make financial gambles on my 3-year infrastructure plan – that’s why I’m in the cloud in the first place.

Speaking of financial gambles, let’s talk licensing.

How to License SQL Server in Google Compute Engine

Just like the other providers, there’s two ways you can do this.

With Bring Your Own Licensing (BYOL), if you own licenses already, and your licensing agreement allows it, you can use those.

If you don’t have cloud-friendly licensing, or if you’re starting a new project, then you’ll want to avoid forking over a large up-front fee for licensing. Google will rent you the licensing along with your VM:

  • Enterprise Edition – $0.399 per core/hour (roughly $300/core/month)
  • Standard Edition – $0.1645 per core/hour (roughly $118/core/month)

Because you, dear reader, are a data geek, you’re going to run numbers and realize that renting Enterprise Edition licensing from Google for a year almost exactly matches up with what it would cost you to buy that same licensing outright ($3500/core/year). (AWS is the same story.)

If you know that your licensing needs are predictable, and that you’re going to need SQL Server Enterprise Edition for a year or more, and you can afford the up-front costs, then yes, you should buy that licensing and bring it to the cloud. But again – the cloud is about flexibility, being able to adapt to changing demands.

That’s a big focus in our upcoming white paper, too: how DBAs can help the company adapt to changing performance and availability demands by just spinning up a new VM, adding it into your AG, failing over to it, and shooting the old one.

Like admins say these days, treat your servers like cattle, not like pets – and we’re going to show you how to do it with SQL Server. I’m excited to show you that soon, and I’m honored that Google asked us to be a part of the project.

Let’s talk about it at Google Cloud Next.

I’ll be at Google Cloud Next, GCP’s conference, on March 8-10 in San Francisco. We’ll be talking more about the specifics in the Running Microsoft SQL Server on GCE session. If you’re already registered for Next, you can reserve a seat in the session now.

See you there, and I can’t wait to share more details soon.


Updating the Stack Overflow Demo Database

SQL Server
1 Comment

StackOverflow.com shares your questions, answers, comments, votes, users, badges, etc by doing a public data dump that you can download via BitTorrent.

I take that data, and I turn it into a SQL Server database that you can query. It’s so much better than the Microsoft sample databases because:
so-logo

  • It’s just a handful of easily-understood tables
  • It has real-world data distribution
  • Sample queries to tune are available at data.stackexchange.com
  • It’s big enough to see real performance issues (the 2017-01 version is up to 110GB)
  • It’s actually fun to read the data while you’re working with it

If you like playing around with this kind of thing, you can get the latest version now.


[Video] Office Hours 2017/01/25 (With Transcriptions)

This week, Brent, Erik, Tara, and Richie discuss whether an antivirus can corrupt a database or remove data files, the biggest node clusters they’ve worked with, whether cursors are a good idea for reporting queries, authogrowth setting for new servers, memory-optimized tables, index rebuilds and more.

Here’s the video on YouTube:

Office Hours Webcast - 2017/01/25

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes

Office Hours Webcast – 2017-01-25

 

What’s the big deal with antivirus software on SQL Servers?

Brent Ozar: Graham asks, “I’m reading Denny Cherry’s book Securing SQL Server. He says for antivirus configuration he recommends not scanning the data files because if a data pattern matches that of a known virus the antivirus could corrupt the database or remove the database file. Has anyone seen this happen?”

Tara Kizer: I have not seen anyone say that that’s why to exclude it from the antivirus software. Usually, people are saying it for performance reasons you don’t want it to scan. There’s like 60 exclusions that you’ll want to add to antivirus but I haven’t seen the issue that Denny is referring to. But of course, I always have exclusions in place.

Brent Ozar: Antivirus apps don’t change the file, they just quarantine the file. They just block it so you can’t access it, so that shouldn’t cause corruption.

Erik Darling: Like MDF and LDF are open in SQL Server, I can’t do anything to them. Would an antivirus app be able to grab them and do something?

Brent Ozar: No. The time where it catches you is on startup. If you stop the instance and start it then you can have it where the antivirus has grabbed a lock on one of the files to scan it and SQL Server skips that database on startup.

Erik Darling: Oops.

 

What’s a good alternative to the T-SQL debugger?

Brent Ozar: Thomas asks—this is such a good question—“Do you know of a good working alternative to the T-SQL debugger built into SSMS?”

Richie Rump: Wait, T-SQL has a debugger? I should try that. That would be amazing.

Tara Kizer: I can’t ever get it to work.

Erik Darling: It accidentally comes up every time I accidentally hit F6.

Tara Kizer: Yeah, exactly.

Brent Ozar: For those of you who haven’t had the pleasure of using the debugger before, you will learn once that it stops the SQL Server while you’re stepping through debugging. So if you ever want a way to bring production down quickly, that’s a really surefire way to do it, especially if your code has locks in any of the code that you’re stepping through. Hello, ugliness. So to answer your question, I haven’t seen one either.

Richie Rump: Yeah, I’ve never used it. Typically, if I ever feel the need to use a debugger that means I probably need to break up this script into something smaller.

 

Why does patching suck on a 10-node cluster?

Brent Ozar: Stephanie asks, “We have a ten-node geo-cluster. SQL patching takes hours upon hours.” I’m not laughing with you, I’m laughing at you. “Do you see any issues with pausing the node then stopping the cluster service locally before starting a SQL Server service pack upgrade?” So what was the biggest cluster you guys have ever had in terms of nodes? I’m thinking like five.

Tara Kizer: At my last job we had a 14-node cluster, but this was on SQL Server 2012 where you could have five total replicas, four secondaries and one primary. So this 14-node cluster had several availability groups spanning different nodes. We had tier one and tier two servers at the primary site and then we had tier three servers and lower hardware at the DR site. We needed all the databases to be able to be queried on one instance for the business users. So they all had that as a replica too.

Brent Ozar: How did you do patching on that?

Tara Kizer: We did the tier two and tier three servers, those are the synchronous and asynchronous replicas, we did those ahead of time. Production wasn’t running on them. Then tier one we did at night at like 8:00 I think it was. It did not take us hours upon hours though. But again, we split up the work. I think that we hit all of the primary replicas at the same time.

Brent Ozar: One big outage. Yeah, just planning takes hours to get everything staged and ready to go, get everybody’s approvals ready to go.

Tara Kizer: This was Microsoft security patching, as far as service packs, same type of thing though.

 

Why do memory-optimized tables use so much memory?

Brent Ozar: Oh, Kyle, bless your heart for asking this question. Kyle says, “Memory-optimized tables seem to occupy a ton of memory.”

Erik Darling: Go on, sir.

Brent Ozar: Let’s think through that for a second. He says, “I thought they would be compressed but I guess I read the specs wrong. Is compression anywhere near the horizon for Hekaton? It would be nice to send Hana packing. Thanks.” Nope. So what Microsoft recommends is that you need 2x the size of the tables. So like if you have 100 gigs worth of Hekaton data, you’re supposed to have 200 gigs worth of RAM because Hekaton is append only. All new rows are coming in as new versions but also deleted rows and updated rows come in as new versions. So that every now and then SQL Server does the equivalent of garbage collection, getting rid of the old versions of rows. It needs a new table in order to go do that. Like a whole lot of other things in SQL Server these days, these features use memory because memory is cheap. Microsoft just figured out that memory is cheap and it’s time for us to recognize the same thing.

 

Does anyone use database snapshots with Always On AGs?

Brent Ozar: Oh my goodness. Roland says, “I have a database with 1.3 terabytes and lots of indexes on my AGs. Over on my secondary replica I have a snapshot of the database. When I start the index optimization on the primary replica my redo queue builds up very high. Last time it was up to 150 gigs in the queue. The only way I was able to speed it up was to drop the snapshot. Is there a best practice for how to do huge updates in snapshots along the same time?” Have you guys ever used snapshots in combination with an AG? I have not.

Erik Darling: Why would you?

Brent Ozar: No, everybody is shaking their heads. The only—I’m like stretching—if you want to get a point in time consistent view of the data maybe, like if you wanted to say, “I want to see my reports as of exactly 8:00 a.m. this morning.” The one thing you might be interested in is 2016 has parallel redo. If you go to groupby.org we have a session from Bob Ward on there. He talks about the differences that parallel redo makes in SQL Server 2016. It’s his SQL Server 2016 “It Just Runs Faster.” I don’t know of anybody using snapshots with availability groups though so that kind of stumps me there.

 

Should I stop SQL Server before I shut down Windows?

Brent Ozar: Boy, you guys have got great questions today. You guys are like the A+ audience. I’m looking forward to this one. We might be here for an hour.

Erik Darling: D- answers.

Richie Rump: Come back next week.

Brent Ozar: Yeah, can you just trickle these out? You don’t ask all your good questions at once, leave some of them. I tell you, that audience last week, man, they were red cards. They were not good. Just kidding because I know a lot of you were in there. Thomas says, “I’m moving a storage array and I’m shutting down all my SQL Servers. Should I stop the SQL Server services first or just shut down Windows?”

Tara Kizer: I’d just shut it down. Shut down Windows.

Brent Ozar: Yeah.

Tara Kizer: T-SQL Server takes care of us. It knows what to do when it comes back up and it has to go through crash recovery.
Brent Ozar: If you were worried about like a failover of clusters or anything, you could shut down your passive nodes first and then finally shut down the active if you wanted to just walk it all the way down.

 

Brent Ozar: William says, “The debugger, the first time I tried it out it was on my QA system. I soon afterward had several confused developers at my cubicle.” That’s probably true. If you want to take a break from work and get the team to go out for beers or something, you could just fire up the debugger and no one will be able to get any work done.

 

Are cursors a good idea in reporting queries?

Brent Ozar: Adam says, “I never use cursors in my report queries.” Well, Adam, our relationship is off to a good start. “But I’ve been playing around on Stack Overflow and I see some incredibly convoluted answers to problems. They strike me as they would be much easier solved by a cursor. Granted, these problems are situations where the schema is faulty or part of the solution would be better handled in the presentation layer.” He says, “Are cursors ever a good idea for reporting queries?”

Richie Rump: I haven’t run across one.

Erik Darling: No. I mean, like, you’ll find some like way out there crazy edge cases. I think Aaron Bertrand has blogged about using cursors in certain scenarios for like grouped medians where they’re somehow faster than a set based solution but really, for the most part, you’re not going to find that in everyday code.

 

What happened to all my drive’s free space?

Brent Ozar: Next up, Rosemary—Rosemary had emailed us earlier and I said, “Hey, you should ask that question during Office Hours.” Rosemary asks, “I’m using backup compression and I’ve got multiple backup files on one dedicated drive. I’ve got six 50-gig files, 300 gigs all together, but the disks that I’m dealing with only show 6 gigs free and available. I don’t understand why 800 gigs of drive space minus 300 gigs worth of backups means that I only have 5 gigs available. Where did my other 500 gigs go?”

Richie Rump: Erik took it.

Tara Kizer: What’s the size of the backup file on disk? I can think of two things here. Trace flag 3042. But that would only apply while the backup is running, so the algorithm that they use in the beginning of the compression process, it may estimate a much larger file size needed than the compressed size. We had free space issues on some mount points at a job that I was at, so we had to use trace flag 3042 so that it wouldn’t overestimate the size of the needed file. The other thing that I’m thinking of that it might be is the volume shadow copy file, I think. I don’t think you can see those if you have Windows Explorer set to show hidden files. I think they’re still hidden from the hidden, if I recall correctly. We ended up having to do something to see them in Explorer. Then I think we had to stop the service to prevent this large file from consuming space.

Brent Ozar: I love it. You may not be doing the snapshot, that’s someone else doing the snapshot copy. I was going for like a page file. If you have a hidden file that Windows is managing RAM, that will show as like the page file will burn up a ton of space. In Windows Explorer, just set your settings so you can view hidden files and that should hit that one.

Tara Kizer: I think in her original email, because I did see the email come through, it said that they have Windows Explorer set up to show hidden files and the page file should show up in the hidden files. I seem to remember some kind of file had to do with the shadow copy service, you couldn’t see those.

Brent Ozar: You’re absolutely right, yeah, it won’t show up there.

Erik Darling: Would a utility like WinDirStat or something see that?

Brent Ozar: I don’t know. I’ve never done it with VSS snapshots but I wouldn’t be surprised.

Erik Darling: Yeah, whenever I’ve had like a weird drive space consumption issue, WinDirStat, it’s a free-ish or mostly free application I’ve been able to use to track down weird files that are consuming space. There’s always a Stack Overflow on [inaudible] see what it is.

 

What’s the worst that could happen if I add clustered indexes?

Brent Ozar: Danny says, “We have a database full of tables using unique identifiers as primary keys. There are non-clustered indexes on all these unique identifier columns but there are no clustered indexes anywhere on any of these tables. My knee jerk reaction is to replace all these non-clustered indexes with clustered indexes ASAP. Is this a no brainer or could I be setting myself up for disaster doing this?”

Erik Darling: Well you only get one non-clustered index, you know. I mean one clustered index.

Brent Ozar: I was like, wait, what?

Erik Darling: You don’t get multiple clustered indexes. There’s a lot of sort of religious debate around whether you should have a clustered index on a GUID or not. My feeling is generally no unless you have some like really—so like one place where GUIDs shine for clustered indexes is if you have like a really high insert workload, you can run into something called last page contention where you’re always trying to burn up, like if you have an identity column or another kind of sequential clustered index key. So you’re always trying to insert data to the very end of the index, then that’s your hotspot. You can switch over to using a GUID. It’s so easy, just switch over to using GUID for your clustered index. But generally, you could just add some surrogate key, some identity column to the table or something to have a clustered index on instead and keep your non-clustered indexes in place. But that’s going to be a heck of an IO intensive operation. I’d probably opt for creating a new table with the clustered index that I want on it, insert my data over to it, and then create my non-clustered indexes on it.

Brent Ozar: I’d just make sure you’re fixing a problem. Like you can make things worse when you stick on clustered indexes. It’s not normal, but just make sure that you understand that you’re fixing a problem. Like look at what’s the top wait stat on that server before you go [inaudible] with things.

Erik Darling: Another thing you might want to do is if you run sp_BlitzIndex in mode four for that database, you’ll get a lot of information about your heaps back. You’ll see if there’s heaps with a lot of forwarded records in them or heaps that are occupying a lot of space. So like if you have a workload that deletes from heap tables, the pages may not get released from the heap back to the database. So you may just have a bunch of empty pages which SQL is still going to unfortunately scan and look at it when you read from the table. So I would run BlitzIndex in mode four to that database to get some more information about the heaps to see if they are actually problematic.

Brent Ozar: Problematic—like automatic, but different.

 

What should I do about this 30GB log file?

Brent Ozar: Fred says, “I have a database with a 30-gig log file but the data file is 25 gigs. I’ve avoided clearing or shrinking the log. What do you think is going on and what should I do?”

Tara Kizer: Do you really need the disk space? It’s so close in size I’d probably just keep it that way. Is your recovery model in bulked log or full? If so, make sure you have transaction log backups occurring frequent enough for your workload.

Brent Ozar: Full backups are not going to clear that out. This is a 1 terabyte solid state drive. It’s smaller than a phone. It costs $300 at Amazon so just be careful what you’re doing there with a 30 gig log file. Oh my god, it’s the end of the world. Not necessarily.

 

Does rebuilding an index take the database offline?

Brent Ozar: Raul says, “Can someone confirm that an index rebuild requires taking the database offline in Standard Edition?”

Tara Kizer: We’ll confirm that that is not required.

Erik Darling: It will take something offline.

Tara Kizer: Your index is going to come offline for Standard Edition, so it won’t be able to be used to tell the performance of queries. The database remains online, you’ll just have some blocking for the index that’s currently being rebuilt.

 

Should I rebuild all my indexes after I upgrade SQL Server?

Brent Ozar: Roland says, “I’ve read you should do a rebuild on all your indexes after you upgrade SQL Server. Why is this recommended?” Wherever you read that, go hunt them down and bookmark that site because you never want to go back there again. You want to be able to recognize bad advice when you come to it. You did exactly the right thing by asking why is it true, because there’s so much bad stuff out there on the internet many of which comes from our own site. But in this case, not this time.

Tara Kizer: Did you guys ever do update statistics after an upgrade?

Brent Ozar: Yes.

Tara Kizer: I didn’t always, it just depended on the system whether I wanted to spend the time doing it or not, how critical was the server or database.

Erik Darling: What kind of stinks is a lot of the times when you read people saying that you should update statistics after you move, they’re like you have to do it with a full scan, but when you have a multi-terabyte database, you can’t just like willy-nilly, “I’m going to full scan all these statistics, no problem.”

 

A long question walks into a bar

Brent Ozar: Eric asks…

Erik Darling: No, I didn’t.

Brent Ozar: Eric has a big, long one.

Erik Darling: Hey, thanks.

Brent Ozar: Yeah, the other Eric. Only one other Eric. Holy cow, Eric. That’s actually a good example of a question to post over on Stack Exchange. As you see these slides scroll by if you’re on YouTube or on a live broadcast, there’s instructions on how to go to dba.stackexchange. Anytime you find yourself typing multiple sentences, and especially multiple lines, multiple questions, probably want to go tackle that over at Stack Exchange.

Erik Darling: Sounds like there’s a vendor product involved so you might want to check with the vendor on that as well.

 

Know of any good job management applications?

Brent Ozar: Paul says, “Do you have any suggestions for job management applications? We want to manage all our agent jobs from a central server but all the apps I’ve found don’t really do logical mapping, meaning if the job fails on this stored procedure, start at this particular step again.”

Erik Darling: Yeah. Unfortunately, it’s going to cost you money. SQLSentry Performance Advisor does have a whole job chaining and job thinging thing in their product. I haven’t used it extensively because I never had a need for it. I was always pretty cool with having different profiles for that stuff but it’s pretty cool. If you go check out the—I said SQLSentry again—SentryOne Performance Advisor. If you go to the SentryOne website they have the job managing part of the performance advisor product with all sorts of pretty screen caps and an explanation of it in there if you want to look at it. Other than that, I don’t know.

Tara Kizer: I’m trying to think of the tool that we used at my last job because we weren’t just managing SQL Server. The dev ops team wasn’t managing just SQL Server so we had another—some kind of Windows third-party package that we purchased. It wasn’t cheap. I know other people have used it. It came up in the MVP mailing list recently and I think it was Adam Machanic that mentioned that they use it as well. So we had to kick off different things and then kick off SQL Server jobs. I just can’t remember the name of it. It’s an expensive product too.

Brent Ozar: JAMS has one, Unicenter is another one.

Erik Darling: I had a client once that built their own.

Richie Rump: Yeah, I mean, why not just build your own? How hard could it be, right?

Erik Darling: Yeah, it just needs a front-end and some stuff behind it that does things. Use the cloud.

Brent Ozar: Error handling we’ll leave for version two.

Richie Rump: We don’t need that. That’s optional.

Tara Kizer: I’ve got the answer to the tool I was using. It’s called Tidal.

Brent Ozar: Is that a streaming music service by Jay Z?

Tara Kizer: It might be a Cisco product. I didn’t look it up. Just the name.

 

Someone’s asking me for a UTF16 collation…

Brent Ozar: J.H. has an interesting one. He says someone asked him to create a database. He says, “According to our documentation, the databases should have the following settings. Character encoding of UTF 8 or UTF 16 and the collation sequence is case insensitive.” You should just have him give you the exact collation that they want, because this still leaves all kinds of things up in the air. Just say exactly which collation do you want. If you look in Books Online, there’s a list of collations, it’s stunning. That should keep them busy for a while.

Erik Darling: Can you set UTF settings like that on a database? 8 and 16? I’ve never seen that part before. The only thing I’ve ever read is that the bulk bcp had some problems with like UTF 8 or 16, stuff that got solved recently but I don’t know. I’ve never seen that setting for a database.

Brent Ozar: No clue.

 

What should I set autogrowth to?

Brent Ozar: Sam says, “For new SQL Servers, what do you typically set your autogrowth settings to, out of interest? Do you consider VLFs in your decision?”

Tara Kizer: I absolutely consider VLFs in my decision just because I’ve run into excessive downtime as a result of having high VLFs.

Brent Ozar: Go on about that, what’s that like? The excessive downtime? That’s awesome. I know where you’re going with this.

Tara Kizer: This company applied the Microsoft security patches monthly. We were doing this with just about five SQL Servers on this one night. The most critical database of those servers, it was down. We had rebooted the server and SQL Server was up but the database was in crash recovery and I kept looking at the, I think it was the error log, it was frustrating. This application had really high availability SLAs. We were in a maintenance window but we weren’t allowed to have that kind of downtime. It came up after about 45 minutes. I was almost in the process of opening up a case with Microsoft to figure out what was going on. The next Monday I contacted Microsoft via email and that’s when I learned about VLFs. This was a little over ten years ago I’d say, but 45 minutes, and this was like a 7 terabyte database. The transaction log wasn’t even that big for that system. It probably was like, I don’t know, 200, 300 gigabytes. I monitor VLFs ever since that outage because it was such a big deal for this application being down that long. As far as what I set for autogrowth, it just depends on if I’m doing a new database and I know that this database is going to be small, medium, or large, then I’ll set the increments to adjust that. If I think that this database is going to be pretty sizeable, I’ll usually use 1 to 5 gigabytes for the transaction log file. But that depends upon how fast your storage is and how quickly it can grow it out since you can’t use instant file initialization for the log files. Data files I’ll set even higher since we do use IFI on those.

Erik Darling: I think the most important thing though is just to get away from percent autogrow.

Tara Kizer: Yeah.

Erik Darling: Because as your database grows, that’s going to be a larger and larger chunk of file to grow out. So as long as you set it to a fixed increment, I think you’re just always going to be in much better shape than percent autogrow.

Richie Rump: 1 meg.

Tara Kizer: 1 percent or 1 megabyte, I don’t know which one is worse.

Erik Darling: Yes.

 

Why are my VSS backups failing?

Brent Ozar: Graham asks a question we happened to be talking about the other day in the company chat room. “We have a 600 gig database that’s being backed up with a third-party app.” I’m not going to name which one it is because I don’t want these poor people to feel bad. “Our backups have been failing because the VSS writer has been freezing IO for 60+ seconds and the vendor support has not been helpful. What’s the best approach to address this?” I’ll say treat your servers like cattle, not like pets. When I’m dealing with a VM, I’m not even going to do VM-level backups. I’m going to back up my SQL Servers. I’m going to do regular SQL Server full and log backups and use a different technique like log shipping, database mirroring, anything like that to be ready to go to failover to. But I’m the kind of person who never wants to restore the OS. If something goes so horribly awry that I can’t trust the OS and I have to step back in time, I don’t trust that OS period. Who knows if last night’s snapshot is going to be any good or not. So I don’t like doing snaps on SQL Server VMs.

Erik Darling: The only time I’ve ever done anything close to that was when I had to [inaudible] machines and even then I was nervous as hell about—I was like, is this going to…?

Brent Ozar: Yeah. Graham is saying, “This is for this vendor’s SQL Server backup agent.” Yeah, I’m saying I don’t use those. I’m not a big fan of those.

 

Why is rowmodctr so far off?

Brent Ozar: Kyle says, “After letting stats maintenance lapse…” Well, we all have our downfalls. “I noticed that sys.indexes.rowmodctr of some stats is recording far more changes than the number of rows in the table. Like say the table has 100,000 rows and it shows 300,000 changes. Does that mean that nothing is querying the table because SQL Server hasn’t automatically updated the stats?” What a cute question.

Tara Kizer: [Laughs] Cute.

(Note from Brent – I don’t know why I used the word cute here, but that question was awesome.)

Erik Darling: You could also have it hit a threshold where—actually, never mind on that.

Tara Kizer: He doesn’t specify the values and the threshold on 2014 and older is 20 percent. So maybe you haven’t hit the 20 percent threshold. On 2016, it’s lower than 20 percent and Erik has a blog article on it. Can’t really tell you a fixed number for 2016, they changed how the thresholds.

Brent Ozar: Well, there’s a trick. He says that the rowmodctr is higher than the number of rows in the table which means it should have tripped the 20 percent. I can think of two things that would cause this. One is that the stats are only updated whenever you do a select, which could be that you’re right, that it hasn’t tripped the 20 percent threshold. The other thing is if these numbers are tiny. The metric was technically last time I looked 20 percent plus 500 rows. So if we were talking about a 100-row table, then that would also come into play there. Otherwise, I like that nothing has queried the table.

Erik Darling: Yeah, that would be my first instinct.

 

How do I reclaim space from a heap table?

Brent Ozar: Paul Olsen says, “How do I reclaim the data file space used by a heap table?” Did you delete it?

Erik Darling: If you mean you deleted data from there and you have a big empty heap, your options suck. You could create a clustered index on the table, which will fix things up. You can rebuild the table in, I think it’s 2008+, but you have to be real careful with that because when you rebuild a heap, you rebuild all the non-clustered indexes on it as well. That can be a really bad word situation.

Brent Ozar: You can truncate the table too but of course then you lose all the data in it. That deallocates all the space immediately, but if you have any rows, you’re screwed.

 

How should I capture DML events on user tables?

Brent Ozar: Steven says, “If you wanted to capture DML events on user tables how would you go about it?” There’s probably two parts to this. For readers who don’t understand what a DML event is and then also how would you go about capturing them.

Tara Kizer: I wouldn’t do it. The systems that I’ve supported have too much data coming in being changed to ever audit like that. Use could use triggers. Doesn’t CDC help you with that too? I’m not positive on that though. But triggers is how I’ve seen a lot of people do it but I can’t afford the performance of that kind of micromanagement.

Richie Rump: But triggers on DML?

Brent Ozar: Every insert, update, delete.

Tara Kizer: You send it to another, a logging table.

Richie Rump: Oh.

Erik Darling: If you do an after trigger…

Tara Kizer: I’ve implemented it…

Richie Rump: Yeah, just log it in the app, be done with it. Don’t worry about ad hoc stuff.

Tara Kizer: Then you’re missing DBAs coming through and wanting to find the data or whatever.

Richie Rump: Yeah.

Erik Darling: My first instinct is to go with an after trigger if it has to be done on the database.

Tara Kizer: Yeah, definitely.

 

Why did my logical reads change from 2008R2 to 2014?

Brent Ozar: Roland says, “In a simple join of two tables…” What in the holy…? I’m going to say this out loud, “In a simple join of two tables, the second table supplies always the same number of rows as the predicate is always the same as the same pk.”

Erik Darling: Who put the bomp in the bomp…

Tara Kizer: I need to see the data at this point and expected output.

Richie Rump: Stack Exchange would be great place for this question.

Brent Ozar: The only thing I’m going to say is we’re going to nibble out one part of this. He says, “Switching from 2008 R2 to 2014 the logical reads changed. How should I approach this problem?” What changed in SQL Server 2014 to change execution plans?

Tara Kizer: Cardinality estimator.

Brent Ozar: What is that and what does it do?

Tara Kizer: I’ll let you guys take that one. I just know the answer.

Erik Darling: It’s like one of those guys in Vegas who can count cards basically.

Brent Ozar: Hmm.

Erik Darling: So the cardinality estimator figures out how many—it tries to figure out how many rows are going to be returned by an operation based on statistics and histograms and cardinality and joins and all sorts of other fun things like that. In 2014, the cardinality estimator was completely redone for the first time since SQL Server 2000 or 7 or some old, ancient version. A lot of stuff changed. There’s a great white paper by Mr. Joe Sack about the cardinality estimator. I will grab a link for that as soon as I’m done talking unless one of my lovely cohosts would like to do that now. That is where I would go and start reading about it because that’s I think where you’re going to find the key to this. I would look at the execution plan as well as to see if something changed. It sounds like perhaps something got replaced with a nested loops join since he’s talking about one row is being read over and over again. That would be my first instinct but I would check out the execution plans and go read Joe Sack’s white paper about the cardinality estimator.

Brent Ozar: If you want to test that it is the cardinality estimator, on your database change the compatibility level. Change it back to say 2012 or 2008 R2. That will flip you back into the old cardinality estimator and you may get immediately a different execution plan.

Erik Darling: There’s trace flags on the blog too.

Brent Ozar: If the thing is already in production you shouldn’t be flipping the compat level back and forth. The reason why is because it’s going to clear your plan cache. So Erik’s trace flag idea is much better when you’re troubleshooting one query at a time.

Brent Ozar: All right, that sums up our webcast for this week. Thanks everybody for hanging out with us. I hope you had an excellent time. We will see you in the next Office Hours.


First Responder Kit Updates: So Many Updates, Brent Quit

I got 128 GB of RAM just to develop sp_BlitzCache with! Also, thanks to @RichBenner @BrentOzar and @douglane for code, and @LitKnd and @HenrikStaunPoulsen for great ideas!

Grab the newly updated First Responder Kit, and here’s what’s in it:

sp_Blitz Improvements

  • #677 Finally! A bug that isn’t ours! Microsoft forget to let SE clean up query store. We warn you about that now.
  • #650 2016 SP1 gave us exciting new (old) Enterprise features. We give you some boring new wording about them. (suggested by @HenrikStaunPoulsen )
  • #602 More checks to see if you did something ridiculous to your CPUs

sp_BlitzCache Improvements

  • #688 Curious about RLS in your query plan? So is @LitKnd! Thanks, Kendra!
  • #678 In 2016 and up, we’ll tell you something if estimated rows and estimated rows read are way different. Seriously, upgrade to 2016. What’s wrong with you?
  • #665 So uh, you can now sort by ‘all’ and ‘all avg’. This may perform terribly on some servers. Please report any bugs.
  • #663 Sometimes stored procedure costs would show up as double. That’s no good.
  • #661 Function join detection was weird in some circumstances.
  • #660 Sometimes we don’t find any warnings. Now we tell you when we don’t find any warnings. Relief.
  • #659 Operator checks sometimes take a long time. Breaking them into smaller parts helps.
  • #653 Ever see something in BlitzCache that wasn’t yours? Funny thing about SPIDs, huh? We filter on those now, too.
  • #652 Filters! Referencing scalar valued functions! What a bad idea. We’ll let you know about that.
  • #633 Are you modifying a lot of indexes? I don’t know. But BlitzCache does. This looks a little inflated because it takes into account IF branches, but you get the idea.
  • #459 If a stored procedure gets executed once, does anyone care? @douglane doesn’t! Thanks, Doug!

sp_BlitzFirst Improvements

  • #668 Context is everything. Especially for databases. When you’re checking for stored procs.
  • #658 Long running queries being blocked. Great post, Brent! (coded by @BrentOzar )
  • #647 Is SQL running? Yes? So is something other than SQL, and it’s taking >= 25% of your processor time. (coded by @RichBenner )
  • #646 Because you should know if some goofball is restoring a database (coded by @RichBenner )
  • #645 Night of the sleeping SPIDs (coded by @RichBenner )

sp_BlitzIndex Improvements

  • #680 Statistics can have more than one column in them. Go figure.
  • #671 Not all ReportServer databases are just called ReportServer (coded by @BrentOzar )

sp_BlitzWho Improvements

  • #649 Oh, that pesky database_id error.
  • #645 Night of the sleeping SPIDs (coded by @RichBenner )

Start downloading!

Happy Birthday mom!


High Compiles and Multi-Statement Table Valued Functions

SQL Server, T-SQL, TempDB
2 Comments

Way back in 2016

I wrote about what to do if sp_BlitzFirst warns about high compiles. During GroupBy, Adam Machanic gave a great talk on new features in SQL Server 2016. It reminded me of a blog post I wanted to write about one common culprit of high compiles: Multi-Statement Table Valued Functions.

Leaving aside all the other performance problems with them, this is another way they can mess with your server. Let’s look at a typical setup where they get used, and how SQL Server 2016 can help you fix the problem.

You made a bad choice

Rather than use Table Valued Parameters, you decided to pass a CSV list to a stored procedure, and you wrote a crappy function with a WHILE loop in it to split the CSV list. Or you copied one from the first result you found on the internet, even though it warned you that it was a really just God-awful die in a fire idea.

To simulate workload, we’ll need a stored proc, and a loop to feed it a random list of IDs.

Then we’ll grab a CSV list of IDs and run our proc like this.

If I were on vNext, I’d be using STRING_AGG, but I’m trying to keep things clean. No big deal, though. I’ve been using XML this long, right?

We’re going to be using the Extended Events session and query from before. We’ll kick that off, run our loop, and then be on our merry way.

Shredding the data will give us a bunch of repeating patterns that look like this.

10… Top 10… Yeah.

So yeah, that function seems to get up to something once for every ID you pass in. Remember that in our STUFF… query, we grabbed the TOP 10 each time. In the XE session, each time we call the proc, the string splitting function compiles and executes code 10 times. Bummerino. That’s the life of a loop.

On SQL Server 2016 (and really, with any non-looping code), we can get around the constant compilations with a simple rewrite. In this case, I’m calling 2016’s STRING_SPLIT function instead of the MSTVF function.

When when our loop runs, we don’t see the constant compilations of function code.

You’re a joy.

Finisher

We didn’t cover a lot of the performance hits that MSTVFs, and table variables in general can cause. Much has been written about those topics, but I haven’t seen this come up. It’s one of the many reasons why you might be seeing high compilation rates in sp_BlitzFirst, or in your monitoring tools. Or because you’re really good at guessing compilation rates.

This was, oddly, the first demo I wrote on my new home cloud/server/whatever. Not exactly pushing the bounds of performance, here, but Extended Events data shredded pretty fast 🙂

Thanks for reading!


What Do You Wanna Know About Storage?

Storage
35 Comments
The Great Hardware Robbery of 2016

During Dell DBA Days 2016, we got to hang out with Doug Bernhardt of Dell’s storage team. He’s the guy who publishes Dell’s SQL Server documentation, builds their Fast Track reference architectures, and gets to play with all kinds of cool toys.

Doug emailed us and asked:

Is there any solutions type of info (or any info in general) on either Dell or EMC SAN and SQL Server that you think would be helpful to you, your customers, or the SQL community in general?  Anything that would help make better decisions about architecture, design, purchasing, or optimization of any of the Dell EMC gear?

And I figured there’s only one person I should ask – and that’s you, dear reader. So what documentation on storage would you like to see?


When Always On Isn’t: Handling Outages in Your Application

Today’s brief Stack Overflow outage reminded me of something I’ve always wanted to blog about:

Stack Overflow is in limp-home mode

There’s a gray bar across the top that says, “This site is currently in read-only mode; we’ll return with full functionality soon.”

That’s not a hidden feature of Always On Availability Groups. Rather, it’s a hidden feature of really dedicated developers whose application:

This is no small feat. For example, if you’re calling stored procedures to render a page, those stored procedures cannot do any writes. You can’t log page views inside a stored proc, or log execution times. You have to assume that the database may not be writeable. (That’s great practices anyway, since you can then use replicas to scale out all page rendering.)

For bonus points, your application can still do logging and writes – but just not to SQL Server. You might cache writes temporarily to a NoSQL store.

This is all left as an exercise for the reader. Always On Availability Groups give you a limp-home capability for your web site, but it’s up to you to make your web site as functional as possible when the database is degraded.


Analyzing Temporary Stored Procedures with sp_BlitzCache

All the cool kids

Know that when you’re trying to figure out why a stored procedure is slow sometimes, you should use a temporary stored procedure instead of pulling the code out and running it ad hoc.

We frequently point people to Erland Sommarskog’s “Slow in the Application, Fast in SSMS“, because it’s an excellent treatise on many of the things that can go wrong with a query.

But you’re here, so you’re cool

You use temporary stored procs, and you use sp_BlitzCache, but you’re having a hard time tracking down and analyzing plans for temporary stored procs.

The good news is: We got you covered.

Pretender

We have this stored procedure, which looks like just about any stored procedure. It accepts some variameterables, there’s a begin and an end; it’s spectacular. It doesn’t even throw errors.

One day the boss-person comes in and says “we also need to filter on user reputation”, and since you’re a well-paid and loyal employee, you get cracking on that. You create a temporary stored procedure to make sure your code works, and examine any potential performance issues. What a little go-getter you are.

You could use an ad hoc script to look at the plan cache, Extended Events, or Profiler, to capture the plan. But you like us, so you want to use Ye Olde Blytzen Casshe.

One thing we do by default is ignore system databases. They’re often noisy, and lousy with system queries that have nothing to do with real life. We give you the ability to override that, to choose a database to single out for examination, and as of recently, a way to filter by stored procedure. The code to track this one down looks about like so!

I get this back, with some warnings, and the ability to open up the plan and gaze upon it’s awfulness.

No alibi

Sunglasses At Night

There are a few tips in this post, and some really great links to follow and read, if you haven’t already. Hopefully all of them help you become better at performance tuning SQL Server.

Thanks for reading!


[Video] Office Hours 2017/01/18 (With Transcriptions)

This week, Brent, Erik, Tara, and Richie discuss sp_Blitz capabilities, what’s best to use for resolving SQL performance issues, CDC, tools for capturing wait stats, their favorite monitoring tools, temp tables vs table variables, partitioning with constraints, and whether you should get any certifications.

Here’s the video on YouTube:

Office Hours Webcast - 2017/01/18

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes

Office Hours Webcast – 2017-01-18

 

How should I prepare for the MCSE exam?

Brent Ozar: Wes Crockett says, “Hi all. It’s my first time here. Really pumped that you guys do these events. I’m recently back in a DBA role…” Back in the saddle again. “…after a few years out of the DBA world.” Now I’m more curious about what you went and did and what you came back for because it’s just interesting.

Richie Rump: [Inaudible] thing didn’t work out for you buddy.

Brent Ozar: Yeah, because he went to go to a developer and realized that the frameworks change every like six weeks. He’s like, man, give me some ANSI SQL. I’m not too bright.

Erik Darling: Real programming sucks.

Richie Rump: Stop moving my cheese dammit.

Brent Ozar: He says his new employer wants him to obtain an MCSE. “Do you guys have a recommended approach to getting there? What books do you think would be valuable, self-study, whatever?” We should all tell our last time that we got a cert and then how we studied for it. Going from left to right, Richie Rump, when was the last time you got a cert and how did you study for it?

Richie Rump: I think I took the 2008 test, the SQL developer stuff. I think I took that. I think that was the last time I took that. Before that, I think it was the PMP that I took. But the SQL stuff, I just actually just took it. I didn’t study, didn’t do anything. I think I may have read one thing on XML and that was it and just went in and took it.

Brent Ozar: How about you, Erik?

Erik Darling: The last certification I got was an A+ cert in 1998 or 1999. I got it by putting together some computers—that was my studying. I basically plugged stuff in and wrestled with interrupt requests. Then I had an A+ cert. I could put CD drives in, and CD burners were super popular.

Brent Ozar: They were. Is this the reason why you just built your new desktop? Are you trying to renew that, are you trying to get you’re A+ renewed?

Erik Darling: Yes, I’m trying to get back on the A+ fast track. I need to get back in their good graces because my career is at a little bit of a standstill right now.

Richie Rump: You know, Erik, I hear the cloud has computers now. Maybe you want to look into that.

Erik Darling: It does, but they’re super slow. They are just the slowest darn things.

Brent Ozar: Tara, how about you?

Tara Kizer: I don’t have any certifications.

Erik Darling: Oh, yeah.

Tara Kizer: I once or twice took a practice test for the MCSE and I failed miserably. I mean, not failed miserably, but I failed. I think I got like a 60 percent on it. I’m just a terrible test taker. It would be hard work for me to pass these things and I already had an established career when I was doing this so I didn’t see any need to do this. So unless your company is requiring it or maybe the next job is going to require it, I really don’t see the need to get certifications.

Erik Darling: Yeah, I’d much rather just get books that cover topics that I’m interested in, or books that cover topics that actually apply to my job. For instance, this thing is constantly on my desk.

Brent Ozar: Calvin and Hobbes?

Erik Darling: Yes. I have every single Calvin and Hobbes book, every single one. One of them is signed.

Brent Ozar: Whoa—by Calvin or Hobbes?

Erik Darling: Both.

Brent Ozar: Oh, wow, dang. Yeah, I tried, I can’t remember if it was 2012 or 2014, which one it was that I was like, “Let me go see how they are now.” Fail. I’m like, wow. Because I’m reading the questions and I’m like I know some of these are wrong—are just completely and utterly wrong. So, yeah, not big fans of certs. Lee Townsend says he ran into way too many paper tigers with certs who couldn’t actually do the job. Kelly says, “Good answer, Tara. I agree. I don’t hire folks based on certifications. I prefer experience.”

Erik Darling: What sucks is that like because it’s a Microsoft exam, they really try to cram new features on you. It’s like your sitting there learning about Hekaton and change tracking and CDC and like all this other stuff that you know practically you are not going to use and you are going to have to apply in your job. That’s my other kind of gripe with them.

Brent Ozar: Every time I sit through one of them I’m like I don’t know what percentage of DBAs in the world have to do the things that are shown on this exam. I don’t understand what their job is like. That’s not a real—that doesn’t measure what I do. Wes says that when he did his intermission in his career he went over and did systems engineering, QA automation, and data analytics. I would also be curious as to what brought you back to database administration.

Erik Darling: Money probably.

 

Will there be a Brent Ozar Unlimited certification?

Brent Ozar: Money, that’s probably true. Fiag says, “How about a project of a Brent certified DBA?” Not no, but hell no. We don’t like taking certifications. Writing those questions is even worse. God, it’s hard. I like doing interviews and just letting people talk about what they’re good at. If I ever did a certificate it would be something involved with doing an interview and going, yeah, she hardly drooled at all. Yeah, she’s good.

Erik Darling: We did kind of have fun, I think around about when I first started me and Kendra did those quizzes, it was like one on indexes and one on T-SQL. Those were fun because they were just like here’s ten questions, what do you think? No one got a badge or a smiley face or a gold star for taking it but it was just kind of, I don’t know, that was kind of fun because it was just a cutesy little thing. If it was like an actual cert I would run.

Brent Ozar: Wes says he came back for money for one reason but he also found a job in an industry he loves. I know, that always makes me feel like I’m less like working when I’m around another company of—we did the GroupBy conference and one of the presenters is a database administrator for, the way that he said it is the Formula 1 team with the red cars and the yellow prancing horses. I was like you lucky, diabolical…

 

Should I put SSRS on a separate server?

Brent Ozar: Tammy asks, “When we do SSRS installs, should we put that on a separate server? Right now some of our servers have everything on one. I don’t think that there are security concerns starting with 2014 and I kind of want them on the same server for licensing purposes but I don’t want to make a bad choice. We have VMs so I could easily add more servers if I want. Should I put SSRS on a separate box?”

Tara Kizer: I think of SSRS as a like a third-party product. Just because it says SQL Server in the name and it’s written by Microsoft, it’s totally different than SQL Server. Think of it as a third-party product. Would you install a third-party product that isn’t supporting the SQL Server, isn’t doing net backup, or supporting software for SQL Server? Would you put a third-party product on your database server? I wouldn’t. I want my SQL Server dedicated to the databases and nothing else except supporting software. I always put the reporting services service or any other services that I need on a separate box. Yeah, you’ve got additional licenses here but the best practice is to have a dedicated SQL Server. If you can support it on the VM and you can add resources, you’re not having a problem, sure, do it. But I’ve worked in corporate environments where we’re just very careful with how things are, how we’re designing things, and making sure that we have a dedicated SQL Server.

Erik Darling: If you’re licensing Enterprise, you have to license at the host and you can spin up as many guests as you want, but Standard Edition, then, yeah, you have to pay per guest. So I don’t know if that changes how you think about things.

 

Can sleeping queries cause blocking?

Brent Ozar: James says he ran sp_Blitz and he noticed that there was—and I’m guessing this is sp_BlitzFirst or BlitzWho. He noticed that there was a sleeping query with open transactions.

Erik Darling: That was me.

Tara Kizer: I logged in.

Brent Ozar: “Does that cause blocking and if so, how do I resolve it?”

Erik Darling: It could. If you do like BEGIN TRAN and you insert, update, delete and then you walk away, you’ll have a sleeping transaction that could cause blocking because you’re still holding onto those things, those little pages and locks and whatnot.

Tara Kizer: Run sp_lock and then the SPID number and see what resources are locked and if anyone else wants to get access to those same resources they’re going to be blocked.

Erik Darling: Another option is if you’re feeling Machanic-y, you can run sp_WhoIsActive and you can use the show sleeping SPIDs parameter of [inaudible]. Then that will show you anything that is asleep. It will show you what’s going on there. Also, I think we recently added that to BlitzWho. I don’t know if it’s gotten pushed out from the dev branch yet though.

Brent Ozar: Oh, that’s a good point.

Erik Darling: You can run BlitzWho if you’re feeling particularly attached to us. It shows sleeping SPIDs there as well.

Brent Ozar: I love that sp_WhoIsActive also has the get locks parameter. It will show you exactly what people have locks on, it’s fantastic.

 

Do service packs always cause reboots?

Brent Ozar: J.H. says, “When I apply a SQL Server service pack or hotfix on some machines it reboots, on some machines, it doesn’t. Is there a reason or way I can tell when it’s going to reboot and when it’s not?”

Tara Kizer: It’s just whether or not the files were locked. If the files were locked, it’s going to require a reboot in order for the installer to make changes to those files.

Brent Ozar: Yeah.

Brent Ozar: Robert has a big long list with a checklist, a question with a checklist. Your best bet there is definitely to post that on Stack Exchange. As you see the slides rolling past, this one right here, it says if you have a multi-paragraph question. The reason why I say that, because this isn’t really multiple paragraphs, you’ve laid it together pretty nicely, but I bet you the smart answerers over there are going to do a beautiful job of injecting more steps. I like where you’re going. It’s very good but just throw it out in dba.stackexchange. You’ll get even better details.

 

What’s better: Profiler or Extended Events?

Brent Ozar: Raul asks a question. “What is the best to use for resolving SQL performance issues?” He gets a little tricky there because he ties us down to only two things and I’m not sure I would use either of them. It’s either Profiler or Extended Events.

Erik Darling: You’ve got to be careful with those, buddy.

Brent Ozar: Yeah, why do you have to be careful?

Erik Darling: Observer overhead, sir.

Brent Ozar: Yes. Schrödinger’s Profiler. Yeah, so the more instrumentation, yeah, the slower your SQL Server is going to go, especially with—Erik has a great demo with functions, how scalar UDFs and multi-statement table valued functions as they’re running, when you turn on profiler or even set statistics IO—boom—just unbelievable overhead. So what would you recommend using instead? Anything we would recommend starting instead of profiler or extended events? Because it sounds like he is after queries.

Erik Darling: Sure, I would start with our tools because that’s the cheapest way. You can check out sp_BlitzCache and you check out sp_BlitzIndex and you can check out sp_Blitz. Those will give you a pretty good general idea of the state that your server is in. Then from there, if I really needed eyes on my server constantly, I would just by a mature monitoring product. I wouldn’t sit there trying to roll my own or sit there trying to get an extended events session right or hope that I can get profiler kicking and working. That’s just me though, I’m lazy.

Brent Ozar: So many jokes there about an immature monitoring product but I’m not going to go there. It’s like Beavis and Butt-Head watching your servers.

 

How do I protect the distribution database?

Brent Ozar: Tim Smith says he has a four-node SQL 2012 availability group. Two of the nodes are in DR, two of the nodes are in his primary data center. “What’s my best way to create a plan for the DR for our distribution database?” He’s put the distribution database in here too, which is that even supported in an AG?”

Tara Kizer: He does say two-node failover cluster instance so the distribution database is not on the four-node AG. He’s got the distribution database on the two node regular failover cluster instance. Because yes, you’re right, distribution database is not supported in availability groups so it has to go into a failover cluster instance. I’ve had environments which have the availability groups across, onto a disaster recovery site and we had replication involved. You’ll have the distribution database at the primary site and the distribution database at your DR site. When you do the failover, you’re going to want to run scripts to recreate replication. Once you do that, you’re going to point it to the distribution database on the other instance. So you’re going to run replication scripts to redirect it. I don’t have those scripts memorized but once you’re ready to do that failover, I think we dropped replication on the first, did the failover, and then added it back pointing it to the new location of the distribution database.

Brent Ozar: I made a face like it sounded horrible but as I’m thinking about it, I’m like, it’s disaster recovery. It’s not like you do this every day.

Tara Kizer: It’s the same process you would do if you moved the distribution database to another server at the primary site. Think about what those steps are.

Brent Ozar: Yeah.

 

Why does CDC hurt when I do this?

Brent Ozar: Adam says, “We have CDC set up using a secondary file group. On some databases, the file size eventually grows to match the size of the primary data file. What could be causing this?” Well, Adam, I think the answer is in your question: CDC. I don’t have any experience with CDC. Do any of you guys? No.

Tara Kizer: I’ve used change tracking but not CDC.

 

Where do you post BrentOzar.com jobs?

Brent Ozar: Michella asks, “Where do you post brentozar.com jobs?” They’re in the SQL Server agent. No, we’re not hiring now but when we do we post a blog post. Normally, we like snipe Tara out of nowhere.

Tara Kizer: Just out of the blue.

Brent Ozar: Then generally we post it out on the blog. We let people apply in public. We’re not just trying to get you to read the blog, it’s just the easiest way. There’s like 100,000 of you that read that thing. Most of you aren’t qualified. That’s okay. We’ll find one or two good ones and we’ll go from there.

 

What are good tools to capture SQL Server wait stats?

Brent Ozar: Cody says he recently attended a seminar talking about performance tuning with wait statistics. Something touched on but they didn’t go into depth with it was what good tools are out there to capture wait stats. When you guys are looking for something to capture wait stats, what do you think about using?

Erik Darling: Extended events. Just kidding.

Brent Ozar: Profiler?

Erik Darling: Again, I stick with our stuff. I’ll go with BlitzFirst, you run it, you can either do since startup and it will show you your wait stats since startup or you can do it for a 30 second or some other second sample and you can get wait stats over a sample of time. So it’s a really nice way to look at wait stats. We make it really easy to read them and figure out what’s going on with them so I think it’s a pretty good way. You can probably find a bajillion scripts out there on the internet that all look at waits stats though, if you don’t want to use our scripts, which I don’t know why you wouldn’t.

Brent Ozar: If you went back to work as a production database administrator and your boss said, because as part of the deal for having you go back to work as a DBA, he said you can have two things, whatever monitoring tool you want and unlimited soups, salads, and breadsticks. Which monitoring tool do you pick and why?
Erik Darling: Good question.

Brent Ozar: I know, right? I’m putting you on the spot.

Erik Darling: Me? I say the same thing every week. Every week I say I would go with SentryOne now Performance Advisor. That is my all-time favorite monitoring tool. I like the way it looks. I like that it has that dark theme so it doesn’t blind me when I’m staring at it at 4:00 in the morning. I just really like the way the charts and graphs don’t give you a bunch of goofy stuff that you can’t do anything about and you can highlight sections of the graphs and it will bring you to the SQL that was running at the time or bring it to the SQL that is currently running, you can view execution plans. It’s got plan explorer built into it so you can view execution plans right from your monitoring tool. You don’t have to do a bunch of other goofy stuff to get your plans somewhere else, which is a shortfall of many, especially web-based monitoring tools, is that you have to download the execution plan to something else.

Brent Ozar: Tara, how about you?

Tara Kizer: Which tool?

Brent Ozar: Yeah.

Tara Kizer: Yeah, I like SentryOne’s Performance Advisor but I’ve also used the Quest tools, Foglight and Spotlight. I think they’re all great tools, the ones that I’ve used.

Brent Ozar: I’d be like, any, just give me a tool, I don’t care which one.

Tara Kizer: Yeah, a tool.

 

What’s better: temp tables or table variables?

Brent Ozar: Dan says he’s having a great debate of temp tables versus table variables. He hates to beat a dead horse—good—then stop. No, seriously, this will be still a debate topic ten years from now. He says, “I believe it’s best to use temp tables instead of table variables but my team members disagree. Recompiles and stored procs always come up.” You can fix that, put your temp tables at the top. Put your temp tables at the top and they won’t necessarily recompile. For more details on that, go read Paul White’s temp table recompile. Google for temp table recompile…

Erik Darling: Oh, temp tables and stored procedures.

Brent Ozar: Yeah, oh my god, he’s got a great blog post, like four of them about plan reuse with those that are shocking. I really dig them.

 

How should I handle NoSQL databases?

Brent Ozar: Richie, Alexander wants to ask you, he says, “What do you think about companies that adopt different sorts of NOSQL? How do you handle it all? Like sometimes they want Cassandra, Mongo, whatever. So why are they picking different kinds of databases and what should you do about it as a DBA?”

Richie Rump: Well those different databases do usually one specific thing really well and they don’t do other things at all. If they do it at all, it’s done pretty poorly, so that’s why you’ll have—if I need something for really fast reads, Mongo is really good for that. But if I need some relational data and then you join between different tables and stuff like that, it kind of falls over a little bit on the reporting side. That’s why you’ll see a lot of, “Hey, I need this tool for this, this and this tool for that.” I know we talked about Elasticsearch, that’s a great tool if want to go and search large amounts of text and things like that. I’m not going to put my relational data in there because it just doesn’t make any sense to do that. So how do you keep up? Well, what do you need to know is really the question. If you need to know about a document database, go look for document databases and start playing with some tools. If you need to know about graph databases, go look at a graph database and go look at some tools. I don’t like to get into the details of it until I’m told this is what we’re going with or if I’m doing some sort of POC. Until then, I just want to keep a high level of understanding what’s going on with each tool and then, hey, we’re going to go with this, then I start going pretty deep in the product.

Brent Ozar: The other thing to keep in mind as you start to spelunk around with lots of multiple tools, some of these tools fold up. If you want to read an interesting thing, search for rethinkdb, why rethinkdb failed. It was a company that was building an open source product and they just folded up recently. It’s a really interesting story behind it, around how hard it is to build an open source tool and make it work, scale, make it make money.

Tara Kizer: What’s the name again?

Brent Ozar: Rethinkdb, all one word, all lowercase. Are you trying to think about something other than SQL Server?

Tara Kizer: You said it was an interesting story, I want to look at it.

Brent Ozar: It really is.

Richie Rump: They had great founders. They were really super smart. They had a great team behind them. They had all this open source stuff coming in all over the place and just this vibrant community and they just couldn’t make it work financially. They even had NASA. NASA was putting it in their stuff. So it’s not like people weren’t using it, just sometimes the financials just don’t work out that way.

Brent Ozar: It’s amazing to read the postmortem because so many people chime in in comments and like, “Yeah, I use rethinkdb all the time but I don’t pay them a dollar because everything is out there and it’s free and it works great.”

Richie Rump: Yeah, we interviewed the founders on Away from the Keyboard last year, and the community manager. We posted the community manager after they had folded and we were like, “Do you want to re-listen to this before we post this, just to make sure you’re good with it?”

Erik Darling: Were things looking sunnier for them then or were they kind of like, “No one is giving us money for this free thing.”

Richie Rump: No, everything was like the sky is the limit. I mean, you’re not going to talk about the bad stuff to a bunch of podcasters. I mean, why would you want to do that? “No, we’re hurting pretty good. Maybe we won’t be around very much longer.”

Erik Darling: If I were working for me, man, I would be out of here.

Brent Ozar: Oh, god. Don’t use our product.

 

Do I have to license SQL Server for TFS?

Brent Ozar: Paul asks, “I just got out of a meeting.” Well, thank god for that. “Where our uptime engineer told—” Up time engineer, what a great job title. I love it. That’s cool. Richie is our down time engineer. “He told everyone that you can use Developer Edition without paying for licensing because we use TFS for development. TFS is a business application and requires licensing.” Some of Microsoft’s licenses, like SharePoint is an example, will include one SQL Server as part of it. So what you want to do is read more about that product’s license. They’re usually production licenses. They’re usually not Developer either.

Richie Rump: I think TFS was Standard but I haven’t been in that game for years, so. But it’s usually just like you said, the one and that’s it.

Brent Ozar: And it’s not like you really want a big, bulky SQL Server for that either. You can live with a small one.

 

How do I protect Analysis Services?

Brent Ozar: M.M. says, “I’m going to setup log shipping as a disaster recovery for our production server.” That’s amazing, we’re doing sort of a weird version of that this week. “We have a separate cube server with SQL Server Analysis Services, do you know what’s required for DR for Analysis Services? We’re not using VMs and my databases are large.”

Erik Darling: I have no idea.

Tara Kizer: No idea. I’ve never even used SSAS.

Brent Ozar: Everything I know from Analysis Services is just from listening to the guys at Prodata talk. It’s like www.prodata.ie, they’re out of Ireland. Just awesome, wonderful, friendly people—Carmen and Bob. The basic idea would be you just stand up another analysis services box over there and when you failover you rebuild your cubes. No sense in keeping the cubes in sync across.

 

Can I do partition switching with foreign keys?

Brent Ozar: Richard says, “I want to use…” Richie likes this. Is it just that you like his name or you like his question?

Richie Rump: I like his name, that’s it. You’ve got a great name, buddy. It means powerful ruler and I’m pretty sure you are powerful in whatever you do.

Brent Ozar: His last name is Seltzer so his middle name must be Alka. “I want to use table switches in SQL Server partitioning with sliding windows.” I’m pretty sure I know Richard too, come to think of it. He wants to do partitions with sliding windows but apparently, he can’t do that if the table is apparent in a foreign key constraint. “Is there any way to do switches or fast clearing out of partitions without dropping and re-adding constraints?” Have you guys done partitioning with constraints?

Erik Darling: Not that kind.

Richie Rump: Yes, but we didn’t do sliding windows. We didn’t slide in the data.

Brent Ozar: Every time I had to do one of those projects I didn’t use foreign key integrity and it didn’t occur to me until just now. Like I never used parent child foreign key integrity. We trusted the ETL process to manage that and make sure it was clean and pure, which of course is a bad idea but you know, I had really smart, nice, trustworthy ETL guys who lived in a van down by the river.

 

Where are best practices for SSRS 2016?

Brent Ozar: Tammy says, “I have another SSRS question. I was looking for best practices for the more recent versions of reporting services and I couldn’t find anything recent. Can you recommend a resource?” No.

Erik Darling: Doug Lane.

Brent Ozar: Even that, not recent.

Erik Darling: Totally, yeah.

Brent Ozar: Do we even know anybody who is using…? We just had the GroupBy conference. Bert Wagner—if you go to groupby.org—Bert Wagner did a session on high performance SSRS and right at the beginning he’s like, “Look, I haven’t used SSRS in the last year or two. This is just general performance advice” because people were asking how is it different in 2016. He’s like, “I don’t know.” I don’t know anybody who is doing 2016 SSRS. Not that it’s bad I just don’t know anybody doing it.

Erik Darling: Tableau.

Richie Rump: In a tableau world…

 

When should we apply service packs and cumulative updates?

Brent Ozar: Interesting. Lee says, “Microsoft wants us to apply service packs and cumulative updates and security releases right away. I’m curious to your thoughts on that.” What did you guys do when you were database administrators or you were managing databases?

Erik Darling: You know, you’ve got to patch those dev and QA servers first, man. You can’t just stick that on prod. That’s my advice.

Brent Ozar: You let it bake for some period of time.

Erik Darling: Yeah. If you’re super cool and you have clusters or AGs then you can patch some secondaries first or some dev servers first and let that burn in and then flip around and do your other stuff. The right-away thing, cool for dev and QA and whatever UAT environment you have, not cool for prod because Microsoft updates will still break stuff. Something seems to go awry with just about every one of those SPs and CUs.

Brent Ozar: The best thing was breaking NOLOCKs. One of the CUs broke NOLOCK. If you had it in development, you would never know because it’s not like you’re worried about concurrency in a dev environment. People are just running queries and you expect locking to happen so even when you bake it in for a while it still becomes an ugly surprise.

 

I want to upgrade and change my cluster…

Brent Ozar: Niraj asks a question that I feel like I want to type with a latex glove. “I need to perform an in-place upgrade from SQL Server 2008 R2 and it’s a cluster. Want to upgrade the cluster to SQL 2012. By the way, I’d like to convert them to Always On.” I’m assuming he means Always On Availability Groups. “Any suggestions?”

Erik Darling: Don’t.

Tara Kizer: I am not a big fan of in-place upgrades in production. I’ve certainly done it in test environments to save us time but never, ever in production. Usually when I’m upgrading from one version to the next I’m also switching hardware, or at least to another server. But if you’re not going to be switching servers, just install a new instance, then do the configuration you want—although if you’re going to be moving from a cluster two node to form a failover cluster instance to availability groups, you’re obviously going to need to do a bunch of work on that server if you’re doing in place. You’re going to have remove the cluster, add it back, and then setup availability groups so I would not recommend doing that all on the production servers. You have no fallback plan.

Erik Darling: The other thing is that since you’re on that version of SQL Server, you’re probably on Windows 2008 R2. That’s just like, you know, a bucket of crap for a failover cluster. Windows Server 2012 and 2016 are much better and god help you if you try to do an in-place OS upgrade.

Tara Kizer: We did those in the test environment just to save time and we’re just like two thumbs up if it works.

Erik Darling: Terrifying.

 

How can I tell which tables are getting used?

Brent Ozar: Robert has a question I love because I used to struggle with this myself. Robert says, “I inherited a legacy system with a lot of dead code and tables. Sys.dm_exec_query_stats fails me when tracking down which objects are being used. What’s the best way to determine what is being used?” Oh, there’s so many ways to do this. I would probably push back and say, why? What are you going to do? Because best case scenario you drop some tables. Worst case scenario, you break somebody’s application.

Erik Darling: Yeah, especially the code. It’s not taking up space so it’s not hurting anything just sitting there. I wouldn’t go ahead and mess with it. If you really want to get into it then set up some sort of auditing or some other monitoring just to see what runs over a 30-day or 90-day period and then get rid of everything that doesn’t. But even then, it’s not really guaranteed that anything is going to change for the better. You get rid of stored procedures that never run, they seem pretty harmless to me, like fingernails.

Richie Rump: Yeah, there’s a development philosophy where you want to delete as much code as possible. So early and often you want to delete code, so you’re always looking to delete code, delete code, delete code so that when it does go out, you’re not having all this commented-out code and/or code just doesn’t run anymore. Typically, I do a lot of that. I’ll go ahead and I’ll write a bunch of stuff—oh, that didn’t work at all. So you go in and you make sure it works and then you do a check in and deleting all the code. Typically, that’s how that would work, but a lot of people, they don’t do that. They just kind of just say, “We moved something and then the old stuff remains.”

Erik Darling: For tables though, one thing that I used to do whenever I inherited a server, I would look for table names that have like BK or back or a date on them, because that’s usually someone like saying, okay, I’m going to do something stupid over here. Let me backup the table as is and then if I need it again, you know. Because like especially when I was doing the ediscovery thing, you would sometimes have to make changes to the relativity’s configuration table, which is just a big long table with like a name and a value and then like a SQL variant column to put in your configuration. There were times when you had to make changes and you didn’t want to mess anything up so I would always select everything over into a table with like table name back with a date so I would know which table is which. So that’s one thing that you can look for sometimes, there’s a pattern that pops up where people make backup tables that are safe to ditch after a while.

Brent Ozar: He follows up and he says his biggest problem isn’t space but Visual Studio DACPAC deploy times. Yeah, it’s not—DACPAC isn’t known for speed and efficiency of deployment. That’s your bigger problem there. If you want to performance tune that process, I’d actually look at switching processes. Because even if you have a bare minimum of tables, as your data file grows, you can continue to be screwed because sometimes the DACPACs will simply in order to affect one alter table, they’ll create a new table and shove all the data into it then drop the old table. That can break all kinds of things, replication, security setups that you’ve done, so I’d want to be careful about that. Thanks, everybody for hanging out with us this week. We will see you guys next week in Office Hours. Adios.


The 2017 Data Professional Salary Survey Results

We asked to see your papers, and 2,898 people from 66 countries answered.

Download the raw data in Excel, and you can slice and dice by country, years of experience, whether you manage staff or not, education, and more.

Community bloggers have already started to analyze the results:

Keep in mind that the data’s only as good as the people who entered it. This was free for anyone to enter, and we didn’t validate their experience, their actual pay stub, or whether they have naked pictures of the boss that they’re using for blackmail. You have to take the data with a grain of salt, and use medians (the middle numbers) rather than averages. (We’ve also taken the liberty of hand-editing and removing specific rows – for example, somebody filled it in $10,000,000,000,000 per year, and that row got removed.)

We’ll definitely do this again next year, and there’s one thing we’ll need your help to figure out: how do we get more granular location data around the world? The problem with open text entry location fields is that people can’t be trusted to put in consistent data. I’d love to have a dropdown box for country, which then populates a dropdown box for state/province, which then lets people pick metro areas. The challenge there is finding a free or open source provider for that data that integrates well into a survey. If you’ve got ideas, I’d love to hear ’em – but remember, they need to be global (not just US), and they need to be free or open source.


Indexes For Kids

Indexing, SQL Server
18 Comments

Kid Tested

When you work with something a lot, you tend to extrapolate concepts from it out to everything you see. That one person collecting tickets is a serial zone. That food truck line is a HEAP. The empty subway car that smells like a toilet is Replication.

Stuff like that.

If you have one or more kids, you might even see database concepts pop up with them, too. Their messy room is index fragmentation. Their tiny inefficient hands are like cursors. Their dirty diapers are like Replication.

Dadsplainin’

I don’t sit around lecturing my kid about database stuff. It’s not that I don’t love what I do, it’s just that I honestly do hope there are job openings for princess astronaut doctors by the time she’s old enough to start looking.

But then we got this puzzle! And it’s like, the perfect way to explain why indexes can make things faster.

I mean, just look at it.

The wires are in this picture to annoy Richie.
The wires are in this picture to annoy Richie.

 

It even joins together. Lord help me. Here it comes.

If you’re already cool with indexes

This post isn’t for you, unless you’re super bored.

See, each of those stacks is like a column. Or a field. And each of those letters and pictures is like a row. Or a record.

Now that your database terminology OCD requires medication

The end goal is to match each one up. They even have matching patterns. It’s really something.

Here are your DBA choices for matching them:

  • If they’re both shuffled, you pick a stack and look for the corresponding piece in the other stack. This is horribly slow and procedural, even if you start to cheat and make matches when you remember you’ve seen a piece recently.
  • You can organize one alphabetically, much like adding an index, but you still have to wander through the unsorted stack. That’s not much better, is it? You’ve turned into a human Nested Loops Join. Half-indexed data doesn’t help you much here.
  • Last, you can organize both stacks alphabetically. This is obviously the most efficient. It’s almost like join elimination. You can mash both stacks together without looking.

This is a lot like what Brent teaches in How To Think Like The Engine. Any WHERE clause or JOIN that isn’t indexed means you’re scanning something. This doesn’t mean I’m totally against scans, it just means I’d rather SQL scan indexed data than unindexed data.

Towards entropy

If this puzzle were like your data, it probably wouldn’t be a 1:1 match. I’m not insulting your data, mind you. Some customers signed up, but they didn’t order anything.

Similarly, if we were missing any puzzle pieces, we’d know because we just sorted them all. If you don’t, trust me, you end up looking through the stack like four times thinking you lost your mind not being able to find the Xylophone. Ordering the data first means you know xylophone is missing, so when X comes up, you discard it.

Inevitably, one of these pieces will end up broken or in the toilet.

And this, my dear friends, is why you always make backups.

Thanks for reading!


Your Favorite Bugs & Enhancement Requests: #TSQL2sday 86 Roundup

SQL Server, T-SQL
18 Comments

I dunno about you, but I got a big stocking full of coal. Next year, I’m gonna be better, and I plan on asking Santa for a whole bunch of Connect requests. For T-SQL Tuesday, I asked you to name your favorite SQL Server bugs & enhancement requests, and here’s what you want in your stocking next year.

If you agree with a feature, click on it, and upvote its Connect request. These bloggers took the time to make their case – now it’s time for you to vote.

Export/import SQL Server settings – Rich Benner blew me away with this idea. Wouldn’t it be cool to simply export all your settings when you need to build a new server and make sure it’s consistent with the old one?

A query hint to force parallelism – James Anderson wants a supported hint that would do the same thing as the unsupported trace flag 8649, and I am intrigued by his idea, and would like to subscribe to his newsletter.

Track last-used-dates for objects – Wayne Sheffield wants it. You want it. I want it. This would be awesome.

Turn off Enterprise features in Developer Edition – Mark Southall echoes something I’ve heard so many times from DBAs who want their developers to avoid features they can’t deploy in production. Yes, this is less necessary in 2016 SP1 – but there’s still a few reasons, and a lot of people who aren’t on 2016 SP1.

Add a numbers tableAdam Machanic Aaron Bertrand asked for something that would make advanced T-SQL a whole lot easier to write (and especially more consistent across different servers.) He also included a few honorable mentions, plus his favorite resolved Connect items.

String or binary data would be truncated – Denis Gobo is as sick of this stupid error message as I am, and I’ve upvoted this Connect suggestion before. So necessary. And for bonus points, Lori Edwards wrote about the exact same thing!

Indexed view insertion bugs – Wow, until I read Shane O’Neill’s post, I had no idea these bugs existed, and it scares me a little. OK, maybe a lot. Hold me.

Play a sound while waiting for SSMS – Tom Roush is an optimist who brings joy to others, as do his Connect requests.

Faster query execution with batch mode – Koen Verbeeck talks about Itzik’s slick trick to use columnstore-style execution to make rowstore queries go faster, and asks to get a non-hacky workaround.

Retaining history of dropped columns in temporal tables – I think temporal tables stand a chance of being one of the coolest features in 2016 over time, and Randolph West points to a great idea to make ’em even better.

Unsigned integers – One of my favorite new bloggers, Ewald Cress, managed to work in a Zoolander reference. Bonus points. Enough bonus points, in fact, that he needs numbers larger than integers to track them.

Extended Events live data view gets behind – Dave Mason warns you about the first issue you’re likely to run into when you’re playing with XE for the first time.

Fixing a bug with columnstore indexes and filtered indexes – Any blog post that has a stack dump is fun to me, even if I haven’t been using these two features often.

Database-level wait stats – Arun wants to bring this feature down from the cloud. I totally understand that it’d be hard to code, but…we went to the moon, right? Also, I want a pony.

Sorting a stacked Power BI chart – Looking at Melissa Connors’ example, I would probably bet one crispy taco that this will be the first item in this list that actually gets fixed.

Always On recovery pending bug – Muthukkumaran Kaliyamoorthy got bit by what sounds like a really funny AG bug. (He might beat Melissa in speed-to-fix, come to think of it.)

Native multi-tenant support – One database, multiple clients with the same tables, but kept separated by the engine. I love Kennie Nybo Pontoppidan’s optimism, but I think Tom Roush’s request is likely to get fulfilled first.

MSBuild support in SSIS – Automated deployment is getting more mainstream these days, and Jeroen Janssen wants to make it easier.

SSRS “while-loading” page – Ryan McCauley wants to customize the page users see before their report loads, and show them useful information about the report. I love it.

Subfolders in SSRS – I’m not sure if Bert Wagner is trolling me, because it seems impossible that this hasn’t happened after 7 years of requests.

Estimated Number of Rows to be Read – Rob Farley wants to see this property added to the execution plan tooltips.

Named function parameters – Instead of just passing in a list of values, Riley Major makes a great case for why you should pass the names in too. He won me over.

Padded strings – No, not just with spaces, but for example a string of leading zeroes for zip codes. Christian Gräfe won me over with that Oracle idea.

Add included columns to sp_helpindex – Kenneth Fisher, buddy, pal. One word: sp_BlitzIndex. At least I think that’s one word.

SQL Server on Raspberry Pi systems – Winning post in the category of “Highest Licensing Exposure for Smallest Performance Gain.”

RowCreateTime, RowModifiedTime, RowModifiedUserName fields – Louis Davidson opined that SQL Server should automatically update these columns. It’d be really useful for change detection. This one’s my favorite – in fact, I’m such a huge fan of this that my own T-SQL Tuesday post was about a different Connect request with the same goal.

Updates After the Publication:

IntelliSense for MDX – Jens Vestergaard wrote a post, but forgot to link back here, so we didn’t know about it. I like the idea though!

Storing & accessing Query Store data – Erin Sstellato wants to put it somewhere other than PRIMARY, and export it from the database.

Lots of stuff – Adam Machanic put together a whole wish list for developer ease-of-use.

Whew! That was a ton of fun. Wanna host your own T-SQL Tuesday? Check out the rules and email amachanic at gmail dot com – he’d love to have you, and it’s a great way to meet new bloggers, spark inspiration, and get other people reading your blog.