Blog

Updating the Stack Overflow Demo Database

SQL Server
1 Comment

StackOverflow.com shares your questions, answers, comments, votes, users, badges, etc by doing a public data dump that you can download via BitTorrent.

I take that data, and I turn it into a SQL Server database that you can query. It’s so much better than the Microsoft sample databases because:
so-logo

  • It’s just a handful of easily-understood tables
  • It has real-world data distribution
  • Sample queries to tune are available at data.stackexchange.com
  • It’s big enough to see real performance issues (the 2017-01 version is up to 110GB)
  • It’s actually fun to read the data while you’re working with it

If you like playing around with this kind of thing, you can get the latest version now.


[Video] Office Hours 2017/01/25 (With Transcriptions)

This week, Brent, Erik, Tara, and Richie discuss whether an antivirus can corrupt a database or remove data files, the biggest node clusters they’ve worked with, whether cursors are a good idea for reporting queries, authogrowth setting for new servers, memory-optimized tables, index rebuilds and more.

Here’s the video on YouTube:

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes

Office Hours Webcast – 2017-01-25

 

What’s the big deal with antivirus software on SQL Servers?

Brent Ozar: Graham asks, “I’m reading Denny Cherry’s book Securing SQL Server. He says for antivirus configuration he recommends not scanning the data files because if a data pattern matches that of a known virus the antivirus could corrupt the database or remove the database file. Has anyone seen this happen?”

Tara Kizer: I have not seen anyone say that that’s why to exclude it from the antivirus software. Usually, people are saying it for performance reasons you don’t want it to scan. There’s like 60 exclusions that you’ll want to add to antivirus but I haven’t seen the issue that Denny is referring to. But of course, I always have exclusions in place.

Brent Ozar: Antivirus apps don’t change the file, they just quarantine the file. They just block it so you can’t access it, so that shouldn’t cause corruption.

Erik Darling: Like MDF and LDF are open in SQL Server, I can’t do anything to them. Would an antivirus app be able to grab them and do something?

Brent Ozar: No. The time where it catches you is on startup. If you stop the instance and start it then you can have it where the antivirus has grabbed a lock on one of the files to scan it and SQL Server skips that database on startup.

Erik Darling: Oops.

 

What’s a good alternative to the T-SQL debugger?

Brent Ozar: Thomas asks—this is such a good question—“Do you know of a good working alternative to the T-SQL debugger built into SSMS?”

Richie Rump: Wait, T-SQL has a debugger? I should try that. That would be amazing.

Tara Kizer: I can’t ever get it to work.

Erik Darling: It accidentally comes up every time I accidentally hit F6.

Tara Kizer: Yeah, exactly.

Brent Ozar: For those of you who haven’t had the pleasure of using the debugger before, you will learn once that it stops the SQL Server while you’re stepping through debugging. So if you ever want a way to bring production down quickly, that’s a really surefire way to do it, especially if your code has locks in any of the code that you’re stepping through. Hello, ugliness. So to answer your question, I haven’t seen one either.

Richie Rump: Yeah, I’ve never used it. Typically, if I ever feel the need to use a debugger that means I probably need to break up this script into something smaller.

 

Why does patching suck on a 10-node cluster?

Brent Ozar: Stephanie asks, “We have a ten-node geo-cluster. SQL patching takes hours upon hours.” I’m not laughing with you, I’m laughing at you. “Do you see any issues with pausing the node then stopping the cluster service locally before starting a SQL Server service pack upgrade?” So what was the biggest cluster you guys have ever had in terms of nodes? I’m thinking like five.

Tara Kizer: At my last job we had a 14-node cluster, but this was on SQL Server 2012 where you could have five total replicas, four secondaries and one primary. So this 14-node cluster had several availability groups spanning different nodes. We had tier one and tier two servers at the primary site and then we had tier three servers and lower hardware at the DR site. We needed all the databases to be able to be queried on one instance for the business users. So they all had that as a replica too.

Brent Ozar: How did you do patching on that?

Tara Kizer: We did the tier two and tier three servers, those are the synchronous and asynchronous replicas, we did those ahead of time. Production wasn’t running on them. Then tier one we did at night at like 8:00 I think it was. It did not take us hours upon hours though. But again, we split up the work. I think that we hit all of the primary replicas at the same time.

Brent Ozar: One big outage. Yeah, just planning takes hours to get everything staged and ready to go, get everybody’s approvals ready to go.

Tara Kizer: This was Microsoft security patching, as far as service packs, same type of thing though.

 

Why do memory-optimized tables use so much memory?

Brent Ozar: Oh, Kyle, bless your heart for asking this question. Kyle says, “Memory-optimized tables seem to occupy a ton of memory.”

Erik Darling: Go on, sir.

Brent Ozar: Let’s think through that for a second. He says, “I thought they would be compressed but I guess I read the specs wrong. Is compression anywhere near the horizon for Hekaton? It would be nice to send Hana packing. Thanks.” Nope. So what Microsoft recommends is that you need 2x the size of the tables. So like if you have 100 gigs worth of Hekaton data, you’re supposed to have 200 gigs worth of RAM because Hekaton is append only. All new rows are coming in as new versions but also deleted rows and updated rows come in as new versions. So that every now and then SQL Server does the equivalent of garbage collection, getting rid of the old versions of rows. It needs a new table in order to go do that. Like a whole lot of other things in SQL Server these days, these features use memory because memory is cheap. Microsoft just figured out that memory is cheap and it’s time for us to recognize the same thing.

 

Does anyone use database snapshots with Always On AGs?

Brent Ozar: Oh my goodness. Roland says, “I have a database with 1.3 terabytes and lots of indexes on my AGs. Over on my secondary replica I have a snapshot of the database. When I start the index optimization on the primary replica my redo queue builds up very high. Last time it was up to 150 gigs in the queue. The only way I was able to speed it up was to drop the snapshot. Is there a best practice for how to do huge updates in snapshots along the same time?” Have you guys ever used snapshots in combination with an AG? I have not.

Erik Darling: Why would you?

Brent Ozar: No, everybody is shaking their heads. The only—I’m like stretching—if you want to get a point in time consistent view of the data maybe, like if you wanted to say, “I want to see my reports as of exactly 8:00 a.m. this morning.” The one thing you might be interested in is 2016 has parallel redo. If you go to groupby.org we have a session from Bob Ward on there. He talks about the differences that parallel redo makes in SQL Server 2016. It’s his SQL Server 2016 “It Just Runs Faster.” I don’t know of anybody using snapshots with availability groups though so that kind of stumps me there.

 

Should I stop SQL Server before I shut down Windows?

Brent Ozar: Boy, you guys have got great questions today. You guys are like the A+ audience. I’m looking forward to this one. We might be here for an hour.

Erik Darling: D- answers.

Richie Rump: Come back next week.

Brent Ozar: Yeah, can you just trickle these out? You don’t ask all your good questions at once, leave some of them. I tell you, that audience last week, man, they were red cards. They were not good. Just kidding because I know a lot of you were in there. Thomas says, “I’m moving a storage array and I’m shutting down all my SQL Servers. Should I stop the SQL Server services first or just shut down Windows?”

Tara Kizer: I’d just shut it down. Shut down Windows.

Brent Ozar: Yeah.

Tara Kizer: T-SQL Server takes care of us. It knows what to do when it comes back up and it has to go through crash recovery.
Brent Ozar: If you were worried about like a failover of clusters or anything, you could shut down your passive nodes first and then finally shut down the active if you wanted to just walk it all the way down.

 

Brent Ozar: William says, “The debugger, the first time I tried it out it was on my QA system. I soon afterward had several confused developers at my cubicle.” That’s probably true. If you want to take a break from work and get the team to go out for beers or something, you could just fire up the debugger and no one will be able to get any work done.

 

Are cursors a good idea in reporting queries?

Brent Ozar: Adam says, “I never use cursors in my report queries.” Well, Adam, our relationship is off to a good start. “But I’ve been playing around on Stack Overflow and I see some incredibly convoluted answers to problems. They strike me as they would be much easier solved by a cursor. Granted, these problems are situations where the schema is faulty or part of the solution would be better handled in the presentation layer.” He says, “Are cursors ever a good idea for reporting queries?”

Richie Rump: I haven’t run across one.

Erik Darling: No. I mean, like, you’ll find some like way out there crazy edge cases. I think Aaron Bertrand has blogged about using cursors in certain scenarios for like grouped medians where they’re somehow faster than a set based solution but really, for the most part, you’re not going to find that in everyday code.

 

What happened to all my drive’s free space?

Brent Ozar: Next up, Rosemary—Rosemary had emailed us earlier and I said, “Hey, you should ask that question during Office Hours.” Rosemary asks, “I’m using backup compression and I’ve got multiple backup files on one dedicated drive. I’ve got six 50-gig files, 300 gigs all together, but the disks that I’m dealing with only show 6 gigs free and available. I don’t understand why 800 gigs of drive space minus 300 gigs worth of backups means that I only have 5 gigs available. Where did my other 500 gigs go?”

Richie Rump: Erik took it.

Tara Kizer: What’s the size of the backup file on disk? I can think of two things here. Trace flag 3042. But that would only apply while the backup is running, so the algorithm that they use in the beginning of the compression process, it may estimate a much larger file size needed than the compressed size. We had free space issues on some mount points at a job that I was at, so we had to use trace flag 3042 so that it wouldn’t overestimate the size of the needed file. The other thing that I’m thinking of that it might be is the volume shadow copy file, I think. I don’t think you can see those if you have Windows Explorer set to show hidden files. I think they’re still hidden from the hidden, if I recall correctly. We ended up having to do something to see them in Explorer. Then I think we had to stop the service to prevent this large file from consuming space.

Brent Ozar: I love it. You may not be doing the snapshot, that’s someone else doing the snapshot copy. I was going for like a page file. If you have a hidden file that Windows is managing RAM, that will show as like the page file will burn up a ton of space. In Windows Explorer, just set your settings so you can view hidden files and that should hit that one.

Tara Kizer: I think in her original email, because I did see the email come through, it said that they have Windows Explorer set up to show hidden files and the page file should show up in the hidden files. I seem to remember some kind of file had to do with the shadow copy service, you couldn’t see those.

Brent Ozar: You’re absolutely right, yeah, it won’t show up there.

Erik Darling: Would a utility like WinDirStat or something see that?

Brent Ozar: I don’t know. I’ve never done it with VSS snapshots but I wouldn’t be surprised.

Erik Darling: Yeah, whenever I’ve had like a weird drive space consumption issue, WinDirStat, it’s a free-ish or mostly free application I’ve been able to use to track down weird files that are consuming space. There’s always a Stack Overflow on [inaudible] see what it is.

 

What’s the worst that could happen if I add clustered indexes?

Brent Ozar: Danny says, “We have a database full of tables using unique identifiers as primary keys. There are non-clustered indexes on all these unique identifier columns but there are no clustered indexes anywhere on any of these tables. My knee jerk reaction is to replace all these non-clustered indexes with clustered indexes ASAP. Is this a no brainer or could I be setting myself up for disaster doing this?”

Erik Darling: Well you only get one non-clustered index, you know. I mean one clustered index.

Brent Ozar: I was like, wait, what?

Erik Darling: You don’t get multiple clustered indexes. There’s a lot of sort of religious debate around whether you should have a clustered index on a GUID or not. My feeling is generally no unless you have some like really—so like one place where GUIDs shine for clustered indexes is if you have like a really high insert workload, you can run into something called last page contention where you’re always trying to burn up, like if you have an identity column or another kind of sequential clustered index key. So you’re always trying to insert data to the very end of the index, then that’s your hotspot. You can switch over to using a GUID. It’s so easy, just switch over to using GUID for your clustered index. But generally, you could just add some surrogate key, some identity column to the table or something to have a clustered index on instead and keep your non-clustered indexes in place. But that’s going to be a heck of an IO intensive operation. I’d probably opt for creating a new table with the clustered index that I want on it, insert my data over to it, and then create my non-clustered indexes on it.

Brent Ozar: I’d just make sure you’re fixing a problem. Like you can make things worse when you stick on clustered indexes. It’s not normal, but just make sure that you understand that you’re fixing a problem. Like look at what’s the top wait stat on that server before you go [inaudible] with things.

Erik Darling: Another thing you might want to do is if you run sp_BlitzIndex in mode four for that database, you’ll get a lot of information about your heaps back. You’ll see if there’s heaps with a lot of forwarded records in them or heaps that are occupying a lot of space. So like if you have a workload that deletes from heap tables, the pages may not get released from the heap back to the database. So you may just have a bunch of empty pages which SQL is still going to unfortunately scan and look at it when you read from the table. So I would run BlitzIndex in mode four to that database to get some more information about the heaps to see if they are actually problematic.

Brent Ozar: Problematic—like automatic, but different.

 

What should I do about this 30GB log file?

Brent Ozar: Fred says, “I have a database with a 30-gig log file but the data file is 25 gigs. I’ve avoided clearing or shrinking the log. What do you think is going on and what should I do?”

Tara Kizer: Do you really need the disk space? It’s so close in size I’d probably just keep it that way. Is your recovery model in bulked log or full? If so, make sure you have transaction log backups occurring frequent enough for your workload.

Brent Ozar: Full backups are not going to clear that out. This is a 1 terabyte solid state drive. It’s smaller than a phone. It costs $300 at Amazon so just be careful what you’re doing there with a 30 gig log file. Oh my god, it’s the end of the world. Not necessarily.

 

Does rebuilding an index take the database offline?

Brent Ozar: Raul says, “Can someone confirm that an index rebuild requires taking the database offline in Standard Edition?”

Tara Kizer: We’ll confirm that that is not required.

Erik Darling: It will take something offline.

Tara Kizer: Your index is going to come offline for Standard Edition, so it won’t be able to be used to tell the performance of queries. The database remains online, you’ll just have some blocking for the index that’s currently being rebuilt.

 

Should I rebuild all my indexes after I upgrade SQL Server?

Brent Ozar: Roland says, “I’ve read you should do a rebuild on all your indexes after you upgrade SQL Server. Why is this recommended?” Wherever you read that, go hunt them down and bookmark that site because you never want to go back there again. You want to be able to recognize bad advice when you come to it. You did exactly the right thing by asking why is it true, because there’s so much bad stuff out there on the internet many of which comes from our own site. But in this case, not this time.

Tara Kizer: Did you guys ever do update statistics after an upgrade?

Brent Ozar: Yes.

Tara Kizer: I didn’t always, it just depended on the system whether I wanted to spend the time doing it or not, how critical was the server or database.

Erik Darling: What kind of stinks is a lot of the times when you read people saying that you should update statistics after you move, they’re like you have to do it with a full scan, but when you have a multi-terabyte database, you can’t just like willy-nilly, “I’m going to full scan all these statistics, no problem.”

 

A long question walks into a bar

Brent Ozar: Eric asks…

Erik Darling: No, I didn’t.

Brent Ozar: Eric has a big, long one.

Erik Darling: Hey, thanks.

Brent Ozar: Yeah, the other Eric. Only one other Eric. Holy cow, Eric. That’s actually a good example of a question to post over on Stack Exchange. As you see these slides scroll by if you’re on YouTube or on a live broadcast, there’s instructions on how to go to dba.stackexchange. Anytime you find yourself typing multiple sentences, and especially multiple lines, multiple questions, probably want to go tackle that over at Stack Exchange.

Erik Darling: Sounds like there’s a vendor product involved so you might want to check with the vendor on that as well.

 

Know of any good job management applications?

Brent Ozar: Paul says, “Do you have any suggestions for job management applications? We want to manage all our agent jobs from a central server but all the apps I’ve found don’t really do logical mapping, meaning if the job fails on this stored procedure, start at this particular step again.”

Erik Darling: Yeah. Unfortunately, it’s going to cost you money. SQLSentry Performance Advisor does have a whole job chaining and job thinging thing in their product. I haven’t used it extensively because I never had a need for it. I was always pretty cool with having different profiles for that stuff but it’s pretty cool. If you go check out the—I said SQLSentry again—SentryOne Performance Advisor. If you go to the SentryOne website they have the job managing part of the performance advisor product with all sorts of pretty screen caps and an explanation of it in there if you want to look at it. Other than that, I don’t know.

Tara Kizer: I’m trying to think of the tool that we used at my last job because we weren’t just managing SQL Server. The dev ops team wasn’t managing just SQL Server so we had another—some kind of Windows third-party package that we purchased. It wasn’t cheap. I know other people have used it. It came up in the MVP mailing list recently and I think it was Adam Machanic that mentioned that they use it as well. So we had to kick off different things and then kick off SQL Server jobs. I just can’t remember the name of it. It’s an expensive product too.

Brent Ozar: JAMS has one, Unicenter is another one.

Erik Darling: I had a client once that built their own.

Richie Rump: Yeah, I mean, why not just build your own? How hard could it be, right?

Erik Darling: Yeah, it just needs a front-end and some stuff behind it that does things. Use the cloud.

Brent Ozar: Error handling we’ll leave for version two.

Richie Rump: We don’t need that. That’s optional.

Tara Kizer: I’ve got the answer to the tool I was using. It’s called Tidal.

Brent Ozar: Is that a streaming music service by Jay Z?

Tara Kizer: It might be a Cisco product. I didn’t look it up. Just the name.

 

Someone’s asking me for a UTF16 collation…

Brent Ozar: J.H. has an interesting one. He says someone asked him to create a database. He says, “According to our documentation, the databases should have the following settings. Character encoding of UTF 8 or UTF 16 and the collation sequence is case insensitive.” You should just have him give you the exact collation that they want, because this still leaves all kinds of things up in the air. Just say exactly which collation do you want. If you look in Books Online, there’s a list of collations, it’s stunning. That should keep them busy for a while.

Erik Darling: Can you set UTF settings like that on a database? 8 and 16? I’ve never seen that part before. The only thing I’ve ever read is that the bulk bcp had some problems with like UTF 8 or 16, stuff that got solved recently but I don’t know. I’ve never seen that setting for a database.

Brent Ozar: No clue.

 

What should I set autogrowth to?

Brent Ozar: Sam says, “For new SQL Servers, what do you typically set your autogrowth settings to, out of interest? Do you consider VLFs in your decision?”

Tara Kizer: I absolutely consider VLFs in my decision just because I’ve run into excessive downtime as a result of having high VLFs.

Brent Ozar: Go on about that, what’s that like? The excessive downtime? That’s awesome. I know where you’re going with this.

Tara Kizer: This company applied the Microsoft security patches monthly. We were doing this with just about five SQL Servers on this one night. The most critical database of those servers, it was down. We had rebooted the server and SQL Server was up but the database was in crash recovery and I kept looking at the, I think it was the error log, it was frustrating. This application had really high availability SLAs. We were in a maintenance window but we weren’t allowed to have that kind of downtime. It came up after about 45 minutes. I was almost in the process of opening up a case with Microsoft to figure out what was going on. The next Monday I contacted Microsoft via email and that’s when I learned about VLFs. This was a little over ten years ago I’d say, but 45 minutes, and this was like a 7 terabyte database. The transaction log wasn’t even that big for that system. It probably was like, I don’t know, 200, 300 gigabytes. I monitor VLFs ever since that outage because it was such a big deal for this application being down that long. As far as what I set for autogrowth, it just depends on if I’m doing a new database and I know that this database is going to be small, medium, or large, then I’ll set the increments to adjust that. If I think that this database is going to be pretty sizeable, I’ll usually use 1 to 5 gigabytes for the transaction log file. But that depends upon how fast your storage is and how quickly it can grow it out since you can’t use instant file initialization for the log files. Data files I’ll set even higher since we do use IFI on those.

Erik Darling: I think the most important thing though is just to get away from percent autogrow.

Tara Kizer: Yeah.

Erik Darling: Because as your database grows, that’s going to be a larger and larger chunk of file to grow out. So as long as you set it to a fixed increment, I think you’re just always going to be in much better shape than percent autogrow.

Richie Rump: 1 meg.

Tara Kizer: 1 percent or 1 megabyte, I don’t know which one is worse.

Erik Darling: Yes.

 

Why are my VSS backups failing?

Brent Ozar: Graham asks a question we happened to be talking about the other day in the company chat room. “We have a 600 gig database that’s being backed up with a third-party app.” I’m not going to name which one it is because I don’t want these poor people to feel bad. “Our backups have been failing because the VSS writer has been freezing IO for 60+ seconds and the vendor support has not been helpful. What’s the best approach to address this?” I’ll say treat your servers like cattle, not like pets. When I’m dealing with a VM, I’m not even going to do VM-level backups. I’m going to back up my SQL Servers. I’m going to do regular SQL Server full and log backups and use a different technique like log shipping, database mirroring, anything like that to be ready to go to failover to. But I’m the kind of person who never wants to restore the OS. If something goes so horribly awry that I can’t trust the OS and I have to step back in time, I don’t trust that OS period. Who knows if last night’s snapshot is going to be any good or not. So I don’t like doing snaps on SQL Server VMs.

Erik Darling: The only time I’ve ever done anything close to that was when I had to [inaudible] machines and even then I was nervous as hell about—I was like, is this going to…?

Brent Ozar: Yeah. Graham is saying, “This is for this vendor’s SQL Server backup agent.” Yeah, I’m saying I don’t use those. I’m not a big fan of those.

 

Why is rowmodctr so far off?

Brent Ozar: Kyle says, “After letting stats maintenance lapse…” Well, we all have our downfalls. “I noticed that sys.indexes.rowmodctr of some stats is recording far more changes than the number of rows in the table. Like say the table has 100,000 rows and it shows 300,000 changes. Does that mean that nothing is querying the table because SQL Server hasn’t automatically updated the stats?” What a cute question.

Tara Kizer: [Laughs] Cute.

(Note from Brent – I don’t know why I used the word cute here, but that question was awesome.)

Erik Darling: You could also have it hit a threshold where—actually, never mind on that.

Tara Kizer: He doesn’t specify the values and the threshold on 2014 and older is 20 percent. So maybe you haven’t hit the 20 percent threshold. On 2016, it’s lower than 20 percent and Erik has a blog article on it. Can’t really tell you a fixed number for 2016, they changed how the thresholds.

Brent Ozar: Well, there’s a trick. He says that the rowmodctr is higher than the number of rows in the table which means it should have tripped the 20 percent. I can think of two things that would cause this. One is that the stats are only updated whenever you do a select, which could be that you’re right, that it hasn’t tripped the 20 percent threshold. The other thing is if these numbers are tiny. The metric was technically last time I looked 20 percent plus 500 rows. So if we were talking about a 100-row table, then that would also come into play there. Otherwise, I like that nothing has queried the table.

Erik Darling: Yeah, that would be my first instinct.

 

How do I reclaim space from a heap table?

Brent Ozar: Paul Olsen says, “How do I reclaim the data file space used by a heap table?” Did you delete it?

Erik Darling: If you mean you deleted data from there and you have a big empty heap, your options suck. You could create a clustered index on the table, which will fix things up. You can rebuild the table in, I think it’s 2008+, but you have to be real careful with that because when you rebuild a heap, you rebuild all the non-clustered indexes on it as well. That can be a really bad word situation.

Brent Ozar: You can truncate the table too but of course then you lose all the data in it. That deallocates all the space immediately, but if you have any rows, you’re screwed.

 

How should I capture DML events on user tables?

Brent Ozar: Steven says, “If you wanted to capture DML events on user tables how would you go about it?” There’s probably two parts to this. For readers who don’t understand what a DML event is and then also how would you go about capturing them.

Tara Kizer: I wouldn’t do it. The systems that I’ve supported have too much data coming in being changed to ever audit like that. Use could use triggers. Doesn’t CDC help you with that too? I’m not positive on that though. But triggers is how I’ve seen a lot of people do it but I can’t afford the performance of that kind of micromanagement.

Richie Rump: But triggers on DML?

Brent Ozar: Every insert, update, delete.

Tara Kizer: You send it to another, a logging table.

Richie Rump: Oh.

Erik Darling: If you do an after trigger…

Tara Kizer: I’ve implemented it…

Richie Rump: Yeah, just log it in the app, be done with it. Don’t worry about ad hoc stuff.

Tara Kizer: Then you’re missing DBAs coming through and wanting to find the data or whatever.

Richie Rump: Yeah.

Erik Darling: My first instinct is to go with an after trigger if it has to be done on the database.

Tara Kizer: Yeah, definitely.

 

Why did my logical reads change from 2008R2 to 2014?

Brent Ozar: Roland says, “In a simple join of two tables…” What in the holy…? I’m going to say this out loud, “In a simple join of two tables, the second table supplies always the same number of rows as the predicate is always the same as the same pk.”

Erik Darling: Who put the bomp in the bomp…

Tara Kizer: I need to see the data at this point and expected output.

Richie Rump: Stack Exchange would be great place for this question.

Brent Ozar: The only thing I’m going to say is we’re going to nibble out one part of this. He says, “Switching from 2008 R2 to 2014 the logical reads changed. How should I approach this problem?” What changed in SQL Server 2014 to change execution plans?

Tara Kizer: Cardinality estimator.

Brent Ozar: What is that and what does it do?

Tara Kizer: I’ll let you guys take that one. I just know the answer.

Erik Darling: It’s like one of those guys in Vegas who can count cards basically.

Brent Ozar: Hmm.

Erik Darling: So the cardinality estimator figures out how many—it tries to figure out how many rows are going to be returned by an operation based on statistics and histograms and cardinality and joins and all sorts of other fun things like that. In 2014, the cardinality estimator was completely redone for the first time since SQL Server 2000 or 7 or some old, ancient version. A lot of stuff changed. There’s a great white paper by Mr. Joe Sack about the cardinality estimator. I will grab a link for that as soon as I’m done talking unless one of my lovely cohosts would like to do that now. That is where I would go and start reading about it because that’s I think where you’re going to find the key to this. I would look at the execution plan as well as to see if something changed. It sounds like perhaps something got replaced with a nested loops join since he’s talking about one row is being read over and over again. That would be my first instinct but I would check out the execution plans and go read Joe Sack’s white paper about the cardinality estimator.

Brent Ozar: If you want to test that it is the cardinality estimator, on your database change the compatibility level. Change it back to say 2012 or 2008 R2. That will flip you back into the old cardinality estimator and you may get immediately a different execution plan.

Erik Darling: There’s trace flags on the blog too.

Brent Ozar: If the thing is already in production you shouldn’t be flipping the compat level back and forth. The reason why is because it’s going to clear your plan cache. So Erik’s trace flag idea is much better when you’re troubleshooting one query at a time.

Brent Ozar: All right, that sums up our webcast for this week. Thanks everybody for hanging out with us. I hope you had an excellent time. We will see you in the next Office Hours.


First Responder Kit Updates: So Many Updates, Brent Quit

I got 128 GB of RAM just to develop sp_BlitzCache with! Also, thanks to @RichBenner @BrentOzar and @douglane for code, and @LitKnd and @HenrikStaunPoulsen for great ideas!

Grab the newly updated First Responder Kit, and here’s what’s in it:

sp_Blitz Improvements

  • #677 Finally! A bug that isn’t ours! Microsoft forget to let SE clean up query store. We warn you about that now.
  • #650 2016 SP1 gave us exciting new (old) Enterprise features. We give you some boring new wording about them. (suggested by @HenrikStaunPoulsen )
  • #602 More checks to see if you did something ridiculous to your CPUs

sp_BlitzCache Improvements

  • #688 Curious about RLS in your query plan? So is @LitKnd! Thanks, Kendra!
  • #678 In 2016 and up, we’ll tell you something if estimated rows and estimated rows read are way different. Seriously, upgrade to 2016. What’s wrong with you?
  • #665 So uh, you can now sort by ‘all’ and ‘all avg’. This may perform terribly on some servers. Please report any bugs.
  • #663 Sometimes stored procedure costs would show up as double. That’s no good.
  • #661 Function join detection was weird in some circumstances.
  • #660 Sometimes we don’t find any warnings. Now we tell you when we don’t find any warnings. Relief.
  • #659 Operator checks sometimes take a long time. Breaking them into smaller parts helps.
  • #653 Ever see something in BlitzCache that wasn’t yours? Funny thing about SPIDs, huh? We filter on those now, too.
  • #652 Filters! Referencing scalar valued functions! What a bad idea. We’ll let you know about that.
  • #633 Are you modifying a lot of indexes? I don’t know. But BlitzCache does. This looks a little inflated because it takes into account IF branches, but you get the idea.
  • #459 If a stored procedure gets executed once, does anyone care? @douglane doesn’t! Thanks, Doug!

sp_BlitzFirst Improvements

  • #668 Context is everything. Especially for databases. When you’re checking for stored procs.
  • #658 Long running queries being blocked. Great post, Brent! (coded by @BrentOzar )
  • #647 Is SQL running? Yes? So is something other than SQL, and it’s taking >= 25% of your processor time. (coded by @RichBenner )
  • #646 Because you should know if some goofball is restoring a database (coded by @RichBenner )
  • #645 Night of the sleeping SPIDs (coded by @RichBenner )

sp_BlitzIndex Improvements

  • #680 Statistics can have more than one column in them. Go figure.
  • #671 Not all ReportServer databases are just called ReportServer (coded by @BrentOzar )

sp_BlitzWho Improvements

  • #649 Oh, that pesky database_id error.
  • #645 Night of the sleeping SPIDs (coded by @RichBenner )

Start downloading!

Happy Birthday mom!


High Compiles and Multi-Statement Table Valued Functions

SQL Server, T-SQL, TempDB
2 Comments

Way back in 2016

I wrote about what to do if sp_BlitzFirst warns about high compiles. During GroupBy, Adam Machanic gave a great talk on new features in SQL Server 2016. It reminded me of a blog post I wanted to write about one common culprit of high compiles: Multi-Statement Table Valued Functions.

Leaving aside all the other performance problems with them, this is another way they can mess with your server. Let’s look at a typical setup where they get used, and how SQL Server 2016 can help you fix the problem.

You made a bad choice

Rather than use Table Valued Parameters, you decided to pass a CSV list to a stored procedure, and you wrote a crappy function with a WHILE loop in it to split the CSV list. Or you copied one from the first result you found on the internet, even though it warned you that it was a really just God-awful die in a fire idea.

To simulate workload, we’ll need a stored proc, and a loop to feed it a random list of IDs.

Then we’ll grab a CSV list of IDs and run our proc like this.

If I were on vNext, I’d be using STRING_AGG, but I’m trying to keep things clean. No big deal, though. I’ve been using XML this long, right?

We’re going to be using the Extended Events session and query from before. We’ll kick that off, run our loop, and then be on our merry way.

Shredding the data will give us a bunch of repeating patterns that look like this.

10… Top 10… Yeah.

So yeah, that function seems to get up to something once for every ID you pass in. Remember that in our STUFF… query, we grabbed the TOP 10 each time. In the XE session, each time we call the proc, the string splitting function compiles and executes code 10 times. Bummerino. That’s the life of a loop.

On SQL Server 2016 (and really, with any non-looping code), we can get around the constant compilations with a simple rewrite. In this case, I’m calling 2016’s STRING_SPLIT function instead of the MSTVF function.

When when our loop runs, we don’t see the constant compilations of function code.

You’re a joy.

Finisher

We didn’t cover a lot of the performance hits that MSTVFs, and table variables in general can cause. Much has been written about those topics, but I haven’t seen this come up. It’s one of the many reasons why you might be seeing high compilation rates in sp_BlitzFirst, or in your monitoring tools. Or because you’re really good at guessing compilation rates.

This was, oddly, the first demo I wrote on my new home cloud/server/whatever. Not exactly pushing the bounds of performance, here, but Extended Events data shredded pretty fast 🙂

Thanks for reading!


What Do You Wanna Know About Storage?

Storage
35 Comments
The Great Hardware Robbery of 2016

During Dell DBA Days 2016, we got to hang out with Doug Bernhardt of Dell’s storage team. He’s the guy who publishes Dell’s SQL Server documentation, builds their Fast Track reference architectures, and gets to play with all kinds of cool toys.

Doug emailed us and asked:

Is there any solutions type of info (or any info in general) on either Dell or EMC SAN and SQL Server that you think would be helpful to you, your customers, or the SQL community in general?  Anything that would help make better decisions about architecture, design, purchasing, or optimization of any of the Dell EMC gear?

And I figured there’s only one person I should ask – and that’s you, dear reader. So what documentation on storage would you like to see?


When Always On Isn’t: Handling Outages in Your Application

Today’s brief Stack Overflow outage reminded me of something I’ve always wanted to blog about:

Stack Overflow is in limp-home mode

There’s a gray bar across the top that says, “This site is currently in read-only mode; we’ll return with full functionality soon.”

That’s not a hidden feature of Always On Availability Groups. Rather, it’s a hidden feature of really dedicated developers whose application:

This is no small feat. For example, if you’re calling stored procedures to render a page, those stored procedures cannot do any writes. You can’t log page views inside a stored proc, or log execution times. You have to assume that the database may not be writeable. (That’s great practices anyway, since you can then use replicas to scale out all page rendering.)

For bonus points, your application can still do logging and writes – but just not to SQL Server. You might cache writes temporarily to a NoSQL store.

This is all left as an exercise for the reader. Always On Availability Groups give you a limp-home capability for your web site, but it’s up to you to make your web site as functional as possible when the database is degraded.


Analyzing Temporary Stored Procedures with sp_BlitzCache

All the cool kids

Know that when you’re trying to figure out why a stored procedure is slow sometimes, you should use a temporary stored procedure instead of pulling the code out and running it ad hoc.

We frequently point people to Erland Sommarskog’s “Slow in the Application, Fast in SSMS“, because it’s an excellent treatise on many of the things that can go wrong with a query.

But you’re here, so you’re cool

You use temporary stored procs, and you use sp_BlitzCache, but you’re having a hard time tracking down and analyzing plans for temporary stored procs.

The good news is: We got you covered.

Pretender

We have this stored procedure, which looks like just about any stored procedure. It accepts some variameterables, there’s a begin and an end; it’s spectacular. It doesn’t even throw errors.

One day the boss-person comes in and says “we also need to filter on user reputation”, and since you’re a well-paid and loyal employee, you get cracking on that. You create a temporary stored procedure to make sure your code works, and examine any potential performance issues. What a little go-getter you are.

You could use an ad hoc script to look at the plan cache, Extended Events, or Profiler, to capture the plan. But you like us, so you want to use Ye Olde Blytzen Casshe.

One thing we do by default is ignore system databases. They’re often noisy, and lousy with system queries that have nothing to do with real life. We give you the ability to override that, to choose a database to single out for examination, and as of recently, a way to filter by stored procedure. The code to track this one down looks about like so!

I get this back, with some warnings, and the ability to open up the plan and gaze upon it’s awfulness.

No alibi

Sunglasses At Night

There are a few tips in this post, and some really great links to follow and read, if you haven’t already. Hopefully all of them help you become better at performance tuning SQL Server.

Thanks for reading!


[Video] Office Hours 2017/01/18 (With Transcriptions)

This week, Brent, Erik, Tara, and Richie discuss sp_Blitz capabilities, what’s best to use for resolving SQL performance issues, CDC, tools for capturing wait stats, their favorite monitoring tools, temp tables vs table variables, partitioning with constraints, and whether you should get any certifications.

Here’s the video on YouTube:

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes

Office Hours Webcast – 2017-01-18

 

How should I prepare for the MCSE exam?

Brent Ozar: Wes Crockett says, “Hi all. It’s my first time here. Really pumped that you guys do these events. I’m recently back in a DBA role…” Back in the saddle again. “…after a few years out of the DBA world.” Now I’m more curious about what you went and did and what you came back for because it’s just interesting.

Richie Rump: [Inaudible] thing didn’t work out for you buddy.

Brent Ozar: Yeah, because he went to go to a developer and realized that the frameworks change every like six weeks. He’s like, man, give me some ANSI SQL. I’m not too bright.

Erik Darling: Real programming sucks.

Richie Rump: Stop moving my cheese dammit.

Brent Ozar: He says his new employer wants him to obtain an MCSE. “Do you guys have a recommended approach to getting there? What books do you think would be valuable, self-study, whatever?” We should all tell our last time that we got a cert and then how we studied for it. Going from left to right, Richie Rump, when was the last time you got a cert and how did you study for it?

Richie Rump: I think I took the 2008 test, the SQL developer stuff. I think I took that. I think that was the last time I took that. Before that, I think it was the PMP that I took. But the SQL stuff, I just actually just took it. I didn’t study, didn’t do anything. I think I may have read one thing on XML and that was it and just went in and took it.

Brent Ozar: How about you, Erik?

Erik Darling: The last certification I got was an A+ cert in 1998 or 1999. I got it by putting together some computers—that was my studying. I basically plugged stuff in and wrestled with interrupt requests. Then I had an A+ cert. I could put CD drives in, and CD burners were super popular.

Brent Ozar: They were. Is this the reason why you just built your new desktop? Are you trying to renew that, are you trying to get you’re A+ renewed?

Erik Darling: Yes, I’m trying to get back on the A+ fast track. I need to get back in their good graces because my career is at a little bit of a standstill right now.

Richie Rump: You know, Erik, I hear the cloud has computers now. Maybe you want to look into that.

Erik Darling: It does, but they’re super slow. They are just the slowest darn things.

Brent Ozar: Tara, how about you?

Tara Kizer: I don’t have any certifications.

Erik Darling: Oh, yeah.

Tara Kizer: I once or twice took a practice test for the MCSE and I failed miserably. I mean, not failed miserably, but I failed. I think I got like a 60 percent on it. I’m just a terrible test taker. It would be hard work for me to pass these things and I already had an established career when I was doing this so I didn’t see any need to do this. So unless your company is requiring it or maybe the next job is going to require it, I really don’t see the need to get certifications.

Erik Darling: Yeah, I’d much rather just get books that cover topics that I’m interested in, or books that cover topics that actually apply to my job. For instance, this thing is constantly on my desk.

Brent Ozar: Calvin and Hobbes?

Erik Darling: Yes. I have every single Calvin and Hobbes book, every single one. One of them is signed.

Brent Ozar: Whoa—by Calvin or Hobbes?

Erik Darling: Both.

Brent Ozar: Oh, wow, dang. Yeah, I tried, I can’t remember if it was 2012 or 2014, which one it was that I was like, “Let me go see how they are now.” Fail. I’m like, wow. Because I’m reading the questions and I’m like I know some of these are wrong—are just completely and utterly wrong. So, yeah, not big fans of certs. Lee Townsend says he ran into way too many paper tigers with certs who couldn’t actually do the job. Kelly says, “Good answer, Tara. I agree. I don’t hire folks based on certifications. I prefer experience.”

Erik Darling: What sucks is that like because it’s a Microsoft exam, they really try to cram new features on you. It’s like your sitting there learning about Hekaton and change tracking and CDC and like all this other stuff that you know practically you are not going to use and you are going to have to apply in your job. That’s my other kind of gripe with them.

Brent Ozar: Every time I sit through one of them I’m like I don’t know what percentage of DBAs in the world have to do the things that are shown on this exam. I don’t understand what their job is like. That’s not a real—that doesn’t measure what I do. Wes says that when he did his intermission in his career he went over and did systems engineering, QA automation, and data analytics. I would also be curious as to what brought you back to database administration.

Erik Darling: Money probably.

 

Will there be a Brent Ozar Unlimited certification?

Brent Ozar: Money, that’s probably true. Fiag says, “How about a project of a Brent certified DBA?” Not no, but hell no. We don’t like taking certifications. Writing those questions is even worse. God, it’s hard. I like doing interviews and just letting people talk about what they’re good at. If I ever did a certificate it would be something involved with doing an interview and going, yeah, she hardly drooled at all. Yeah, she’s good.

Erik Darling: We did kind of have fun, I think around about when I first started me and Kendra did those quizzes, it was like one on indexes and one on T-SQL. Those were fun because they were just like here’s ten questions, what do you think? No one got a badge or a smiley face or a gold star for taking it but it was just kind of, I don’t know, that was kind of fun because it was just a cutesy little thing. If it was like an actual cert I would run.

Brent Ozar: Wes says he came back for money for one reason but he also found a job in an industry he loves. I know, that always makes me feel like I’m less like working when I’m around another company of—we did the GroupBy conference and one of the presenters is a database administrator for, the way that he said it is the Formula 1 team with the red cars and the yellow prancing horses. I was like you lucky, diabolical…

 

Should I put SSRS on a separate server?

Brent Ozar: Tammy asks, “When we do SSRS installs, should we put that on a separate server? Right now some of our servers have everything on one. I don’t think that there are security concerns starting with 2014 and I kind of want them on the same server for licensing purposes but I don’t want to make a bad choice. We have VMs so I could easily add more servers if I want. Should I put SSRS on a separate box?”

Tara Kizer: I think of SSRS as a like a third-party product. Just because it says SQL Server in the name and it’s written by Microsoft, it’s totally different than SQL Server. Think of it as a third-party product. Would you install a third-party product that isn’t supporting the SQL Server, isn’t doing net backup, or supporting software for SQL Server? Would you put a third-party product on your database server? I wouldn’t. I want my SQL Server dedicated to the databases and nothing else except supporting software. I always put the reporting services service or any other services that I need on a separate box. Yeah, you’ve got additional licenses here but the best practice is to have a dedicated SQL Server. If you can support it on the VM and you can add resources, you’re not having a problem, sure, do it. But I’ve worked in corporate environments where we’re just very careful with how things are, how we’re designing things, and making sure that we have a dedicated SQL Server.

Erik Darling: If you’re licensing Enterprise, you have to license at the host and you can spin up as many guests as you want, but Standard Edition, then, yeah, you have to pay per guest. So I don’t know if that changes how you think about things.

 

Can sleeping queries cause blocking?

Brent Ozar: James says he ran sp_Blitz and he noticed that there was—and I’m guessing this is sp_BlitzFirst or BlitzWho. He noticed that there was a sleeping query with open transactions.

Erik Darling: That was me.

Tara Kizer: I logged in.

Brent Ozar: “Does that cause blocking and if so, how do I resolve it?”

Erik Darling: It could. If you do like BEGIN TRAN and you insert, update, delete and then you walk away, you’ll have a sleeping transaction that could cause blocking because you’re still holding onto those things, those little pages and locks and whatnot.

Tara Kizer: Run sp_lock and then the SPID number and see what resources are locked and if anyone else wants to get access to those same resources they’re going to be blocked.

Erik Darling: Another option is if you’re feeling Machanic-y, you can run sp_WhoIsActive and you can use the show sleeping SPIDs parameter of [inaudible]. Then that will show you anything that is asleep. It will show you what’s going on there. Also, I think we recently added that to BlitzWho. I don’t know if it’s gotten pushed out from the dev branch yet though.

Brent Ozar: Oh, that’s a good point.

Erik Darling: You can run BlitzWho if you’re feeling particularly attached to us. It shows sleeping SPIDs there as well.

Brent Ozar: I love that sp_WhoIsActive also has the get locks parameter. It will show you exactly what people have locks on, it’s fantastic.

 

Do service packs always cause reboots?

Brent Ozar: J.H. says, “When I apply a SQL Server service pack or hotfix on some machines it reboots, on some machines, it doesn’t. Is there a reason or way I can tell when it’s going to reboot and when it’s not?”

Tara Kizer: It’s just whether or not the files were locked. If the files were locked, it’s going to require a reboot in order for the installer to make changes to those files.

Brent Ozar: Yeah.

Brent Ozar: Robert has a big long list with a checklist, a question with a checklist. Your best bet there is definitely to post that on Stack Exchange. As you see the slides rolling past, this one right here, it says if you have a multi-paragraph question. The reason why I say that, because this isn’t really multiple paragraphs, you’ve laid it together pretty nicely, but I bet you the smart answerers over there are going to do a beautiful job of injecting more steps. I like where you’re going. It’s very good but just throw it out in dba.stackexchange. You’ll get even better details.

 

What’s better: Profiler or Extended Events?

Brent Ozar: Raul asks a question. “What is the best to use for resolving SQL performance issues?” He gets a little tricky there because he ties us down to only two things and I’m not sure I would use either of them. It’s either Profiler or Extended Events.

Erik Darling: You’ve got to be careful with those, buddy.

Brent Ozar: Yeah, why do you have to be careful?

Erik Darling: Observer overhead, sir.

Brent Ozar: Yes. Schrödinger’s Profiler. Yeah, so the more instrumentation, yeah, the slower your SQL Server is going to go, especially with—Erik has a great demo with functions, how scalar UDFs and multi-statement table valued functions as they’re running, when you turn on profiler or even set statistics IO—boom—just unbelievable overhead. So what would you recommend using instead? Anything we would recommend starting instead of profiler or extended events? Because it sounds like he is after queries.

Erik Darling: Sure, I would start with our tools because that’s the cheapest way. You can check out sp_BlitzCache and you check out sp_BlitzIndex and you can check out sp_Blitz. Those will give you a pretty good general idea of the state that your server is in. Then from there, if I really needed eyes on my server constantly, I would just by a mature monitoring product. I wouldn’t sit there trying to roll my own or sit there trying to get an extended events session right or hope that I can get profiler kicking and working. That’s just me though, I’m lazy.

Brent Ozar: So many jokes there about an immature monitoring product but I’m not going to go there. It’s like Beavis and Butt-Head watching your servers.

 

How do I protect the distribution database?

Brent Ozar: Tim Smith says he has a four-node SQL 2012 availability group. Two of the nodes are in DR, two of the nodes are in his primary data center. “What’s my best way to create a plan for the DR for our distribution database?” He’s put the distribution database in here too, which is that even supported in an AG?”

Tara Kizer: He does say two-node failover cluster instance so the distribution database is not on the four-node AG. He’s got the distribution database on the two node regular failover cluster instance. Because yes, you’re right, distribution database is not supported in availability groups so it has to go into a failover cluster instance. I’ve had environments which have the availability groups across, onto a disaster recovery site and we had replication involved. You’ll have the distribution database at the primary site and the distribution database at your DR site. When you do the failover, you’re going to want to run scripts to recreate replication. Once you do that, you’re going to point it to the distribution database on the other instance. So you’re going to run replication scripts to redirect it. I don’t have those scripts memorized but once you’re ready to do that failover, I think we dropped replication on the first, did the failover, and then added it back pointing it to the new location of the distribution database.

Brent Ozar: I made a face like it sounded horrible but as I’m thinking about it, I’m like, it’s disaster recovery. It’s not like you do this every day.

Tara Kizer: It’s the same process you would do if you moved the distribution database to another server at the primary site. Think about what those steps are.

Brent Ozar: Yeah.

 

Why does CDC hurt when I do this?

Brent Ozar: Adam says, “We have CDC set up using a secondary file group. On some databases, the file size eventually grows to match the size of the primary data file. What could be causing this?” Well, Adam, I think the answer is in your question: CDC. I don’t have any experience with CDC. Do any of you guys? No.

Tara Kizer: I’ve used change tracking but not CDC.

 

Where do you post BrentOzar.com jobs?

Brent Ozar: Michella asks, “Where do you post brentozar.com jobs?” They’re in the SQL Server agent. No, we’re not hiring now but when we do we post a blog post. Normally, we like snipe Tara out of nowhere.

Tara Kizer: Just out of the blue.

Brent Ozar: Then generally we post it out on the blog. We let people apply in public. We’re not just trying to get you to read the blog, it’s just the easiest way. There’s like 100,000 of you that read that thing. Most of you aren’t qualified. That’s okay. We’ll find one or two good ones and we’ll go from there.

 

What are good tools to capture SQL Server wait stats?

Brent Ozar: Cody says he recently attended a seminar talking about performance tuning with wait statistics. Something touched on but they didn’t go into depth with it was what good tools are out there to capture wait stats. When you guys are looking for something to capture wait stats, what do you think about using?

Erik Darling: Extended events. Just kidding.

Brent Ozar: Profiler?

Erik Darling: Again, I stick with our stuff. I’ll go with BlitzFirst, you run it, you can either do since startup and it will show you your wait stats since startup or you can do it for a 30 second or some other second sample and you can get wait stats over a sample of time. So it’s a really nice way to look at wait stats. We make it really easy to read them and figure out what’s going on with them so I think it’s a pretty good way. You can probably find a bajillion scripts out there on the internet that all look at waits stats though, if you don’t want to use our scripts, which I don’t know why you wouldn’t.

Brent Ozar: If you went back to work as a production database administrator and your boss said, because as part of the deal for having you go back to work as a DBA, he said you can have two things, whatever monitoring tool you want and unlimited soups, salads, and breadsticks. Which monitoring tool do you pick and why?
Erik Darling: Good question.

Brent Ozar: I know, right? I’m putting you on the spot.

Erik Darling: Me? I say the same thing every week. Every week I say I would go with SentryOne now Performance Advisor. That is my all-time favorite monitoring tool. I like the way it looks. I like that it has that dark theme so it doesn’t blind me when I’m staring at it at 4:00 in the morning. I just really like the way the charts and graphs don’t give you a bunch of goofy stuff that you can’t do anything about and you can highlight sections of the graphs and it will bring you to the SQL that was running at the time or bring it to the SQL that is currently running, you can view execution plans. It’s got plan explorer built into it so you can view execution plans right from your monitoring tool. You don’t have to do a bunch of other goofy stuff to get your plans somewhere else, which is a shortfall of many, especially web-based monitoring tools, is that you have to download the execution plan to something else.

Brent Ozar: Tara, how about you?

Tara Kizer: Which tool?

Brent Ozar: Yeah.

Tara Kizer: Yeah, I like SentryOne’s Performance Advisor but I’ve also used the Quest tools, Foglight and Spotlight. I think they’re all great tools, the ones that I’ve used.

Brent Ozar: I’d be like, any, just give me a tool, I don’t care which one.

Tara Kizer: Yeah, a tool.

 

What’s better: temp tables or table variables?

Brent Ozar: Dan says he’s having a great debate of temp tables versus table variables. He hates to beat a dead horse—good—then stop. No, seriously, this will be still a debate topic ten years from now. He says, “I believe it’s best to use temp tables instead of table variables but my team members disagree. Recompiles and stored procs always come up.” You can fix that, put your temp tables at the top. Put your temp tables at the top and they won’t necessarily recompile. For more details on that, go read Paul White’s temp table recompile. Google for temp table recompile…

Erik Darling: Oh, temp tables and stored procedures.

Brent Ozar: Yeah, oh my god, he’s got a great blog post, like four of them about plan reuse with those that are shocking. I really dig them.

 

How should I handle NoSQL databases?

Brent Ozar: Richie, Alexander wants to ask you, he says, “What do you think about companies that adopt different sorts of NOSQL? How do you handle it all? Like sometimes they want Cassandra, Mongo, whatever. So why are they picking different kinds of databases and what should you do about it as a DBA?”

Richie Rump: Well those different databases do usually one specific thing really well and they don’t do other things at all. If they do it at all, it’s done pretty poorly, so that’s why you’ll have—if I need something for really fast reads, Mongo is really good for that. But if I need some relational data and then you join between different tables and stuff like that, it kind of falls over a little bit on the reporting side. That’s why you’ll see a lot of, “Hey, I need this tool for this, this and this tool for that.” I know we talked about Elasticsearch, that’s a great tool if want to go and search large amounts of text and things like that. I’m not going to put my relational data in there because it just doesn’t make any sense to do that. So how do you keep up? Well, what do you need to know is really the question. If you need to know about a document database, go look for document databases and start playing with some tools. If you need to know about graph databases, go look at a graph database and go look at some tools. I don’t like to get into the details of it until I’m told this is what we’re going with or if I’m doing some sort of POC. Until then, I just want to keep a high level of understanding what’s going on with each tool and then, hey, we’re going to go with this, then I start going pretty deep in the product.

Brent Ozar: The other thing to keep in mind as you start to spelunk around with lots of multiple tools, some of these tools fold up. If you want to read an interesting thing, search for rethinkdb, why rethinkdb failed. It was a company that was building an open source product and they just folded up recently. It’s a really interesting story behind it, around how hard it is to build an open source tool and make it work, scale, make it make money.

Tara Kizer: What’s the name again?

Brent Ozar: Rethinkdb, all one word, all lowercase. Are you trying to think about something other than SQL Server?

Tara Kizer: You said it was an interesting story, I want to look at it.

Brent Ozar: It really is.

Richie Rump: They had great founders. They were really super smart. They had a great team behind them. They had all this open source stuff coming in all over the place and just this vibrant community and they just couldn’t make it work financially. They even had NASA. NASA was putting it in their stuff. So it’s not like people weren’t using it, just sometimes the financials just don’t work out that way.

Brent Ozar: It’s amazing to read the postmortem because so many people chime in in comments and like, “Yeah, I use rethinkdb all the time but I don’t pay them a dollar because everything is out there and it’s free and it works great.”

Richie Rump: Yeah, we interviewed the founders on Away from the Keyboard last year, and the community manager. We posted the community manager after they had folded and we were like, “Do you want to re-listen to this before we post this, just to make sure you’re good with it?”

Erik Darling: Were things looking sunnier for them then or were they kind of like, “No one is giving us money for this free thing.”

Richie Rump: No, everything was like the sky is the limit. I mean, you’re not going to talk about the bad stuff to a bunch of podcasters. I mean, why would you want to do that? “No, we’re hurting pretty good. Maybe we won’t be around very much longer.”

Erik Darling: If I were working for me, man, I would be out of here.

Brent Ozar: Oh, god. Don’t use our product.

 

Do I have to license SQL Server for TFS?

Brent Ozar: Paul asks, “I just got out of a meeting.” Well, thank god for that. “Where our uptime engineer told—” Up time engineer, what a great job title. I love it. That’s cool. Richie is our down time engineer. “He told everyone that you can use Developer Edition without paying for licensing because we use TFS for development. TFS is a business application and requires licensing.” Some of Microsoft’s licenses, like SharePoint is an example, will include one SQL Server as part of it. So what you want to do is read more about that product’s license. They’re usually production licenses. They’re usually not Developer either.

Richie Rump: I think TFS was Standard but I haven’t been in that game for years, so. But it’s usually just like you said, the one and that’s it.

Brent Ozar: And it’s not like you really want a big, bulky SQL Server for that either. You can live with a small one.

 

How do I protect Analysis Services?

Brent Ozar: M.M. says, “I’m going to setup log shipping as a disaster recovery for our production server.” That’s amazing, we’re doing sort of a weird version of that this week. “We have a separate cube server with SQL Server Analysis Services, do you know what’s required for DR for Analysis Services? We’re not using VMs and my databases are large.”

Erik Darling: I have no idea.

Tara Kizer: No idea. I’ve never even used SSAS.

Brent Ozar: Everything I know from Analysis Services is just from listening to the guys at Prodata talk. It’s like www.prodata.ie, they’re out of Ireland. Just awesome, wonderful, friendly people—Carmen and Bob. The basic idea would be you just stand up another analysis services box over there and when you failover you rebuild your cubes. No sense in keeping the cubes in sync across.

 

Can I do partition switching with foreign keys?

Brent Ozar: Richard says, “I want to use…” Richie likes this. Is it just that you like his name or you like his question?

Richie Rump: I like his name, that’s it. You’ve got a great name, buddy. It means powerful ruler and I’m pretty sure you are powerful in whatever you do.

Brent Ozar: His last name is Seltzer so his middle name must be Alka. “I want to use table switches in SQL Server partitioning with sliding windows.” I’m pretty sure I know Richard too, come to think of it. He wants to do partitions with sliding windows but apparently, he can’t do that if the table is apparent in a foreign key constraint. “Is there any way to do switches or fast clearing out of partitions without dropping and re-adding constraints?” Have you guys done partitioning with constraints?

Erik Darling: Not that kind.

Richie Rump: Yes, but we didn’t do sliding windows. We didn’t slide in the data.

Brent Ozar: Every time I had to do one of those projects I didn’t use foreign key integrity and it didn’t occur to me until just now. Like I never used parent child foreign key integrity. We trusted the ETL process to manage that and make sure it was clean and pure, which of course is a bad idea but you know, I had really smart, nice, trustworthy ETL guys who lived in a van down by the river.

 

Where are best practices for SSRS 2016?

Brent Ozar: Tammy says, “I have another SSRS question. I was looking for best practices for the more recent versions of reporting services and I couldn’t find anything recent. Can you recommend a resource?” No.

Erik Darling: Doug Lane.

Brent Ozar: Even that, not recent.

Erik Darling: Totally, yeah.

Brent Ozar: Do we even know anybody who is using…? We just had the GroupBy conference. Bert Wagner—if you go to groupby.org—Bert Wagner did a session on high performance SSRS and right at the beginning he’s like, “Look, I haven’t used SSRS in the last year or two. This is just general performance advice” because people were asking how is it different in 2016. He’s like, “I don’t know.” I don’t know anybody who is doing 2016 SSRS. Not that it’s bad I just don’t know anybody doing it.

Erik Darling: Tableau.

Richie Rump: In a tableau world…

 

When should we apply service packs and cumulative updates?

Brent Ozar: Interesting. Lee says, “Microsoft wants us to apply service packs and cumulative updates and security releases right away. I’m curious to your thoughts on that.” What did you guys do when you were database administrators or you were managing databases?

Erik Darling: You know, you’ve got to patch those dev and QA servers first, man. You can’t just stick that on prod. That’s my advice.

Brent Ozar: You let it bake for some period of time.

Erik Darling: Yeah. If you’re super cool and you have clusters or AGs then you can patch some secondaries first or some dev servers first and let that burn in and then flip around and do your other stuff. The right-away thing, cool for dev and QA and whatever UAT environment you have, not cool for prod because Microsoft updates will still break stuff. Something seems to go awry with just about every one of those SPs and CUs.

Brent Ozar: The best thing was breaking NOLOCKs. One of the CUs broke NOLOCK. If you had it in development, you would never know because it’s not like you’re worried about concurrency in a dev environment. People are just running queries and you expect locking to happen so even when you bake it in for a while it still becomes an ugly surprise.

 

I want to upgrade and change my cluster…

Brent Ozar: Niraj asks a question that I feel like I want to type with a latex glove. “I need to perform an in-place upgrade from SQL Server 2008 R2 and it’s a cluster. Want to upgrade the cluster to SQL 2012. By the way, I’d like to convert them to Always On.” I’m assuming he means Always On Availability Groups. “Any suggestions?”

Erik Darling: Don’t.

Tara Kizer: I am not a big fan of in-place upgrades in production. I’ve certainly done it in test environments to save us time but never, ever in production. Usually when I’m upgrading from one version to the next I’m also switching hardware, or at least to another server. But if you’re not going to be switching servers, just install a new instance, then do the configuration you want—although if you’re going to be moving from a cluster two node to form a failover cluster instance to availability groups, you’re obviously going to need to do a bunch of work on that server if you’re doing in place. You’re going to have remove the cluster, add it back, and then setup availability groups so I would not recommend doing that all on the production servers. You have no fallback plan.

Erik Darling: The other thing is that since you’re on that version of SQL Server, you’re probably on Windows 2008 R2. That’s just like, you know, a bucket of crap for a failover cluster. Windows Server 2012 and 2016 are much better and god help you if you try to do an in-place OS upgrade.

Tara Kizer: We did those in the test environment just to save time and we’re just like two thumbs up if it works.

Erik Darling: Terrifying.

 

How can I tell which tables are getting used?

Brent Ozar: Robert has a question I love because I used to struggle with this myself. Robert says, “I inherited a legacy system with a lot of dead code and tables. Sys.dm_exec_query_stats fails me when tracking down which objects are being used. What’s the best way to determine what is being used?” Oh, there’s so many ways to do this. I would probably push back and say, why? What are you going to do? Because best case scenario you drop some tables. Worst case scenario, you break somebody’s application.

Erik Darling: Yeah, especially the code. It’s not taking up space so it’s not hurting anything just sitting there. I wouldn’t go ahead and mess with it. If you really want to get into it then set up some sort of auditing or some other monitoring just to see what runs over a 30-day or 90-day period and then get rid of everything that doesn’t. But even then, it’s not really guaranteed that anything is going to change for the better. You get rid of stored procedures that never run, they seem pretty harmless to me, like fingernails.

Richie Rump: Yeah, there’s a development philosophy where you want to delete as much code as possible. So early and often you want to delete code, so you’re always looking to delete code, delete code, delete code so that when it does go out, you’re not having all this commented-out code and/or code just doesn’t run anymore. Typically, I do a lot of that. I’ll go ahead and I’ll write a bunch of stuff—oh, that didn’t work at all. So you go in and you make sure it works and then you do a check in and deleting all the code. Typically, that’s how that would work, but a lot of people, they don’t do that. They just kind of just say, “We moved something and then the old stuff remains.”

Erik Darling: For tables though, one thing that I used to do whenever I inherited a server, I would look for table names that have like BK or back or a date on them, because that’s usually someone like saying, okay, I’m going to do something stupid over here. Let me backup the table as is and then if I need it again, you know. Because like especially when I was doing the ediscovery thing, you would sometimes have to make changes to the relativity’s configuration table, which is just a big long table with like a name and a value and then like a SQL variant column to put in your configuration. There were times when you had to make changes and you didn’t want to mess anything up so I would always select everything over into a table with like table name back with a date so I would know which table is which. So that’s one thing that you can look for sometimes, there’s a pattern that pops up where people make backup tables that are safe to ditch after a while.

Brent Ozar: He follows up and he says his biggest problem isn’t space but Visual Studio DACPAC deploy times. Yeah, it’s not—DACPAC isn’t known for speed and efficiency of deployment. That’s your bigger problem there. If you want to performance tune that process, I’d actually look at switching processes. Because even if you have a bare minimum of tables, as your data file grows, you can continue to be screwed because sometimes the DACPACs will simply in order to affect one alter table, they’ll create a new table and shove all the data into it then drop the old table. That can break all kinds of things, replication, security setups that you’ve done, so I’d want to be careful about that. Thanks, everybody for hanging out with us this week. We will see you guys next week in Office Hours. Adios.


The 2017 Data Professional Salary Survey Results

We asked to see your papers, and 2,898 people from 66 countries answered.

Download the raw data in Excel, and you can slice and dice by country, years of experience, whether you manage staff or not, education, and more.

Community bloggers have already started to analyze the results:

Keep in mind that the data’s only as good as the people who entered it. This was free for anyone to enter, and we didn’t validate their experience, their actual pay stub, or whether they have naked pictures of the boss that they’re using for blackmail. You have to take the data with a grain of salt, and use medians (the middle numbers) rather than averages. (We’ve also taken the liberty of hand-editing and removing specific rows – for example, somebody filled it in $10,000,000,000,000 per year, and that row got removed.)

We’ll definitely do this again next year, and there’s one thing we’ll need your help to figure out: how do we get more granular location data around the world? The problem with open text entry location fields is that people can’t be trusted to put in consistent data. I’d love to have a dropdown box for country, which then populates a dropdown box for state/province, which then lets people pick metro areas. The challenge there is finding a free or open source provider for that data that integrates well into a survey. If you’ve got ideas, I’d love to hear ’em – but remember, they need to be global (not just US), and they need to be free or open source.


Indexes For Kids

Indexing, SQL Server
18 Comments

Kid Tested

When you work with something a lot, you tend to extrapolate concepts from it out to everything you see. That one person collecting tickets is a serial zone. That food truck line is a HEAP. The empty subway car that smells like a toilet is Replication.

Stuff like that.

If you have one or more kids, you might even see database concepts pop up with them, too. Their messy room is index fragmentation. Their tiny inefficient hands are like cursors. Their dirty diapers are like Replication.

Dadsplainin’

I don’t sit around lecturing my kid about database stuff. It’s not that I don’t love what I do, it’s just that I honestly do hope there are job openings for princess astronaut doctors by the time she’s old enough to start looking.

But then we got this puzzle! And it’s like, the perfect way to explain why indexes can make things faster.

I mean, just look at it.

The wires are in this picture to annoy Richie.
The wires are in this picture to annoy Richie.

 

It even joins together. Lord help me. Here it comes.

If you’re already cool with indexes

This post isn’t for you, unless you’re super bored.

See, each of those stacks is like a column. Or a field. And each of those letters and pictures is like a row. Or a record.

Now that your database terminology OCD requires medication

The end goal is to match each one up. They even have matching patterns. It’s really something.

Here are your DBA choices for matching them:

  • If they’re both shuffled, you pick a stack and look for the corresponding piece in the other stack. This is horribly slow and procedural, even if you start to cheat and make matches when you remember you’ve seen a piece recently.
  • You can organize one alphabetically, much like adding an index, but you still have to wander through the unsorted stack. That’s not much better, is it? You’ve turned into a human Nested Loops Join. Half-indexed data doesn’t help you much here.
  • Last, you can organize both stacks alphabetically. This is obviously the most efficient. It’s almost like join elimination. You can mash both stacks together without looking.

This is a lot like what Brent teaches in How To Think Like The Engine. Any WHERE clause or JOIN that isn’t indexed means you’re scanning something. This doesn’t mean I’m totally against scans, it just means I’d rather SQL scan indexed data than unindexed data.

Towards entropy

If this puzzle were like your data, it probably wouldn’t be a 1:1 match. I’m not insulting your data, mind you. Some customers signed up, but they didn’t order anything.

Similarly, if we were missing any puzzle pieces, we’d know because we just sorted them all. If you don’t, trust me, you end up looking through the stack like four times thinking you lost your mind not being able to find the Xylophone. Ordering the data first means you know xylophone is missing, so when X comes up, you discard it.

Inevitably, one of these pieces will end up broken or in the toilet.

And this, my dear friends, is why you always make backups.

Thanks for reading!


Your Favorite Bugs & Enhancement Requests: #TSQL2sday 86 Roundup

SQL Server, T-SQL
18 Comments

I dunno about you, but I got a big stocking full of coal. Next year, I’m gonna be better, and I plan on asking Santa for a whole bunch of Connect requests. For T-SQL Tuesday, I asked you to name your favorite SQL Server bugs & enhancement requests, and here’s what you want in your stocking next year.

If you agree with a feature, click on it, and upvote its Connect request. These bloggers took the time to make their case – now it’s time for you to vote.

Export/import SQL Server settings – Rich Benner blew me away with this idea. Wouldn’t it be cool to simply export all your settings when you need to build a new server and make sure it’s consistent with the old one?

A query hint to force parallelism – James Anderson wants a supported hint that would do the same thing as the unsupported trace flag 8649, and I am intrigued by his idea, and would like to subscribe to his newsletter.

Track last-used-dates for objects – Wayne Sheffield wants it. You want it. I want it. This would be awesome.

Turn off Enterprise features in Developer Edition – Mark Southall echoes something I’ve heard so many times from DBAs who want their developers to avoid features they can’t deploy in production. Yes, this is less necessary in 2016 SP1 – but there’s still a few reasons, and a lot of people who aren’t on 2016 SP1.

Add a numbers tableAdam Machanic Aaron Bertrand asked for something that would make advanced T-SQL a whole lot easier to write (and especially more consistent across different servers.) He also included a few honorable mentions, plus his favorite resolved Connect items.

String or binary data would be truncated – Denis Gobo is as sick of this stupid error message as I am, and I’ve upvoted this Connect suggestion before. So necessary. And for bonus points, Lori Edwards wrote about the exact same thing!

Indexed view insertion bugs – Wow, until I read Shane O’Neill’s post, I had no idea these bugs existed, and it scares me a little. OK, maybe a lot. Hold me.

Play a sound while waiting for SSMS – Tom Roush is an optimist who brings joy to others, as do his Connect requests.

Faster query execution with batch mode – Koen Verbeeck talks about Itzik’s slick trick to use columnstore-style execution to make rowstore queries go faster, and asks to get a non-hacky workaround.

Retaining history of dropped columns in temporal tables – I think temporal tables stand a chance of being one of the coolest features in 2016 over time, and Randolph West points to a great idea to make ’em even better.

Unsigned integers – One of my favorite new bloggers, Ewald Cress, managed to work in a Zoolander reference. Bonus points. Enough bonus points, in fact, that he needs numbers larger than integers to track them.

Extended Events live data view gets behind – Dave Mason warns you about the first issue you’re likely to run into when you’re playing with XE for the first time.

Fixing a bug with columnstore indexes and filtered indexes – Any blog post that has a stack dump is fun to me, even if I haven’t been using these two features often.

Database-level wait stats – Arun wants to bring this feature down from the cloud. I totally understand that it’d be hard to code, but…we went to the moon, right? Also, I want a pony.

Sorting a stacked Power BI chart – Looking at Melissa Connors’ example, I would probably bet one crispy taco that this will be the first item in this list that actually gets fixed.

Always On recovery pending bug – Muthukkumaran Kaliyamoorthy got bit by what sounds like a really funny AG bug. (He might beat Melissa in speed-to-fix, come to think of it.)

Native multi-tenant support – One database, multiple clients with the same tables, but kept separated by the engine. I love Kennie Nybo Pontoppidan’s optimism, but I think Tom Roush’s request is likely to get fulfilled first.

MSBuild support in SSIS – Automated deployment is getting more mainstream these days, and Jeroen Janssen wants to make it easier.

SSRS “while-loading” page – Ryan McCauley wants to customize the page users see before their report loads, and show them useful information about the report. I love it.

Subfolders in SSRS – I’m not sure if Bert Wagner is trolling me, because it seems impossible that this hasn’t happened after 7 years of requests.

Estimated Number of Rows to be Read – Rob Farley wants to see this property added to the execution plan tooltips.

Named function parameters – Instead of just passing in a list of values, Riley Major makes a great case for why you should pass the names in too. He won me over.

Padded strings – No, not just with spaces, but for example a string of leading zeroes for zip codes. Christian Gräfe won me over with that Oracle idea.

Add included columns to sp_helpindex – Kenneth Fisher, buddy, pal. One word: sp_BlitzIndex. At least I think that’s one word.

SQL Server on Raspberry Pi systems – Winning post in the category of “Highest Licensing Exposure for Smallest Performance Gain.”

RowCreateTime, RowModifiedTime, RowModifiedUserName fields – Louis Davidson opined that SQL Server should automatically update these columns. It’d be really useful for change detection. This one’s my favorite – in fact, I’m such a huge fan of this that my own T-SQL Tuesday post was about a different Connect request with the same goal.

Updates After the Publication:

IntelliSense for MDX – Jens Vestergaard wrote a post, but forgot to link back here, so we didn’t know about it. I like the idea though!

Storing & accessing Query Store data – Erin Sstellato wants to put it somewhere other than PRIMARY, and export it from the database.

Lots of stuff – Adam Machanic put together a whole wish list for developer ease-of-use.

Whew! That was a ton of fun. Wanna host your own T-SQL Tuesday? Check out the rules and email amachanic at gmail dot com – he’d love to have you, and it’s a great way to meet new bloggers, spark inspiration, and get other people reading your blog.


[Video] Office Hours 2017/01/11 (With Transcriptions)

This week, Brent, Erik, Tara, and Richie discuss restoring databases from Enterprise Edition to Standard Edition servers, their favorite way of collecting SQL server metrics, extended events, career advice, how to tell what processes SQL uses memory for, and much more!

Here’s the video on YouTube:

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes

Office Hours Webcast – 2017-01-11

 

Can you restore from Enterprise to Standard Edition?

Brent Ozar: We might as well go ahead and get started because now we have two pretty good questions in there that don’t have anything to do with the Nelson Mandela Effect. Mandy asks, “Can there be any issues restoring a database from an Enterprise Edition server over to a Standard Edition server? I’m on SQL Server 2014.”

Erik Darling: Yes.

Tara Kizer: I’m pretty sure that when we did that to move a production database down to a test environment for like customer service at my last job I believe it was, the restore failed because it was using an Enterprise Edition feature. I think normally though you can do the restores down but this was a specific issue. I forget what the Enterprise feature was though.

Erik Darling: Yeah, it really could be any of them, it doesn’t matter if it’s partitioning or compression or just any one of those things, or TDE, or something that’s Enterprise only. If you’re using any one of them, the restore will fail. There is—I forget if it’s its own DMV or if it’s a column in a DMV—it’s like sku persisted features or something like that.

Brent Ozar: Yeah, persisted sku features.

Erik Darling: Yeah, that’s the one. That will tell you if you’re using anything Enterprising. You should know that. You’re a good DBA. You should be aware of all the features that your developers are implementing.

Brent Ozar: Yeah, but she is such a good DBA she never had to do a restore, so.

Erik Darling: Terrifying.

Tara Kizer: I’ve never really even used Standard Edition, so a lot of times I have no idea what those lesser editions have. I’ve always worked in large corporations that just throw money at SQL Server.

Richie Rump: What are those poor plebs using?

Erik Darling: It’s like, no, it’s all Enterprise.

Tara Kizer: Yeah, what we had to do at my last job, because this was going to be used as a test environment, but it was going to be used like production, so we had to license it and we didn’t want to waste an Enterprise license on that one. So we had to restore to an Enterprise Edition test server, remove the feature, backup, and then restore it to the Standard Edition server.

Brent Ozar: My favorite way to tell when it’s being used by a database is use sp_Blitz. If you run sp_Blitz it will actually check to see if any databases are using Enterprise Edition features and it will tell you right in there. That will help a lot.

 

Have you ever seen hypothetical indexes suppress normal ones?

Brent Ozar: Justin asks—here’s a really bad question—“Have you ever seen hypothetical indexes suppress normal ones?”

Richie Rump: Not the worst question we’ve seen this week.

Brent Ozar: No.

Tara Kizer: It’s a good question. Hypothetical indexes don’t actually get used by anything. It’s just used for the database tuning advisor, right? They were just left behind from that. It didn’t complete successfully so it didn’t clean up all the stuff that it left behind.

Erik Darling: I wonder if that would ever suppress a missing index request though.

Brent Ozar: Oh, that’s a great question. We should totally try that.

Erik Darling: Yeah, that would be weird, right?

Tara Kizer: That means we have to use DTA.

Erik Darling: Can you create a hypothetical index? I forget.

Brent Ozar: I don’t think so.

Erik Darling: Never mind then, out the window.

Tara Kizer: Too much work.

Erik Darling: Scratch that blog post before it even begins.

Brent Ozar: That’s a great question too though, could you create your own hypothetical index? So then we have to start tracing the DTA.

Erik Darling: I did that once and it was stupid looking.

Brent Ozar: There’s trace flags.

Erik Darling: Yeah, I remember specifically there was one like showplan feature that the only other reference I found to it was on a website that was all in Russian and I didn’t click on it. I was like I don’t want to know.

Richie Rump: ??.

Brent Ozar: ??.

 

How do you recommend collecting performance data?

Brent Ozar: Renee asks, “Do you guys collect KPIs, key performance indicators, of the SQL Server? If so, how do you do that? What’s your favorite way to collect SQL Server metrics?” We should ask each of you. Erik, how do you collect SQL Server performance indicators?

Erik Darling: I use sp_BlitzFirst.

Brent Ozar: How does that work?

Erik Darling: You can log in all the tables and you can set it up as an agent job to run every however many minutes or seconds you want. If you’re short a monitoring tool or round-the-clock DBAs who don’t mind hitting F5 or one of those dark man bird things…

Brent Ozar: People who aren’t watching the YouTube video won’t get that.

Erik Darling: You know, it’s a pretty good way to collect metrics across a variety of things like wait stats and file stats and all sorts of stuff. Then it is kind of an exercise to you to query and trend that information.

Brent Ozar: Richie, how about you?

Richie Rump: I don’t always collect stats, but when I do, I use sp_Blitz. Even before I started working for Mr. Brent Ozar I used sp_Blitz. So nothing new there. If it ain’t broke, don’t try to figure it.

Brent Ozar: Okay, Tara, give these guys an intelligent answer.

Tara Kizer: I’m a big fan of monitoring tools that can collect this data for me. If I don’t have a monitoring tool not licensed for a certain server, I might start collecting stuff there, maybe it’s perfmon counter logs that I’m using. But again, I’ve worked at large corporations where we just had monitoring tools collect all the stuff and then we reported on that out to a DBA team.

Brent Ozar: Tara doesn’t get out of bed for less than Enterprise Edition.

Tara Kizer: Sorry.

Brent Ozar: That’s my favorite answer too. We write sp_BlitzFirst to make it easier to go gather stuff but if I was a DBA tomorrow I would immediately go to my manager and say, “I’ve got to have a dashboard. I’ve got to know what’s going on here.” I would never try to roll my own as a DBA. There’s a funny story around me not getting a job because I gave that as an answer in part of the interview.

Tara Kizer: I actually have rolled my own monitoring tool, this is many years ago, in a large corporation but before we started shelling out cash for monitoring tools. It was a little .NET application. I didn’t have access to the production server from my desk or at home and we had to walk to the actual data center, which if I was in the office was fine because it was two buildings over, but it just became ridiculous. So I put a tool out there to then email me some information or maybe put it in a log file and email it to me. Eventually, we did get access. Then having to get a call in the middle of the night was ridiculous because no access.

 

How many miles did you walk this weekend?

Brent Ozar: You know, you say that, you didn’t want to walk to the server but how many miles did you walk this weekend?

Tara Kizer: So I only became active the past—I’m going to say four years. I was pretty sedentary before that and a lot heavier.

Brent Ozar: I see the pictures on Facebook and I’m like dear god, that’s a mountain, how…?

Tara Kizer: Yeah, I’m a little concerned about the two hikes I signed up for for Saturday and Sunday because they’re both hard and I didn’t realize they were the same weekend. I was like, “Oops,” usually I only do one.

Brent Ozar: Wow, tell us about those.

Richie Rump: The real question is what’s in the water bladder? Is water really in the bladder?

Tara Kizer: It’s definitely water because everything else is junk. Everything else breaks the hydration. One is in miles—it’s only like 1500 elevation gain—but it’s boulder scrambling. I just completed a hike on Sunday that involved a lot of boulder scrambling to get to the peak. I’ve never been sore on my arms from my hike as I was on this hike, usually it’s my legs. So Saturday’s includes that too. Sunday’s is a cross country for the desert for a peak out there called Willow Peak. I’ve been wanting to do that for a while. You need a capable vehicle to get to the trailhead so I need to go with groups that have those vehicles.

Brent Ozar: Wow.

Richie Rump: Nope, nope.

Tara Kizer: I think that’s 12 miles. It’s 21 miles and I’ve done that in a day before but it’s just the elevation gain and the type of hike that it is that’s going to make them really hard.

Brent Ozar: Richie, you’re saying no but you started running again. I heard that on the podcast.

Richie Rump: Yes, yes, but I’m not doing 21 miles in the desert so there’s a difference here. It was 60 degrees on Monday here and I didn’t even get out of bed. I was like, “I’m not running. That’s it. I’m done.” It’s got to be hot and it’s got to be humid for me to actually get out and go do it.

Brent Ozar: For those of you that don’t know, Richie lives in south Florida which is where you don’t even have to carry water. All you have to do is open your mouth and keep running and you just immediately hydrate.

Tara Kizer: The highest peak in Florida I think is like 350 for the elevation.

Richie Rump: And that’s at Disney World, right? That’s like Big Thunder Mountain or Space Mountain, or one of those two.

 

Should we use the cloud as a tertiary server?

Brent Ozar: Thomas asks, “What do you guys think of using Azure or Amazon Web Services or Google Compute Engine for a third secondary, aka tertiary, for High Availability? Is there one that you guys would recommend over GCE, Azure, or AWS?” I’ll take that one. I would use the one that your developers want to use because they’re probably wanting to use other features or things that are inside that platform. Some developers have stuff that they like to use in Azure, some have stuff they like to use in AWS, whatever, so I would look bigger and from an enterprise perspective. We never get to pick as DBAs, we’re kind of like low man on the totem pole there. Richie gets to pick because he’s our head developer. He gets to architect all these things. We just, “Where do you want to go, Richie? All right, that’s the one we’re going to go with.”

Richie Rump: Except I don’t get to pick because you got to pick, Brent.

Brent Ozar: I got to pick.

Richie Rump: And what do I do? I’m like, “I’m going to use Azure for this” and just go off and do it.

Brent Ozar: Yep, and then swear later.

 

How do I use the system health XE session?

Brent Ozar: Terry asks, “Do you have a recommended blog or anything on how to interpret and use the system health extended events session? P.S. I have taken down all my Christmas cards except for yours which is staying up forever.” Awesome. We send out Christmas cards to our training buyers. Have any of you guys used the system health extended events session?

Erik Darling: I have only used it to look at deadlocks and I think maybe like occasionally just to—I’ve looked at like the CPU utilization thing but I just haven’t really cared all that much about it.

Brent Ozar: I don’t either.

Tara Kizer: I used it one time and I had to look up how to use it because I can never remember and also what is included in it. But Google has always led me to the answer.

Erik Darling: I would say deadlocks is probably the most common, there’s a bajillion blog posts about that.

Brent Ozar: If you want to learn more about extended events in general, there’s two things: look for Jonathan Kehayias and then also look for SQL Server Central’s Stairways to Extended Events. Both of those are good resources.

 

I have a function that’s fast on one server and slow on another.

Brent Ozar: Frederick says, “I have a function that takes 35 seconds to run on its original server then I moved it to an identical server—same CPU, same memory, same SQL Server version—and it runs there in five seconds. I can’t find any differences in MAXDOP, cost threshold, query plans, whatever. They’re on the same VM server and SAN. I’m out of things to check. What should I do next?”

Tara Kizer: Statistics IO and time. I’d compare the output of those and I would bet that there is a plan difference though.

Brent Ozar: Yeah, he didn’t mention a plan difference. So when you go to look at the execution plans, what would you do?

Tara Kizer: I think it does say there aren’t any differences in the query plans.

Brent Ozar: Oh, yeah.

Tara Kizer: But I would bet there are.

Brent Ozar: Yeah, you’re right. I bet you’re right. We’ve got a post talking about you can use a plan comparison tool built into SQL Server Management Studio 2016 to compare two different plans and it will find all the differences between those two. Those are useful.

Tara Kizer: Then also check while both of them are running run BlitzNow or WhoIsActive and see what it’s waiting on. Is there blocking? Is that what the difference is?

Brent Ozar: Yeah, sp_BlitzFirst has the seconds parameter. So you can run it with at seconds 30 and it will take a 30 second sample and tell you what happened during that time span. That may help a lot.

 

Why is a backup being blocked?

Brent Ozar: Paul says, “I’m seeing blocking when my transaction log that’s running and the blocking isn’t picked up by Idera’s monitoring. What in the SQL Server engine could be blocked during a transaction backup? I didn’t think backups would block anything.” Has anybody else seen this?

Tara Kizer: We need more information I think.

Brent Ozar: There are two.

Tara Kizer: Do you know what it is?

Brent Ozar: One. One is msdb. If the beginning or the end of your log backup, it will hit msdb in order to read or write backup information. So I ran into this before when I had a long history of backups and I wasn’t purging it out. So then my monitoring tools would actually get blocked. You know what’s funny? A lot of monitoring tools will exclude their own queries. So they won’t show you that they’re part of the problem. If I had a crispy taco to bet, I would bet it on that one. It could theoretically also be system tables but I would bet it on the msdb tables.

 

What should MAXDOP be if I have 80 cores?

Brent Ozar: Oh, goodness, Jonelle. Jonelle I like already because Jonelle has a lot of money. Jonelle says, “Hi. In a transactional workload, what’s the best practice for MAXDOP on a system with 80 cores?” Not 8, not 18, but 80 cores.

Tara Kizer: … one of my clients that claimed that they were throwing hardware at a performance problem and they started with like 4 processors and I think they went up to 8 or something. I was like, “You have 8.”

Brent Ozar: My laptop has 8.

Richie Rump: Mine does too, 8.

Tara Kizer: I know that we say max degree of parallelism shouldn’t go any higher than 8 unless you can prove that higher will help you. I sat in on Gail Shaw’s presentation at PASS for tempdb. I think she covered MAXDOP. I’m pretty sure it was a tempdb session but I went to two of her sessions. Her opinion is MAXDOP should be set to no more than four—or maybe that was a tempdb datafile? Maybe that’s what it was. I know that I’ve seen other people say max degree of parallelism, don’t go any higher than 4 but I know that we say 8. Try them both.

Brent Ozar: Erik, you ran tests on CHECKDB at Dell DBA Days and you ran it with different MAXDOPs. It went faster when you went higher than 8 too, right?

Erik Darling: Well, sort of. On 2016, it went faster up until I hit the number of cores in a socket. So we had, I think it was dual 18s in those?

Brent Ozar: I think that sounds right.

Erik Darling: Yeah, dual 18s or 16s. So we had that and when I got up to the number of cores in a socket, DBCC CHECKDB went not like crazy faster, but it did get faster and faster up until I hit the number of cores in a socket. But prior to 2016, I’ve always found that it sort of levels off around MAXDOP 8. I’ve never found that once you get up to like MAXDOP 16, 32, 24, hike, that it gets any better. 2016, there is a difference. Whether that translates to workload queries is different. But you can set MAXDOP specifically for DBCC CHECKDB in 2016 so that’s where it shines a bit.

Richie Rump: I have done a little bit of work with an 80 core server. I had the opportunity of working on one of those guys. For the workload that I was running, was just doing a lot of reading on vast amounts of data. One of these tables was like six terabytes or something like that. I was playing with MAXDOP and my sweet spot for that one query was 24. I did it in increments of five and anything after that—or four, or whatever it was…

Tara Kizer: I was going to say five, why would you go in increments of five?

Brent Ozar: That’s our developer, ladies and gentleman.

Erik Darling: Increments of 13.

Richie Rump: But after that it got slower, before that it kept going faster. So I just set that one particular query to 24 and kind of off to the races. I don’t think we set MAXDOP for that particular server but for that particular query, it needed to be set.

Brent Ozar: The reason why I bring up those other kinds of experiences is especially, now most people out there won’t invest in an 80 core server. So you, Jonelle, I’m not just saying this because you have money, you’re probably a smart and attractive person as well, whether you’re a man or a woman, it’s probably either way. Stand near an 80 core server and amazing things happen. But, you may want to have different settings for things like rebuilding indexes or CHECKDB. You may want to be the kind of person who plays around with resource governor to set different MAXDOPs for different workload pools.

Erik Darling: I think if I’m on an 80 core server I’m going to be really worried about seeing them thread. I want to be looking into trace flag 8048 and some of the other ones that help out prior to 2016 so they can sort of help out when you have that many cores. Because I’m guessing if you have an 80 core box you are at more than eight cores per socket and that’s when things start to get a little tricky, when you have those big, wide CPUs.
Brent Ozar: How many cores—the new desktop you just built—how many cores does that thing have?

Erik Darling: 8?

Tara Kizer: 120.

Erik Darling: It’s an I7 6850, so it’s 4 cores, big thread. So it is hyper-threaded. I didn’t go with the crazy one.

Brent Ozar: So you have more solid state drives than you have cores.

Erik Darling: Yeah, but they’re in a Windows 10 storage space so technically I only have one solid state drive that’s 3.5 terabytes.

Richie Rump: Your priorities are out of whack, sir. I hate to tell you that.

Erik Darling: If I wanted to spend an extra $400 I could have gotten the bigger chip but I didn’t want to do that, at least not this time around.

Brent Ozar: One of the solid state vendors I was just looking at today has a 4 terabyte 2.5 inch SSD out right now. You can buy it on Amazon, it’s $1,300. Insane.

 

Why does my CHECKDB fail?

Brent Ozar: M.M. says, “We have a two terabyte—” Why don’t I read these questions before I start talking? M.M. says, “We have a two terabyte database which fails the CHECKDB before the backup due to my snapshot space. However, there’s over 400 gigs free on disk.” Oh, I feel sorry for you, buddy. “Is there anything on how I should manage CHECKDB or how I get around what that error is?” Have you guys run into this error before?

Tara Kizer: It sounds really familiar but I can’t pinpoint it.

Brent Ozar: Something to do with snapshots with like fragmented NTSF volumes…

Erik Darling: Well, yeah, because snapshots are sparse files so they do something weird off the bat. I forget what it is but, oh god, what are those trace flags that help…?

Tara Kizer: Yeah, I was going to say there’s a trace flag.

Erik Darling: 20…

Brent Ozar: I wonder if Bob Pusateri’s CHECKDB thing is public.

Tara Kizer: While you are doing that, I would recommend that M.M. posts this question on Twitter with #SQLHelp and you’ll probably get Paul Randal or someone like that reply to you with the correct answer.

Brent Ozar: Is there anyone like that or is there just Paul Randal? Yes, I love Paul dearly. So, yeah, that’s a great point. It is a known thing that a lot of people have run across and none of us just remember it off the top of our heads.

Erik Darling: Aaron Bertrand also has a post called “Minimizing the Impact of CHECKDB” that talks about some of the trace flags that you can use that will change the way CHECKDB runs. You might have some luck with one of those. If not, there’s always off-loading CHECKDB. You could backup the database and restore it somewhere else. Since you’re already a diligent DBA taking backups you might want to try physical only. I don’t know if that would decrease the amount of work that goes into but it might be another thing to look at.

 

If MAXDOP is 8, will SQL Server only use 1 or 8?

Brent Ozar: Humphrey likes his question so much that he has pasted it in three times. All right, Humphrey, you’ve got my attention. He says, “If MAXDOP is set to 8 does that mean that SQL Server will only use either 1 or 8?” As much work as you put into your question, I’m going to give you a real short answer: No. That’s just the MAXDOP, it’s not the MINDOP. If you look at your execution plans you can actually see the number of threads used.

Erik Darling: Only the actual plans though.

Brent Ozar: Yes.

Erik Darling: Not the cache plans, otherwise it would be in BlitzCache and it would be slick there.

Richie Rump: And it would be amazing.

Brent Ozar: It would be hard to query.

 

Where should I put the partitioning key?

Brent Ozar: Fiag asks, “Should the field in the partition function always be the first place in the primary key for the same table? Basically when I partition does the partition field have to go first in the indexes?”

Erik Darling: I’ve never not done that so I don’t know.

Richie Rump: Well then is it partitioned?

Brent Ozar: Yeah, you can create indexes on top of it and have those not be the first field but, yeah, sadly I think it has to be the first one. I’ve never not done it that way either.

 

What’s your favorite monitoring tool?

Brent Ozar: Clark says, “Speaking of monitoring tools. Tara mentioned Foglight/Spotlight in the past, or as we used to call it ‘QuestFogSpot.’ Is this the monitoring software she prefers?” Why don’t you care about the rest of us, Clark?

Tara Kizer: Brent used to work at Quest. I really did like Foglight and Spotlight when I used them at two of my jobs. The last job was using SQL Sentry Performance Advisor and I liked it too. I haven’t done much with Idera. Those are the three tools that I think are pretty comparable. But, you know, are there any other tools that I prefer? I just have the most experience with Foglight and Spotlight but SQL Sentry, or SentryOne’s Performance Advisor is a great tool as well.

Erik Darling: I’m having the hardest time adapting to their new name.

Tara Kizer: I know.

Erik Darling: SentryOne. Like SentryOne perf…

Brent Ozar: Sentry 21.

 

Should I learn A or B or C or D or all of the above?

Brent Ozar: Graham asks, “I’m doing some career planning and I wanted to get your opinion on whether I should focus on database development and design or should I focus on business intelligence? Should I know both? I’m already learning Amazon Web Services and Azure Infrastructure Services.”

Tara Kizer: It doesn’t say DBA anywhere in that question.

Brent Ozar: No, yeah. I think he wants to know all the things. Richie, what are your thoughts on that?

Richie Rump: I don’t know, what do you want to do is really the question for me. Those are two pretty different things. You could go into both but eventually you’re going to get a job doing one. So which job do you want? Do you want to do the BI side or do you want to the db design/development side? I think they’re two totally different things. Focus on what you want to do and then go towards that.

Tara Kizer: Some people base their decision purely on the money aspect of things so take a look at our salary survey. I know that there are definitely some people that are listing those as what they are doing right now and see what salaries are averaging for those.

Richie Rump: Yeah, I’ve done both and I’m not doing the database/data intelligence warehouse-y stuff anymore. I’m done.

Brent Ozar: Why not?

Richie Rump: It seemed pretty repetitive to me. You load data in, you clean up a crapload of data, and you take data out. It’s what you’re doing over and over and over as opposed to when you’re doing a lot of database development, there’s a lot of interaction between what’s the business really need and you’re talking with the business doing database design and doing all that stuff. Where typically when you get inserted into a business BI project, all that stuff has really been answered already and you’re just going in and doing it. At least that’s been my experience, so I’m done. I want to write code so that’s kind of where I went.

Brent Ozar: I just never want to deal with reports again as long as I live because it’s always, “Can you move this number to the right? What does this number mean? Can you add this?” “No” I’m done with that.

Erik Darling: “Are you sure this is right?”

Tara Kizer: I’ve had some fun with SSIS and SSRS when it gets into the coding aspects of it. You are playing with, in SSRS you can have .NET stuff in there, you can load in DLLs. So I’ve done some fancier stuff with these utilities. It wasn’t just purely moving a column. I didn’t get into that aspect. Once I was done with it, it got moved over to a developer to take care of those requests. But what are you most interested as far as what makes you happy. What makes the day go faster? For me, it’s performance tuning. So database development, if I had to pick between these two, database development does have performance tuning in it a lot of times so that’s where I would go if I had just these two choices here. Money and what interests me.

Erik Darling: Money and happiness.

Tara Kizer: Money first, which is why I was an on-call DBA for forever.

 

Will the First Responder Kit grow to include development tools?

Brent Ozar: Fiag says, “We use the Blitz Toolkit a lot. Are there any plans to extend this library beyond the database and architect tools to development helper tools?” The cool thing is that it’s totally open source. You can add in whatever you want. I’ve worked with other folks—for example, I’m working with Aaron Bertrand on getting in his replacement for sp_MSforeachdb. If there’s other tools that you’d like to build, we can totally talk about that. Start an issue over in the GitHub repository and the group can start discussing in. I’d love to see other stuff in there as well. Will we build it? Probably not. I think the tools that you see in there, the extent of the tools that we’re going to build, we’ll continue to make those better but I just wouldn’t expect to see new tools in there.

Erik Darling: Even some of the tools that exist we just don’t add much too, like BlitzTrace and BlitzRS, we just don’t do anything with them.
Brent Ozar: Brent says, “Is there a way to see how SQL Server is using memory after it’s taken all that’s available, like what parts of the memory SQL Server is using for what?”

Erik Darling: Yes.

Brent Ozar: All right, next question… [Laughter] Where do you look? Is it in DMVs?

Erik Darling: It’s DMVs all the way. I’m going to totally barf on remembering the names of them but I know that there are DMVs where like memory buffers or buffer descriptors or something like that where you can figure out which tables and indexes are actually in memory and all that stuff.

Brent Ozar: Clerks is another one. Clerks describes if it’s used for log buffering or whatever.

Erik Darling: Clock hands.

Brent Ozar: Clock hands, wow, oh man, that’s bonus points for that. If you want a query to see as an example, if you search for Opserver—it’s like observer but with a p—Opserver, it’s an open source thing by Stack Overflow’s dev team. It’s a monitoring tool kind of for SQL Server, it’s like a dashboard. If you go to their GitHub repository, search for the word “clerks.” There’s a DMV query in there that I wrote with Nick Craver and it lists all of the clerks and what they do. So it just gives you what is used inside. I have no idea how it works, I wrote that thing like three years ago.

Richie Rump: Yeah, not Amazon. Don’t go to Amazon and search for clerks, but GitHub and Opserver because the other one will get you something else.

Brent Ozar: Which is good as well, but yeah.

Richie Rump: But in a different way, Brent.

Tara Kizer: Not in a good way.

Richie Rump: A little bit different way.

Brent Ozar: Paul Olson follows up and says, “What about BlitzCache, won’t that show what’s in memory?” It only shows the one component, the cache query plans, it doesn’t show for example what transaction log records are in RAM, how much Always On Availability Groups is using, all kinds of other stuff.

Brent Ozar: Justin adds, “Clerks was indeed a great movie.” I’ve never seen it. I’m just going to leave it there.

 

How can I tell when I have out-of-date statistics?

Brent Ozar: Fiag says, “Sorry for my ignorance.” You’re forgiven. We all have our problems. All of us happen to also suffer from ignorance so we work pretty well here.

Erik Darling: Massive amounts.

Brent Ozar: “Does Blitz recommend stats for what table should be…?” I think you mean where we need to update statistics.

Erik Darling: Yes, well not Blitz, but BlitzIndex. I recently added—so it doesn’t do it out of the box automatically because there were some performance issues, the server had a large amount of statistics in them, that I think I fixed. It was weird, I’ve been experimenting with it with clients and I haven’t been able to get the statistics part to run slow since I changed—I had to hint the join types to automatically use hash joins because there were certain circumstances where like SQL would just woefully underestimate how many rows are going to come back and just be like, “I’m going to nested loop sort through all this garbage.” So, yeah. BlitzIndex, if you download one of the newer versions of it and you use the @SkipStatistics = 0 it will tell you if you have wonky statistics on any of your tables. Tell me if it runs slowly so I can keep messing with things.

Tara Kizer: By slow he means like over 20 minutes before I killed it.

Erik Darling: I tested that on so many statistics objects and I could not get it to be slow.

Tara Kizer: I remember that. It was like two clients of mine in a row or something that it happened on.

Erik Darling: 30,000 some odd statistics objects and it’s not…

Brent Ozar: This is exactly where our scripts come from. All of the stuff in the First Responder Kit, we use this every day. We go and run it against client’s servers. So please, when you check in good code, make sure it doesn’t suck. I’m not talking to Erik, I’m talking to you, the public.

Richie Rump: Yeah, you’re talking to me. I understand that, Brent. Yeah, you’re talking to me. I get that.

Brent Ozar: You.

How can I tell if a query is coming from a linked server?

Brent Ozar: Curtis asks, “Is there a way to tell if a query is coming from a linked server connection?” I don’t know that.
Tara Kizer: I don’t think so, especially if they’re doing an open query to use it because it’s just going to look like the query is local.

Brent Ozar: It could come from—it’s going to look from another server name. I think you could see from sys.dm_exec_connections or sys.d—whatever the connections DMV is.

Tara Kizer: Yeah, for a server rather than a desktop or a laptop. If you have a naming standard you could search in.

Erik Darling: If we’re just like brainstorming ways to tell you could look through the execution plan and see if there’s a remote query operator in there anywhere.

Tara Kizer: Wouldn’t that appear on the source server though, not where the actual object exists?

Erik Darling: Oh, yeah. Then you know what, look at the SQL text and if there are a lot of double quotes…

[Laughter]

Tara Kizer: That has good stuff in there.

Brent Ozar: It looks like a really bad query. We could probably figure that out, too. But I like the connections DMV angle is where I’d probably go.

 

Was Clerks 2 better?

Brent Ozar: The last question we’ll take is I guess from Brent. Brent says, “What about Clerks 2?” No, I haven’t seen that either. I don’t know if you guys did? Nope? All right.

Erik Darling: Best part about Clerks 2? Goodbye Horses is in the soundtrack.

Richie Rump: I didn’t think Clerks 2 was bad but I didn’t think it was that great either.

Brent Ozar: What’s your favorite movie that you’ve seen lately?

Richie Rump: Rogue One.

Brent Ozar: That good? I still haven’t seen it yet. I need to get that in the theater while it’s still out. Yeah, I know. I’ve been busy. I’m trying to get a dang apartment over in freaking San Diego. Turns out to be kind of hard.

Richie Rump: Priorities, Brent, priorities.

Brent Ozar: Yeah, right? All right, thanks everybody for hanging out with us this week and we will see you guys next week on Office Hours. Adios.

Erik Darling: Adios.


Build Me A Build: Things I’d Do Different

Hardware
36 Comments

What I started with isn’t what I ended with

I started off really trying to stay close to budget. When I realized I needed to go over by a bit (storage, mostly), and that not all of my parts would even be here in the calendar year (the M.2), I started to wonder if I got the parts that I should have. This is what I started with.

It was a reasonable build that stayed close to budget, and I was pretty happy with it.

When reality kicked in, this is what I ended up with. That’s all of last year’s budget, and half of this year’s budget. And trust me, I could have figured out other stuff to buy, too.

What changed?
The CPU went from the i7-6800K to the i7-6850K
The motherboard went from the X99-A to the X99-E
Filled up the ICY DOCK with 6 SSDs (but I would have done this anyway)
Got a 1TB M.2
Threw in an 8TB SATA drive for backups
Added a bunch of case fans (seriously, this thing could terraform a small planet)
Downgraded the graphics card (who cares, right?)
Upgraded the power supply from a 550w to an 850w

Most of the changes didn’t add a significant amount to the price. The M.2 was the real killer.

But worth it!

Feeling a little weird

I started thinking about other options. If I knew how badly I was going to go over budget, I probably would have built this from the get-go. Not that I’m at all unhappy with what I ended up with, of course. But I could have made some cool tweaks.

The main difference here is the motherboard. I would have bought the X99 Pro (on purpose, this time), and nabbed a second 1 TB M.2 — though I’m not sure if I would have put together an M.2 RAID, or created TempDB drives on it for my SQL VMs. The world may never know.

Until 2018.

Other differences: the RAM would have to change to be compatible, but I’d keep the same amount. There’s an incompatibility warning on the page for 128 GB of RAM, but Asus documentation says it’s supported. I sent a note into the site owners, but they haven’t updated it yet. This is also the same board/RAM combo that Stack Overflow devs use, and I’m like 50% sure their computers work.

Understimating

What I imagined myself doing is plugging this thing into one of my laptop monitors, and borrowing a keyboard and mouse from another setup to do the initial install and config. After that, I’d just remote in to do my stuff.

That didn’t quite go as planned. For some reason, this thing didn’t work with a laptop monitor. When I plugged it in, the monitor just ignored it, the way I ignore Surgeon General warnings. I ended up having to do the initial install and config on my livingroom TV.

Big Windows

Yes, 55 inches of Windows installer is a bit odd.

I also didn’t take into account little hiccups, like a BIOS update needing some config changes, adding an External switch in Hyper-V breaking networking, etc. There’s a lot of little tinkers you have to do, and moving this back and forth between my office and the livingroom TV is just stupid, especially when my kid is trying to watch cartoons hit each other with hammers.

So yeah, I caved and bought a cheap monitor, and a keyboard/mouse combo. Yes, my desk is going to look ridiculous, but whatever.

So now what?

With everything built, it was time to get things installed. Word of warning, if for some reason you buy the X99-E, you can’t run Windows Server 2016 on it, as of this writing. I mean, you can, but it can’t figure out which driver to use for the Ethernet port, which is a deal breaker. Intel drivers wouldn’t even install; they’d say there were no Intel devices on board to use. Heh.

Using my favorite system info tool, Speccy, here’s the system summary. For some reason it isn’t showing the RAM as being overclocked, but it is. Weird. Whatever.

Speccy-tacular

And of course, CPU-Z shows things spinning at the speed of light.

I don’t know what the Z means.

Next up

Doing some wheel kicking is in order. I need to start setting up my VMs, which aren’t just for SQL. I need a domain controller with AD and all that. It’ll also be time to start parting ways with some of the stuff I work with on my laptop.

Thanks for reading!

Brent says: in the company chat room, there’s a running joke about our desktops being more powerful than a lot of production SQL Servers. Granted, the total cost on this is up over $4k, but with 128GB memory and several terabytes of insanely fast solid state storage, it’s still a screaming deal.


Do DDL Triggers in TempDB Survive Restarts?

SQL Server, TempDB
10 Comments

Of Course Not

Nothing survives in tempdb. But this kind of stinks for those of us who don’t want certain things created in there.

In my case, when I’m working on Blitz scripts, I have a terrible habit of accidentally re-creating the procs in user databases. Usually whichever I’m building test data in, like Stack Overflow. This can lead to several minutes of temporary insanity, where none of my changes are working.

I wrote a DDL trigger recently to prevent me from creating Blitz procs in user databases. It looks like this.

And it works just fine, it just doesn’t survive in tempdb. Something to be aware of.

What To Do?

One workaround is to create it in model. Then it will be in tempdb on restart, but it will also be in all the databases you create.

That may or may not have the intended effect! If you don’t want people doing a certain thing in just tempdb, you have to add an additional check using the event data.

Thanks for reading!


#TSQL2sday: System-Maintained LastUpdatedDate, LastUpdatedBy Columns

T-SQL
13 Comments

For this month’s T-SQL Tuesday, we’re talking about interesting Connect requests to fix or change SQL Server behavior.

T-SQL Tuesday
T-SQL Tuesday

In our Senior DBA Class, one of the exercises involves figuring out changes to a table after an Always On Availability Group failover. In every class, several students always say the same thing: “I really wish I had a LastUpdatedDate column on all of my tables that just updated automatically.”

Whaddya know – Adam Machanic filed Connect #3105516 for that exact thing. He writes:

“It is extremely common for databases to contain columns like “UpdatedDate,” “UpdatedBy,” etc. These columns work based on a trust model: As a database designer, I must trust that all of my downstream consumers will follow the intended rules, remembering to update the columns every single time any transaction occurs. And of course I have to trust that they’ll put in the correct values. The only alternative to trust is to put ugly and potentially slow triggers in place to make sure everything happened the right way.

This situation needs to change, and this is especially prevalent now, thanks to SQL Server 2016 temporal tables. Many customers I’ve spoken to regarding these tables would really like to be able to audit who made the change. Some of them would like to audit which host name the change came from. And so on and so forth. The bottom line is that we need a flexible solution to allow users to declaratively model these situations.”

I love it! Head over there and give it an upvote.


[Video] Office Hours 2017/1/4 (With Transcriptions)

On this first Office Hours call of 2017, Brent, Erik, Tara, and Richie discuss multi-server jobs, how to find motivated Junior DBAs, working with SAP databases, partitioning, AlwaysOn Availability Groups, tracking SQL users, SQL monitoring tools, data capture, source archiving, and much more!

Here’s the video on YouTube:

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes

Office Hours Webcast – 2017-01-04  

Is it okay to use IF branches in stored procedures?

Brent Ozar: We’ll go ahead and get started. Dimitri asks, “In stored procedures is it okay to use IF branches to control flow or do those confuse the optimizer?”

Erik Darling: I’ve never found the optimizer to be terribly confused by IF branches. Sometimes I confuse myself with them. Somehow I get more confused by IF branches than the optimizer does. They can actually be sometimes helpful for parameter sniffing situations. So if you can branch your logic to call a different stored procedure based on if a big value is coming through or if a little value is coming through, you can call the sub-stored procedure based on that. So it could actually be helpful in some circumstances.

Brent Ozar: Just keep in mind the key thing that he said there, call a different stored procedure.

Erik Darling: Right.

Brent Ozar: IF branches that are coded inside the same stored proc, you’re still vulnerable to parameter sniffing.

 

Where do multi-server Agent jobs actually run?

Brent Ozar: Michella asks, or Mikella—Michella, one of these days you’ve got to tell me how to pronounce your name. “We’re using SQL Server Agent multiserver jobs.” Oh my god, there’s hardly—oh man, have any of you guys used multiserver jobs before?

Tara Kizer: I’ve always avoided it. I tested it out and it just didn’t make sense for us.

Brent Ozar: Yeah.

Tara Kizer: They’re probably the one company using the feature.

Brent Ozar: Right. She says “…on a SQL 2016 server. I have an SSIS package and a job that runs but it actually runs on another server. Where is the CPU and memory being used? How is the SSIS process packaging?” The way I like to think of it is imagine for a second a package whose query runs CHECKDB. CHECKDB isn’t going to run remotely. CHECKDB is going to run wherever the database lives. Now there are packages you could design, like create stuff and move it from one table or from one server to another via linked server queries if you want to pull data down remotely. So you can make work happen anywhere. But if you just flat out run queries, they’re going to run at the target. To learn more about multiserver jobs, the guy you want to know is Mike Hillwig. If you search for “Mike Hillwig multiserver jobs,” especially you throw in the word SQL Saturday in here. Mike has down presentations at several SQL Saturdays and you can get the slide decks from there. Really smart guy, super helpful there. Literally, the only person I know using that feature.

 

Should I use table variables?

Brent Ozar: Keith says, “I was collaborating with a colleague who I respect.” It’s very interesting that he prefaces it with that. So, okay, all right. We don’t know anything about that here. We don’t respect anyone that we work with, but I’ll continue. “We were optimizing a large stored procedure. He said he never uses table variables when he expects more than 100 rows to be added. He uses temp tables instead. I scoffed since the only differences I was aware of between the two were transaction participation, the ability to add indexes, and the ability to alter temp tables.” So, which one of us is going to break his poor, delicate heart?

Erik Darling: There’s a lot of stuff going on there.

Tara Kizer: Just don’t use table variables. That’s just my rule. People talk about 100 rows, “I use a table variable over 100.” Or maybe it’s 1,000. Some people say 1,000, use a temp table. I have run into severe performance issues when the table variable always had one row in it. I do not use table variables. I know that one advantage to table variables, that they survive a transaction so you still have the data in there that you can query, maybe you’re using it for troubleshooting. For that instance, I would use a table variable. Everything else, temp tables.

Brent Ozar: Yeah, the good news is you respect him for a good reason. He does know what he’s talking about. If you want to learn more about that, if you go to brentozar.com/go/tunequeries, one of the Watch Brent Tune Queries sections involves a table variable and I show you some of the horrific drawbacks there.

 

Where can I find a motivated junior DBA in Chicago?

Brent Ozar: Paul asks, “Where can my company find a motivated junior DBA in the Chicagoland area?”

Tara Kizer: That’s Brent, but he’s coming to San Diego soon.

[Laughter]

Tara Kizer: We already had our reviews, it was last month.

Richie Rump: Shots fired. Shots fired.

Brent Ozar: If it wasn’t so true…

[Laughter]

Brent Ozar: DBA is even a stretch. He says, “Seems all the good ones are happy and not looking for work.” I’d say that’s usually generally true too. Junior DBA—such a weird term too because by the time someone is officially a junior DBA, like they hold that in their job title, guess what their next job is going to be?

Tara Kizer: Yeah.

Brent Ozar: It’s not junior.

Tara Kizer: I’d recommend posting it to a job board maybe at the local universities, wherever the business information systems building is, or maybe even computer science. My degree is mathematics with emphasis in computer science, so the math building too might draw some hits there. But junior DBA, it’d be difficult, maybe look internally also. There could be a developer looking to make that move. Developer to junior DBA salary-wise might be bad but look internally. Sometimes there is people looking to move to a different area.

Erik Darling: I’d probably just throw it on Stack. Just throw whatever job posting you have on there at least to get some eyeballs on it. You may not find the person directly from there but I’m sure someone looking at that will be like, “Oh, junior DBA job, I know someone who…”

Richie Rump: When I hear “junior DBA” I’m like you’re willing to train. If you’re willing to train, I was really tight with the folks at my placement center at the University of Miami, where I went to school. They know the students. They know who is good, they know who is not. They’ll send you resumes of the ones—because they want their students placed. They’ll send you the good ones and then you just kind of go through them and you interview them. You’ll get the best and the brightest, especially now. So January, they’re ready to hire, these kids are looking for work. They’ve been four years in school. They’ve been told that there’s no jobs out there for them. They’re hungry. I would go that route if you can wait until the April/May time period.

Tara Kizer: I actively participate in online forums and a lot of people are wondering, how can you even find a junior DBA job? There just aren’t any out there. So the fact that you guys have one is rare. I suspect that you should be able to get some good hits. You just got to get the word out.

Brent Ozar: Yeah, just don’t advertise that we need someone who is already a junior DBA. That’s where it’s not going to work. The other thing I would do is make a list of what they have to already know, that they can’t learn on the job, things that they have to already know. Tailor the job description to basically say, “You’ve been doing this for a couple years. You may be a developer. You may be a sysadmin. You may be a college person. But here’s the things that you’re comfortable doing. Here’s the things we’re going to train you on.” As soon as you’re done writing that job description, ask yourself, “What did this candidate do last year and what job do they want next?” Because over and over again I get these job descriptions where they say, “Must already know these 14 technologies, is going to become a junior DBA.” I’m like, no, no, that person is already on their way, going up that ladder.

Erik Darling: “Oracle. MySQL. Perl. Python. PHP.” I’m like who…?

Richie Rump: “SQL Server 2026.”

Erik Darling: “Must have a decade of experience with SAP HANA.” You’re like you can’t possibly.

Brent Ozar: I was literally—A call I was on earlier this week where someone said, “I don’t understand why it’s so hard for me to find a DBA who is skilled in both architecture and systems administration and development and database administration.” He goes, “I have that personal background.” I said, “What’s your title?” He said, “CTO.” I said, “Exactly. When you have that you’re done with that part of your work.”

 

What’s it like to work with very large SAP databases?

Brent Ozar: Speaking of SAP, let’s see here. Jonelle asks, “What’s your experience like working with SAP databases in terms of very large databases?”

Tara Kizer: Do you guys have experience with SAP on SQL Server? All the companies I’ve worked for have had it on Oracle.

Brent Ozar: Yeah, yeah. Horrible. Erik, how about you?

Erik Darling: No, SAP is one of the few that I haven’t done much. I mean like, okay, SAP like byproducts like Business One and stuff I’ve worked with, with like sort of sub-products of SAP. But I wouldn’t call any of those databases very large. I think they were sub-terabyte anyway. So they didn’t really…

Brent Ozar: We had a DBA at one company where we had SAP in SQL Server and BusinessObjects and the data warehouse and all that and I kind of loved it because they were like, “You don’t touch anything in here. You don’t change the settings. You don’t touch anything.” The objects were indecipherable. They were all these alphanumeric-type things. I’m like, “Great, I don’t have to do anything.” Any index design you have to do has to come through SAP anyway, so I kind of liked that. The big question to ask there is the same question we ask on a lot of stuff, what’s your RPO and RTO? Put it in writing. Because usually the whole company runs on SAP and whenever it’s time to do the restore, if it’s a two-terabyte database and it takes three hours to do the restore, if something goes wrong, everybody needs to be under the same page that our business is going to be down for three hours.

 

Will a scalar UDF in a WHERE clause cause the query to go single threaded?

Brent Ozar: James asks, “Will a scalar valued function on a where clause run single threaded and will my estimates versus actuals be correct?”

Erik Darling: That’s going to suck, James.

Brent Ozar: That’s one of Erik’s favorites.

Erik Darling: Yeah, James, that’s not going to go well. Stop doing that. The scalar valued function itself may run in parallel, that’s the funny thing about them is that the code inside a scalar valued function can go parallel, but it will force the entire surrounding query to run single threaded, which is generally not what you want to happen. On top of that, it’s probably going to run in a weird nested loop joint thing and you’re probably going to get that crappy one-row estimate and SQL is not going to know how many times it has to execute to get the data back. It’s not going to go well. Don’t put functions in your where clause, man. I’m begging you. Even table valued functions I’d keep out of there.

Brent Ozar: Especially with the estimated versus actually. Those are going to be wildly incorrect.

Brent Ozar: William says—he has a job posting—he says, “Junior DBA, must be Oracle certified, SQL Server certified, and clearance.” Yes, good one. I see that a lot.

 

Where can I find more information about table partitioning?

Brent Ozar: Dimitri asks, “Where can I find some literature…?” The library is always a good place to start. But then he gets more specific and says “…about the benefits and drawbacks of table partitioning?”

Tara Kizer: I like Kendra’s recent article. I know that we have some information on our website but she covered the topic very recently on her website. But we do have a lot of information on our website too.

Brent Ozar: And what is Kendra’s site?

Tara Kizer: Littlekendra.com.

Brent Ozar: Yes, if you search for partitioning over there, Kendra is one of our favorites on indexing design, has written a ton of partitioning in the past. It’s getting more popular these days now that SQL Server 2016 SP1 has all the developer features from Enterprise Edition in it.

Tara Kizer: Right.

Brent Ozar: I expect to see a whole lot more of those partitioning questions.

Erik Darling: There’s another pretty good article written by a guy named Dan Guzman who posts on dba.stackexchange a lot. His website is dbdelta.com. He has a post on table partitioning best practices. It’s pretty good as well if you’re just looking for sort of an opening primer. It’s not old, either. It’s from like 2014. So it’s not like ancient—ancient partitioning secrets in there.

Brent Ozar: Wow, yeah, if you google for “dbdelta partitioning” he has a category on his post just on partitioning. Man, I tell you what, that’s making the weekly links. That looks actually good.

Erik Darling: I win a prize.

Brent Ozar: You win a prize.

Erik Darling: I want another stress ball.

Richie Rump: You get to come to work next week.

Brent Ozar: What does the loser get?

 

Should I implement partitioning in OLTP?

Brent Ozar: Dimitri says, “For a background on the partitioning, my boss wants to implement it in an OLTP environment with queries and procedures that hardly ever use anything that would be considered a partitioning key. I need to convince him that it may not be for us before we spend too much time testing it.”

Tara Kizer: What are you trying to achieve? One of the things why I like table partitioning is to get rid of data or move data quickly out of the OLTP database. I had a seven-terabyte database and we had to make sure that we only had six months of data so we would run deletes in a loop in batches of 10,000 but eventually, this database got so large and the incoming data was a lot. I don’t remember what it was per day but we couldn’t even keep up with the deletes. Not only that, the ghost cleanup record process was so far behind so we ended up implementing table partitioning and we could get rid of the data in like a few milliseconds as long as it could get the schema lock. Sometimes it couldn’t and the job would take several hours but once it finally could get that lock, it just got rid of the data.

Erik Darling: A sort of cheaper, from technical and managerial points of view, would be just to use partitioned views instead of actual table partitioning. You get a lot of the same benefits without the sort of craziness of partitioning keys and begging the optimizer for partition elimination and stuff like that.

Tara Kizer: He has a follow-up comment that the goal is to improve performance. That’s exactly what Kendra talks about in her blog post from a few months ago. I would take a look at that. It has a video.

Erik Darling: Partitioning is not a performance feature.

Tara Kizer: Yeah.

Richie Rump: No.

Brent Ozar: No. And folks, feel free to get your questions in now. We go through first come first serve. We’re running out of questions here for today, which is fine too. We can just sit around and talk about our favorite things about partitioning. That was like the feature I was most excited about when SQL Server 2005 came out. I was in the middle of doing a data warehouse. I’m like, why isn’t this making anything faster?

 

Why aren’t my statistics the same on all AG replicas?

Brent Ozar: Paul says, “We’re using AOA.” Which, that’s the first time I’ve seen Always On Availability…

Tara Kizer: Call it AG or AOAG, not AOA.

Brent Ozar: “We are seeing statistics change on the secondary for our clustered indexes while the statistics on the primary replica are not changed. How does stats updates work for AOA?”

Tara Kizer: Pick me! Pick me!

Brent Ozar: Tara Kizer.

Erik Darling: Take it away.

Tara Kizer: I have a lot of AG experience, I cover this in a topic that we had at our precon. You get temporary statistics on your secondary servers. You do get the same statistics from your tables but you’re also getting temporary statistics because the databases are marked read only Microsoft couldn’t update statistics in that database. Your query patterns are different on your OLTP database and on your secondary so it may need different statistics. Microsoft invented the temporary statistics and those are in tempdb. Microsoft SQL Server isn’t in control of your statistics on the secondary so you can’t update those or create them.

Brent Ozar: I love it because you can have queries that are slow on two different replicas for two different reasons.

Tara Kizer: Yeah, that’s true.

 

Can I master a whole bunch of trades at once?

Brent Ozar: M.M. says, “You may have just given me a hint by your answer about the CTO but I’m curious about your perspective. On a large corporation that has a DBA being the all-encompassing server person setting up all file transfers, SFTP, active directory, third-party software, drivers, firewalls, server security… all general admin for more than 20 servers.”

Tara Kizer: If it’s a large company why does it just say single all-encompassing server person? Usually in large corporations everything is very divided out. You have your DBAs, your sysadmins, your SAN people, not just one person that handles everything.

Erik Darling: Yeah, that sucks. There’s no way one person is that good at everything that they could be a general admin.

Brent Ozar: Well you suck at everything. You only spend two, three hours a week per platform or application. You’re not going to grow at that. You’re not going to get better.

Erik Darling: If there’s a big problem with one of them, that’s all you’re doing.

Richie Rump: And yet yesterday I set up a database with all the trims and bells and whistles and everything and it took me ten minutes and I know nothing about all that stuff. I am the cloud!

Brent Ozar: We’re going to be dealing with the aftermath for the next ten years.

[Laughter]

Richie Rump: Yep, cloud fixes everything.

Brent Ozar: We are probably the poster child for the cloud in that we have no sysadmin, none of us want to be admins, we all want to cash out at 5pm—really 2pm is really when we want to cash out.

Erik Darling: Whenever happy hour starts.

Richie Rump: 2pm west coast time.

Brent Ozar: When the mimosas wear off but before the tequila is available.

 

How can I track changes in SQL Server?

Brent Ozar: Clinton asks, “What do you guys recommend to track users that are making changes in your SQL Server environment?”

Tara Kizer: What kind of changes?

Erik Darling: Yeah, what kind of changes? There’s like adding tables, columns, indexes, stuff like that and then there’s changing data. They kind of get different answers.

Tara Kizer: If it’s due to changing data and you don’t want them changing data, don’t give them that access. Have them change the data through stored procedures that have been tested. That way they can’t do any type of rogue queries and you could add in some logging in those stored procedures.

Brent Ozar: If they get the business on their side and the business says, “No, trust me, they need to change the data whenever they want,” then it’s not your problem and you just got to let it go. When the business comes and go, “Who made these changes?” “I don’t know. You told me anyone could make them. I’ve got to let them make them.” I’m a big fan of Colombo.

 

Should I use autogrow on TempDB?

Brent Ozar: Michella says, “For tempdb should I set autogrow on or not?”

Erik Darling: It depends.

Tara Kizer: I think Brent has a different opinion than say maybe Erik or me, I’m not sure.

Brent Ozar: I bet we have the same.

Tara Kizer: Well, the same, I mean in the install doc, you recommend going ahead and growing out tempdb to whatever that mount point or drive has for storage. I would want it to be 20 percent less than that due to monitoring tools, alerts, but I go ahead and grow it to what it needs to be and then I like to monitor what the growth is and make sure we’re using the larger size for reboots that occur. But I like to be notified. I want to know if the tempdb is filling up and allow it to autogrow so that extra 20 percent due to my alerts, let that autogrow and hopefully I can get the alert in time to possibly add storage.

 

What SQL monitoring tool do you recommend?

Brent Ozar: Jonelle asks a controversial question. “Which SQL monitoring tool would you guys recommend? I’ve seen SQL Sentry and SolarWinds at the last PASS and I’m torn which one to go with.” What do you recommend we look for when we’re choosing a tool?

Erik Darling: Whatever you can read and use. Whatever you’ll come back to and actually use is what I would recommend. If you open up a tool and you are nothing but confused and it’s not giving you the information that you want and it’s a hassle for you every time you do it and it’s just not giving you the information you need to solve problems, or see if there are any problems that you can root out, then it’s just not the monitoring tool for you. Back when I had ye old real job, I used, well it was SQL Sentry back then but it’s SentryOne now. They had a perfectly good monitoring tool, they had great reports. I liked it. There was a bar graph. When you saw little spikes and hiccups and stuff you could highlight and it would jump to whatever SQL was running at the time. So there’s some really nice features in there. That’s the one that I’m personally most comfortable with and that I would probably recommend but everyone else probably has their own spread of experiences.

Brent Ozar: Tara, which one did you use the most?

Tara Kizer: My last job was using SQL Sentry’s Performance Advisor. The job before that we were going to purchase licenses for all of our 100 servers to use it there. We had been a Quest user, but we kept having issues with Quest, the performance analysis tool, just on some servers, and these are very complex environments. But I really did like Quest’s Performance Analysis and I think it’s been renamed Foglight, and have used Spotlight. I know that you guys will say Quest’s Spotlight and then SQL Sentry Performance Advisor just as a sample of two tools, there’s other great tools out there, but I don’t think that Performance Advisor is comparable to Spotlight. Performance Advisor to me is comparable to the Foglight tool instead, the performance analysis, whatever they’re naming it these days. But I like both tools. I have not used the Idera Diagnostics Manager. We’ve had salespeople fly in and show it to us. It looked great, it just isn’t what we went. As far as SolarWinds, I was doing a proof of concept on that product two jobs ago and it just did not have the features that I needed. Unfortunately, at that job we had to support SQL Server 2005 all the way up to the newest versions and it did not support 2005 so that was a strike against it right away, but it also didn’t have the features that I needed out of a performance monitoring tool. This was about three years ago and so things could have changed since then. I know that they still aren’t supporting 2005 of course, but maybe the features are more comparable to these other tools.

Brent Ozar: Yeah, if you want to read more about it, search for “Erik Darling monitoring tool” and he’s got a blogpost on it that talks about it too.

 

Thoughts on Entity Framework?

Brent Ozar: Bert asks, “Thoughts on Entity Framework?” Richie, I’ll call on you for this one.

Richie Rump: Yo no sé. No, no, no. [Speaking Spanish] I like Entity Framework. I like ORMs. I like any tools that make development easier but just like any tool, you have to understand how it uses it and then how it uses SQL Server. That’s where most development and development projects get into problems where DBAs take a look at the code and say, “What the heck is going on?” It’s because the developers don’t understand how the queries are being built so they just kind of throw whatever works into their query on the dev side and what comes out to SQL Server is total trash and total garbage. There’s really no thought on how any of that stuff works. I suggest go to Julie Lerman’s site, she’s got a couple books, she’s got some Pluralsight courses on how Entity Framework works. That will give you a great comprehensive idea of how you should be architecting your application with Entity Framework so everything is working correctly. But I like it. These guys probably don’t.

Brent Ozar: I love it. I’m a huge Entity Framework fan—and I know that’s not trendy and I know a couple of people have typed in, “It made me throw up in my mouth.” Here’s the deal, do you like getting paid? I like getting paid. We only get paid if the developers ship features, if they can ship things that customers want to use. Is the code perfect? No, but it’s not solved by simply saying, “You have to write everything in T-SQL.” It’s not like they’re suddenly going to write amazing T-SQL without training either. The root cause on that is usually just a training issue, that people don’t know how to use the tool that they’re using. So DBAs are like, “You should write everything in a stored procedure so I can police it.” Dude, you’re already overworked. You’re working like 60 hours a week trying to keep a plane in the air. I’m a huge fan of stored procedures too but you can’t code everything by hand. “Everything should be done in assembler.”

Richie Rump: “Back in my day, we only had vacuum tubes and we liked it.”

Erik Darling: The old punch cards.

Brent Ozar: “Vacuum tubes didn’t work, we would suck on the keyboard.”

Tara Kizer: That’s one of the classes I had to take for my degree, assembly languages, it was actually one of my favorite classes. I really enjoyed it.

Richie Rump: Assembly language is why I got out of computer science. That was the class. I’m like, “No way I’m ever using this ever again. See ya.” Hello, School of Business.

Brent Ozar: Let the record show Tara Kizer is the only classically trained employee here. She keeps the rest of us in line. The rest of us are like, “The internet is a series of tubes.” We have no idea what’s going on…

 

What are the gotchas with Change Data Capture?

Brent Ozar: Ronnie asks, “Have any of you ever used change data capture and if so, what do you think about it and does it require a lot of management?” I have not used CDC. How about you guys?

Erik Darling: No, I think I’ve talked about it before, we had one client in the past who used CDC and it broke. They wrote their own CDC on top of CDC. So my one experience with CDC was not normal CDC. It was like, oh god, what happened to CDC?

Richie Rump: I prototyped it once. For the customer that I was dealing with, it didn’t fit their business case for it but just like any other tool, you have your business case for it. You have the set of features of what you want it to do, go off and test it and see if it works for you. There’s other tools out there that may work better for you but since it’s free in the box, go ahead write a POC for it, see if it works.

Erik Darling: What’s the worst that happens?

Tara Kizer: I’ve only used change tracking but not CDC.

Brent Ozar: I don’t think I’ve heard any horror stories. I haven’t heard anybody go—sometimes you’ll hear these horror stories—“You should never use that feature.” I haven’t heard that story, so.

 

Can I use Developer Edition for our developers to track bugs?

Brent Ozar: Dimitri asks, “Licensing question. Can we use SQL Server Developer Edition for our own internal bug tracking software or do we need Standard Edition?” As a guy who does a whole lot of licensing talks around here, I would tell you that that is production for you. I went through a licensing audit at one point, the licensing auditor had an elegant way of answering this. He said, “If I can turn off the server and the only people who will come running are developers saying that they can’t work at the moment but they didn’t lose any work, then you can kind of get away with calling it development.” But if there’s actual work going on in here, which is what bug tracking is, then it’s production.

Erik Darling: You might be able to get away with SQL Express depending on how big the database is if you’re looking for free, but that depends a bit as well.

 

Followups on earlier questions

Brent Ozar: Paul follows up and he says, “The Always On Availability Group stats I was referring to are the clustered index stats. The secondaries are updated after hours.” You know what I would do is I would post a picture of DBCC SHOW_STATISTICS. Do that on both replicas and go ahead and post that up on a place like Stack Exchange and you’ll be able to get better info. I think there’s a piece that we’re missing here.

Tara Kizer: Also look at your latency, query both to see what your latency is.

Brent Ozar: Yeah.

Brent Ozar: Steven says, “We use CDC and it works very well.” Breck says, “CDC is a royal pain.” So just like anything else in databases, you get opinions on both sides.

Richie Rump: Can’t please everybody. That’s why you need to test it yourself.

Brent Ozar: Can’t really please anybody.

 

What’s the best way to source control database objects?

Brent Ozar: We’ll take one more question. Phil says, “What’s the best way to source archive database objects? My company uses SVN for source archiving but I find it very tedious and the free SQL Server add-in doesn’t work properly.”

Tara Kizer: I really like Team Foundation Server, we used that at my last job, very very heavily. I’ve used SVN and other products as well but I really like TFS.

Brent Ozar: Richie is making a face. Richie, what do you like?

Richie Rump: I think the rest of the development world has moved onto Git and/or Mercurial so I like that. Redgate has a tool that really integrates well with other stuff, so that’s an option as well. Everything I have now is in Git, just everything, Git and GitHub.

Tara Kizer: I find Git so hard to use though, especially if you’re only talking about SQL Server database objects, that’s where I draw the line. If you’re talking about a lot of other stuff, then maybe use something like that, but SQL Server database objects, I would be Redgate or TFS.

Richie Rump: Oh, I could give you the horror stories of TFS. Oh, so many horror stories. Personal scars. Here, let me show you my scars here.

Tara Kizer: When I say Redgate or TFS I’m talking about the Redgate schema change feature and then use something on top of that.

Richie Rump: Yep, done that.

Brent Ozar: There’s two ways you can do change comparisons, there’s either before the fact or after the fact. If you do before the fact, the developers are checking in changes before they get deployed to production. For things like that, stuff like TFS works extremely well because it’s well integrated into the developer’s workflow, Git is as well. But that’s generally not schema detection and change detection that gets applied to production. People are just building change control scripts and then deploying them over in production. The other way of doing it is where DBAs often come in, which is after the fact. We go monitor the server once a day or every four hours, whatever, check to see what objects changed and then go check those changes in. I’m a huge fan of Redgate stuff for that, like postmortem, here’s the changes that happened to the database. It doesn’t tell you who did it, doesn’t tell you when, doesn’t tell you why, but as a DBA, I don’t usually need to know that. I just need to know that they changed. It would be nice to know who did it but I don’t have that luxury. Christopher says, “We just yell ‘fire in the hole’ when releasing scripts.”

Tara Kizer: I’ve worked in environments like that.

Richie Rump: Fire in the hole. I’ve worked in environments where the DBAs are not in charge of the database, right? I mean essentially they’re in charge of setup and backups and storage and then that’s it. The development team is in charge of whatever goes into it.

Tara Kizer: I’m okay with that as long as those developers are in the on-call rotation because if I’m in the on-call rotation, I want the people that are making changes, I want everyone on the on-call rotation that can have that kind of access, because I don’t want to be called for some kind of deployment issue that you caused.

Richie Rump: Yeah, exactly. They had a special development dev ops team that kind of monitored that whole thing. I don’t think the DBAs wanted to touch that database anyway.

Brent Ozar: And there’s a different experience here from like Tara is coming from a background of working with replication and replicated databases where if you make schema changes you can be in deep trouble if you do some of those changes.

Thanks everybody for coming and hanging out with us. Bert King says, “Happy New Year.” Happy New Year to you all too. We’ll see you next week. Adios folks.


Thanks, Google

Licensing
10 Comments

I usually can count on Google to figure out what I want to search for, even if there’s typos in my search. But sometimes Google just doesn’t get it right.

I’m not Brent Ozar. I can’t afford a Rolex!

Do you have any searches that Google or Bing didn’t get right and made you laugh? Drop a comment.

Erik says: Google must have seen the price tag on Enterprise licensing and gotten confused.

Brent says: If you see me with a Rolex, don’t look too closely at it. I’m just saying there’s an entire subreddit dedicated to replica watches. And for the record, I’m a Bell & Ross guy.


Build Me A Build: Assembly Olympics

Hardware
39 Comments

Overall, this was pretty easy

I’m thankful that generations of enthusiasts have agreed (mostly) on standards, and written oodles of documentation and guides to make stuff like this easy. Seriously. To put things in some perspective, when I first started building and tinkering with desktops, I remember having to manually set IRQs when there was a conflict. It was stupid.

But that’s not to say everything is simple. You still have to screw all this stuff in. With a non-magnetic screwdriver, mind you. That could screw stuff up.

That and… Well, let’s call it a supply chain error.

It all begins with Microcenter

When I placed my initial order for the CPU and Motherboard with them, it sat around for about an hour, then got canceled. I called to ask why, and no one knew. So I replaced the order, and it was magically ready a few minutes later to pick up. They both sat on my desk while I waited for the rest of the parts to get shipped.

Around a week later, when I had most everything, and a few days off to tinker, I started assembling things. The case is a Fractal R5

This is basically a minivan
Seriously, you could nap in here.

But back to the supply chain issue! Here’s my motherboard box, unopened.

X99-E, just like the doctor ordered.

After a while of excited assembly, a few things started to add up.

I had the wrong motherboard in my box. In fact, I had an oddly beat up motherboard in my box.

Why is it all weird?

If you look real close, someone had clearly returned or switched an X99 Pro motherboard and picked up an X99-E.

The first clue should have been when the back panel didn’t match the motherboard ports at all:

I am the back of your motherboard
I am missing holes

But I was excited, and I had already seated the RAM, CPU, and heat sink.

After getting most everything installed…

Delicious

I got a motherboard error about the RAM. It wasn’t compatible.

Doh

The next big clue here? The x99-E doesn’t have a LED with error codes. I learned this after flipping through the manual in disbelief for a while.

Like I said, “supply chain errors”.

Back to Microcenter

They were… okay about me returning it, as long as I replaced it with another board. I sort of get it. When a guy who looks like me comes walking in with a few-hundred-dollar motherboard and a week-old receipt saying they put the wrong one in the box, it’s a bit odd.

On the plus side, they had another X99-E in stock, and this one had the right one in the box. I made them open it in the store this time.

Back home, and with the right motherboard, I put my day of work back together.

Fast forward to yesterday

And, for reference, yesterday is the 4th of January, year of our SMOD 2017. I finally got the last piece of the puzzle, the M.2 chip.

It’s a lot smaller than you’d expect.

75 cents wide

This is what the final product looks like now

My cable organization game is pretty weak

There are a few differences between the final build and what I planned on, but I’ll talk about those in another post.

Thanks for reading!

Brent says: you always read about how Apple gear is overpriced, and to some extent, I agree. However, when you factor in assembly time, it’s not all that bad of a deal. When something like the motherboard switcheroo happens, that can burn a ton of time, and your time is money. Well, not Erik’s time. He’s an employee, and as we all know, employee time is practically free. That’s why Erik is currently giving Ernie a foot massage.


Tell Us What You Make: The 2017 Data Professional Salary Survey

Not what you’re worth, dear reader: we know your worth cannot possibly be calculated in base-10 numbering systems because you’re just so awesome, but what you make.

A few things to know:

  • It’s totally anonymous (we’re not getting your email, IP address, or anything like that)
  • It’s open to all database platforms
  • Your responses will be totally open (so we’re not asking for city, for example)

Your next steps:

Thanks for your help in giving everybody in the community a better chance to talk honestly with their managers about salary.


Filtered Indexes and Variables: Less Doom and Gloom

Indexing, SQL Server
5 Comments

It Is Known

That when you use filtered indexes, they get ignored when your queries are parameterized. This is a Plan Caching Thing©, of course. The simplest example is a bit column with a filtered index. If your index is on WHERE Bit = 1, it doesn’t have data for WHERE Bit = 0. That index would only be suitable for one variation of the query, so caching a plan that uses an index which can’t be reused for every variation isn’t feasible.

There would be errors everywhere. Your users would hate you. You’d end up in a belltower, probably ordering a lot of pizza and waiting for one of those Kevin Costner movie marathons where they play everything from that real awkward stretch from 1992 to uh… oh wow, it never stopped. There are a lot of bad movies in there. Tin Cup was funny though.

Mix and Match

But on the bright side, plans can be cached if they’re parameterized and the parameter isn’t for the filter condition. Let’s look at a quick example.

This is a pretty index, isn’t it? Unique, narrow, selective filter condition. It’s a winner. What could go wrong?

Well…

Just so you don’t think I’m lying, here’s what happens if we disobey the first rule of filtered indexes.

Monsters.

Clustered index scan, missing index request. The works. When offered our filtered index, the optimizer makes the same face I do when the all the scotch behind the bar starts with “Glen”.

But…

Other things can be variables! Remember that our index is keyed on UpVotes, DownVotes, and Id. It would be daffy to create the index we did just to search on Reputation. So let’s expand our horizons.

UpVotes and DownVotes take parameters as predicates, but Reputation is a literal value. In this case, the optimizer makes the same face I do when I find a Chateauneuf-Du-Pape I haven’t had yet. Infinite joy.

Joy, Ode To

Filtration System

While index filter conditions don’t deal with parameters well, you can still use parameters for predicates in other columns. That’s an important distinction to make when evaluating filtered index usage.

Thanks for reading!