[Video] Office Hours 2017/02/1 (With Transcriptions)

This week, Brent, Erik, and Richie discuss AlwaysOn Availability Groups, execution plans, speeding up log shipping, applying wrong service packs, why servers “page wildly” when copying mid to large-size files from one SQL server to another, their favorite high availability solutions, and much more!

Here’s the video on YouTube:

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes

Office Hours Webcast – 2017-02-01


Is it worth learning Always On?

Brent Ozar: James says, “I’m a mid-level DBA.” I think that’s kind of like a mid-level manager, he’s kind of sort of a DBA. “We do not currently use Always On, however, is it worth learning the technology and how many of your clients use Always On?” Erik, what do you think? Is it worth learning the technology?

Erik Darling: It’s always worth learning, whether it’s worth implementing, whether it’s worth emotional pain that you can go through where every time you think you know what’s going on with your availability group, you don’t actually know what’s going on with your availability group. We’ve said it a million times before, or at least I have during Office Hours and to clients, is I’m an idiot. I can set up an availability group. It is dead simple to set up. What sucks and what’s hard is when they break. As soon as an availability group breaks or as soon as you have to make hard decisions about bringing an async replica online, if the primary [inaudible] goes down, what do you do? That’s where availability groups gets hard. So learning how to set one up is easy. Learning how to keep one up and learning what to do when something goes wrong with one is all the hard stuff. So, worth learning? Yeah. Worth implementing? It’s your coronary.

Brent Ozar: You have to think about what the real cost is. Tara writes about this in her white paper on the Google Compute Engine stuff. She says you really have to have a staging environment that’s separate. You have to rehearse everything in the staging environment, be comfortable with how things work. The second part of your question was how many of our clients use Always On Availability Groups. I don’t know ballpark on a percentage, I think on a percentage it’s probably fairly low but you have to remember that we’re kind of an emergency room for SQL Server. People either come in when the performance is bad or when it’s unreliable. When people come in for unreliable SQL Servers, it is often Always On Availability Groups, that they can’t understand why failovers happen for example.

Erik Darling: Just to add a little bit to that. I’ve never recommended availability groups to anyone. Part of our Critical Care, we can do some HADR architecture spec talk. I have never once recommended availability groups to a client. Beyond that, I think the percentage of clients using them is a much different percentage than clients who are using them happily. I’d say like maybe 3 to 4 percent of clients are using them. There’s an even smaller percentage who are like, “We’re psyched. This is going great. Got my happy face on.”

Brent Ozar: It’s often the reason why they’re calling, that there was a problem with it. I don’t believe it’s that the technology is broken, it’s just hard. There’s a wizard and it looks like you can just step through the wizard and you’re done. People are like, “That wasn’t so hard. What’s the big deal with that?” Man, it’s what happens afterward that’s harder.


Why are query plans different between two servers?

Brent Ozar: Josh asks, “I have SQL Server 2012 Expensive Edition” as we like to call it. He says, “There’s a difference in execution plans between two different servers, between dev and QA. I’ve tried making everything identical. What else can I look for that would cause execution plans to be different?” He’s ruled out clearing the plan cache, my set connection options, the hardware is the same, partition tables and indexes are identical. What would you guys look at?

Erik Darling: Different execution plans…

Richie Rump: Look at stats.

Erik Darling: Yeah.

Brent Ozar: I like it, yeah. Because what if the databases aren’t from the same point in time? What if there has been changes to them? What if one of them updated stats and the other one didn’t? That can hugely make a difference.

Erik Darling: Yeah, I’d double check other settings too. I would just make sure MAXDOP and cost threshold are in the same place. I would make sure that all of that stuff is floating around the same area. Just to dot the i’s and cross the t’s.

Brent Ozar: I would also say too that there’s a really cool tool built into Management Studio 2016 that lets you compare execution plans. If you Google for – and I’m not just saying that because today is like Google day – if you Google for compare execution plans with SQL Server 2016, it will show you what the differences are. It will show you what’s unequal. It can be things like the cardinality estimator, not on 2012, but it will show you all the differences between the plan, like what influenced it to do different things.

Erik Darling: One thing that might be even fun to try is see if there’s any hinting or whatever you can use to get the plan to be the same. That way, if it’s like a difference in a join type, try forcing the join type. If it’s a difference in joint order, try forcing the joint order to see maybe why that other server isn’t picking it. You may find that for some reason, somehow, that plan on that other server is getting a different cost estimate or something.

Brent Ozar: That’s true.
Brent Ozar: Roland says, “I’m running an Always On Availability Group with two nodes in synchronous mode. Could running a CHECKDB on the secondary replica somehow affect the primary replica?”

Erik Darling: Yes.

Brent Ozar: Yes. It’s going to slow things down, yes. If you’re doing inserts, updates, and deletes against the primary and then the secondary isn’t responding to writes as quickly, yeah, unfortunately that is going to slow down the primary.

Erik Darling: Another thing that could be happening is if you were one of those weird people who has an AG and still has a SAN behind it, you could be sucking all that DBCC CHECKDB data through the SAN and the other AG is sort of sharing that same pipe or whatever boobidi-babbidi and things are just kind of beating each other up there. Don’t make that face, Richie. You know boobidi-babbidi is a perfectly technical term for IO.

Richie Rump: Someone has a young daughter, that’s all I’m saying.

Brent Ozar: Is this like a Sponge Bob thing?

Richie Rump: It’s like “Bibbidi-Bobbidi-Boo,” Cinderella thing.

Erik Darling: “The thing-a-ma-bob that does the job.”

Brent Ozar: A VM host too, if they’re on the same VM host that would be horrible for other reasons.


Will a CONVERT function be SARGable?

Brent Ozar: Dimitri asks, “If I do a convert function in a where or a join, is that still going to be SARGable?”

Erik Darling: Only for some things, not for all things. The one that he explicitly asked about was date/time. There is a special optimizer rule built in for date to date time conversions where SQL can still use an index and still do a seek because the left-most part of the date is still part of the histogram and all that. So SQL can still do all sorts of nice stuff with just the date part to date time. It doesn’t work as well for all converts, tasks and whatnot.

Brent Ozar: The thing that’s tricky is it may still do a seek but the estimates can be wrong. In one of our classes I show that using that convert date and it gets a seek but it only estimates one row instead of like 4,000.


How can I reduce log shipping latency?

Brent Ozar: Tim says, “What are some ways to speed up log shipping to an off-site read-only standby with an app that can create hundreds of transactions even on a simple order lookup?” I guess what he’s really asking is – so if you can’t write less to the log file is there a way to speed up pushing that log data to another location?

Erik Darling: It depends on which part is slow I guess. You could take more frequent log backups.

Brent Ozar: Yeah, oh, yeah.

Erik Darling: …smaller logs across. I would rather send a 10 or 30 second log of data across than five or ten minutes. I think that’s going to help.

Brent Ozar: And if you’re worried about latency, that makes me think that maybe your querying the secondary in which case taking more faster log backups isn’t going to help a whole lot unless you’re kicking people out every time you do a restore.


Should I learn the cloud?

Brent Ozar: Interesting. Clark asks, “As a DBA who only became comfortable with virtual servers running Microsoft SQL, do I now have to swallow the red or blue pill and learn cloud-based servers? If yes, where do I start? My current employer would never agree to cloud-based anything.” No, if you’re not doing it, don’t go learn it. If the company won’t accept it, don’t bother learning. You’ve got enough things to learn. Focus on the things that will make you more valuable at your employer.
Richie Rump: Unless you want a new employer and you’re really passionate about cloud computing and cloud databases, then by all means, start spinning stuff up, start learning it. But if you’re comfortable where you’re at, you like where you’re at, and you like what you’re doing, there’s really no need.

Erik Darling: One thing I will say is that the perfect cure for interest in the cloud is learning the cloud.

Brent Ozar: Oh god, suddenly I’m scared. Hold me. Yeah, poor Richie has been spending the last two, three days, really since Friday I think struggling with a problem.

Richie Rump: Friday, yeah.

Brent Ozar: Yeah, struggling with an ugly problem.

Richie Rump: It’s been multiple build systems and multiple projects so it hasn’t just been the one. It’s been multiple and then trying to get all that automated because there’s so much documentation everywhere on how to do this stuff, you just have to figure it out.

Brent Ozar: We have a company Slack chatroom where everybody goes in and chats. One of the things that goes into one of the rooms is Richie’s failed builds so we all get to see whenever Richie’s builds are failing.


Should user databases have one file per core?

Brent Ozar: Renee says, “I have a server with four physical CPUs, each of which has ten cores. I’ve got a data warehouse, should I line up the number of data files with the number of CPU cores that I have? Should I have 40 data files for my data warehouse?”

Erik Darling: I don’t think that’s going to do much for you. I wouldn’t even say that for tempdb. I only mentioned tempdb because I can imagine that sort of thinking coming from having one tempdb data file per core so that’s why I say that.

Brent Ozar: You’ll sometimes read Microsoft Fast Track Data Warehouse reference architectures and they’ll do crazy stuff like that because they’re trying to drive 100 percent CPU use across lots of storage fabrics. That’s really a different kind of crazy town but that’s fairly rare to see. I don’t think I’ve ever seen anybody with more than ten files per file group and they didn’t like it even when they had ten.


Should I use Windows 2016 or 2012 R2?

Brent Ozar: Justin says, “Back in the SQL 2008 and 2012 days…” It’s still the SQL 2008 and 2012 days. We’re still here.

Erik Darling: For many of our clients it’s still 2008.

Brent Ozar: Tara I think answered a SQL Server 2000 question today on Stack Overflow.

Richie Rump: Noooooooo.

Brent Ozar: He says, “Have you guys recommended running Windows 2012 over Win 2008? Do you have a SQL or similar recommendation regarding on what OS you should run SQL Server 2016 on?” It doesn’t make a difference to me. Win 2012 versus SQL in 2016.

Erik Darling: Yeah, unless you’re chasing failover clusters or something else there are a few advantages to 2016, like you can add rolling cluster upgrades and some other cool stuff. I believe that if you’re running availability groups Windows 2016 is the only one that supports the distributed transactions.

Brent Ozar: Oh, yeah.

Erik Darling: So there are some advantages to Windows 2016 but generally I don’t think I care all that much as long as it’s not 2008 or 2003.


Should rename/lock/hide the SA account?

Brent Ozar: Wes says, “I have a bad question.” We’ll be the judge of that. “Do you guys lock the SA account and replace it with a new SA in order to hide the 0X01 user? Is there anything that I should think about there?” No. I don’t ever do that.

Erik Darling: I’ve disabled SA a couple times just to check off an audit box but never made a difference.

Brent Ozar: I don’t feel massively secure. My bigger problem is usually that everybody in the shop has the SA password and it’s like on every post-it note everywhere and I’m like, “Can I please get that culturally changed?” and they say no.

Erik Darling: Two things I always notice is at my last job when I walked in I asked my boss the SA password and he said, “It’s on the whiteboard by the developer section.” The other thing is, this is something that has always irked me a little bit. I say a little bit because I go back and forth on it. Everyone in Management Studio saves their password, right? Because no one wants to go do something if they’re using SQL logins. If you’re using Windows auth it obviously doesn’t matter but logging in with SA most people will save the password. I have a little bit of a gripe with that because then anyone, if you leave your workstation unlocked, they just open up Management Studio and they go right in from your workstation. So just as a best practice for you, unless you have those crazy SA passwords that you can’t remember because it’s all like hyphen, parentheses blah blah blah, then try not to save them.

Brent Ozar: Yeah.

Richie Rump: I just put it on a sticky note and post it on my monitor so I don’t forget.

Brent Ozar: Nice. Or 1… 2 … 3… 4… 5…


Should I deploy SQL Server 2014 now or 2016?

Brent Ozar: Ryan says, “We’re running SQL 2012 SP2 CU14.” Come on, now. If you have to name that I think I’m already in trouble.

Erik Darling: If you have to name that, why aren’t you on SP3?

Brent Ozar: Good point, CU14. “Our software vendor certifies what SQL Server version we can run on. They just certified 2014 and we expect them to finish certifying 2016 sometime this year. Performance is fine but some execution plans suffer. Would you wait until 2016 is certified or are the improvements in the cardinality estimator in 2014 worth making the change forward now?”

Erik Darling: I’d wait.

Brent Ozar: Yeah, me too. What are your reasons behind waiting?

Erik Darling: There are just enough bells and whistles and improvements to the cardinality estimator between 2014 and 2016 that I would just wait it out. There is no sense at this point upgrading to a version behind current especially Microsoft is getting into such an aggressive lifecycle for SQL Server. It sounds like vNext is going to be an every two-year thing now. We’re not going to see long waits between major versions. So I would be very careful with that and not intentionally put myself behind a version.

Brent Ozar: And if I do it, I only want to do it once. I don’t want to go through it in 2014 and then go through it again in 2016.


We really messed up our patching…

Brent Ozar: Rosemary says, “I have a question about applying the wrong service pack.” Well, don’t do that. She says, “On a two-node cluster someone has applied the wrong service pack for the passive node. Now the cluster won’t failover. Is the only solution to uninstall the wrong instance from the passive node and then reinstall it with the right binaries?” Based on the – and I’m skipping like five lines out of your question – call support. Call Microsoft Support. It’s $500. They’ll work the problem with you until it’s done but frankly if someone did something that incredibly stupid, like they applied the wrong version of SQL Server across, I would take this advantage to go build a new node in the cluster that I was going to then go failover to. Just something brand new from scratch that I know is going to be reliable. If someone that incompetent screwed up your service pack, god only knows what else they installed on the SQL Server. Like in our recent engagement at Google where we accidentally infected our domain controller with a virus. That would be Erik Darling.

Erik Darling: Excuse me.

Brent Ozar: Get something off of BitTorrent we said. How hard can it be we said.


How big of a drive should I use for system databases?

Brent Ozar: Tim asks, “When you guys build a new server, how big of a drive do you create for the system databases, master, model, and msdb?”

Erik Darling: I don’t ever separate those from other stuff.

Brent Ozar: Where do you put them?

Erik Darling: I put them in with whatever data files and log files I’m going to have for my user database. I might have a system folder on that drive but I don’t ever separate them out for anything. Tempdb is the only one that gets its own good boy drive.


What MAXDOP settings take precedent?

Brent Ozar: Jonelle asks, “If you set MAXDOP at the instance level, the max degree of parallelism, if you set it at one, does that disable parallelism unless you provide a query hint?”

Erik Darling: Yes.

Brent Ozar: Yes.

Erik Darling: Or, unless you’re on 2016 and you set it at the database level. There are database scope configurations for such things, which is weird because they do it for MAXDOP but not for cost threshold, right?

Brent Ozar: Right. Yeah, I wrote a blogpost recently on all the ways you can set MAXDOP and I think it was like 13 or 16 different ways that you can influence parallelism just by different settings. It’s crazy how many things you can tweak it with.


Can BEGIN TRAN fill up TempDB?

Brent Ozar: Eric says, “Is it possible…?”

Erik Darling: I did not.

Brent Ozar: What? Different Eric. He doesn’t know how to spell Erik correctly. He says, “Is it possible that a statement with a BEGIN TRANSACTION that uses tempdb that is left hanging can hold the tempdb log file hostage and eventually cause it to fill up and bring its SQL Server to its knees? If so, I find this disturbing.”

Erik Darling: Welcome to SQL.

Brent Ozar: Yep, bad news. Yes, that is possible. It’s kind of sad. I say kind of sad, it’s not like I’ve ever run into that. The problem for me hasn’t been the log files, it’s been the data files. Some yoyo dumping a huge thing into tempdb trying to join 50 tables together. It’s not always Richie, it’s just sometimes Richie. Less often now that he’s doing work in Postgres.


Why does the page file get used when I copy files around?

Brent Ozar: Guillermo says, “Every time I start copying a mid to large size file from one of my SQL Servers to another, I see the server starts paging wildly. Why is this?”

Erik Darling: I did this to a server once. If you look back shamefully, I had a Stack Exchange question, but I did it with PowerShell. I was using the move item command. I was moving a bunch of data files around and one of them happened to be like 500 gigs. As soon as I did that, the server just stopped. It was like no one could get it in. Nothing. It just stopped dead. It stopped until the file finished moving. That was because Windows tasks like that, like file system stuff and PowerShell and even like Xcopy or Robocopy, they don’t give a single care about your SQL memory. They see the memory available on the server and they’re like, “I’m going to buffer this whole file, move it over. I don’t care.” That’s why it’s very important that when you move files on SQL you do it very carefully and do not hit Control C and Control V to do it. Even just like dragging and dropping a file from one place to another can cause trouble if it’s big enough.

Richie Rump: Honey badger Windows don’t care.

Brent Ozar: Does not care. Takes all of your memory. If you’re playing around with command line stuff like Xcopy, look into unbuffered. There’s an unbuffered switch you can use for Xcopy.

Erik Darling: I think Robocopy has a similar one that I started using as well.

Richie Rump: Don’t mentioned that Power thing again please, Erik.

Erik Darling: I mention it as little as possible.

Richie Rump: We’re not on speaking terms right now.

Brent Ozar: Especially, it’s so illustrative, you can just trash servers so much more quickly. That’s not true. That’s kind of true.

Erik Darling: It is true.


Should I separate data and indexes to reduce gaps?

Brent Ozar: Heather says, “A colleague recently recommended that we should move indexes to a different file group because reorgs and rebuilds cause gaps. Is there any merit to this suggestion?” That’s incredible.

Erik Darling: Gaps? In what?

Brent Ozar: I guarantee I know what they’re thinking. They’re thinking I have like say 100 gigs that’s all completely full and I want to rebuild a 50 gig index. That it’s going to pick up that index and move it to the empty space at the end and there’s going to be a 50 gig hole at the beginning. So theoretically this could be true if you have one dominant object in the database, one object that takes say half of the database or more. But if you have, to keep the numbers simple, ten things in your database that are all equally sized, you’re not rebuilding them all simultaneously. You’re rebuilding them one at a time and you’re leaving free space behind. Is there going to be a gap? Yes, even if you put it in a different file group there will still be a gap, unless you get fancy and go back and forth between file groups or something but tell that colleague of yours to stick with the other department that their primary job description is in, management for example, but keep out of the database.

Erik Darling: Yeah, that’s one of those bananas things where it’s like how is that even going to change anything? Because SQL is going to read that from disk into memory anyway, right?

Brent Ozar: It kills me that people forget what RAM stands for. RAM doesn’t stand for sequential access memory. It stands for random access memory.

Richie Rump: How old is that server? Is it like SQL Server 6.5 or something?

Brent Ozar: Or how old is the colleague too?

Richie Rump: Goodness.

Brent Ozar: “When I was your age…”

Richie Rump: I said that this week.

Brent Ozar: What made you say that?

Richie Rump: Cloud.

Brent Ozar: Oh. “When I was your age, I would have gone into the data center and beat it with a hammer. A stone hammer, not PowerShell, the real, original stone hammer.”

Richie Rump: “When I was your age I did builds on my own machine and I liked it.”

Brent Ozar: “I really liked it because I could go do something else while my code was compiling.”

Richie Rump: “I went and got coffee.”


Should I enable xp_cmdshell?

Brent Ozar: Guillermo asks a controversial question. There’s no controversy with me. Guillermo says, “What’s your stance on enabling xp_cmdshell on SQL Server?”

Erik Darling: I don’t care. If you’ve got to use it, you’ve got to use it. You’re not any better off enabling and using xp_cmdshell than you are if you were to replace that with say using PowerShell in an [inaudible] or something. It’s just not. Plus, as a guy who has had to do a lot of dumb things with SQL Server, I have used xp_cmdshell to do many of them. My proudest moment was using xp_cmdshell to build a cookie and then use the cookie in a call to curl to download a file. Then lay that file onto a table. I was so psyched on that. It was like I don’t know why everyone hates xp_cmdshell.

Brent Ozar: That’s pretty cool.

Richie Rump: Learn a programming language, Erik.

Erik Darling: I tried to learn PowerShell because everyone…

Richie Rump: Again, learn a programming language, not script kiddie stuff, okay?

Erik Darling: … just write a quick PowerShell script to do it. I’m like, yes, quick PowerShell script. Then four days later I’m like sitting there in a pile of broken keyboards with like punches through my monitor and like teddy bear heads ripped off. Quick PowerShell script my foot.

Richie Rump: My thing is with xp_cmdshell is like should you be doing it in a database at all? Maybe this should be somewhere else other than the database.

Erik Darling: I agree but sometimes these tasks have to interact with the database, right? Like sometimes you do have to move a file. Then you can’t do anything in the database until the file is done moving.

Richie Rump: That’s why we have STKs to talk to databases with, these programming languages. You see how that works?

Erik Darling: You do. DBAs who don’t know C# and don’t have the time to learn C#.

Richie Rump: …it’s been around for – I don’t know – over 15 years. I mean, I think it’s pretty stable.

Brent Ozar: Here it comes: When I was your age…

Richie Rump: When I was your age…

Brent Ozar: I would ask – this does bring up a good point – what do you need xp_cmdshell for? That is the thing that I usually ask. Like are they trying to implement some kind of automation that is really best done in another language or in an app server. I was talking to a DBA recently who even does file unzipping through xp_cmdshell, I’m like this may not be the best use of CPU cycles on SQL Server’s licensing.

Erik Darling: 7-Zip CLI.

Brent Ozar: PKZIP… When I was your age…

Richie Rump: I could do that in Azure Functions. Just saying. Or Lambda.

Brent Ozar: When I was your age we didn’t have Lambda.

Richie Rump: That’s right.


On a SAN, does it matter if I separate data and indexes?

Brent Ozar: Dimitri says, “Speaking for file groups. If all my storage is on a SAN, does it make any difference for performance to have data and indexes on different file groups?”

Erik Darling: Not unless they’re on totally different spindles or I guess maybe if you have like a real fancy SAN and they’re routed through different storage paths, then maybe – through different network storage paths rather, not the other thing.

Brent Ozar: That was the reason we always talked about it in the old days – when I was your age – we would put the data on one set of drives and the indexes on another. The theory was you could read from the data and rebuild the indexes faster, like if you’re reading from one set of spindles and writing to the other. These days, kids seem to have this really fancy concept called memory. They put a lot of their data up in memory so then they don’t have to read from disks in order to write to disks but I don’t know, it’s just to give you the live broadcast – 64 gigs of RAM there in that desktop. Mine is a chump compared to Erik’s who has…

Erik Darling: 128.

Brent Ozar: 128 gigs of RAM in his desktop, ladies and gentlemen.

Erik Darling: Wouldn’t it be cool if SQL accepted weird slang for date expressions? Like “when I was your age.” Like “back in the day.”

Brent Ozar: We could write that.

Erik Darling: It would just interpret it, like based on your age and mindset.

Brent Ozar: Yeah.

Erik Darling: My barber is an old guy and he always says stuff like, “There was a time” and “back in the day.” Every sentence starts with it. Or it’s like “Nowadays.” Nowadays would be like the date minus seven or something.

Richie Rump: Did the html tag flash actually work? Did the marquee tag work and it scrolled across. Blink, how’s that?

Brent Ozar: Nice.

Erik Darling: Because it is a structured English query language so I think it should accept a wider variety of terms to express dates.

Brent Ozar: Yesteryear.

Erik Darling: Yeah.

Richie Rump: You want to make things more complex for DBAs? Is that what I’m hearing?

Erik Darling: Well, for developers.

Brent Ozar: No, just developers.

Richie Rump: Well then, sure, fine, why not? We’re not updating things every two weeks now, right? What could possibly go wrong?

Erik Darling: No one depends on that stuff.


Should I increase max worker threads?

Brent Ozar: Neil says, “Have you guys had any experience with high concurrent connections having max worker threads run out of idle threads? Is there a reason why I should increase max worker threads?”

Erik Darling: Noooooo.

Brent Ozar: Why no?

Erik Darling: I think we have a joint blogpost coming up about this soon about how like…

Brent Ozar: We do.

Erik Darling: So max worker threads, it sounds like one those – like, oh, if I just had some more worker threads everything would be fine because I’m running out of them. The problem is when you add worker threads, you don’t add CPUs, you don’t add CPU resources, those threads are all now competing on the same CPUs you had before. So now where your workload may not fall off into the abyss, like your server may not stop responding, but it’s going to get really close because all those threads are going to be running on CPUs and slowing down and sucking at life.

Brent Ozar: I had one client he set max worker threads at like 3,000. They actually had that many queries running simultaneously and they didn’t get – I said, all right, divide out the number of cores you have and that number of worker threads. If they’re all active and they could make progress, think about how little CPU time they get because they just go on for four milliseconds of CPU time and then step off the line for someone else to execute. They take longer, not shorter, so it’s pretty amazing.

Erik Darling: What you really want to do is if you are consistently running out of worker threads and you can’t add CPUs is start looking at parallelism because parallelism is one of the biggest consumers of threads on most servers that I see. So either cutting MAXDOP in half, instead of 8, cut it to 4. If it’s 4, cut it to 2. Maybe boosting cost threshold up, but those are short term, artificial ways to do it until you have time to do the query and index tuning you have to do so that queries don’t have to go parallel anyway.

Brent Ozar: Bingo.

Erik Darling: Again, that’s a performance hit too because now these queries, you know, either were going parallel before and aren’t going now or we’re using more cores and finishing faster now or we’re using less and taking longer. So any situation with max worker threads and all this other stuff, you are going to slow stuff down until you make it better. There’s no great solution.


How many VLFs are too many?

Brent Ozar: Renee asks, “Do you guys have a rule of thumb on how many VLFs are too many and how do you minimize them?” Yeah, run sp_Blitz. Sp_Blitz warns you and we warn you when it’s over 1,000, not because that’s bad but we want you to fix it before you get up to 15,000, 20,000. There’s a link in there on how you go about fixing those, really quick and easy fixes.


Why do my missing index suggestions disappear?

Brent Ozar: Ben says, “I use sp_BlitzIndex against a somewhat high transaction db.” Okay, quick, podium moment here. Everything you work with will always seem like high transactions or high size. Never use those terms. Instead, use real numbers. He says, “sp_BlitzIndex will display five high value missing indexes for a specific table but then I do an index rebuild and reorg on Sundays and those five suggestions disappear. Why do they come and go?”

Erik Darling: Because they’re karma chameleons.

Brent Ozar: Karma karma karma chameleon…

Erik Darling: Just kidding. I look nothing like Boy George today. So in SQL Server 2012 is the only one that still has it for limited versions of 2012 because it got fixed in service pack 3 I think, but there was a bug that happened between 2008 R2 and in 2012 where every time you rebuild an index it clears out the DMVs for that index. So like all the usage and the missing index stuff. I want to say, Kendra has a blog post on littlekendra.com about which versions and which commands still reset certain counters. I want to say that maybe the rebuild thing still might clear out missing index stuff and it [inaudible] other DMV counters. There’s something wonky about it that I’m glad you wrote down for me. I would check there and get the full list but that’s the reason why. So stop rebuilding indexes.

Brent Ozar: There you go.


What’s your favorite high availability feature?

Brent Ozar: Next up Guillermo says, “For high availability for SQL Server what’s your preferred method between failover clustered instances or availability groups?” I’m going to take away his restriction and I’m going to say any high availability mechanism and we’ll each give answers. Erik, what’s your favorite and why?

Erik Darling: I just talked.

Brent Ozar: Richie, what’s your favorite and why?

Richie Rump: Favorite…?

Brent Ozar: I’ll answer. My favorite high availability solution is usually either VMs, just a single VM, because it’s easy. It’s not perfect. There are all kinds of things it doesn’t really protect you from. If someone drops a table, if someone hoses up a patch. But it just is kind of free high availability, especially for your older stuff that you can’t afford to reinstall or buy a fancy pants edition for. Barring that, I like failover clustered instances a lot because they’re really similar to what they’ve been around for years, so it’s easy to get books and training on those.

Richie Rump: Cloud databases.

Brent Ozar: Cloud databases. That’s such a good answer.

Richie Rump: Azure SQL database you have multiple copies when you spin it up so if one fails it jumps to the next one and then starts rebuilding another one. So that’s free. It’s out of the box. You spin it up and it’s there. It’s the cheapest way to get HA.

Brent Ozar: How about you, Erik?

Erik Darling: Actually I’m going to expand a little bit on what you said. My favorite setup of all time is the failover cluster with log shipping because that gives you HA and DR outside and log shipping, it doesn’t get any easier than log shipping. If you screw up log shipping, god help you with anything else.

Brent Ozar: It’s pretty bulletproof. Doesn’t rely on Windows domains, anything like that. Speaking of which, the guys over at Madeira Data, Guy Glantser, just wrote a blogpost on building Always On Availability Groups without a domain controller. He does a three-node Always On Availability Group with no domain controller just in work groups. The checklist to do it is probably ten pages long if you go print it out. So, Guy Glantser. We’ll put that link in the show notes. It’s actually in this week’s Monday links too as well. I was like everybody needs to see this. It’s crazy. Thanks, everybody, for hanging out with us at this week’s Office Hours. We’ll see you guys next Wednesday. Adios, everybody.

Previous Post
Memory Grants and Data Size
Next Post
Using Trace Flag 2453 to Improve Table Variable Performance

1 Comment. Leave new

  • There’s one guy who somehow manages to open a transaction with “Microsoft Office 2010” which never closes and uses tempdb, so the drive fills up and I get paged for it. He manages to pull this off almost exclusively on holidays.


Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.