Blog

Windows Update Now Delivers SQL Server Cumulative Updates

SQL Server
13 Comments

When the SQL Server Release Services team said they were going to start treating cumulative updates just like service packs, you may not have expected this part:

Windows Updates
Windows Updates

This means if your Windows admins approve and install patches, they may also be patching your SQL Server a little more frequently than you’re used to. Cumulative updates tend to come out about every 60 days – you can see the cadence over at SQLServerUpdates.com‘s detail pages for each version.

Windows Update isn’t a great way to deliver patches – for example, in the screenshot above, my VM is going to download 1.3GB of redundant patches, possibly going through the patch process twice – once for RTM CU14, and then for the chronologically-later Service Pack 2. In this case, there’s no need to download or install CU14 – just applying SP2 would be the appropriate action, assuming your apps are approved for SP2.

I’m sure some folks are going to protest installing cumulative updates – CUs don’t always go well – but note that these updates aren’t flagged in the Important category, only Optional.

I love this change because it makes SQL Server servicing easier for non-DBAs. It’s hard for sysadmins and accidental DBAs to keep up with available patches, and this helps.


An Introduction to Query Memory

Microsoft has been quietly making some amazing improvements for performance tuners in SQL Server 2012, 2014, and 2016. This week, we’re going to introduce you to just how awesome they are. (They being the improvements, not Microsoft. You already knew they were awesome.)

Using the freely available StackOverflow database, let’s start with a simple query – SELECT * FROM Users:

Simple query with no ORDER BY
Simple query with no ORDER BY

When I hover my mouse over the SELECT in the execution plan, I get a little popup with things like estimated subtree cost, estimated number of rows, and more.

Now let’s add an ORDER BY:

Execution plan with a memory grant
Execution plan with a memory grant

When I hover over the SELECT, there’s a new line in town: Memory Grant. SQL Server needs RAM to sort the list of users before delivering them back to us. It didn’t need memory for the prior query because it was just dumping the data out as fast as it could read it.

You can see more details about the memory grant if you right-click on the SELECT and click Properties:

Memory grant details in the Properties pane
Memory grant details in the Properties pane

When SQL Server builds your query’s execution plan, it has to guess:

  • Desired Memory – how much this plan wants in a perfect scenario, in kilobytes. Cached as part of the cached execution plan, and will remain consistent for all executions of this plan.
  • Granted Memory – how much this execution of the query actually got. This can vary between execution to execution depending on the server’s available workspace memory at the time. Yes, dear reader, this is one of the reasons why sometimes your query is slow, and sometimes it’s fast. This value is not cached as part of the plan.
  • Grant Wait Time – how long this execution of the query had to wait in order to get its memory. This relates to the Perfmon counter Memory Grants Pending. Not cached, since it varies from execution to execution.
  • Max Used Memory – how much this execution used, and also not cached obviously. This is where things start to get interesting.

SQL Server’s query grant memory estimates are really important.

If it estimates too little memory, then as your query starts to bring back more and more data and it runs out, then you’ll be spilling to TempDB. That’s why we freak out when user-defined functions only estimate 1-100 rows will come back, or when table variables only estimate 1 row will come back.

As we join those objects to other things in our query, SQL Server may wildly underestimate the amount of work it has to do for the rest of the tables. You can see this in action in my Watch Brent Tune Queries video from SQLRally Nordic in Copenhagen.

On the other hand, when SQL Server estimates too much memory is required, we have a different problem:

  • Your query may wait a long time just to start because that estimated memory isn’t available
  • SQL Server may clear out a lot of otherwise-useful memory just to run your query
  • Other people may have to wait for memory because a ton got allocated to your query

You can see this one in action in Kendra’s AdventureWorks query that estimates bazillions of rows. It estimates 10.5 terabytes of data will be handled, when in actuality it’s only 292MB.

But we’ve been flying blind – until now.

Until now, it was really hard to troubleshoot these issues. While the plan cache does show us how much memory a query would want, it didn’t show us how much was actually used.

We could try running the query to get the actual plan, but even that didn’t always get us the right answer. If we were suffering from parameter sniffing problems, one set of parameters might work fine while the other would produce catastrophic underestimates or overestimates.

We could try using Profiler or Extended Events to catch spills or huge grants as they happened, but…let’s be honest, you’re not doing that. You’re the kind of person who barely finishes reading a blog post, let alone starts firing up a tool before problems happen.


Cardinality Estimator Trace Flags and Links

SQL Server
13 Comments

This comes up during Office Hours once a week

Which pretty much gives it a 100% hit rate. I’m writing this mostly to have something to reference when people ask.

Trace flags

If you restore a database on a SQL Server 2014 or greater instance, you have choices to make. You can either set the compatibility level to 120 (or 130, if 2016), and your queries will start using the new CE. If you leave it at a compatibility level below 120, it will continue to use the old one. You can change this at the query level by hinting with Trace Flags.

9481 will force the old CE in >= 120 database
2312 will force the new CE in a < 120 database

This gives you the ability to test queries, ALL OF YOUR QUERIES, with the new optimizer to check for regressions.

What’s the difference?

The new CE no longer assumes data independence. That means if you were searching a city and state combination, it would make no fuss about the relationship between the two. That doesn’t quite make sense, does it? The new CE aims to fix that, but it’s not always right either.

For more detailed write-ups on the inner workings, here are some links:

Optimizer guarantees

Microsoft has also made guarantees that you’ll no longer face regressions when upgrading, because you won’t be using the newest CE unless your database compatibility level is set to the latest version. If you’re on 2016, you get some extra power over this with the Query Data Store.

Thanks for reading!

Brent says: this is one of the features we’ll be demonstrating during the Performance Overhead of Various Features session at Dell DBA Days 2016. Just by flipping one switch at the database level, you can get different query plans – so it’s important to know whether things are getting better or crazy better. (Okay, or also worse. But let’s think positive.)


Dell DBA Days Prep: Country Songs About Databases

Humor, SQL Server
67 Comments

I travel a lot, and I’ve seen a lot of things, but even I was surprised this week when I sat down in a Cracker Barrel Restaurant that had the oddest soundtrack. It turns out there’s an entire sub-genre of country music that specializes in database songs.

While I was sitting there eating my Uncle Herschel’s Favorite Breakfast (with country ham and hash brown casserole, of course), I heard tunes like:

Hardware with our name on it
Hardware with our name on it
  • Your Cheatin’ Transaction
  • The Day I Lost My Tables
  • Papa Warned Me About Triggers
  • He Left Me with MongoDB
  • God Bless the Transaction Log
  • The Night of the Cold, Dead SAN
  • Save Me from Books Online
  • She Thinks That She’s an Admin
  • Distributed Transaction Blues
  • Workin’ 99.999 to 5
  • Why Won’t You Commit?
  • The SQL Server 2000 I Hide
  • The Best Little Data Warehouse in Texas

Amazing. I had no idea these were even a thing. I’m sure we’ll hear more next week in Round Rock. I might even sing a few for you – register to watch and find out. Leave your suggestions in the comments, and we’ll pick our favorite one Monday, announce it on the first Dell DBA Days webcast, and give the winner a free Everything Bundle.

Erik says: My favorites were “Mamas Don’t Let Your Babies Grow Up To Be Developers”, “Ring Buffer Of Fire”, and “Patching After Midnight”.


Using Stack Overflow Queries to Generate Workloads

One of my favorite questions is, “How can I generate workloads to run against SQL Server for testing?”

Step 1: get the StackOverflow.com database. This database has a relatively simple schema, just a few tables, real-world data distributions, and enough rows that you can generate seriously slow queries. You can use any size, small medium or large, and they’ll all work.

Step 2: get the top user queries. Over at Data.StackExchange.com, users write their own queries to learn things about Stack’s data. I’ve taken a dozen highly-voted queries and turned them into stored procedures, all of which take a single integer parameter. This way I can call ’em with a single random number and get constantly varying queries.

Step 3: call them randomly with SQLQueryStress. The last stored proc in the above script is usp_RandomQ, which uses a random number generator to pick one of the stored procs and run it. I use a recompile hint on usp_RandomQ because I don’t want his metrics sticking around in the plan cache – the real workload is the various stored procs that he calls. Just set up SQLQueryStress to repeatedly call usp_RandomQ from, say, a dozen threads, and presto, your SQL Server will fall over.

I use these 3 in combination all the time in my sessions. Enjoy!


[Video] Office Hours 2016/07/27 (With Transcriptions)

This week, Brent, Richie, Erik, and Tara discuss speeding up performance, how to make SSRS highly available, what to do with SSIS packages when making upgrades, Windows Core, and how to hide data from DBAs when they have SA access.

Here’s the video on YouTube:

Office Hours Webcast - 2016/07/27

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Office Hours Webcast – 2016-07-27

 

Should I use a covering index or an indexed view?

Brent Ozar: Shaun asks, “I have a number of tables that get loaded every night in an ETL process. Most queries only use a limited number of columns and rows from these tables. Should I use a covering index, an index view, or just put that data in a separate table?” Oh, it took me a minute. I had to read it a couple times. So most of your queries only use a few columns in a wide table. So how do you go about making that kind of stuff fast? I should start picking people, and Tara is reading it, so I’ll pick Erik.

Tara Kizer: Yeah, I’m still reading it.

Erik Darling: So an index view doesn’t really help you there. There’s a lot of rules around index views and it’s just hitting a single table. Index views really shine when you need to sort of aggregate stuff and go crazy. So I wouldn’t recommend an index view there just because of the added complexity. Also, there’s a lot—well not a lot of—but there is some overhead when you insert or modify in index view. We have a blogpost about that hanging out somewhere on the site I believe Kendra or Jeramiah wrote it. But there is that. So for you, my good man, I’m going to go with a covering index to speed up those queries. And if they really only use a limited number of columns and rows, you may even be able to get away with a filtered index to further drop down the amount of indexing you have on this table that’s getting cabobbled every night.

Tara Kizer: I was going to suggest a filtered index as well.

Erik Darling: Ha, I’m awake this week.

Tara Kizer: If it’s the same rows at least amongst most of the users.

Erik Darling: Yeah. There are some funky rules around filtered indexes too and we have many blogposts about those on the site as well so give those a read through if you’re interested in more.

 

100% CPU, ASYNC_NETWORK_IO, PREEMPTIVE_OS_PIPEOPS, Oh My

Brent Ozar: Breck says—Breck just throws a whole bunch of metrics out. It’s almost like he’s playing metrics bingo. He says, “100 percent CPU. Top wait. ASYNC_NETWORK_IO 40 percent. PREEMPTIVE_OS_PIPEOPS 31 percent. Would this indicate an application load spike?” Wow, that’s kind of tricky.

[Laughter]

Richie Rump: I don’t have my Magic 8 Ball here. I need it. Where did it go?

Tara Kizer: If SQL Server is waiting on the app, for ASYNC_NETWORK_IO, almost always means SQL Server is waiting for the application to tell it to keep sending rows. The application is doing row-by-row processing. But if SQL Server is just waiting, it’s not turning through CPU during that time, I mean it’s just sitting there waiting, holding the data—I don’t know, are you sure that it’s SQL Server?

Brent Ozar: Yeah, I like that. Could be something else burning the CPU cycles.

Erik Darling: Yeah.

Tara Kizer: And the preemptive OS is external, it’s an OS process for that preemptive wait.

Erik Darling: Yeah, so one interesting thing I’ve seen is that PREEMPTIVE_OS_PIPEOPS sometimes correlates to people using xp_cmdshell for stuff. So if you see high waits for that and someone is using xp_cmdshell, you could be waiting on xp_cmdshell to fire off some other application on some other commands to the OS. So I’d be careful there as well.

Brent Ozar: Good point.

 

Is 140MB usage in TempDB a problem?

Brent Ozar: Shaun says, “I have a SQL 2012 R2 standard cluster. It’s got about 250 gigs of databases. Tempdb has got eight files, 16 gigs of space. During business hours, tempdb only uses like 140 megabytes, is that a problem?”

Tara Kizer: It is not a problem. You could probably shrink down your files but 16 gigabytes is really small. Disks are cheap, right?

Brent Ozar: If it’s a problem, if you would like to fix it, just go create a bunch of tables or queries that dump stuff into tempdb. That problem will be solved right away.

 

Idle Chit-Chat

Brent Ozar: Next up, we have J.H., asks, “Have you had a chance to check out super soldier Captain Randy Cramer as he looks and sounds like your twin clone?” I have no idea who that is, so the answer is no. Immediately Tara goes and takes her headphones off so she’s got to have like a super soldier Randy Cramer toy or something like that. No—is that true? No, she’s not. Tara, I thought you were going to come running back with a super soldier Randy Cramer thing.

Tara Kizer: No, sorry. I was turning off my fan in case that’s why my audio is bad.

Brent Ozar: Funny. Wes says, “Tara sounds like she’s in a bottle.” That’s because she is in a bottle. She lives in one of those build a ship bottles, it’s just a really large bottle.

Erik Darling: She is a resident SQL genie.

Tara Kizer: Does it sound better now?

Brent Ozar: No, I think it’s some kind of like USB audio thing. It’s like echoey but it might be that your gain is up too high or something.

Tara Kizer: I have no idea what to do.

Brent Ozar: Yeah, I know. It’s not that bad. It sounds fine, it’s not that bad. It’s like you’re literally in a bottle.

 

How do you make SSRS highly available?

Brent Ozar: Let’s see, Tommy says—Tommy is like [inaudible 00:04:24] initials champion here. SSRS, SSIS, HA, SQL 2016. All right, “Can you recommend articles to provide a good way to accomplish HA with SSRS?” How do you guys make SSRS highly available?

Erik Darling: I use Tableau.

[Laughter]

Erik Darling: Get my own Tableau server, hang out with that.

Brent Ozar: That’s just mean, man. That’s like Power BI, everybody then has Excel at that point, it’s highly available. So SSRS is not cluster aware, so what you usually do is you stand up a bunch of little cheap VMs because SSRS also isn’t really resource intensive. A bunch of little cheap VMs and then you throw a load balancer in front of it, an F5, an IP… I’ve forgotten what the IP deal is. Then you just round robin across those. If one of them dies, who cares, it doesn’t matter. Licensing isn’t cheap on that but when you’re licensing VMs at the host level then it’s not quite so bad.

 

Is it common to see 8-10 instances on a 3-node cluster?

Brent Ozar: Ryan says—I need to take a deep breath as I ask Ryan’s question—“Our normal configuration is a large three-node cluster with about eight to ten instances setup as failover clustered instances.” I can already see us all kind of like itching away from the webcam.

Tara Kizer: I have him beat on that by the way, I am embarrassed to say.

Brent Ozar: What was your typical when you were dealing with stuff?

Tara Kizer: No, no, I have him beat as far as number of instances on a cluster. I had a four-node cluster with 11 instances. It wasn’t even the worst I saw. It’s the worst I implemented. I was forced to do this.

Brent Ozar: “I was young. I needed the money.”

[Laughter]

Brent Ozar: He says, “I’ve been told by a few different people that this configuration isn’t very common.” You have now been told by four more people. “Some vendors have scoffed at it. Is it really that rare in favor of say a two-node with a single default instance that runs as active/passive?” Think of it as like a bell curve. Normal people are out there and you’re definitely on the edge of the bell curve. It’s fairly unusual.

Tara Kizer: Yeah, I mean the reason why I hated it back then is because we were on SQL Server 2005 and in order to install anything, SQL Server service packs, whatever, all four nodes had to be in agreement that didn’t have to reboot again. It was insane to have to install a new instance or a service pack or hotfix. I had to reboot all four servers, usually like five times each just to get the installation to finally agree, okay, all four nodes are finally fine. It was horrible. But starting with 2008, you could do one node at a time. But yeah, what’s the purpose of having eight to ten on a three-node cluster? Why not just have several two-node clusters instead?

Brent Ozar: It’s a party.

Tara Kizer: Or, do you have duplicate database names, is that why you’re having to spread out across SQL instances? Why not just put them on a three-node cluster, use two SQL instances and then have all the databases from all eight to ten on those two?

Brent Ozar: Wow, you just blew my mind with the duplicate database name thing. I’m like, why would anyone ever do that—then [makes explosion sound].

Erik Darling: Oh, yeah. Yeah, because we’ve dealt with a couple of clients who use vendor products who all have their vendor name as the database name and they’re like, “No, we can’t separate things out. We can’t do these things because everything is going to this database.” It’s like, ugh, why?

Brent Ozar: That’s incredible.

 

Why does sys.dm_exec_requests show suspended?

Brent Ozar: Greg asks, “Why does sys.dm_exec_requests return a status of suspended when a query is still executing?” That’s a really good question. The first thing I can think of is like a BEGIN TRAN. Somebody does a BEGIN TRAN, executes something, and now we’re suspended. What else would do it?

Erik Darling: If it’s being blocked by something else, would it be suspended? Or is that still runnable?

Brent Ozar: Oh, I bet you’re right, I bet if it is blocked.

Erik Darling: Yeah, that would be my first guess.

Brent Ozar: I like that.

Erik Darling: Because it’s like, suspend is like when it’s waiting on a resource or something else to go through I think.

Brent Ozar: Oh, he says, “It’s in a query that runs in a restore.” Oh, I wonder if it’s waiting for file activity? Or if it’s part of like, I don’t know, or it could be extended store procedures. I wonder if it’s doing something preemptive?

Erik Darling: Yeah, I would make sure if it’s during a restore, I would make sure that I have instant file initialization turned on to limit the amount of things that I would have to wait on during that process. You do that by giving the SQL service account perform volume maintenance tasks permissions and the security policy user rights assignment, secpol.msc, for the curious.

Brent Ozar: Nice.
Brent Ozar: Scott sends in a URL for super soldier Randy whatever-it-was, Randy Cramer…

Erik Darling: I ain’t clicking that.

Brent Ozar: …who claims to have spent 17 years on Mars and spending three years serving on a secret space fleet. You know, we do look a lot alike in this webcam and his picture. I do have to give it to you. Suddenly, I feel like I need to put on a nicer shirt. J.H. says, “After—”

[Erik laughs]

Brent Ozar: You looked at the picture, didn’t you? We are disturbingly alike. That’s totally going in the webcast notes. I’m going to reply to all just so everyone can see it. “This is Randy…” So then everybody can go click on that link. So it’s now in the answered questions for us.

 

Why can’t I shrink this file down from 5GB to 2GB?

Brent Ozar: J.H. says, “After I shrink a datafile to its minimum size…” Stop doing that. “I.e., five gigs within the Management Studio GUI, how come the space used is smaller afterwards?” Oh, wow. So he did a shrink to five gigs in the GUI but the space used… man, I had to read that three times to understand what’s going on. Let me rephrase J.H.’s question. So he’s got a file and when he used the GUI it says you can only shrink it down to five gigs. “But when I look at the space used, it’s only using two gigs, why can’t I shrink it down further?” I bet it’s a log file.

Erik Darling: Yeah, so check in sys.databases and see if there is a log_reuse_wait_desc. So just like look like database name and log_reuse_wait_desc from sys.databases and see if you’re waiting on something weird. Like if you used to have replication but you didn’t remove replication right, it could be waiting on that. If you used to participate in mirroring, if you haven’t taken a log backup, there are a lot of reasons why it may not be. You could also just try running a manual checkpoint and seeing if you can shrink it down if the database is in simple recovery.

Tara Kizer: Why are you trying to go from five gigabytes to two gigabytes is my question. That’s just ridiculous. I can understand if it was 500 gigabytes and maybe you could get it down to 200 gigabytes, but from five to two, that’s just—it just seems like a waste of time to me.

Brent Ozar: He’s running SQL Server on a Raspberry Pi, he’s only got so much space left on his thumb drive. He needs to free that up.

 

What should I do with my SSIS packages when I upgrade from 2008 R2 to 2014?

Brent Ozar: Anker says, “Hi, I am Anker.” Hello, I’m Brent. Actually right now, all of us are Brent. “We are upgrading SQL 2008 R2 to 2014. We have some SSIS packages. Should I open them in SQL Server 2014? I’ve heard that if I open them in 2014 it’s going to update them to 2014. What should I do with my SSIS packages if I’m doing this big, fancy upgrade process? Should I move them to the 2014 server or leave them on 2008 R2?”

Tara Kizer: If you open them up in Visual Studio that’s when in the newer version of the tools, that’s when it would upgrade. And you don’t have to save those changes. But I mean, where are you opening up these files? If you modify your SSIS packages using Visual Studio 2008 and just change the data source to point to the SQL 2014 new SQL instance, save that, and you’re good to go. Just deploy that 2008 package to wherever your 2008 SSIS repository is.

Brent Ozar: This is one of the reasons I like having a separate server for it too.

Tara Kizer: Yeah.

Brent Ozar: Build your own SSIS server, leave it in a corner, do whatever you want with it on a separate version upgrade plan.

Erik Darling: This is one of the reasons I like having Tara, because I still haven’t opened SSIS.

Tara Kizer: I used it quite a bit at my last job for [inaudible 00:12:07] type tasks.

Richie Rump: I’m systematically forgetting all of my SSIS information.

 

How common are database snapshots?

Brent Ozar: Ryan says, “I’m on database snapshots. Have you guys used database snapshots? How common are they to see out in the wild?”

Erik Darling: Very uncommon.

Tara Kizer: I’ve used them.

[Speaking at same time]

Tara Kizer: I used them a couple of times. I think on Twitter or on the [inaudible 00:12:32] mailing list there’s people that really like to use them and revert to them if they run into problems with an upgrade. We used them on the asynchronous database mirror for some internal users to run their horrible queries against it so they wouldn’t touch production.

Brent Ozar: Yeah, I’d have to say out of every SQL Server I’ve—we have a warning in sp_Blitz that tells us when there’s database snapshots and I can’t remember the last time I saw that warning. But that probably has more to do with the fact that people call us when they’re having performance problems and they probably delete the snapshots in order to avoid performance problems.

Tara Kizer: I’ve seen it a couple times in Blitz and then I immediately, I just ignore them, because there’s really no issue with having a snapshot. I mean as long as you’re refreshing them, you’re not just leaving them out there.

Erik Darling: Yeah, I’ve thought about using snapshots on a couple occasions for things but my main kind of gripe with them was that there are point and times when you create them and then if you want to keep refreshing them, then you have to create a new one and sort of drain users over. It just seemed like way too much of a process for what I was getting out of it.

 

More About Shrinking Databases and Rare Steaks

Brent Ozar: J.H. says, follow up to his shrink question, he says, “The database was actually 80 gigs in size and he was shrinking it down to five gigs.” Five gigs is good. You can stop there. That’s cool.

Tara Kizer: Yeah.

Richie Rump: Yeah.

Erik Darling: Yeah.

Tara Kizer: Yeah, that makes more sense.

Erik Darling: I have pictures bigger than that.

Tara Kizer: It’s all the detail on the tattoos.

Erik Darling: Yeah, pretty much. I get real close up on those food pictures. I want Brent to see every grain.

Richie Rump: Every spice.

Erik Darling: I want Brent to be able to see my steak’s thoughts when I take a picture of it.

Brent Ozar: Because it’s still alive too.

Erik Darling: Yeah, it still remembers its name. That’s how rare it is.

 

How common is Windows Server Core?

Brent Ozar: Jeff says, “Have you any of you guys stood up a SQL Server instance on Windows Core?” I’ll raise you guys on that question. Have you ever used Windows Core?

Tara Kizer: I have.

Brent Ozar: Tara has, yeah.

Tara Kizer: It was a proof of concept, it was three jobs ago, and as part of our review cycle we had to have these goals. My manager set a goal to on a Windows Core server install SQL Server, figure out all the different commands you’d have to run to say set lock pages in memory, everything that we would normally do. I went and did that and then the Windows team just wasn’t able to support it yet in our very large environment. We had like 700 SQL Servers at this job. So yeah, we had this proof of concept. I suspect that that job now has Windows Core deployed because that was a big push due to having to patch Windows every month. If you get on Windows Core there’s less patching, you might even be able to skip some months. It was painful.

Erik Darling: Was that the patching next Tuesday people?

Tara Kizer: It was, but it was a different division. So the one division I referred to was the Thursday after and this other division was the following week.

Erik Darling: So from proof of concept to poof goes the concept.

Tara Kizer: Yeah, I think it was on Windows 2012 and it wasn’t as good then. Was Core available on 2008 R2 for Windows? I can’t remember.

Brent Ozar: I think so.

Tara Kizer: Maybe that’s what it was, it was just a lot harder. You didn’t have any GUI whatsoever. On the new stuff, you can bring in some of the GUI.

Richie Rump: Yep.

Erik Darling: It’s just Microsoft’s attempt to be more like Linux.

Brent Ozar: Yeah. I taught a class on it. Dandy Wey of Microsoft was like, “I need someone to proctor something at TechEd, can you teach it?” I’m like, “Yeah, sure, what is it?” “Windows Core.” “Dang it.” I’m like, okay, all right, I’ll learn it. I go and learn it and the whole class—it was all an interactive lab where all the attendees had their own SQL Servers. The whole class, all the Q and A consisted of people raising their hand going, “Can you tell me how to do this in Core? Can you tell me how to do this in Core?” “No, I don’t know, sorry, I don’t know. I have no idea.”

Erik Darling: “I can only tell you what’s in the book.”

Brent Ozar: Yes, I felt like that. You know, like you always go to New Horizons or Global Knowledge and the instructors have no idea how to do anything. I’m like, I’m that guy. I’m that guy. Ugh. I even asked, “Is there anyone more qualified to teach this?” “No, no one else wants it either.”

 

Is it safe to apply all up-to-date Windows patches?

Brent Ozar: J.H. asks, “Way back, Microsoft had issues with a couple of Windows patches which caused Java security connectivity. Do you know if they’ve fixed this and if it’s safe to apply all up-to-date Windows patches?”

Tara Kizer: Just by the question, it says “way back” and now asking can he apply the Windows updates. If there was a bad patch out there, I think the one that you’re referring to that we did encounter and it was fixed like within the next week or two. So I mean if you’re way far behind on Windows updates, yeah, you can go ahead and roll forward all those patches. I can guarantee my other job have installed them all.

 

How do you hide data from DBAs who have SA access?

Brent Ozar: Anker asks, “What’s the best way to hide data from DBAs when they’ve got SA access?” Oh, I love this question so much. So if someone has SA access, how do you hide data from them?

Richie Rump: Put it in Mongo.

Brent Ozar: Hides it from everyone.

Tara Kizer: Encrypt it. Other than that, there’s really no way that you’re going to be able to hide from us. We can get in.

Brent Ozar: And when you say encrypted, elaborate more on what you mean.

Tara Kizer: I mean either application encryption or is it transparent data encryption, scramble the data for us.

Brent Ozar: And TDE, we can still read it. TDE only encrypts it on disk.

Tara Kizer: That’s true, yeah.

Erik Darling: You could try Always Encrypted but there are so many holes in that and ways around the Always Encrypted columns that that’s not even practical…

Richie Rump: Encrypt it before it lands to the database. Salt it, and you’re good.

Erik Darling: And pepper.

Brent Ozar: A little eggnog in there, some cardamom. When you talk about doing this in the application layer, is it like easy to do for developers? Or is this something that there’s a bunch of different approaches to do and people have to figure it out?

Richie Rump: You know, I just had to do this in Node, which is JavaScript. Literally, it was installing a package and then calling one function and everything was encrypted. I was using it for a checksum but the concept is essentially the same. The encryption has gotten a lot easier, as before, I was doing it 20 years ago and there was a lot of stuff we had to do. Now, there’s packages that really help us out to do this a lot easier than we used to have to, so yeah, it’s fairly easy. All the base encryption loggers are in any major language. You should be able to do it without a problem.

 

Wait – why would you want to hide data from DBAs?

Brent Ozar: Wes asks, “Why would you want to hide it from DBAs? They should be trusted with all the data.” Here’s the deal, Wes. You didn’t spell trusted correctly. So if I can’t trust you to spell the word trust correctly, I don’t know that I can trust you. We’ve all been in really weird high security type environments so it would be interesting to hear. My approach with that was usually the developer stored stuff unencrypted in the database, it was stored unencrypted the whole way through. Management wasn’t happy about that. Management didn’t trust any of us but they didn’t have a fast way of getting everything encrypted quickly. So it wasn’t just that they didn’t trust the DBAs, they didn’t trust anybody, especially you know somebody rolls in in a really nice car, like DBAs, and they’re like, “Where did you get the money for that?”

Erik Darling: Beep, beep.

Brent Ozar: “Because I sold the data to the Chinese.” Steve says, “Throw money at the DBA not to look at the data.” Man, we’ll not look at the data all day long for free. Tell us what your data is, we promise we won’t look at it. Dollar a month, that’s cool.

Erik Darling: It is safe from us.

 

Should I use a server level trigger to alert about sysadmin role changes?

Brent Ozar: J.H. asks, he says, “Sorry, I asked this a few weeks ago, but I missed the answer.” All right, so what I’m going to do is I’m going to wait like 60 seconds to see if you stick around just as a test. He says, “Are there any things to be aware of, such as downsides, when I’m implementing a server level trigger that emails the DBA team when the sysadmin role gets modified?”

Tara Kizer: I like it. It sounds good. Because you’re not monitoring an application table here, it’s sending an email. It’s a server level trigger, how often is a sysadmin role getting modified. I’m not sure that this is the approach that we took. We did a lot of auditing at my last job making sure that Windows administrator group, someone didn’t slide into there, the sysadmin role, things like that. We just used PowerShell and queried the data. We just queried it say every hour or every five minutes. We didn’t set up a trigger on the instance.

Brent Ozar: If I had to do it for an auditor, I might get nervous because anybody who knows enough about this to go hack you knows enough to disable your trigger.

Tara Kizer: That’s true.

Brent Ozar: They can turn off database mail. They can break database mail if they want to break your alerting. But if you’re worried about your incompetent coworkers or your sysadmin buddies who could screw something up, I love this. I’m all for this. DDL triggers are another thing, I’m not quite as fond of that. I mean, I’m fond of them, they just have drawbacks and gotchas because if the trigger fails then all of a sudden somebody’s code fails and then you know, it causes a mess.

Erik Darling: I ran into a really interesting one with a trigger recently with snapshot isolation. Wherein the trigger there was a temp table getting populated and a clustered index getting made on the temp table. But under snapshot isolation that wouldn’t work because data definition language like that isn’t snapshotted, so it just failed immediately. So you had to just take the index tree out of the trigger. It was crazy.

Tara Kizer: Was the error clear and that was the issue?

Erik Darling: Yeah. As soon as I took the index tree out of the trigger language, it ran fine.

Brent Ozar: Wow, and folks, if stories like that interest you, go onto BrentOzar.com and search for “bad idea jeans.” We have several blogposts in the past where we’ve done crazy, stupid things with temp tables and triggers because there’s one for one third-party vendor, who Erik and I know very well, did some wild things with temp tables. So I knew that they needed an index so I set up a trigger, a server level trigger whenever the table was created in tempdb if it matched this definition I would go add an index to it immediately. Talk about a bad idea.

Erik Darling: Wow. That was worse than my idea.

Brent Ozar: So bad. I had to like give big, written instructions to the client, “I’m leaving here but you better make sure you know this is in place because it’s going to break your codes sooner or later.” Well that’s all the questions we’ve got this week, everybody. Thanks for hanging out with us and we will see you guys next week on Office Hours. Bye, everybody.

Erik Darling: Adios.


How To Fix Forwarded Records

Some of our clients have very high forwarded record counts and aren’t aware of it until they run sp_BlitzFirst and get an alert about high Forwarded Records per Second. Some of these clients are using Ola Hallengren‘s IndexOptimize stored procedure to maintain their indexes. This brought up a question of whether or not rebuilding a heap fixes the forwarded records or if IndexOptimize is excluding heaps.

What is a heap?

A heap is a table without a clustered index. It is not stored in any kind of order. Think of a heap like a teenager that has been asked to clean their room. The teenager grabs everything off the floor and crams it into the closet. The room looks orderly at first glance, until you examine the room. When everything was crammed into a pile in the closet, it was done randomly and without order. That pile is a heap.

What are forwarded records?

Forwarded records are rows in a heap that have been moved from the original page to a new page, leaving behind a forwarding record pointer on the original page to point at the new page. This occurs when you update a column that increases the size of the column and can no longer fit on the page. UPDATEs can cause forwarded records if the updated data does not fit on the page. Forwarding pointers are used to keep track of where the data is.

How do you fix forwarded records?

You have two options to remove the forwarded records.
1. Rebuild the heap: ALTER TABLE TableNameGoesHere REBUILD;
2. Add a clustered index to the table

Option 1 is a temporary fix. Forwarded records can still happen, so you should monitor for forwarded records and rebuild the table to remove them. Note that rebuilding heaps was added to SQL Server starting with SQL Server 2008. You can’t rebuild heaps in SQL Server 2005 or lower.

Option 2 is a permanent fix. There are some people that prefer heaps for performance reasons. I am not one of those people. Writes on heaps do perform well, but reads do not. Think of the teenager cleaning their room analogy. The teenager can “clean” his/her room quickly but can’t find things easily. Finding one item might not take too much time, but imagine having to find 10 items from that pile.

Add a clustered index to all tables with the exception of staging tables or those used for ETL.

Can you show me a demo of this?

Using the StackOverflow database, I created a heap by dropping the clustered index on the Posts table. Even though this isn’t a small table, the clustered index dropped in just over a minute. I could have used a smaller table or created a new one, but I was too lazy. I save my energy for hiking, plus I always start with Posts.

Examining the table, we see it has 0 forwarded records:

Create some forwarded records:

Examining the table again, we see it now has 36 forwarded records:

Time to rebuild it:

Check the forwarded record count again:

Yippee, no forwarded records after the heap was rebuilt!

Ola Hallengren’s IndexOptimize does not rebuild heaps.

IndexOptimize does not rebuild heaps as of this writing. It specifically excludes them with “indexes.[type] IN(1,2,3,4,5,6,7)” since type=0 is a heap. If you want Ola’s script to do that, edit the IndexOptimize proc and look for this area:

And add 0 in the list of types, as in:

Just keep in mind that you’ll need to edit Ola’s proc each time you update it. To be alerted if/when Ola changes this, watch this Github issue.

Brent says: I was shocked when I learned this. I thought for sure Ola would take care of me. Turns out there’s a few things I still have to do for myself.


HOLY COW. Amazon RDS SQL Server Just Changed Everything.

Platform-as-a-Service users (Azure SQL DB, Amazon RDS) often ask me:

OMG
OMG
  • How can I move my data into the cloud? Can I just take a backup on-premises, and restore up in the cloud?
  • How can I use the cloud as inexpensive disaster recovery?
  • Once I go to PaaS, why can’t I just get a backup file with my data?
  • How can I refresh an on-premises dev server from cloud production?
  • How can I do cross-provider disaster recovery inexpensively, like have a primary in AWS and a secondary in Azure?
  • Why am I locked into just one cloud provider once I go PaaS?

Until now, the single biggest problem has been that both Azure SQL DB and Amazon RDS SQL Server don’t give you access to backup files. If you wanted to get your data out, you were hassling with things like import/export wizards, BCP, or sync apps.

Today, Amazon RDS SQL Server announced support for native backup/restore to Amazon S3.

This is a really, really, really big deal, something Azure SQL DB doesn’t support (and I dearly wish it did). I get even more excited reading this because now Microsoft has to do it in order to remain competitive, and that’ll make Azure SQL DB a much more attractive product for traditional DBAs.

Here’s the use cases that it supports:

“I’m on-premises, and I want to use the cloud as DR.” Just keep taking your full backups as normal, but use a tool like Cloudberry Drive to automatically sync them to Amazon S3. When disaster strikes (or preferably, when you want to test and document this process long before disaster strikes), spin up an Amazon RDS SQL Server instance and restore your backups. Presto, you’re back in business. (I’m glossing over all the parts about setting up web and app servers, but that’s a developer/devops/sysadmin problem, right?)

“I have big databases, and I want to experiment with the cloud, but can’t I upload fast.” Ship your USB hard drive to Amazon with your backups, they’ll copy ’em into S3, and then you can spin up RDS instances. Got more data? Check out Amazon Snowball.

“I’m using the cloud, and I want cross-provider DR.” Run your primary SQL Server in Amazon RDS, schedule regular backups to Amazon S3, and then use a cross-provider file sync tool or roll your own service to push those backup files from Amazon S3 over to Azure or Google Drive. When disaster strikes at Amazon (or if you just want to bail out of Amazon and switch cloud providers), just restore that backup somewhere else. Same thing if you want to refresh a local dev or reporting server, too.

“I’m using the cloud, but I might outgrow Platform-as-a-Service.” PaaS makes management dramatically easier, but both Amazon and Azure set limits on how large your databases can get. Putting your database in Amazon RDS or Azure SQL DB is basically a bet that your data will grow more slowly than their database size limits. If you bet wrong – which is a great thing because your data skyrocketed, usually indicating that you’re in the money – you have an easy transition into IaaS (self-managed SQL Server in the cloud) rather than the painful hell of dealing with data exports.

This right here changes every SQL Server cloud presentation that I give. It’s really that big.


Where Do Missing Index Requests Come From?

Indexing, sp_BlitzIndex
1 Comment

Be honest, here

You don’t care about most indexing problems. Duplicates, Borderline Duplicates, Unused, Wide, Non-aligned Partitions… All you’re here for are the Missing Indexes. Wham, bam, your query finishes in less than a second, ma’am. Take this quarter, go downtown and have a rat gnaw that thing off your face.

And I get it. You’re busy! You have a million other things to do. Take a number, Non-Unique Clustered Indexes. Maybe you’re not all that Unique anyway. Maybe you’re just another dude (or dudette) with a beard (stranger things have happened) and tattoos (maybe a beard tattoo), wearing a plaid shirt (probably also a tattoo), wandering around a major metropolitan area.

You’re in a Mode 3 state of mind

When you run sp_BlitzIndex, you have some Modal options. Mode 3 gives you missing index details. Just missing index details. No Scrolling through warnings about the other horrors you’ve visited upon your poor data. But where do they come from? I’m glad you asked! No, really. I am. Brent says I’m not allowed to sleep until I hit 1000 blog posts. It’s been over a year. Shadows have voices.

Index Matching

SQL is lazy. Horribly lazy. It takes that ANSI-spaghetti query you halfheartedly wrote while waiting for lunch to show up and turns it into results magic. Before it does that, it goes through a process called Simplification, where it rewrites your query a bit by removing contradictions (where col = ‘thhbbbpppttt’ and col = ‘GILBERRRRRRRRT’). Some other stuff goes on during this process, but let’s stick to index matching. Once it has the query you meant to write ready, El Optimizer will think to itself “I’d really like to find indexes like these…” and goes hunting for them. When it doesn’t find its index soulmate, it logs that in the missing index DMV.

But there are problems

Not, like, world-ending, life-deaffirming problems. Just stuff you should be aware of. It’s very query specific, this DMV. It’s not looking at your whole workload, the way you should be doing, like a responsible DBA and/or Developer. Query runs. Optimizer wants index. Request is logged. This index may almost match an index you already have, match another request from a similar query, and sometimes, they’re just a cry for help.

Why just a cry for help? Sometimes the request will only be estimated to improve the cost of the query by something like 50%. Or less. Way less. And much like swallowing a handful of Good & Plentys, these requests are just a sign that something is really horrible. So horrible that The Optimizer can’t figure out a very helpful index. If you throw functions on columns in joins or where clauses, SQL won’t even bother asking for a index, because what would it do with it? Whole lotta nothin, that’s what.

It will also sometimes ask for some insane indexes. Ever see those requests that are like one key column, and then include every column? That’s because someone is running SELECT *, or the equivalent in EF, because they’re lazy and didn’t just pick the columns they actually need. It will also ask for MAX columns as includes, because it doesn’t check those things first. It’s fine, it just becomes your job to look at these things first.

No missing indexes, I’m a genius

Not so fast, Slowpoke. Kendra has a couple posts at her new home here and here that talk about what clears out index DMVs, and fixes for some long-outstanding bugs coming along.

Thanks for reading!


Logging Activity Using sp_WhoIsActive – Take 2

Monitoring, SQL Server
47 Comments

We’ve already covered how to log activity using sp_WhoIsActive, but I wanted to share how I’ve seen it done in the wild. The difference between the two methods is that this version logs everything to the same table and is for continuous activity logging (scheduled job), whereas the other version uses a new table each time it logs activity and is for when you are working on a problem.

WHAT IS IT?

If you don’t know about sp_WhoIsActive already, you’ve been living under a rock. It’s just the most awesome, free SQL Server tool there ever was. Adam Machanic, the author of sp_WhoIsActive, says it “is a comprehensive DMV-based monitoring script, designed to tell you at a glance what processes are active on your SQL Server and what they’re up to.”

I’m a fan of using free tools that can help me do my job more effectively. I’ve used sp_WhoIsActive even when I’ve also had paid tools monitoring production.

GET IT

You can download sp_WhoIsActive here.

SCHEDULE IT

Create a new job and plop the below code into the job step, modifying the first 3 variables as needed. The code will create the logging table if it doesn’t exist, the clustered index if it doesn’t exist, log current activity and purge older data based on the @retention variable.

How often should you collect activity? I think collecting sp_WhoIsActive data every 30-60 seconds is a good balance between logging enough activity to troubleshoot production problems and the storage needed to keep the data in a very busy environment.

You could also break the DELETE out into its own job so that it runs once a day. If you do this, loop through the DELETEs in batches rather than all at once.

ANALYZE IT

This is just a starter query. I generally am looking at a specific time period based on a complaint from a user that queries were running slow earlier in the day. I once used the WhoIsActive logging table to identify who was causing tempdb to grow absurdly big. By the time I got to it, tempdb was 500GB in size! I switched my query to include tempdb_allocations and tempdb_current plus ORDER BY tempdb_allocations DESC, and it was immediately clear a business user was querying ALL.THE.THINGS.


sp_BlitzFirst @SinceStartup = 1 Shows You Wait Stats Since, Uh, Startup

SQL Server, Wait Stats
0

It’s this easy to get a cumulative recap of wait stats, file stats, and Perfmon counters:

sp_BlitzFirst @SinceStartup = 1
sp_BlitzFirst @SinceStartup = 1

In the above example, here’s what the wait stats columns mean:

  • Sample Ended – when it ran
  • Hours Sample – the number of hours your SQL Server instance has been up
  • wait_type – the name of the wait stat
  • Wait Time (Hours) – in descending order, with the biggest waits first
  • Per Core Per Hour – this is SQL Server’s tachometer
  • Number of Waits and Avg ms Per Wait – sometimes waits strike all the time in small increments, but sometimes (like locking) they show up in small quantities, but very long times

If you want to see wait stats right now, just use it without any parameters:

sp_BlitzFirst
sp_BlitzFirst

First, you get the top result set – a snapshot of what queries are running right now. Then sp_BlitzFirst takes a 5-second sample of your current waits, and returns a headline news set of what performance issues you’re facing right now.

Go grab the latest release of our First Responder Kit and start troubleshooting performance the easy way.


[Video] Office Hours 2016/07/20 (With Transcriptions)

This week, Brent, Richie, Erik, and Tara discuss new Microsoft certs for data science, index rebuilds, replication, patching SQL server, and what to say to a manager that says he needs an active/active HA solution but has no idea what that means.

Here’s the video on YouTube:

Office Hours Webcast - 2016/07/20

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Office Hours Webcast – 2016-07-20

 

How should I manage a thousand database schemas?

Brent Ozar: Joe O’Conner asks, he says, “Any suggestions on tools or methodology for standardizing a schema to accommodate a thousand plus instances that may all be slightly different?” Joe, let me ask a follow-up question too if you can return back and ask, do you want them to be different or do you want them to be the same? That will probably influence the answer that we give you there.

 

Why is replication showing high CPU use?

Brent Ozar: Let’s see here, next up. James Kelly says, “In SQL Server 2005…” I’ve got to stop using last names because somebody is going to ask a question they’re ashamed of—like James Kelly—who has SQL Server 2005. Poor guy, god bless him. It’s not a bad database platform, it’s not horrible. It’s better than 2000 was. He says sp_replmonitorrefreshjob causes high CPU when suspended mode on the distribution database. What should I look into in order to reduce CPU? Tara, I’m even sorry I read that but you’re the only one here that would be remotely qualified.

Tara Kizer: Yeah.

Brent Ozar: You know anything about that? I don’t know anything about that.

Tara Kizer: I haven’t encountered that and I used replication for, I don’t know, ten years or so, what is meant by suspended mode on the distribution database? Is that when the jobs are not working?

Brent Ozar: I think he’s doing sp_WhoIsActive. I bet he’s doing sp_WhoIsActive.

Tara Kizer: Oh, but suspended doesn’t necessarily mean anything. “Causes high CPU…” Are you sure it’s that refresh job that’s causing the high CPU? If it is, I would start taking a look at the replication monitor and looking for what kind of latency you have in the queues. I would imagine you’ve got a large backlog and maybe that’s why it’s taking it a while to pull up the data.

Brent Ozar: That makes sense. So my webcam probably just froze. I all of a sudden got pouring down rain here, you could hear it in the building, just all of a sudden it pours down and immediately my internet dies. I’m like, oh, it’s going to be like that. It’s going to be like that.

Richie Rump: What kind of backwater [inaudible 00:01:41] are you in?

Tara Kizer: Yeah, I was going to say. Richie is like, it rains all the time.

 

Brent Ozar: Dennis says that MS certs are great at uncovering what areas you don’t know. Yeah, but also what areas you don’t need to know.

Tara Kizer: Or what areas you do know.

 

What do you think about the new Microsoft degree for data science?

Brent Ozar: Greg asks an interesting follow-up question. “What do you guys think about the new Microsoft certs for data science?” Did you guys see that new degree?

Erik Darling: Yeah.

Brent Ozar: Richie clearly saw it. Erik saw it.

Tara Kizer: I only heard about it, I haven’t seen it.

Brent Ozar: I didn’t look at it, I just saw the headlines and went, “Hmm, what?” So Richie, you made such a face. There must be a good reason you made that face, or a bad reason.

Richie Rump: Calling it a degree is weird to me. It’s just strange. I don’t understand it. I spent five years getting my four-year degree, so am I going to spend six years trying to get my seven-year MS degree? I don’t understand. I don’t understand it.

Erik Darling: Is it really that long term? I thought it was like hours.

Brent Ozar: Well, but it’s also the kind of thing you’re going to do spare time. It’s not like, I hope to god you’re not going to take out a loan and go do that kind of thing. Go spend $1,000 on exams or something.

Erik Darling: It was like $500 or whatever, or something. I don’t know, I didn’t read all the fine print.

Brent Ozar: There’s going to be like no training on it, third party for a while, because it’s all brand new. And that platform changes so fast. I’m like, I don’t get that. So I’m not keen on that.

Richie Rump: I mean basic stuff it doesn’t, but I mean is that what they’re going to teach? They’re going to teach me the basics of data science independent of their platform? Or are they going to say, “This is how we do it at Microsoft?”

Erik Darling: … golden hammer from Microsoft.

Brent Ozar: There was a statistics course in it, or like one statistics test. But then the rest was heavily Microsoft. Yeah, exactly. I was like… especially if, you want to know more than one toolset if you’re going to do the data science thing. In the database community, you can know just Microsoft and you can make a wonderful living for yourself. It’s fantastic. Data science is not—you don’t get to learn just one tool, you’ve got to learn a lot of tools.

 

Yadda Yadda Yadda

Brent Ozar: James says he’s waiting for his boss to get a credit card and register for the Senior SQL DBA class next week. Cool, very cool. We’re doing an online class there for four days.

 

Brent Ozar: Nate Johnson says, “SQL Server Management Studio 2016 says it won’t work with 2005 instances but it’s working okay for me. Have you guys used SSMS 2016 with 2005?” I will answer that one. No, we were doing La Cucaracha the day that 2005 was out of support. Erik was doing a countdown for that date. He could not wait. So we haven’t used it at all. Usually what that means with Microsoft is just that they don’t have it in their test matrix anymore. It usually works, there’s edge cases where it won’t but usually you’re okay.

 

Brent Ozar: Thomas says, “I think if you’re new, the certs are good because they show that you’ve been at least exposed to areas of SQL Servers. In my day job, I don’t do any analysis services, reporting services, or integration services so the certs will give me some exposure to them in an organized way.” You can die of exposure, you know that. That’s dangerous.

Erik Darling: Naked and afraid.

Brent Ozar: Naked and afraid, in indexes.

Richie Rump: Unsubscribe.

 

What should I do about an index rebuild that takes 90 minutes?

Brent Ozar: Michael Tilly asks a question that is really inconsistent, multiple parts, and a novella. So if you guys want to read into it, we’re going to read through it. Michael says he has a question about index rebuilds. He works at a large university, actually Brent’s alma mater—I don’t think you’re allowed to say that unless you graduate—and University of Houston for three semesters, so yeah. “I have a large students table…” 35,000 students a year, if I remember correctly. “That has many, many millions of rows of active and inactive students. I have a maintenance plan that does a nightly index rebuild on the…” Holy cow. “On the…” Yeah, already we’re shaking our heads. “It seems to take an excessive amount of time, 90 minutes. Right now we’re rebuilding it online. I’m thinking that maybe I should turn it offline. What should I do?”

Erik Darling: I think you should turn that maintenance plan off.

Tara Kizer: Off, yeah, why are you doing this?

Erik Darling: Because my whole thing with what you’re saying is like you probably get a whole bunch of students once a year.

Brent Ozar: What are you trying to say?

Erik Darling: I’m trying to say that you enroll students once a year.

Brent Ozar: Some of them graduate.

Erik Darling: Yeah, but you enroll students once a year. So like you add in a new crop of students and at that one point, your table might be fragmented. Then after that, you might have some dropouts, but you just change them to inactive. You don’t like delete them from the table. I can’t imagine the students table is getting heavily fragmented aside from enrollment and graduation. Or like even then, graduation is probably just switching to inactive. So I just can’t imagine a way for fragmentation to even accrue that much on a daily basis.

Brent Ozar: I think if they had an index on estimated completion or estimated graduation date, my record would have moved around a lot during the course of me being there. But I think for the rest of the people, not so much.

Richie Rump: So what can he do, Erik?

Brent Ozar: Oh, yeah, good point.

Erik Darling: You would have to go back. I wrote a blog post recently, no surprises there.

Brent Ozar: What?

Erik Darling: Peanut gallery, all of yous—it was about when you want statistics instead of an index but it touched a little bit on why index fragmentation isn’t really that important and up-to-date statistics are far more important. So what I would do in your shoes is I would stop rebuilding the index every night. I would stop rebuilding indexes pretty much in general and then just update the statistics daily and see how you fare. You’ll probably just end up in the same place now where performance is what it is because the statistics got updated, it didn’t reflect all of the fact that you were rebuilding the index every night and wasting hours and hours of your life and precious disk and CPU time.

Tara Kizer: And for the update statistics, because you have several million rows of data in that table, you could probably just get away with like a 20 percent, 30 percent sampling. You probably don’t need to do a full scan on that table.

 

How do I resync AGs when replicas have different drive letters?

Brent Ozar: Nathan says he’s got an availability group and they very rarely have to resync their dbs. They’ve added new databases and “we also have to start syncing those. But we’re not able to do a full sync because the drive letters don’t line up anymore. For example, I have some drive letters that exist on the primary but not on the secondaries. Is there a way to automate this process without doing backups and restores and joining them manually?”

Tara Kizer: You have to do a backup and a restore and say with move in order to get the different drive letters. So, no. There isn’t a different way.

Brent Ozar: Yeah, that blows.

Erik Darling: Rename your drives.

Tara Kizer: Yeah, rename your drives. Script this out so that it’s automated at least, you’re not having to do it manually.

Erik Darling: Yeah, there are actually a bunch of questions on dba.stackexchange.com about how to automate adding databases to availability groups. There are some, not like fully fleshed out scripts, but there were some pretty good outlines of scripts on there about how to get started but you are going to have to make certain considerations for the fact that you have different drive letters and that stuff when you’re writing your script.

Brent Ozar: I want to say there’s a trick with the SUBST command too. Like there’s this command line S-U-B-S-T that can let you make fake drive letters to point to other drive letters. Totally don’t quote me on that but just like google it to see. I’m sure it’s not supported but just as a fun trick.

 

Why isn’t Brent giving money to the University of Houston?

Brent Ozar: Michael from my fake alma mater comes back and says, “As it turns out, we do have constant fragmentation because we keep all our student history.” Yeah, I know because you keep asking me for donations which seems pretty stupid because I dropped out. Why would I give you money if I didn’t finish? He says, “As we constantly update the home address and phone number and other info, so lots of updates.” Listen, you’ve got to unsubscribe me. I’ve asked several times. That’s not true. Although you did bulldoze the dorm that I was in and I was a little pissed off about that one.

 

How do I change account owners on an endpoint?

Brent Ozar: Let’s see here, Thomas says, “We have a former DBA who granted the rights on the service accounts to an endpoint.” I think he used his own personal service accounts on an endpoint, like mirroring or availability groups. “I can’t drop his login without changing that. Is there a way to change which service accounts have permissions on an endpoint without dropping and recreating the connection? I tried ALTER AUTHORIZATION and that didn’t work.”

Tara Kizer: What’s the endpoint being used for? I mean can’t you just rebuild whatever that is? Is it mirroring? Is it an availability group? What is it?

Brent Ozar: Yeah, it should be a brief outage, right? Just recreate it.

Tara Kizer: Yeah, exactly, get a maintenance window for this.

Brent Ozar: Yeah, I like that. And you want to be patching anyway. I’ll also take the crazy thought of, “Why change it?” Leave it under his account. As long as you have the password.

Tara Kizer: Or maybe rename it so that it’s like a service account now and not someone’s—you know, because the [inaudible 00:10:26] will remain the same. So it will appear as a regular account at that point.

Brent Ozar: The service account formerly known as…

Tara Kizer: Yes.

 

More about synchronizing a thousand schema differences

Brent Ozar: Joe comes back with the thousand schema thing. Joe says, “Are there any suggestions or tools or methodology for standardizing to a single schema that will accommodate slight differences in a thousand plus instances without losing any data?” Wow.

Tara Kizer: It’s a thousand plus SQL Server instances and trying to get them all into one database with different schemas? I’m confused what the question is.

Brent Ozar: You know what I bet he’s doing, Erik. I bet he’s doing what your favorite ISV does. Yes, I bet, so we blog about this kCura Relativity, it’s a piece of software out there and customers are allowed to change tables but your deployment scripts still have to work. How do their deployment scripts work, like when they drop a new version of their app?

Erik Darling: The last time I did it, the deployment scripts were in a numbered folder and called by a C# program—by some program. Don’t quote me on the C#. But it’s tough because you have to specifically change only your indexes. The last thing didn’t go so well because there was some custom objects that they didn’t account for that I had to like go back and manually change. So I don’t know that I would recommend that process as an overall process. But, you know, they do alter tables directly from scripts and everything.

Brent Ozar: Run them in order based on their version upgrades. They also have dozens of developers to manage that kind of thing.

Erik Darling: Yep.
Brent Ozar: Lee asks, “I’m curious, why does the model database, the system database model, why does it have recovery set to full?”

Tara Kizer: Well that’s just because it’s the default for new databases so when you create a new database and you don’t provide all of the settings, it uses model as the template and it’s just the default that Microsoft has set. I really wish that they would switch it to simple. If they’re going to have everything be default to full, then how about you default to setting up transaction log backups too?

Brent Ozar: Oh, yes. Yeah, I like that. We should have a connect item for that.

Tara Kizer: Closed. Won’t fix.

Erik Darling: Immediately.

 

Why does my log file grow even in simple recovery model?

Brent Ozar: Lakshmi says, “I have a database in a simple recovery model and we make a lot of changes on say Friday nights. The transaction log grows to like 20 gigs, why is it that my log file is growing even in simple recovery model?” Did I say simple recovery mode at the beginning? That’s Tara’s hot point.

Tara Kizer: It’s one of mine, yeah. Everyone says “recovery mode.”

Brent Ozar: Yeah, simple recovery model.

Tara Kizer: You’re logging of your transactions doesn’t change based upon your recovery model. It’s still the same amount of logging. So if you have 20 gigabytes of data changed, or at least logging, it’s going to be the same with simple or full. It’s what happens at the end of the transaction is what’s different between the two recovery models. With simple, that is cleared from the transaction log after it completes, either rolls back or commits. With full, it stays in there until something backs up the transaction log. So the size doesn’t change based upon recovery model.

Brent Ozar: Some yoyo doesn’t BEGIN TRAN, your log is going to be growing for a while.

 

Does antivirus slow down SQL Server?

Brent Ozar: Matthew says, “Have you guys ever experienced noticeably degraded performance as a result of having antivirus installed on a SQL Server?”

Tara Kizer: Yes. Definitely. That’s why you need exclusions in place. At one point we had something like 46 exclusions on the database servers. I think it went even higher. I forget what the number was, 60s or 70s I think. But there’s a list of exclusions that you need to have in place on a SQL Server. It’s SQL Server executable, your MDF files, LDF, NDF, all these… There’s just tons of them and I believe that this is documented somewhere, what your exclusions should be. I don’t know where it is though.

Brent Ozar: There’s a KB article. If you Google/Bing whatever “SQL Server antivirus.” Then you site: support.Microsoft.com. If you site: in Google and restrict it to a single site you’ll get the answers just from the Microsoft Knowledge Base.

 

Where can I read about lifecycle management?

Brent Ozar: Graham says, “Other than Microsoft’s content,” so I guess we know how he feels about Microsoft’s content, “are there any good end of life or lifecycle management references for SQL Server versions?” Did you guys ever see any good database documentation around when you should upgrade or how to handle upgrades other than Microsoft’s?

Erik Darling: Most of the ones that I’ve seen reference a lot of Microsoft’s stuff, like using the Best Practice Advisor and the Upgrade Advisor and using those as sort of jump off points for what you should do from then on out. Microsoft does, at least I think, a fairly decent job for the Upgrade Advisor stuff, like giving you the scripted out things that you need to do your migration with.

Brent Ozar: Tara, with all the SQL Servers that you guys managed, how did you do processes for upgrades? Did you like write out, “Here’s exactly what we’re going to do” and which order, or was it like different every time you guys did a SQL Server?

Tara Kizer: When I was at QualComm and I was the primary DBA for a specific division it was really based upon what projects were in the pipeline and if we could get an upgrade placed into that project. If they were doing a major release, maybe we could get that database upgraded. I’m sure to this day they still have SQL Server 2005 instances out there. I’ve been gone from them for about three years but I’m positive just because how many there were when I left. There weren’t any 2000 but tons and tons of 2005. Some of them probably just get stuck there. Not all companies care about not being supported by Microsoft. So what if you don’t have any security hotfixes. If you have a secure environment, you may not care about that type of stuff. Companies really care about getting releases out, making their customers happy. Do they really care about the SQL Server version behind the scenes? The DBA team cares, you know, what do we have to support.

Brent Ozar: That’s why we still at the company we still run Windows 95 here at Brent Ozar Unlimited. It works really well.

Richie Rump: Well I’m on 3.11, that Workgroup edition.

Brent Ozar: I’ve been trying to get him to upgrade that Trumpet TCP/IP stack is just not…

Richie Rump: It’s an oldie but a goodie, man.

Brent Ozar: Oldie but a goodie, that Solitaire, no Minesweeper like 3.11.

Richie Rump: Hot dog theme, that’s what it is.

Brent Ozar: Oh my god, the hot dog theme is correct. So for those of you who ever get out in the community if you ever get the chance to Erland Sommarskog speak, Erland Sommarskog is a presenter in the SQL Server community and he usually uses the hot dog theme on his desktop. It’s just awesome. Comic Sans everywhere, pastel colors.

Tara Kizer: That would drive me crazy.

 

How often should I patch my SQL Server?

Brent Ozar: Adam asks, “What do you guys recommend for patching? What is a good balance of staying up to date with SQL Server patches while not just being on the hamster wheel continuously and doing constant updates? How often do you recommend patching SQL Servers?”

Tara Kizer: So based upon these large corporations that I’ve worked with, most of them want to do monthly patches. As the person who has to do the patches, I don’t want to do monthly patches because it is painful when you have to do 100 servers, 700 servers. We had a lot of servers that had to be patched. It wouldn’t just be one person having to do all this patching, it would have to be a team of people. It was disruptive. There was always some issue on some server that patches would not work, maybe failovers weren’t working that day. There was always some issue. It was just painful. It took way too much time. I wanted to do quarterly but large enterprises want to ensure that their environments are secure. Now, that goes against what I just said about SQL Server 2005 and not having security hotfixes—and these are the same corporations I’m talking about. But yeah, as one of the people on the patching team, I did not want to do it monthly. I would much prefer quarterly. Oracle doesn’t do monthly. Or, I should say, Oracle teams don’t do it monthly. I don’t mean just one company. It’s very common for them to do quarterly or twice a year. Why is it so important on Microsoft? I mean I realize it’s because of all the bugs there are as far as security goes.

Erik Darling: The really good thing about quarterly patches too is that it gives you some time to apply them to a development environment and let them burn it and make sure that they don’t do anything too funky. Because you know a lot of times a patch will get released and within the first 24 hours a lot of people will be raising their hands like, “It broke something.” So it gives you time to test things out as well. Quarterly is a good span of time to run across something awful.

Tara Kizer: Yeah, one of the companies I worked for, the patches came out on Tuesdays, “Patch Tuesdays” was what everyone called it for Microsoft patches. Then by Thursday we had to have installed, in production. I mean that was very aggressive. We ended up switching it to the following Tuesday, but still, it’s still very aggressive.

Erik Darling: So was that the “see you next Tuesday?”

Tara Kizer: Yeah. I mean, yeah, they’ve been installed in a test environment and maybe there’s some automated testing that has occurred on those, but that’s not enough time to burn those patches in.

Brent Ozar: That would suck.

Erik Darling: That’s horrible.

 

Should I use Desired State Configuration for SQL Server?

Brent Ozar: Graham says he’s trying to move to DSC, Desired State Configuration for SQL Server upgrades and migrations. I haven’t seen anybody successfully use DSC for SQL Server. Stack was trying to use it and failed. They had to give up on it and they have really bright PowerShell guys over there. I don’t think it’s quite there yet for SQL Server. It makes total sense for Windows, just not quite there yet for SQL Server.

Tara Kizer: I’ve never even heard of that.

Brent Ozar: The idea is you programmatically declare what you want the server to end up as and then Windows takes care of patching, installing the right features and getting it to that point for you. It’s not quite there.

Tara Kizer: Isn’t that what System Center Configuration Manager does? I mean it just knows what you’re missing.

Brent Ozar: Yeah, except you’re also supposed to be able to declare in code the whole state you want to get to, server names, what IIS tools you have installed, SQL Server features, where you turn things on and off. Yeah, it’s pretty tough.

 

Thanks for putting the Everything Bundle on sale

Brent Ozar: Lakshmi says, “Thank you, I just made my manager purchase the Everything Bundle from your training. Hope I can learn much more from there.” Well what are you doing on our free webcast? Go over to the good stuff. Here we’re just answering questions. These people don’t know what they’re doing. Go learn the good things. No, but thank you for picking that up. That’s awesome. We have a good time building that stuff.

 

Why does the Database Tuning Advisor only tell me to drop indexes?

Brent Ozar: Anker says, “Hi, I’m using the database tuning advisor in SQL Server 2014 for recommendations but it’s only giving me drop index recommendations.” [Laughter]… listen, I was trying to keep a straight face while asking this question. “Does the DTA ever provide any index suggestions too?”

[Laughter]

Tara Kizer: Why are you using DTA?

Brent Ozar: So why should he not use the DTA?

Erik Darling: DTA is craaaaazy.

Brent Ozar: How else are you going to get those?

Erik Darling: We have a tool called sp_BlitzIndex which is very helpful for finding missing index requests. There are some differences between it and DTA. Missing index requests are fueled by a process in the query optimization process called index matching. The stuff in the DMVs that gets logged is per query. The stuff that DTA logs is per workload. So you might have something in your settings that’s not giving you missing index requests for your workload that you can do both and you can click a bunch of stuff where you want suggestions, even for index views I found out recently. So that’s the difference between what they each do. But if it is only generating drop index recommendations and you have missing index recommendations enabled, I would really like to see what comes out of sp_BlitzIndex for you because that’s a wild time.

Richie Rump: Yeah, I have a blogpost that I was working on and it introduces a script that prevents the DTA indexes from being created.

Brent Ozar: So the blogpost, I don’t think it’s gone live yet.

Richie Rump: No, I was working on it. I haven’t loaded it yet. Still writing it. But the script is—I wrote the script. The script is great.

 

Nice video on the Dell DBA Days blog post

Brent Ozar: Greg Smith says, “Great video on the Dell DBA Days blogpost today.” We’re excited about that. We’ll all be down in Round Rock sweating through our shirts in August. We’re going to try and stay in the air conditioning.

Richie Rump: What else is new, Brent?

Brent Ozar:          What else is new.

 

What would you say to a manager who wants…

Erik Darling: There’s a really good question up top, if you don’t mind.

Brent Ozar: Oh yeah, who asked?

Erik Darling: It’s a Scott Kelly question. “What would you say to a manager that says they need an active/active HA solution but has no idea what that means?”

[Laughter]

Erik Darling: I think the most diplomatic way to handle that is to ask them to sketch out what that would look like and provide a budget for it. Then go from there because—just don’t walk into your manager and say, “You don’t know what that means.” Ask them to spend some time sketching it out, you know, a Visio diagram or just a whiteboard of what it would look like and how much he thinks it would cost. That’s where I would go from there if I was in your shoes.

Tara Kizer: This is a topic that I had to deal with a few times at one of the companies. Every few years management would be like, “We want to do active/active.” So every single time, we would go to the vendors and say, “What can we do to make this happen?” Here comes the price tag and we’d send it to management and they’d say, “Oh no, never mind.” It is very expensive. We’re talking two million dollars probably.
Erik Darling: In hardware, not just licensing.

Tara Kizer: Yeah, exactly. That’s several years ago last time I looked at it.

Brent Ozar: And Graham, if you get our download pack, if you go to BrentOzar.com and click First Aid up at the top, we have a download pack. We have a high availability and disaster recovery worksheet in there that helps you sketch out how much data you’re willing to lose, how long you’re allowed to be down for. One of the options is a zero/zero failover solution. It’s a million bucks and up. So just be honest and talk through it with the manager. The other thing I’d ask the manager is, “What’s driving that? What is that’s making you want that?” It may just be that they want cost savings or whatever and I’d say, “All right, so can you introduce me to someone else you’ve worked with at one of your past places who can help us design that?”

“Well, I’ve never worked with anything…”

“Oh, me neither. Sounds like it’s a good time for me to go to training.”

 

Are a lot of companies running SQL Server on Windows Core?

Brent Ozar: Let’s see. I had another question in here. Tom asks, “From what your team is seeing, have many companies started using Core installations yet? Windows Core.” Have you any of you guys seen SQL Server on Windows Core yet?

Erik Darling: Nada.

Tara Kizer: I started playing with it as far as what we would have to do to install everything and what does a database team need to learn. But this was back on 2008 Core and it wasn’t very good.

Richie Rump: Yeah.

Tara Kizer: Yeah, so we abandoned it and I’m sure that company is now using Windows Core because we really wanted to get to that to do Microsoft patching because there’s a lot less to patch on Windows Core than there is on the full versions.

Richie Rump: Yeah, same thing here. Where I was at a company and they were testing it. The IT folks really wanted to go Core and the DBAs were like, “Whoa, wait a second. I don’t know how to handle some of this stuff.’” They tested it and they were like, “Maybe not.” So, they abandoned it.

Tara Kizer: Was it Windows 2008 Core where it first was released?

Brent Ozar: Yeah, I think so.

Tara Kizer: Yeah. It was terrible. We were researching how to set like lock pages in memory via code. All of these things that you have to do on the database server when you need to set up a SQL Server. Once it’s there, you probably have some PowerShell stuff. But back then, it was just painful. I mean we figured it all out, but we decided we just didn’t want to support it yet so we were waiting for Windows 2012 Core.

Brent Ozar: If I remember right on 2008 Core you also had to enable TCP/IP through the SQL Server configuration manager. And even that [inaudible 00:25:39] goes through the GUI so.

Tara Kizer: Yeah.

 

Do we have to pay for the First Responder Kit?

Brent Ozar: Anker says, “Hi, I love using your scripts from the First Responder Kit. I want to use them in production to get performance recommendations but my manager stopped me in doing so after reading the license agreement of yours. Do we have to buy it from you?” No, it’s totally free. In fact, if you go to firstresponderkit.org it has the open source license on there. When you go to firstrepsonderkit.org go click on the GitHub link at the top right. It has a license that’s very normal for open source products. You are free to use it in any capacity that you want to use it as long as you keep that copyright in there. If you try to for example bundle it into a paid product and you don’t include that node of where it came from, then it gets a little sketchy but you can use it for—if you want to take over the world with SQL Server using Service Broker and linked servers, whatever you want to use it for, go knock yourself out.

 

How do I grant access to stored procedures for low-privilege users?

Brent Ozar: Nate says—this will be the last one that we take because he says, as soon as I see a question that says, “I don’t know if you have time left,” that means I probably should have read it before I do it. It says, “When I was a young boy…” He says, no, “If I had a stored proc that needs to be executed with a low level user but requires higher permissions and I don’t want to get the low level user…” He wants to give people access to run stored procedures but they need to be peons. What does he need to do? If you go to BrentOzar.com/AskBrent, that sounds like it’s an advice column. It’s not, it’s about a stored procedure but it has a little instruction on there on how you use certificates in order to grant permissions to users to run stuff. Really simple. Sounds horrifying but it’s super simple. That certificate word sounds awful.

Erik Darling: We’re probably going to change that URL at some point now, huh?

Brent Ozar: At some point I need to move all the rest of the documentation over there because people are going to go, “What do you mean Ask Brent? That doesn’t make any sense.”

Erik Darling: Yeah. Which one?

[Laughter]

Brent Ozar: You changed your name on here too now.

Erik Darling: I did, magic.

Brent Ozar: Oh my goodness. Wow. We look professional. Well thanks everybody for hanging out with us. We’ll see you guys next week at Office Hours. Bye, everybody.

Erik Darling: Bye-bye.


New Windows Clustering Course for SQL Server DBAs by Edwin Sarmiento

Clustering, SQL Server
13 Comments

WSFC-SQL-LogoThe SQL Server community has been waiting a long time for an in-depth Windows clustering course on current versions of Windows.

See, AlwaysOn Availability Groups as well as good ol’ Failover Clustered Instances both rely on Windows clustering in order to manage uptime. If you’re going to do an AG or an FCI, you gotta know Windows.

Edwin Sarmiento, a well-respected cluster expert, has brought out an online course, and it’s big. I’m hearing great things about it from folks I know who have gone through it.

Go check out the launch specials he’s running. He’s been running deals that include his HA/DR deep dive course, Personal Lab course, and Azure HA/DR Hybrid Solutions course. If you’re doing clustering, you should go take a look.


Updated High Availability and Disaster Recovery Planning Worksheet

One of the most popular things in our First Responder Kit is our HA/DR planning worksheet. Here’s page one:

Page 1 - how our servers are doing now, and what the business wants
Page 1 – how our servers are doing now, versus what the business wants

In the past, we had three columns on this worksheet – HA, DR, and Oops Deletes. In this new version, we changed “Oops” Deletes to “Oops” Queries to make it clear that sometimes folks just update parts of a table, or they drop an entire database. We also added a column for corruption (since your protection & recovery options are different than they are for Oops moments).

When people first see this worksheet, they usually scoff and say, “The business is going to tell me we never want to lose data, and we’re never allowed to go down.” No problem – that’s where the second page of the worksheet comes in:

RPO/RTO cost range estimates
RPO/RTO cost range estimates

Find the amount of data you’re willing to lose on the left side, and the amount of downtime you’re willing to tolerate across the top. Where the boxes match up, that’s a rough price range of the solution.

In this version, we added an asterisk to a lot of supposedly synchronous solutions aren’t – for example, Always On Availability Groups don’t actually guarantee zero data loss. I still keep that sort of thing in zero data loss because most of the time, it’s zero data loss, but you just need to understand it’s not a guarantee.

I like printing those two pages front and back on the same piece of paper because it helps management understand that requirements and costs are two sides of the same coin. It’s management’s job to pick the right box (price range), and then it’s IT’s job to build a more detailed estimate for the costs inside the box. The third and final page of the worksheet breaks out the feature differences for each HA/DR option.

If you’re one of the tens of thousands of folks who’s signed up for email alerts whenever we update our First Responder Kit, then you’ve already got an email this week with these new changes. If not, head on over and pick it up now.


How to Contribute Code to the SQL Server First Responder Kit (Github)

So you’d like to fix a bug or contribute code to the First Responder Kit, but you’re new to Github. Let’s get started.

1. Open a new issue on Github.

Go to the Github issues list for this repo, and start a new issue. Here’s the example that I’m going to work on today:

Example of a Github issue
Example of a Github issue

In the issue title, put the name of the script, and some basic details so someone can see at a glance what’s going on.

In the body of the issue, put as much details as you can including:

  • Description of the symptom
  • What you wanted the code to do
  • What it actually did instead
  • Ideally, how you want to fix it

And submit the issue. If you’re going to work on it yourself, assign yourself to it if you have the permissions. (If you don’t, join the Slack channel, then hop in and ask.)

2. Get the Github issue number.

After you add the issue, the number will be in the title, and at the end of the URL:

Getting the Github issue number
Getting the Github issue number – here, #324

Now it’s time to work on some code!

3. Fork & download the repository.

First, get a Github account and install the Github desktop app. While this can all theoretically be done through the web, it’s a bit painful.

Then go to the First Responder Kit’s page, and click the Fork at the top right to fork this repo into your own account. (It’s kinda like doing a File, Save As to edit your own copy of something.) Then in your forked version of the repo, click Clone or Download button at the bottom right of this screenshot:

Download the repository
Download the repository

That will open your Github app and let you pick where to save the files locally.

 

4. Create a branch for your code.

In the Github desktop app, click on the new-branch button to create a branch for the specific issue you’re working on.

Creating a branch
Creating a branch

For the name, use the issue number from Github, plus a slash and your name. This indicates that it’s your code for that particular issue – remember that other people may end up working on this same issue.

5. Write your code, test it, and check it in.

For code guidelines, check out the CONTRIBUTING.md file in the repo.

Test your code against a case-sensitive instance of SQL Server 2008 and 2016, the oldest and newest supported versions. Other testers will also repeat your work, and automated testing will kick in later, but trust me, you want to do a sanity check first. Don’t get your pants pulled down in public.

Then click on the “1 Uncommitted Change” button at the top of the Github desktop (and there may be multiple changed files) and review your changes. They’re color-coded as to which lines have been removed from scripts, and which lines have been added.

Only check in changes and files that relate to your issue. If you accidentally changed a file that you didn’t mean to, in the Github app, right-click on that file name and click Discard Changes.

Checking in your fix
Checking in your fix

At the bottom left is your check-in title and description.

The title should:

  • Start with the issue number, like #324 – this is Github magic to link to an issue
  • Include the names of the files that were changed – this is helpful to others when they’re scanning a long list of titles of check-ins
  • A brief description of the issue

Then click Commit, and click Publish at the top right to sync your changes with the rest of the world. This doesn’t merge your code with the toxic sewer that is the dev branch just yet – you’re still on an island by yourself, but at least other people can get to your code.

7. Announce that your code is ready for testing.

Go back to your Github issue, and add a comment (don’t close it) that says you’re ready for testing. Include a link to your code, and any notes on how people can reproduce the issue, or know that your code works.

Ready for testing
Ready for testing

In the Labels dropdown on the right side, add the label ReadyToTest so folks can find your work easier.

Someone else has to test your code before it gets merged into the main branch, so it’s in your best interest to make it as easy and friendly as possible for someone to know that your code works.

8. Ready to merge? Add a pull request.

After someone else has tested your code, and you’re both happy that it works, open the Github app and click Pull Request at the top right:

Creating a pull request
Creating a pull request

For the remote branch, pick the Brent Ozar Unlimited First Responder Kit repo, the DEV branch. The check-in title and description will flow into here automatically, and click Send Pull Request. The maintainers will take a look at it and make sure things are kosher.

Stuff you don’t have to do

Release notes and changelog – you’ll notice that the top of each script has a changelog, version number, date, and release notes. Those are added when we consolidate a group of pull requests into a single release.

Documentation – if you don’t feel like writing it, that’s totally okay. We’re happy just to get code in at least, and we understand that the kinds of folks who write code don’t always overlap with the kinds of folks who like to write documentation. If you write things like a new check for new SQL Server problems, just know that we may not merge your code in with the dev branch until it’s also been documented.

For questions about the process, hop into the Slack channel. If you’re not already a member, sign up here. And thanks!


Announcing Dell DBA Days 2016: The SQL

SQL Server
21 Comments

This August, we’re goin’ back to Texas.

Last year, we flew the team out to Round Rock, Dell’s HQ, and ran all kinds of SQL Server experiments in their hardware labs. We broadcasted a whole bunch of webcasts live on the net for you to watch and see what we learned.

This year, we’re going back again, and the lineup includes sessions on:

  • How to Measure SQL Server
  • The Unbearable Lightness of BEGIN
  • Downtime Train
  • Performance Overhead of TDE, Query Store, BPE
  • And everybody’s favorite from last year: Watch SQL Server Break and Explode

Head on over and register now for free. Space is limited!


Well, That Sucked: Laying Off Half the Company

Company News, SQL Server
44 Comments
Company retreat, 2016
Company retreat, 2016

I always wanna be honest with you, dear reader, and let you see how running a lil’ consulting shop goes. It’s been fun sharing a lot of our happy growing moments along the way. This one, though, is a lot less fun to share.

Yesterday, we had to let go of Angie, Doug, and Jessica.

The background: we basically sell emergency room services for SQL Server, and we’ve consistently been backlogged with work. In early 2015, we figured that me doing sales was holding the company back. If we hired a salesperson, we believed we’d be able to acquire more customers faster, and sell more work to past customers. So we hired Jessica, and staffed up on consultants.

This year, we’ve learned that a salesperson can’t really bring in more emergency room work (seems obvious in retrospect, but we were hoping for followup work), so I had to make a tough decision. I had to right-size the company back to where we’re regularly busy, maybe even turning some work away, just to make sure that we’re profitable overall. The training side of the business is still doing really well, and the company overall isn’t in trouble – but it would have been soon.

I’m bummed out, obviously, because it’s a failure on my part. These people are my friends, and I wanted to build a place where they could thrive for the rest of their working careers if they so chose. I’d always heard it’s tough to bridge the chasm between a 3-FTE consulting shop and a 10-FTE one, and now I really understand why.

Wanna keep in touch with them? Here’s their emails:

  • Angie Rudduck in Portland, Oregon – if you’re looking for an incredibly upbeat, fast-learning junior DBA or SQL Server support person, I’d highly recommend Angie. She did a great job streamlining our backup/recovery/CHECKDB education process.
  • Doug Lane in Cedar Rapids, Iowa – want to make your SQL Server faster, fast? Doug has been doing high end performance tuning for the last few years, and he’d be a huge asset to any SQL Server shop that needs a DBA who can build bridges between developers and SQL Server.
  • Jessica Connors in Chicago, Illinois – in the database world, salespeople often have a reputation for being slimy, but Jessica is totally different. She listens to your pains and matches you up with the right relief. She was a pleasure to work with.

New SQL Server First Responder Kit for 2016-07

First Responder Kit, SQL Server
0

SQL Server 2012 SP3, 2014 SP2, and 2016 users are going to find a lot of stuff to love in here. The new per-query memory grants fields in the plan cache are exposed in sp_BlitzCache, and sp_BlitzFirst also shows memory grants for live running queries in @ExpertMode = 1. Here’s the details:

sp_BlitzCache v3.1:

  • Show cost for stored procedures. #339
  • Warn about trace flags added at the query level, and global trace flags. #361
  • Add warnings about Remote Queries. #315
  • Do not show Forced Plans warning if the real cause is forced parameterization. #343
  • Fix divide-by-zero error if Cost Threshold for Parallelism is 0. #358
  • Fix warning for unparameterized query. #334

sp_BlitzFirst v25 (The Artist Formerly Known as sp_BlitzFirst):

  • Add new memory grants columns to 2012-2016 live queries output. #362
  • Add SQL login to live queries output. #354
  • Filter Perfmon counter display to skip counters with zeroes. Still logged to table though. #356

sp_Blitz v53.1:

  • Warn about 2016 Query Store cleanup bug in Standard, Evaluation, Express. #352
  • Updating list of supported SQL Server versions. #344
  • Fixing bug in wait stats percentages. #324

sp_BlitzIndex v4.1:

  • Compression information in @Mode = 2. #18
  • Use recently-modified check to improve indexes-not-in-use recommendations. #220
  • Alphabetical sort for @GetAllDatabases = 1, @Mode = 2 output. #351
  • Remove per-day cost filter for missing indexes in @Mode = 4. #338
  • Missing index benefit is now labeled per-day to make it more obvious. #330

To get involved with the development & testing, head over to the Github repository, or download the full First Responder Kit with our checklists & guides.


[Video] Office Hours 2016/07/13 (With Transcriptions)

SQL Server, Videos
0

This week, Erik, Tara, Jessica, Doug, and Angie discuss queries, installing multiple instances of SQL server on a Windows VM, using DENY Database rules, migrating databases to Amazon RDS, availability groups, using filtered indexes, and more!

Here’s the video on YouTube:

Office Hours Webcast - 2016/07/13

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Office Hours Webcast – 2016-07-13

 

How should I convert a database from Access to SQL Server?

Jessica Connors: From Steven Mitchell, he says, “For converting MS Access 2010 to SQL 2012 would using SSMA tool or upsizing tool be preferred? Or some other method?” I have a feeling that we don’t know.

Tara Kizer: We don’t use Access.

Angie Rudduck: Sorry.

Jessica Connors: That’s important.

Doug Lane: Having used Access in the past, I would shy away from that because Access has a nasty habit of rewriting things in a way that just really makes no sense. It might be logically equivalent but you’re probably better off just importing. Like if you want to do the quick and dirty way, just do import data from Access data source. Go through all the tables, import the data that way. Because you’re going to end up with—and maybe you want this, but probably not—like keys coming through, constraints, weirdness. You’ll end up with fields that get converted into datatypes you don’t want. So you’re probably better off just importing that data and manually rigging up how you want those columns to come in.

Jessica Connors: All right, enough with Access.

 

When will Microsoft release an updated BPA?

Jessica Connors: Question from Henrico. He says, “When will MS release BPA for 2014?” Do we know?

Erik Darling: Ask MS.

Jessica Connors: Microsoft.

Doug Lane: I’m not even sure what BPA he’s referring to.

Tara Kizer: It’s the Best Practice Analyzer but it’s already out. I used it at my last job. Maybe the question is about 2016 since that was just released?

Erik Darling: Probably.

Jessica Connors: Do not know.

 

What is free virtual memory?

Jessica Connors: Wow, James really wants us to answer this question. He asked it twice, three times. “What is free virtual memory? What is free virtual memory? How to resolve the issue?”

Tara Kizer: What are you trying to solve? We need some context.

Jessica Connors: Yeah, anyone know what is free virtual memory? Is that a proper question?

Erik Darling: I guess it’s when you get virtual memory for free from the virtual memory store.

[Laughter]

Erik Darling: It was on sale that day.

Angie Rudduck: Yeah.

Jessica Connors: All right, James, we need more context.

 

Should I enable Lock Pages in Memory on VMs?

Jessica Connors: Onward to Edward. He says, “I see conflicting posts on the internet for lock pages in memory versus VMs. Thoughts? Pros and Cons? I’m thinking of adjusting the minimum memory to one half of max.”

Tara Kizer: I always set lock pages in memory as just say standard across servers. I worked in large enterprises so we just had a document to fall on, lock pages in memory was always set on the dedicated database servers. As far as min memory to half of max, that’s also what are standard was at these large corporations.

Erik Darling: Yeah, I think for VMs more specifically though you have to be careful because if you’re not setting reservations at the host level, then I think stuff can get ripped away anyway.

Tara Kizer: Yeah, we never even worried about that on VMs but maybe that’s just… I don’t know.

Erik Darling: I mean, it’s a VM, you don’t care about performance anyway.

Doug Lane: Didn’t this come up as a question last week where Brent answered it too and he said I could go either way but I usually do it?

Tara Kizer: That’s right, yeah.

Doug Lane: Rings a bell.

Angie Rudduck: He said it was not the first thing to worry about or something like that.

Doug Lane: Yeah, yeah.

Angie Rudduck: If that’s the only thing you have to worry about, good job.

 

What’s the fastest way to copy data between servers?

Jessica Connors: Question from J.H. He said, “Would select * into be the fastest way of copying large data, or some type of bulk method, or something else from one server to another via linked server?” He says, “My tests look like select into is very fast.”

Erik Darling: It depends on what version you’re on. SQL Server 2014 and up select into can run parallel which is a nice added bonus for that. If you’re going to move a large amount of data, I would probably shy away from select into and probably use a method that sort of respects your transaction log a bit more and breaks it up into batches.

Tara Kizer: I would actually use SSIS for this since it can do it in bulk. It has a fast way of transferring data between servers.

Doug Lane: What do you guys think about minimal logging in that case?

Erik Darling: If it works, it works. It doesn’t always work. Don’t forget that TABLOCK.

Doug Lane: Right.

Tara Kizer: The problem with minimal logging is you’re reducing your recovery points though. So, yeah, you can minimally log things but you don’t have recovery points in that window.

Erik Darling: Yeah, so if it’s a one-time data move and you don’t care about anything during that window, then go for it. But otherwise, if your database is in full recovery mode, you’re not going to get much out of minimal logging.

 

What’s the best way to recover free space from data files?

Jessica Connors: I’m going to butcher this name—I don’t know how to say it. “What is the best way to regain the free space in data files after deleting a large amount of data?”

Tara Kizer: Well, I mean, it’s shrinkfile. So a one-time shrink is okay if you have to delete a lot of data and you want to reclaim that space. If you don’t need to reclaim that space, leave it there. If you have enough storage, just leave it there because when you shrink the file down you’re going to be causing some performance issues, fragmentation. One time shrink though, go ahead and do it. DBCC SHRINKFILE.

Erik Darling: The thing with shrinking data files, at least when I’ve done it is that if you only shrink it to the point where you have free space, like if you just sort of truncate it to where your data ends or a little bit over that, like a gig or 512 megs over that, you don’t really see the fragmentation problems because you’re just getting rid of empty space at the end of the file. So you kind of don’t see the same problems, at least I haven’t. I just shrunk down a bunch of databases on my availability group VM thing because I blew up Stack Overflow with some horribleness. But I shrunk it down to just sort of where the [while 00:05:48] things ended and didn’t see much fragmentation change. So maybe that’s a 2016 thing though. Be careful with that.

Tara Kizer: Maybe. I’ve always just shrunk it down so that there was 10 percent or 20 percent free space that way autogrows didn’t have to occur immediately after we’re done shrinking.

Erik Darling: That’s a good plan.

 

How can I run cross-server queries without linked servers?

Jessica Connors: Question from our friend Justin. He says, “Other than using a linked server, do any of you SQL super geniuses know of a way to run a query from instance A against instance B?”

Tara Kizer: You can use OPENQUERY but usually we want you to do this work in the application. Have the application query the two different data sources. OPENQUERY, you can use OPENQUERY, it’s just like a linked server.

Erik Darling: It’s not necessarily better, it’s just different.

Tara Kizer: Yeah. It does have a benefit of getting around an issue where with a linked server it can query the entire table even though you have a where clause on it. So if you have a ten million row table and your results set is only going to be ten rows, linked server can pull over those ten million first and then do the filtering. OPENQUERY can get around that issue.

Erik Darling: There’s a really good talk by Conor Cunningham, one of the bigwigs at Microsoft in the SQL Server department, on the SQLBits website. So the SQLBits conference, they put all their videos online. They did one a few years back on distributed queries where he talks a lot about the pros and cons of different methods of querying across servers and how it works behind the scenes. You should definitely check that out if you’re really interesting in querying data across servers.

 

Why shouldn’t I install multiple instances in a VM?

Jessica Connors: Question from Raul. He says, “Are there any downsides to installing more than one SQL Server instance on a Windows VM?”

Tara Kizer: How many do you have, Erik?

[Erik puts up 10 fingers]

Tara Kizer: 10?

Erik Darling: More. I’d stick my foot up, I don’t want to gross anyone out.

Tara Kizer: On a test system, go for it. But on a production system, we don’t recommend it.

Erik Darling: Yeah, stacked instances in any environment, whether it’s physical or virtual is just bad news. If you’re already virtualized, why are you stacking instances? Spin up a new VM, it’s right there anyway.

Jessica Connors: I feel like we get that question a lot.

Angie Rudduck: Yep. That and shrinking, or replication, which there’s already one in there too.

Jessica Connors: Or, what are your favorite monitoring tools?

Erik Darling: I don’t mind answering that. That’s a nice question.

Jessica Connors: But it never changes. I mean, has it changed?

Doug Lane: Well, there all nice questions.

Jessica Connors: There’s no such thing as a dumb question.

Erik Darling: I just assume when someone asks that question it’s a vendor plant. “They always mention us! Do it again!”

[Laughter]

Doug Lane: “You don’t believe me? Watch, I’ll ask.”

Erik Darling: It’s the same thing every time, it’s amazing.

 

When would you use the DENY database roles?

Jessica Connors: Question from Samuel. He says, “What would be a scenario where you would use the DENY database roles?”

Erik Darling: Developers.

Angie Rudduck: Yeah, I think he says two things. He says, “Why does DENY even exist if you give datareader isn’t it the same as denydatawriter?” But not necessarily true. I think the key, my understanding anyway, is that SQL is the most restrictive. So if there’s a denydatawriter, even if you have datawriter as well, you’re getting denied. So I think it’s kind of safeguard, but I don’t know.

Doug Lane: It’s an easy way of making exceptions security wise. So you say, “They should be able to write across all these databases, except I’m going to DENY this particular one.” So the DENY is very explicit whereas it can be used as sort of an exceptions tool to just kind of cut out, “All right, don’t give them this. Give them this, give them this. But they can do everything else that the other permission that I gave them allows them to do.”

Tara Kizer: We used the DENY at my last job for sensitive data. We had active directory groups. We had a lot of people in certain active directory groups and a lot of people in a different one. One specific group got the denial on certain columns in a table, like address information, emails, sensitive information. The other group was allowed to see it but if you were in both groups, you got the denial because DENY overrides the grant.

Angie Rudduck: I like that.

Doug Lane: I was going to say you’ll get a lot of questions on that if you ever take a Microsoft certification exam.

Tara Kizer: Oh really?

Doug Lane: Yeah.

 

How do I get data into Amazon RDS?

Jessica Connors: Question from Kimberly Lu. She’s migrating to Amazon RDS. She says, “My first experience migrating a database to Amazon RDS has been painful because I could not use backup/restore. I had to backup/restore to a local server, generate scripts, recreate users, import/export data using wizard. Do you have a less painful process?”

Tara Kizer: Nope. It’s RDS. If you’re using Amazon EC2 it would be easier since it’s a virtual machine, but yeah, RDS, it’s like Azure SQL, right? Limited features.

Erik Darling: Yep.

Angie Rudduck: Somebody else asked about going to the cloud and how to do it. I think that that’s the key. If you’re doing the VM versus the hosted database, if you go to the hosted database at all, you have to script all of your data into or out of. There’s no backup/restore to hosted databases. So if you’re not prepared for that, maybe don’t go to a hosted database.

Erik Darling: Yeah, the one sort of exception is going to Azure. Well, you can use it for RDS too I believe. There’s something called the SQL Database Migration Wizard. It’s available on CodePlex. It just does sort of a BCP out and in to Azure. I’m pretty sure you can use it for RDS as well but I haven’t tried. So it’s okay for that but it’s still slow if you’re migrating a lot of data because you are doing a BCP out, writing a file, blah, blah, blah. Not the fastest thing in the world.

 

Can I use different drive letters with Availability Groups?

Jessica Connors: Availability group question, question from Tim Smith. He says, “One of the benefits of availability groups is not having to have identical hardware. Does that include drive/drive letters? I tested and it looked like you could have a database on different drives on each node but a recent outage has me questioning that.”

Tara Kizer: I’m pretty sure you can. I’ve always had the same drives across because it’s just so much easier with restores but I’m pretty sure it acts just like mirroring where you can have different drives because you’re telling it what to do during the restore command. I’m not too sure what issue he’s referring to on the failover though. But I’ve always used the same drives. I’m pretty sure that it supports it across different drive letters.

Erik Darling: Yeah, anytime that I’ve had to build a cluster or anything else, I’ve always made sure that the drive—even like different servers using the same sort of hardware, I always have the drive letters be the same and do the same pathing for my database and log files, just so that at least it’s consistent across all my stuff and I know where to expect things to be. So you can do it, but it’s not really a best practice to have things be totally different across servers like that, especially when you are counting on a one-to-one copy when you get over there.

Tara Kizer: Yeah, we would need more info on what happened when you did the failover.

Angie Rudduck: He said, “On the restore of the database after failover the NDF file was gone.” I feel like there was a different problem there. Like maybe you had a problem with your drive and somebody actually deleted something. Maybe there wasn’t just a failover. Maybe that’s why you had a failover.

Tara Kizer: I wonder if the NDF came later after the availability group was set up? If another file was added and it didn’t get added to the secondaries? I don’t know. I haven’t tested that.

(Postscript: yes, this is where different drive letters will fail.)

 

Can you turn off auto-updating for SQL Server 2016?

Jessica Connors: Question from Scott Kelly. Isn’t that an astronaut? Scott Kelly? Didn’t he spend a year in space?

Erik Darling: Yeah.

Jessica Connors: Scott Kelly is here, guys.

Doug Lane: That’s Commander Scott Kelly to you.

Erik Darling: How’s your bone density?

Jessica Connors: Yeah, they did an interesting test on him and his twin. Let’s see here. “Is SQL 2016 auto updating not able to be turned off?”

Doug Lane: Oh, boy.

Angie Rudduck: Like Windows updates but SQL updates?

Erik Darling: You can turn SQL updates off. You can turn automatic updates off for anything. Have you had an update? Because I haven’t.

Doug Lane: So there’s the phone home thing that I think in—what is it—Developer and in Express editions you can’t turn it off. If I remember right. But I don’t know if that’s the same as the auto update. I think you can disable auto updates from SQL Server.

Erik Darling: Yeah, the Customer Experience Improvement Program.

Jessica Connors: Yeah. Scott Kelly wants us to know that he did not work for NASA.

Angie Rudduck: No, he did. He’s not the astronaut but he did work for NASA.

Doug Lane: How about that.

Jessica Connors: Oh, he did work for NASA. That’s awesome.

Tara Kizer: I went backpacking with a guy who currently works for NASA this past weekend. He’s a physicist or something like that. It was really cool.

Angie Rudduck: Did he do better at the hike?

Tara Kizer: He stayed back with the person that was having high-altitude issues, or altitude sickness issues, I should say. He actually had to carry her out on his back because it was so bad. Yeah, I’ll tell you guys later.

Jessica Connors: You did a fourteener, right? You made it to the top…?

Tara Kizer: I completed it. It was rough. It was really rough.

Jessica Connors: Good job, Tara.

 

Can you do columnstore indexes on top of GUIDs in SQL 2012?

Jessica Connors: From Joshua. He says, “Is there any workarounds that allow columnstore indexes with a G-U-I-D in SQL 2012?”

Tara Kizer: GUID.

Jessica Connors: I’m like, it’s not GUI.

Tara Kizer: I don’t know, Erik?

Erik Darling: What is the question? “Are there any workarounds to using a columnstore index with a GUID?”

Doug Lane: Do they not allow GUIDs in 2012? I don’t even know.

Tara Kizer: I don’t like GUIDs so I don’t use them.

Erik Darling: Yeah, columnstore in 2012 in general is kind of bogus. You couldn’t update it and there were a lot of limitations so probably not. I haven’t tried explicitly to use a GUID with it and I certainly haven’t tried to use a GUID with it in 2012 because that was just a no fun version of columnstore. It was just lackluster.

 

Do you recommend Converge Solutions?

Jessica Connors: Question from Julius. He says, “Do you folks recommend running SQL Server on Converge Solutions? Any experience, drawbacks, benefits, etc.?”

Tara Kizer: I don’t even know what that is. Do you guys know?

Doug Lane: I have never heard of it.

Jessica Connors: All right, never heard of it, Julius.

Doug Lane: Sorry.

Tara Kizer: Our answer is no.

[Laughter]

Tara Kizer: I’ll have to look that up.

(Postscript: I’m guessing they weren’t referring to a brand name, and instead just general converged infrastructure, which is a pretty big/broad topic.)

 

Why is only one of my TempDB data files growing?

Jessica Connors: Ronny… oh boy, this looks long. Do you want to read it, Angie?

Angie Rudduck: Oh sure. So Ronny says, “I have multiple tempdb files and only one file is growing, actually the second file that was created. All other files are still reflecting the initial memory size.” He has trace flags 1118 and 1117 enabled—oh, no, just 1118, and wants to know that he also needs to add 1117. Does he need to delete the other files and start over before adding the additional trace flag?

Erik Darling: No. What you need to do is make sure that you grow all the other files to the same size as the second file. Elsewise, all of your queries will still be attracted to that larger data file and it will keep growing. So if you can, grow out your other files to match the size of the second file. If not, you’re going to have to deal with the kind of painful and possible corrupting experience of shrinking a tempdb data file.

Tara Kizer: They actually are saying it hasn’t been a problem—or, Paul Randal said corruption on shrinks has not been a problem in several years. They fixed it with 2012 I believe he said, or maybe even earlier. They just never made it public.

Erik Darling: Oh, okay, well, that’s good. I hope you’re using 2012 or later then, pal.

Tara Kizer: It might have been earlier. I can’t remember if it was an email or what it was.

Angie Rudduck: Yeah, if only they publicized when corruption is not happening anymore.

Erik Darling: They should just tell people that shrinking databases will cause corruptions so everyone will shut the heck up about it.

Doug Lane: So you also want to make sure that your file growths are set to zero, otherwise your tempdb files will keep growing. You don’t necessarily want that. It’s better to set it at the right size and then just kind of let it go with that because that may be why you ran into trouble in the first place is that you’ve got file growth on that one file and not on the others.

Tara Kizer: I actually quote/unquote disagree with that.

Doug Lane: Okay.

Tara Kizer: It’s not that that’s bad, it’s just it has to do with monitoring. So on the Brent Ozar website it says to grow your files all the way out to whatever that drive is for tempdb. If you do that, then your monitoring software is going to be alerting, the on-call DBA is about to get a disk space issue. Now of course, you could have an exception for the tempdb drives but I try to avoid exceptions. So instead, grow your files out to what they need to be and then keep autogrowth on. Have the file sizes be identical and you start out with the same exact autogrowth and they should grow proportionally based upon free space. Should not have an issue. If you do have an issue, I wonder what’s going on. You might have some kind of weird thing going on. I like to keep 20 percent free space to avoid alerts. Then when you do get an alert, grow out the mount point or drive.

Angie Rudduck: That’s for good IT teams that have alerts.
Tara Kizer: Yes.

Angie Rudduck: If you don’t have alerts, Doug has a good point. If you never allow them to grow, you never have your disk go over space, but your tempdb could go offline.

Tara Kizer: That’s the problem with that solution is that if you have a query, this wild query that runs, it’s going to fail if it needs more space in tempdb. Whereas if you had free space, well, if you have alerts in place, DBAs are going to start receiving alerts that we’re running out of space and you might be able to add the space before the query fails.

 

How can I move all my nonclustered indexes to another filegroup?

Jessica Connors: Speaking of running out of disk space, Gordon asks, “I’m running out of disk space for a one terabyte database with a single data file. So the plan is to move clustered indexes to a new file and a new filegroup. Given that everything is currently on the primary filegroup, would it be a good idea to move all the non-clustered indexes as well?”

Erik Darling: I would set up two different filegroups actually because if you’re doing this sort of work and you’re going to get some benefit out of it and you want to be able to restore the database as files or filegroups separately, you’re not going to want everything in the primary file group because you’re going to need that primary thing online first in order for the database to be accessible and restore other files and filegroups. So I would actually create one for clustered and one for non-clustered and move them off into one into each so that I can keep the primary part small and just have like the system databases and stuff in there. Be careful. Depending on what version you’re on because not all of the rebuilds are able to be online if you have max column types prior to 2012. So if you’re planning on a seamless move, mind your datatypes.

 

Are there any gotchas with SQL 2014 SP2?

Jessica Connors: Did SQL Server 2014 SP2 just come out? I thought they weren’t doing that anymore. You guys just told me they weren’t doing that anymore.

Tara Kizer: It will be rare.

Erik Darling: Starting in 2016.

Jessica Connors: You’re liars. Gotcha. “Any issues with this SP that y’all are aware of?”

Tara Kizer: Not yet.

Erik Darling: No, it’s a lot of fun.

Tara Kizer: It’s too new.

Angie Rudduck: It is too new.

Erik Darling: A lot of the stuff in Service Pack 2 was oddly catching SQL Server 2014 up to Service Pack 3 of 2012. It did add a couple of other things but most of it was just playing catch-up there with a lot of like the stuff that you can see execution plans and finding out about memory grants.

Tara Kizer: Oh, the memory grant issue. Memory grant columns are in Service Pack 2?

Erik Darling: They are and I’m going to be updating stuff.

Tara Kizer: Yeah, I’ve got a client that’s been waiting on those columns so I’ll let them know. Just real quick on the Service Pack 2 thing. When I said it’s too new to know about issues, just remember that service packs are fixing hundreds or thousands of bugs. So by not installing it, you’re more at risk of not installing it than you are of installing it. So service packs, hotfixes aren’t necessarily a bad thing and you don’t necessarily have to wait a while to see how other people are doing.

Erik Darling: If you’re worried about it, install it on a dev server. Let it hang out for a month and do your thing on it and then push it to prod. If you don’t have a dev server, then quit your job.

 

Are there any downsides to filtered indexes?

Jessica Connors: Question from Joe. He says, “Are there any particular downsides to using filtered indexes? I’ve recently been performance tuning some queries that have non-parameterized search arguments and they’re like magic.”

Tara Kizer: They are like magic. The only issue is if you have a query that has the opposite of what your filtering condition is, then it’s going to have to do a scan. But you know, you want your filtered indexes to be what your queries are usually going to do. Like is deleted equals zero for instance. If you have an is equals one, probably that’s going to be a maintenance job that runs at night to clean out data. It’s okay for those things to do a scan.

Doug Lane: The higher the cardinality on whatever it is you’re filtering on, the more effective that index will be. So if you’ve got data that’s split 50/50, it will be helpful. But if you’ve got data where you’re going to be looking for something where the filter condition there is down to say ten or fewer percent of the rows, then you’re in really good shape. Incidentally, I just put out a statistics course that does include filtered stats, which you get with filtered indexes.

Tara Kizer: All right.

Doug Lane: So you can check that out too.

Jessica Connors: I think about three months ago we just included access to our entire SQL training library for anyone that does the SQL Critical Care. If you want any of these guys to take a look and assess the health of a database, let me know. Reach out. For some reason, July and August are slower for us. I think it’s because summer.

Tara Kizer: Everyone is vacationing.

Angie Rudduck: Too much beer and barbecue.

Jessica Connors: Yeah, vacations. So if you are on this webinar and you reach out to me, Jessica@BrentOzar.com, we are supplying some special deals within the next two months for our SQL Critical Care product. So not only do you get the Critical Care but also the additional training, access to our SQL training library.

Angie Rudduck: Isn’t Brent come out with a new one? Not us three… but the other Brent?

Tara Kizer: Not these fabulous Brents.

[Laughter]

Angie Rudduck: Isn’t he coming out with a new video, Doug? I just expect you to know.

Doug Lane: I think so. I know I’ve got another one coming.

Angie Rudduck: Oh, see. There we go.

Tara Kizer: He painted his wall green so he must be working on something soon.

Doug Lane: Oh.

Tara Kizer: Not you—Brent.

Doug Lane: Oh, Brent, yeah. I’m thinking, “Is my tint that bad?” More stuff on like execution plans and querying.

Angie Rudduck: Nice.

Jessica Connors: Cool beans. All right, guys, that is all the time we have for today. I’ll let you go. Adios.

Erik Darling: Later.


Why Not Just Create Statistics?

Indexing, SQL Server, Statistics
16 Comments

Here at Brent Ozar Unlimited

We have a proud tradition of not blaming index fragmentation for everything. There are points you should deal with it, but they’re probably not 5% and 30% and 1000 pages. But that’s not what this blog post is about. I’m hoping to clarify why we’re more interested in up to date statistics, and also why statistics outside of indexes aren’t really the most helpful thing. If they were, we’d all just create statistics and every query would magically blaze from select to offset without a care in the world.

Statistics: It’s What’s Inside That Counts

Statistics are what SQL Server uses to figure out how many rows to expect from accessing a data structure. You can do some things that fundamentally break this, like using functions in joins or where clauses, using local variables or optimize for unknown, using table variables without recompile hints, and a sniffed parameter can just allow SQL to run a query without guessing at all. It already guessed once. No tag backs. As you may imagine, this is important information for SQL to have for running your queries optimally.

Indexes: Pride and Vanity

Indexes of the nonclustered variety contain subsets of your table or view’s data. Clustered ones are your table or view data ordered by the clustering key(s). Focusing on the nonclustered variety, they’re the “and the [band name]” to your clustered index’s “[lead singer name]”, and they’re great for providing SQL with a more appropriate data structure for your query.

If you don’t need to select all the columns, and you do need to filter, join, order by, or aggregate a column not in the key of the clustered index, nonclustered indexes get a solo after the chorus. Give the drummer some.

Nonclustered indexes will, under normal circumstances, get a statistics object created with rolling density information for the key columns going left to right, and a histogram on only the leftmost column.

I AM I SAID
I AM I SAID

With that out of the way

Why do we care more about statistics being updated than indexes being fragmented? Largely, because reading pages with some empty space from a fragmented index is oh-so-very-rarely the root cause of a performance issue. Especially if those pages are already in memory. Out of date statistics can allow SQL to continue to make some really bad guesses, and keep giving you a lousy execution plan no matter which way you tune your query.

The bigger your table gets, the worse the problem gets. Prior to 2016, if you don’t turn on Trace Flag 2371, about 20% of your table’s rows need to change before an automatic statistics update kicks in. For a 100 million row table, this can be a long ways off. Poor cardinality estimates here can really sink you. Rebuilding indexes for a 100 million row table is a B-U-M-M-E-R.

Log Shipping? Mirroring? Availability Group? Good luck with that.

Crappy server? Low memory? Slow disk? Dead man walking.

You may ultimately spend more time and expend more server resources defragmenting indexes than your queries will spend reading extra pages from fragmented indexes. Rebuilding or reorganizing large indexes can be a special kind of brutal.

Consider the process

Read a bunch of index pages with sys.dm_db_index_physical_stats to figure out if there’s fragmentation, reorganize or rebuild based on feedback.

  • Reorganize is online but single threaded and can take FOREVER on big tables, especially if you’re compacting LOBs.
  • Rebuild is offline and single threaded in Standard, online and potentially parallel in Enterprise, but you better hope you’re patched up so you don’t corrupt anything.

Is that worth it? Every night? For every index on every table in every user database? Only if you can prove it.

The one upside to Rebuilding is that it also updates statistics with a full scan. Think about this the next time you say something like “rebuilding the index fixed the problem”, you may have an epiphany on the way.

Wait for it.

Wait for it.

Wait for it.

Statistics with no indexes

SQL Server doesn’t make easy work of getting information about your Statistics, or finding out which statistics get used. Even at the query level, you have to use a spate of Trace Flags to find out what gets loaded and looked at. Cached plans don’t fare much better.

No wonder everyone cares about indexes and their fragmentation. Microsoft has made information about them easy and abundant, while Statistics are kept hidden in the basement next to piles of soft bones and a bowl of hot blood.

Head rush moment: SQL may use information from histograms outside of the index it chooses for cardinality estimation.

Back to earth: If you just create a bunch of statistics instead of indexes, you’re (at best) using your Clustered Index for everything (which is still bad), or you’re using a HEAP for everything (which is usually worse). You’re still generally better off creating good indexes for your workload. They’ll get statistics objects created and associated with them, and if SQL thinks another column is interesting, it will create a single column statistics object for it, as long as you haven’t turned off auto create stats.

Sure, you can put on your black cloak and goat mask and create some multi-column or filtered statistics, but in the words of a wise man (Doug), you end up with more stats to maintain and understanding query behavior gets more difficult.

Filtered statistics suffer from a problem where they don’t automatically update based on the filtered row count, but rather the table row count. Imagine you have a 100 million row table, and your filtered index is on 1 million rows. All million of those rows might change, but the statistics on that index won’t. 1 million is not 20% of 100 million. You’ll have to update the statistics manually, or rebuild the filtered index.

Multi-column statistics are hardly a replacement for a multi-column index, and it’s not like you get an enhanced histogram that includes the second column. It’s just like a normal histogram. All you get is the density information for the columns you throw in there. Boo hiss.

Moral of the story (B-B-B-B-B-B-BENNY WITHOUT THE JETS)

Indexes are a really important factor for performance, but index fragmentation very rarely is. Statistics are super helpful when they’re not out of date, and getting them up to date is much easier on your server’s nerves. Though not perfect, I’d rather take my chances here. Updating statistics can also cause a bad execution plan to get flushed out of the cache. On their own they can sometimes help queries, but you should only end up here after you’ve really tuned your indexes.

Unless you can establish a metric that makes nightly index defragmentation worthwhile, don’t jump to it as the default. Try just updating statistics. You may find that nothing at all changes, and you now have many extra hours a night to do maintenance. Like run DBCC CHECKDB. If you think index fragmentation is a performance problem, try corruption sometime. That’s not what Brent meant when he said “the fastest query is one you never make.”

Thanks for reading!