Blog

[Video] Office Hours 2017/02/22 (With Transcriptions)

Last Updated February 25, 2017

This week, Erik, Tara, and Richie discuss tempdb etiquette for developers, elastic search vs full-text search, server CPU spiking issues, source control in a database project, querying transaction logs, and more.

Here’s the video on YouTube:

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes

Office Hours Webcast – 2017-02-22

Erik Darling: This is a question about indexes.
[Tara’s phone rings] Richie Rump: Shhh.

Erik Darling: Who’s on call?

Tara Kizer: Sorry, it’s now on vibrate.

Erik Darling: Tara is on call for her secret DBA job that she doesn’t want to tell us about. Her real job.

Tara Kizer: This one doesn’t pay enough.

What happens if I add clustering keys as index includes?

Erik Darling: Got to get that second paycheck in. All right, “In the event that I create a non-clustered index that needs a clustered index column, is it a problem to go ahead and add it even though I pick it up as a secret column? I want to add the clustered index for housekeeping purposes so it’s evident in the definition but I’m not sure if it causes any issues.”

Tara Kizer: I’ve wondered that. It doesn’t make a difference. It’s going to get added regardless. I don’t add them to the non-clustered index. Who cares if they’re secret as long as DBAs and developers know that that’s what’s happening. You don’t need it in there for explanation reasons.

Richie Rump: But that’s where you go wrong.

Tara Kizer: Okay.

Richie Rump: Developers don’t understand that it’s there.

Tara Kizer: That’s true. Maybe they don’t even need to know that though. Do developers need to know about the secret columns?

Richie Rump: If they’re running a lot of queries it would help them to understand, “Hey, use the index, follow the indexes,” and all that other stuff. I never used to put them in there because I understood that they were implicitly. But then someone called Kendra Little, she brought up a good point, that sometimes clustered indexes change and things like that. So if you don’t put it in there, whatever column was there, it’s going to go away and then everything just may not work the same way. So go ahead and explicitly put them in there so that way in case your clustered index changes, it’s always there.

Erik Darling: There’s also another interesting part to that where when you pick up the clustered index column, is it implicitly as involved in the non-clustered index. It’s not part of the key of the index. I believe it’s just like sitting at the leaf level somewhere. It’s not in the key. So if you need that clustered index key column to do something additional within the index, and I’d probably want to have it as a key column, right? So like if you had to like join or filter on it or do something else, I would want that to be right up in the key of the index. There’s no penalty for having it there. It’s not going to be in both places but I think that especially if my query is using the clustered index key column I would want to have it as part of the non-clustered index definition. So it depends on the queries.

Richie Rump: I put them in by default now. That’s just my answer.

Erik Darling: Wow. Look at you, big boy.

Richie Rump: Yeah, man.

Erik Darling: Congratulations.

Richie Rump: Yeah, because I can’t use to design a database, that can’t happen.

Erik Darling: No. You can ask me to fix it, I won’t design it. I can fix all sorts of stuff; I just can’t design them. I’m like a good janitor, I’m not a good construction worker.

Should I change my application timeouts?

Erik Darling: Let’s go to the next question. “A user has been complaining about query timeouts which when reviewing our monitoring tool looks like high concurrent other user activity querying a specific table. I recommended for them to increase the timeout setting on their app’s end. Anything on my DBA end that can also help?” I don’t think so. Make your server better so the queries run faster.

Tara Kizer: Figure out what the root cause is. J.H. also asks, “Isn’t it true that the timeout setting is unlimited on SQL Server side?” That part is true but I’m not sure that your reaction should be to increase the application’s timeout value. Maybe there are some queries that need a longer time at value and you can set that specific one to have a higher one. But as far as the connection stream goes, I would leave it and then from a DBA perspective figure out if you need to add any indexes, you need to fine tune the query. So you asked on your DBA end, is there anything that can help, take a look at the execution plan and see where that leads you.

Erik Darling: Sounds good to me.

What’s fill factor?

Erik Darling: Michella asks, “Can you explain fill factor?” Yes. Tara loves fill factor.

Tara Kizer: Fill factor drives me crazy, for the clients that have lowered the fill factor and the column that’s in the index is just the identity column. I just don’t see why this is being one because there is no reason to have anything but 100 percent fill factor for identity columns. Fill factor is when you have free space on a page, if it’s lower than 100 percent or 0, 100 and 0 are the same thing. This allows you to be able to do inserts into the page if there’s free space in there without having to do a page split. A page split is when you have a data page that is full and you need to insert a row in between and you’re inserting a row in between because of the clustered index that’s sorted by the key. So it has to do a page split. It creates another page and it keeps 50 percent of the rows on one page and 50 percent of the rows on the second page. Then it can do the insert into the proper order. So lowering the fill factor can reduce page splits but page splits don’t happen when you have an identity column. Then when you rebuild your indexes, you’re going to get back the fill factor. After the page split happens, eventually that page might fill up as well and then you rebuild the indexes later at some point if you’re doing that kind of maintenance and you get that free space back.

Erik Darling: Fill factor isn’t preserved when you insert or do anything else to your indexes. Fill factor is only preserved after you rebuild an index, which is something that not many people know about.

Tara Kizer: I don’t like seeing servers that have set the fill factor to a lower number. I think that if you are going to lower the fill factor, it should be on an index-by-index basis. Never set a lower than 100 at the—is it at the server level or database level? I can’t remember which one it is. But don’t lower the server-level or database-level. Do it at the index level. Really, one of the only times this should happen is on GUIDs. Don’t use GUIDs.

Erik Darling: Oh, GUIDs.

Richie Rump: Well…

Erik Darling: What I always say, it’s easy to run out of ints, it’s hard to run out of GUIDs. So if you’re concerned about having—what the bigint max, Richie?

Tara Kizer: It’s pretty gigantic.

Erik Darling: [Inaudible] followed by every other number repeated twice.

Tara Kizer: I don’t know that anyone needs that. I think that GUIDs should be used when you have to keep the uniqueness across servers, different types of servers such as say a [inaudible] or something else. Even just SQL Server instances, you need to keep that row unique amongst different servers and not just this one SQL Server instance. I don’t think that using the GUID is a good choice when it’s just on that server.

Richie Rump: So, we have a new architectures coming out as we do in the developer space. Some of those need [inaudible] GUIDs going across processes, especially in microservices, right? So as a microservice, I need an ID that doesn’t overwrite something else. So that kind of follows from service to service to service and makes those hops. So it’s tough not to have kind of a GUID in there, maybe it’s not to your clustered index but probably going to need to be in the database somewhere.

Erik Darling: Yeah, that’s a good point. A lot of times when you have the GUID that you need for that uniqueness, you don’t have to have it as the clustered index. You can have a non-clustered primary key as the GUID just to have the uniqueness there and have an index on it. But you can have the clustered index on a totally append only surrogate key like an ID column or based on a sequence next value hanging out in there. Anyway, let’s move on.

Richie Rump: I did that a couple weeks ago. Look at that. Look at me. I’m amazing.

Erik Darling: No you didn’t. Stop lying. Stop using my ideas. Pretending, using my ideas and that you’re good at something.

Richie Rump: It’s not going to happen, man. Come on.

Can I use mirroring to reduce downtime for database upgrades?

Erik Darling: Christopher has a question. He’s looking to replace his SQL 2014 Enterprise server with a shiny new 2016 server. He wants to know if he can use mirroring between 2014 and 2016 to minimize downtime.

Tara Kizer: He can, he just can’t go back. You can, if you ever do failover, you can’t go back to the 2014 instance using database mirroring’s failover. I’ve certainly done it—or I should say, I haven’t used database mirroring for my upgrades. I’ve just done custom log shipping just to get the full backup, restore, maybe apply a differential, get the log chain in place. Then at the time of the maintenance window do the final log backup and then restore that final one on 2016. So I like that method, but yeah, database mirroring works as well.

Erik Darling: I’ve used mirroring and log shipping for migrations between versions. I don’t have a favorite in either case. Mirroring is nice because you can set it to asynchronous and not have to babysit anything.

Tara Kizer: That’s true.

Erik Darling: Then when you’re ready, set it to synchronous and do your manual pushover. Log shipping is nice because it does some of the work for you. Like availability groups—for some reason, I don’t know why mirroring doesn’t do it as part of the wizard. Mirroring doesn’t initialize anything for you. Like with log shipping, it’s like, do you want us to initialize this? You’re like, “Yes. I want you to take that backup and do a restore for me. Because I’m lazy.” Mirroring doesn’t for some reason. Mirroring is just like, “Did you do it, dummy? Hey, dummy, did you do it?” So, I like log shipping just for that. Plus, log shipping is so old and bulletproof.

Tara Kizer: You trust it, yeah.

Erik Darling: So I don’t know. I don’t have a favorite between the two aside from a few setup quirks. But either way, like Tara said, it is a one-way migration. So do your failover, don’t let users in immediately. Do some smoke tests. Have like some [inaudible] devs ready to try this stuff out and make sure all the connection strings and all that garbage works. Then when you’re ready, then release the Kraken. But until then, because it’s much less painful to have to just like reinitialize mirroring or log shipping to the server again if the smoke test doesn’t work than it is to try to migrate new user data back, and if there’s identity columns and parent child tables, it’s terrible. Anyway.

What do you use for I/O testing?

Erik Darling: Another question from Michella, “Do you have a go-to script for I/O tests?” No. There are tools like CrystalDiskMark and diskspd which are much better for I/O tests. If you’re looking for just a SQL DMV one, sp_BlitzFirst, if you run it, if you download sp_BlitzFirst from FirstResponderKit.org, you can run sp_BlitzFirst either since startup, then it will look at some of SQL’s internal DMVs to tell you about I/O stuff. Or, you can run if for a sample to tell you about I/O stuff. But if you just want to do a pure test of your disk, CrystalDiskMark and diskspd, there are blog posts on our site about both of them. That’s where I usually go when I need to test a disk. When I wanted to test my brand spanking new fancy 1 terabyte M2, I used CrystalDiskMark. I made clients feel bad unfairly because I was like, “Look how fast this is compared to how you do things.”

Richie Rump: [inaudible] under your desk?

Erik Darling: Yes. When you say it like that…

How long will an index operation take?

Erik Darling: Reece has a question. “Is there a way to get a rough estimate of how long an index operation will take to complete?”

Tara Kizer: Test it. Store the database on a test server and test it.

Erik Darling: Yeah, and you know, when it’s in flight, not really. It’s unfortunate. Not really.

Richie Rump: Not like it would be right, right? You can see that little progress bar and then it stops at 96 and you’re there for two days.

Do you recommend using plan guides?

Erik Darling: Nestor asks, “Do you guys ever recommend using plan guides? Do you ever troubleshoot performance issues with plan guides?”

Tara Kizer: I do. I’ve implemented them in production out of necessity. At a company we were migrating from SQL Server 2005 to 2012. As part of that upgrade, we changed the clustered index of the core table of this critical system. That’s because the clustered index was pretty wide and that’s what we needed on SQL 2000. We wanted to go to say—it wasn’t an identity column but something else. So as part of the upgrade, we did that. Well, once we let the production users in after the upgrade suddenly performance was really bad. We realized that the execution plans weren’t picking this great index. So we converted that clustered index to non-clustered so it was still there but now the execution plans, they were not selecting that index. We proved that via index hints that that index was very helpful for a lot of queries. So rather than adding index hints into the stored procedure code, which this was all of our own custom codes so we could have added the hints directly into the stored procedures, we decided to use plan guides. So we tied a plan guide to the stored procedures that told it to go ahead and use this index or sometimes we did the optimize for, just depended upon what was needed. So I’ve definitely used them. I think only one time I talked about it with a client. That was just because we were trying to override Dynamics AX’s choice of using the optimize for unknown. When you don’t have ownership of the code, the stored procedure code, you shouldn’t be adding index hints in there. But attaching a plan guide might be something that you can do. But just be warned, when you do put a plan guide on vendor code that when the vendor application gets upgraded, you’re going to want to drop those plan guides because otherwise they won’t be able to drop and recreate stored procedure. They’ll get an error.

Erik Darling: Yep. What she said. She’s smarter than me.

Tara Kizer: Well this was when I got in the Microsoft engineer that we had flown in for this major upgrade that we were doing just in case we ran into problems. After the upgrade he did training with us, so we had him onsite and looking at our production issues. He’s the reason why we put plan guides in place and after that I knew what to do.

Richie Rump: I think the key to your entire statement there is, “When we had to.” It’s not the like the first thing, it’s not the second thing. This is like the very last thing you do is put a plan guide on. There’s nothing else. That’s when we do the plan guide.

Erik Darling: True that.

Should I use a CTE or a temp table?

Erik Darling: Brian has a question about—well, it’s not really about temp tables, it’s more about CTEs. So last week we talked about using temp tables to avoid mega join queries. Brian is asking if it would be better to use a common table expression rather than a temp table, and if not, why. Does anyone have an opinion on this matter?

Richie Rump: Sometimes I think CTEs make things a little harder to read than easier to read.

Tara Kizer: They certainly do.

Richie Rump: So, there you go. That’s my answer.

Tara Kizer: I like using temp tables when you’ve got a large query that’s getting the compilation time out, so then breaking apart a lot of those joins and putting that stuff into temp tables and then just joining to the temp table can avoid your compilation timeout. When you get a compilation timeout, you might not have gotten the best execution plan for your query. You got just whatever was the best so far.

Richie Rump: Yeah, and I could put indexes on those temp tables too if I really need them. So that adds something else.

Erik Darling: It’s true. When you start joining a lot of tables together, you may have optimal indexes for some joins but not for all of them. If your query from the final product of those requires different indexes, you don’t want to have 100 indexes on your table just to satisfy all these weird edge cases. Sometimes it’s better to dump stuff to the—especially for stuff like paging queries. There are plenty of times where there are perfectly good use cases for CTEs or using offset fetch or using top with CTEs or offset fetch to do paging queries. Other times, if you are letting people order by mounds of different columns on the result, you are better off dynamically creating temp tables and indexes so that you can support that ordering on a smaller set of data. For me though, with CTEs, it comes down to a couple other things. One, CTE results aren’t materialized. So if I’m accessing the results of the CTE multiple times, it has to execute that code multiple times.

Tara Kizer: I didn’t realize that until you talked about that yesterday with a client, so that was interesting information. So if you are going to be calling that CTE table query multiple times, throw that into a temp table to avoid running that multiple times.

Erik Darling: Yeah, one of the first goofy blog posts that I wrote on this site is called “CTEs, Inline Views, and What They Do.” I’ll throw that link into the ye olde chatter box so that you can reference that. It’s a blog post just about what happens if you reference a CTE multiple times with joins or something. What happens is that code re-executes. So CTEs, while they are sometimes good, especially with the top operator for give SQL a bit of a performance pathway or whatever you want to call it. They also do have their problems. They’re not perfect. Oracle has syntax to materialize a CTE, which I wish SQL Server did as well, but for now we have temp tables, which are perfectly good for materializing sets of data.

Tara Kizer: I feel like the fact that it does re-execute the query if you are using it multiple times that that could lead to weird issues where data has come in and the first time you accessed it, it didn’t have this and now it does. For that reason, I would definitely use a temp table if you’re going to be having to query it multiple times.

Erik Darling: For sure.

Tara Kizer: Unless, of course, you’re using RCSI then it’s just going to look at the snapshot I would assume.

Erik Darling: I don’t know. Who knows.

How do I get rid of TempDB spills?

Erik Darling: Let me see if I can figure out a short way to ask this. “I’m getting a tempdb spill on one of my queries that I can’t seem to fix. I’m doing a self join which tallies up a running total from previous days” and he can’t figure out how to get rid of a spill. Well, the only way to fix a spill is with indexes. It depends on what’s spilling. You’ll have to follow up on that, if it’s a sort or a hash spill.

(Note: theoretically, you could also fix it by stuffing the server full of RAM, adding a plan guide that includes artificially high row estimates, rewriting the query to dump stuff into TempDB first, etc., but you really should start with indexes.)

Should I use OPTIMIZE FOR UNKNOWN?

Erik Darling: Victor asks, and I’m just going through in order now, “I have a proc whose plan was somehow cleared from the cache the first execution of the plan causing parameter sniffing and high parallelism. CPU of 100 percent for 40 minutes.” Wow. “When this runs then [mumbles words]. Should I optimize for unknown along with MAXDOP?” So when would you use optimize for unknown?

Tara Kizer: Never. I know that we say use it sparingly or rarely. My answer is never. There’s just better ways of doing things. Figure out what is best. How it runs best for most executions of the parameter values. See if optimize for a value is helpful or maybe an index hint, but optimize for unknown, you’re getting a mediocre execution plan, maybe. So it does guarantee that you’ll have consistent executions—is that what it is?

Erik Darling: Yeah, so much like using a local variable, it will use the density vector to give you a row estimate. So if your data—here’s the thing that sucks about optimize for unknown. If your data is already so uniform that optimize for unknown fixes your queries, you don’t need optimize for unknown in the first place. You just need better indexes. If your data is so skewed that sometimes optimize for unknown is good and sometimes it’s not, then again, you need better statistics at that point. At that point, you’re looking at either filtered indexes or filtered statistics to try to help out things along the way. Optimize for unknown is one of those really dangerous things because as your data changes, unknown stops being a good flat estimate. Sometimes optimize for unknown turns into a whole different backfiring mess. Depending on how often the code runs, I may just want to through a recompile hint on there, especially if that’s for some reason harder than adding an index. I would much rather have a recompile hint if it’s not a frequently executing piece of code.

Tara Kizer: I’ve even put the option recompile on queries that were executed frequently just because we had the CPU overhead to do that.

Richie Rump: Yep. Yep.

What on-call software do you use?

Erik Darling: Here’s a question for Tara. “What page or texting on-call software do you guys use?” Anything free.

Tara Kizer: The first company where I was in an on-call rotation was at Qualcomm. We just had so many people in IT that we just made our own. We didn’t purchase one or use a free one. We just created our own. Then the last job, we didn’t use software. The knock just reached out to—we had a list and the knock just reached out to whoever was on call that week.

Erik Darling: Knock, knock, knocking on Tara’s door.

Richie Rump: Erik, why don’t you give your phone number out there? Because that’s what we do, any problems, we just call Erik.

Erik Darling: 1-900-MixALot.

Richie Rump: “And kick those…” Back to you, Erik.

How do I deal with TempDB spills without changing my query?

Erik Darling: Dave has sort of a follow-up to his question. “How do I deal with tempdb spills without drastically changing my query, if possible?” Any volunteers on that?

Tara Kizer: I’ll let you take that one.

Erik Darling: Without changing your query, so tempdb spills happen most frequently for two operators, and sometimes for one operator. Most frequently it’s a sort or a hash operation, so a hash join or a hash aggregate will most frequently cause a tempdb spill. You can see it sometimes with parallel exchange operator, sometimes those will spill out, but not too often. So for sorts, you need to have an index that supports whatever sort you’re doing so it doesn’t spill. That’s one way to help. For hash joins, again, hash joins get used frequently when the columns involved in the join are not lead columns and indexes that SQL is using so SQL kind of makes this horrible guess at how many rows are going to be coming in for the hash operation. So without drastically changing your query, without changing your query, change the indexes. Change the indexes so that you are no longer doing things that require sorts, like sometimes if your data is medium sized, SQL might choose a merge join and just insert a sort into the plan because it thinks that based on how many rows it’s getting back it’s going to be cheaper. Presort data and do a merge join and a hash join. Most common is because there’s a weird sort or hash in your execution plan. The only way to really get a grip on those is with indexes or with the right indexes I guess.

How do I prevent people from changing my recovery model?

Erik Darling: Graham asks, “I have a vendor who changes the recovery model from full to simple without informing us. Because where I work is cheap, I’ve created custom monitoring to report when database attributes are changed. Anyway, how do I get this special vendor to stop changing the recovery model?”

Tara Kizer: Why do they have access? I’m assuming they’ve hired a third-party DBA company and so that company is helping out? I don’t know but maybe you need to question their experience level if they’re doing that because that’s breaking your RPO. You’ve lost recovery points when it gets switched to simple. You can’t restore a full backup and the entire transaction log chain if went from full to simple. You have to do a full backup once you switch back to full so you can start that log chain again. So I would talk to the vendor and maybe management to say this is an inexperienced person touching my system and we can’t hit our RPO goal because of this.

Erik Darling: Sounds good to me.

What tools can help interpret execution plans?

Erik Darling: Doug asks, “Can you suggest any tools for helping interpret execution plans?” Interpreting execution plans? No. Getting more information, better looking information, is kind of the realm of Plan Explorer, SentryOne Plan Explorer. As far as interpreting them, you know, that’s a highly personal thing. That’s just something you have to learn over time.

Richie Rump: They’ve got books for that.

Erik Darling: They do. Grant Fritchey has a good book on—I think it’s free to download from either Simple Talk or Redgate or something.

Richie Rump: Is that book free? Maybe…

Erik Darling: They have a free download of it. But I think the previous edition is free and then the newest one is paid for. Something like that. I know that Redgate and/or Simple Talk has had his Reading Execution Plans book available for download before. I don’t know if they still do. Maybe they don’t but I remember at one point getting it from them.

Can CHECKDB cause a SAN to restart?

Erik Darling: “A friend of mine has had his DBCC CHECKDB cause his SAN to restart random vhosts. He expects performance degradation and [inaudible] stability issues. Have you guys seen anything like this? Not looking for a solution, just experience.”

Tara Kizer: I haven’t seen that specific with restarting servers that are pointed to the SAN but I’ve worked with large corporations that have dedicated SAN staff and they would come to us and say, “What are you doing in the middle of night that’s causing so much I/O load?” It was always CHECKDB that was running. So they could definitely see a spike but our hardware could support it. It didn’t cause other servers to restart. CHECKDB is a very important task that needs to be done. If your SAN is causing random hosts to restart, you need to get a better SAN.

Erik Darling: I’ve actually had this happen to a physical computer before. The reason that it was happening to me was running DBCC CHECKDB on a five or so terabyte database. This was on 2008 R2. What would happen is we had 512 gigs in there but max memory was set a touch too high. Max memory was set to like 500 gigs or something. We had to knock it down to 450 because for some reason when DBCC CHECKDB ran, it would just swallow memory and Windows would give up. So it could be a memory issue. I would want to make sure that data and memory were as close to even as possible or better than they are now maybe. I don’t know. That’s a tough one to troubleshoot from afar but that’s one thing that I’ve seen, but that was a physical host, not a VM. Anyway, we are at the 45 mark.

What is SQL Intersection in Orlando?

Richie Rump: Here’s a quick question for you though. “What is SQL Intersection in Orlando? Sounds like fun.”

Erik Darling: It’s a conference where everyone learns stuff and then and goes and does Vegas things. That’s all I know because I haven’t been.

Richie Rump: Well this is Orlando so they go and do Orlando things I guess.

Erik Darling: Oh, never mind.

Tara Kizer: Aren’t there two? Yeah, there’s the one in Vegas and one in Orlando. The one in May is in Orlando.

Richie Rump: Yeah, which Brent is speaking at.

Erik Darling: Yeah, go meet Brent.

Tara Kizer: They’ve got some really good speakers at SQL Intersections.

Richie Rump: Yeah, it’s a very curated list. They don’t bring just anybody. They really bring the best of the best there. It’s pretty good.

Tara Kizer: And it sounds like SQL Intersection is more about the learning whereas PASS is a lot about networking.

Erik Darling: … hanging out, beer. All right. That’s enough. I’ll see you guys next week for Office Hours, hopefully. Thank you for showing up and goodbye.

Tara Kizer: Bye.

I’m just saying it’s valid T-SQL syntax, that’s all.

Last Updated February 23, 2017

Brent Ozar

Humor

Max Worker Threads: Don’t Touch That

Last Updated February 15, 2017

Erik Darling

SQL Server

More isn’t faster

I’ve had people give me all sorts of janky reasons for changing Max Worker Threads. I’ve even had people tell me that someone from Microsoft told them to do it.

The thing is, all changing that setting does is help you not run out of worker threads. It doesn’t make queries run any better or faster. In fact, under load, performance will most likely be downgraded to Absolute Rubbish© either way.

What’s worse? Running out of worker threads and queries having to wait for them to free up, or having way more queries trying to run on the same CPUs? Six of one, half dozen of punching yourself squarely in the nose.

In the classroom

Brent is fond of teaching people about CXPACKET by putting you in a classroom, and having the teacher hand out work to students. I’m going to stick with that basic concept.

Your CPU is the teacher, the threads are students. With 10 kids in a classroom, a teacher may have a pretty easy time answering questions, grading tests and homework, planning lessons, etc. You know, teacher stuff.

If you add 10 kids, the teacher will still have to do all that stuff, but now poor ol’ teach is bringing work home nights and weekends, and classes are going off-schedule because more kids have more questions, three of them wrote disruptive reports, one of them keeps asking the same question 1000 times a second, five of them started blocking the chalkboard from each other, and another peed their pants waiting for the teacher to call on them.

Add 10 more kids, and, well… Ever see that movie The Principal?

The lesson

Adding more kids to your classrooms doesn’t make your school any faster. At some point, you need more teachers, too.

Thanks for reading!

Let’s Corrupt a SQL Server Database Together, Part 1: Clustered Indexes

Last Updated February 9, 2019

Brent Ozar

CHECKDB and Corruption

Hold my beer.

CREATE DATABASE [50Ways];
GO
ALTER DATABASE [50Ways] SET PAGE_VERIFY NONE; /* Normally a bad idea */
GO
USE [50Ways];
GO
CREATE TABLE [dbo].[ToLeaveYourLover]([Way] VARCHAR(50));
GO
INSERT INTO [dbo].[ToLeaveYourLover]([Way])
VALUES ('Slip out the back, Jack'),
('Make a new plan, Stan'),
('Hop on the bus, Gus'),
('Drop off the key, Lee')
GO
SELECT * FROM [50Ways]..[ToLeaveYourLover]; /* Yep, we have data */
GO
ALTER DATABASE [50Ways] SET OFFLINE WITH ROLLBACK IMMEDIATE;
GO

CREATE DATABASE [50Ways];

ALTER DATABASE [50Ways] SET PAGE_VERIFY NONE; /* Normally a bad idea */

USE [50Ways];

CREATE TABLE [dbo].[ToLeaveYourLover]([Way] VARCHAR(50));

INSERT INTO [dbo].[ToLeaveYourLover]([Way])

VALUES ('Slip out the back, Jack'),

('Make a new plan, Stan'),

('Hop on the bus, Gus'),

('Drop off the key, Lee')

SELECT * FROM [50Ways]..[ToLeaveYourLover]; /* Yep, we have data */

ALTER DATABASE [50Ways] SET OFFLINE WITH ROLLBACK IMMEDIATE;

Now, let’s corrupt it. Open it with a hex editor – my personal favorite is the free xvi32 because it doesn’t require installation. Just download it, fire it up (you’ll want to run it as administrator), and open the database’s MDF file:

Next thing you know, you’re looking at the contents of the MDF file. The scientific way to approach this would be to identify the exact 8K page you’re looking for, and jump to that point of the file. However, you have my beer, so I’m trying to finish this quickly so I can get back to that. We’ll just click Search, Find, and look for text:

Presto – now you can see the raw contents of your database. What, you thought SQL Server encrypted it or something? Heck no, Stan’s social security number, credit card number, and disease history are all in there, free for the reading.

And for the writing, it turns out. Close the Find box, click on the S in Stan your right hand window, and start typing. I’ll change his name to Flan, and click Save. Close XVI32, and bring the database back online:

ALTER DATABASE [50Ways] SET ONLINE;
GO
SELECT * FROM [50Ways]..[ToLeaveYourLover];

ALTER DATABASE [50Ways] SET ONLINE;

SELECT * FROM [50Ways]..[ToLeaveYourLover];

And see what you get:

Or rather, see what you don’t get: no corruption warnings, no dropped connection, no errors. As far as SQL Server is concerned, this is just nice good data. You can even run DBCC CHECKDB, and no errors will be reported.

What about clustered columnstore indexes?

Look, the only reason I’m doing this is because you’re still holding my beer hostage. Same script, but now with a clustered columnstore index:

CREATE DATABASE [50Ways];
GO
ALTER DATABASE [50Ways] SET PAGE_VERIFY NONE; /* Normally a bad idea */
GO
USE [50Ways];
GO
CREATE TABLE [dbo].[ToLeaveYourLover]
  ([ID] INT IDENTITY(1,1), [Way] VARCHAR(50), [Guy] VARCHAR(50));
GO
CREATE CLUSTERED COLUMNSTORE INDEX cci ON [dbo].[ToLeaveYourLover]
GO
INSERT INTO [dbo].[ToLeaveYourLover]([Way],[Guy])
VALUES ('Slip out the back','Jack'),
('Make a new plan','Stan'),
('Hop on the bus','Gus'),
('Drop off the key','Lee')
GO
SELECT * FROM [50Ways]..[ToLeaveYourLover]; /* Yep, we have data */
GO
ALTER DATABASE [50Ways] SET OFFLINE WITH ROLLBACK IMMEDIATE;
GO

CREATE DATABASE [50Ways];

ALTER DATABASE [50Ways] SET PAGE_VERIFY NONE; /* Normally a bad idea */

USE [50Ways];

CREATE TABLE [dbo].[ToLeaveYourLover]

([ID] INT IDENTITY(1,1), [Way] VARCHAR(50), [Guy] VARCHAR(50));

CREATE CLUSTERED COLUMNSTORE INDEX cci ON [dbo].[ToLeaveYourLover]

INSERT INTO [dbo].[ToLeaveYourLover]([Way],[Guy])

VALUES ('Slip out the back','Jack'),

('Make a new plan','Stan'),

('Hop on the bus','Gus'),

('Drop off the key','Lee')

SELECT * FROM [50Ways]..[ToLeaveYourLover]; /* Yep, we have data */

ALTER DATABASE [50Ways] SET OFFLINE WITH ROLLBACK IMMEDIATE;

After the database is offline, fire up your trusty XVI32, search for Stan, and he’s still visible in clear text:

Edit him, turn him into Flan, save the file, bring it back online, and run the SELECT query again:

A clustered columnstore index works just like a regular clustered rowstore index: if you don’t have page verification turned on, no corruption is detected, even when you run CHECKDB.

PAGE_VERIFY is really important.

When I created the database, I threw in this line:

ALTER DATABASE [50Ways] SET PAGE_VERIFY NONE; /* Normally a bad idea */

1	ALTER DATABASE [50Ways] SET PAGE_VERIFY NONE; /* Normally a bad idea */

That tells SQL Server not to do any page verification when pages are read or written from disk. You have 3 options:

NONE – just a dumb, suicidal idea.
TORN_PAGE_DETECTION – also a dumb, suicidal idea. You can retry this same demo with TORN_PAGE_DETECTION instead of NONE, and you’ll get the exact same results.
CHECKSUM – SQL Server includes a checksum as it writes each page from here on out, and then checks the page when it’s read.

If I repeat either the rowstore or columnstore demo, but this time change the PAGE_VERIFY NONE to PAGE_VERIFY CHECKSUM, I get a totally different result:

Shout out to Mr. Robot for getting me access to the source code

SQL Server didn’t detect the corruption simply by setting the database online – bringing it online didn’t read this particular data page. However, when I read the page, whammo, SQL Server detected that the data on the page didn’t match the checksum.

The checksum isn’t used for data recovery, mind you – only for alerting us that the data is wrong. This isn’t like parity in a RAID array where we can rebuild the data from scratch.

What you need to do next

Run sp_Blitz, which tells you if any databases don’t have checksum enabled
Set checksum on those databases
Do something that causes pages to be written (like rebuilding indexes, but keep in mind that it’s going to generate a ton of transaction log traffic)
Set up alerts to notify you when a corrupt page is read
Attach a corrupt database (like the one you just created) to your production SQL Server, and run the SELECT statement, which should trigger the alerts you just created
Make sure you’re doing CHECKDB regularly from here on out
Talk to stakeholders about what’s being stored unencrypted in the database
Give me back my beer

Read on for Part 2: Nonclustered Indexes

Crappy Missing Index Requests

Last Updated November 16, 2017

Erik Darling

Execution Plans, Indexing, sp_BlitzIndex

When you’re tuning queries

It’s sort of a relief when the first time you get your hands on it, you get the plan and there’s a missing index request. Even if it’s not a super high-value one, something in there is crying for help. Where there’s smoke, there’s a bingo parlor.

But does adding missing indexes from requests always make things better?

The question goes for any tool, whether it’s DTA, or the missing index DMVs, or your own wild speculation. Testing is important.

Not all requests are helpful

In fact, some of them can be harmful. Let’s look at a recent example from the Orders database. After running for a while, I noticed the UpdateShipped stored procedure was asking for an index. And not just any index, but one that would reduce query costs by 98.5003%. That’s incredible. That’s amazing. Do you take DBA Express cards?

CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[Orders] ([ShipDate],[OrderDate])

1 2	CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>] ON [dbo].[Orders] ([ShipDate],[OrderDate])

The code in question is the part where the update actually happens.

UPDATE o
 SET o.ShipDate = DATEADD(HOUR, 24, o.OrderDate)
FROM dbo.Orders as o
WHERE o.ShipDate IS NULL
AND o.OrderDate >= DATEADD(DAY, DATEDIFF(DAY, 0, @MinOrderDate), '19000101')
AND o.OrderDate <= DATEADD(DAY, DATEDIFF(DAY, 0, @MinOrderDate), '23:59:59')

UPDATE o

SET o.ShipDate = DATEADD(HOUR, 24, o.OrderDate)

FROM dbo.Orders as o

WHERE o.ShipDate IS NULL

AND o.OrderDate >= DATEADD(DAY, DATEDIFF(DAY, 0, @MinOrderDate), '19000101')

AND o.OrderDate <= DATEADD(DAY, DATEDIFF(DAY, 0, @MinOrderDate), '23:59:59')

The index that it’s currently using is very thoughtful. Extra thoughtful. Maybe the most thoughtful index I’ve ever created for free. Though somewhat forgetfully named.

CREATE UNIQUE NONCLUSTERED INDEX ix_Orders_OD_SD ON dbo.Orders (OrderDate, ID) INCLUDE (ShipDate) WHERE (ShipDate IS NULL)

1	CREATE UNIQUE NONCLUSTERED INDEX ix_Orders_OD_SD ON dbo.Orders (OrderDate, ID) INCLUDE (ShipDate) WHERE (ShipDate IS NULL)

What about the query plan?

Aside from some baked-in problems, it’s pretty normal. It has a cost of 2829 query bucks. Pretty high! Like I said, baked in problems.

El Stinko

The baked in problems are an exercise for you, dear reader.

So what happens to it when we add the missing index?

CREATE NONCLUSTERED INDEX Crappy ON dbo.Orders (ShipDate, OrderDate)

1	CREATE NONCLUSTERED INDEX Crappy ON dbo.Orders (ShipDate, OrderDate)

It’s crappy! The query doesn’t even use it, but we do now have to update it. ShipDate is in the index, ShipDate is being updated. We have to update the index, like, now. Duh. This query now has a cost of 6792 query bucks. That’s the opposite of a reduction, and a far cry from the 98 point some-odd percent reduction the missing index DMV promised us.

Totally crappy

If we go a step further. Or farther? You tell me. We can add an index hint to force the matter, and of course, forcing the matter makes matters worse. And that matters. The forced index query has a crappy cost of 6837 query bucks. This is why our cost based estimator does not choose this plan on its own.

Most crappiest

See?

Lies And DMV Lies

When running sp_BlitzIndex, we often recommend testing out any index with an estimated benefit of >1mm per day. But that’s the key word: testing. A missing index request that gets added and causes harm will rarely harm the query that’s asking for it. I got pretty lucky here in demoland with an example. Usually you have to add the index, make sure it doesn’t hurt the query you’re adding it for, and then do regression testing on other queries in play. This includes modification queries.

Thanks for reading!

It’s now easier to read the BrentOzar.com archives by category.

Last Updated February 20, 2017

Brent Ozar

Company News

With fifteen years of blog posts, we’d accumulated a lot of junk around here.

“Did I really write a blog post about buying pantyhose for turtles?”

Recently we cleaned out a lot of the trash – old personal posts about boring stuff – and categorized the remaining ~2k posts to make it easier to find stuff by topic.

Now, at the top of BrentOzar.com, when you hover over Blog, you’ll get flyout menus for categories like these:

Happy surfing.

[Video] Office Hours 2017/02/15 (With Transcriptions)

Last Updated February 19, 2017

Brent Ozar Unlimited Team

SQL Server, Videos

Here’s the video on YouTube:

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes

Office Hours Webcast – 2017-02-15

How can I discourage people from using TempDB?

Erik Darling: There’s some long questions in here today.

Richie Rump: If you have a short question, get them in now.

Erik Darling: I’m going to ask the short questions and in my head I’m going to be paraphrasing the long questions. I’m going to try not to “Brent” anyone. Let’s see here—there isn’t a short question, I lied. Here’s one about tempdb. Brian asks if there are any suggestions to discourage developers from abusing tempdb to teach proper tempdb etiquette to developers by not just filter-lessly WHERE clause-lessly, join-lessly dumping tables of values into tempdb and then indexing them and going crazy. So what would you do to teach developers the proper use of tempdb?

Tara Kizer: He references a script that the developers asked to performance tune. It sounds like it’s a hefty script. It’s just really long and needs to do all sorts of stuff. That’s the type of script you probably need to be using temp tables. It’s hard to really answer the specific question, but as far as massive abuse of tempdb, I don’t really discourage developers from using tempdb. So what if it gets big?

Erik Darling: I have a similar take on that. So what if tempdb gets big? But there are uses for temp tables that are better than others. A lot of people will fall back to temp tables because when they do their joins with regular tables, they suck. They have like 20 joins or they have to do all this awful joining together. There’s like no way to index for them because there’s all sorts of different sorts in like every single column. I understand why people dump stuff off to tempdb. My main pet peeve is when they just like blindly dump data in there without filtering it down to a kind of reasonable result set first. That’s my big pet peeve, is when there’s like some SELECT * INTO FROM table, no WHERE clause, nothing else like that. Anyway.

Richie Rump: The other thing in that question is a few hundred gigs almost maxed out your drive, maybe you need to get a bigger drive.

Tara Kizer: I’ve supported a tempdb that was 500 gigabytes normally. This was a mount point so it was dedicated tempdb. We got a storage alert that we were running out of space. I was the on-call DBA and checked it. It was a business user running a query. This was production data but this was a read-only replica for availability groups. It wasn’t the writeable side, just the readable side, where business users could run these gigantic queries. In order to complete this specific query that caused it to grow past 500 gigabytes in size, we had to tell them they need to start breaking up that query. It was this massive, single SELECT query. All sorts of stuff going on. I don’t know why it was using tempdb, because it wasn’t a temp table. Maybe it was an ORDER BY, I forget. I don’t remember what the actual script was, but that is what was needed for this system. Yeah, you can break apart your queries to do smaller work at a time so a single transaction doesn’t cause it to grow that big, but sometimes the instance just needs a large tempdb. It really depends on your workload.

Erik Darling: If you’re dealing with a terabyte+ size databases eventually you’re going to need a terabyte+ size tempdb. That’s just the way things go, not even just for queries. You do maintenance on a big enough table and SQL goes—there are tons of questions. There’s even a specific error you get when you try to sort something in tempdb and there’s not enough space and the file grows and it goes … it’s not fun. Tempdb should be sized in reaction to the size of your data, not to how big you generically think tempdb should be.

Erik Darling: Nick asks, “What is the thing being clustered or not when creating an index?” The key.

Tara Kizer: Just the key. It’s sorted.

Erik Darling: Yeah.

Tara Kizer: Physically sorted that way.

Erik Darling: It’s really quite boring. Whatever columns you choose to be the key of your clustered index, SQL sorts the clustered index by those columns and then at the very leaf level of the index, it has all the rest of the columns in the table hanging out down there, ready to be referenced.
Richie Rump: Right. Most of the time it’s your primary key but it doesn’t have to be your primary key. It doesn’t have to be. You want to think about how the queries are being written and how you’re reading data from it and maybe you’ll have a better primary clustered index. Maybe.

Erik Darling: I think questions like this come up because different database platforms have different but close by terminology. In Oracle there’s a thing called a cluster index where you can actually cluster tables together with an index so that values that you want to join together will be closer together on disk. Not the Oracle expert, but from the brief time I was reading about Oracle, I’m like, “This is exciting.” That’s what I got from it. Think of it what you will.

How can I determine the overhead of full text indexing?

Erik Darling: Here’s an interesting one. By interesting I mean odd. “My devs want to implement full text indexing. How can I predetermine what the overhead of this will be? Is there some metric that can tie the amount of data in the indexes to the drain on system resources?” Not that I know of.

Tara Kizer: I don’t think so. To answer it you have to have a load test environment where you do a synthetic load and you measure it yourself.

Erik Darling: Yeah, this is one of those things where full text indexing is going to be a little different for everyone depending on how you use it. It’s really hard to sort of glass ball what the load would be on your system. Unfortunately, what I would say and probably what Richie is itching to say is that if you want to use full text indexing, you should be looking at a different tool all together. Richie, take it away.

Richie Rump: Yeah, full text indexing has been around a long time, hasn’t really been touched in a long time. During that period, there’s a lot of tools that have come on like Apache Solr and Elasticsearch—probably Elasticsearch is the one everyone gravitates to—that do a much better job of searching text than full text search does. There’s always these weird, odd little things that full text search does that you don’t really expect it to do. Elasticsearch is so much better at searching text-type stuff. This is what it was designed to do than kind of SQL Server is. It would be interesting to see if SQL Server in the future wants to jump in that one like it did with the Hekaton stuff but right now we’re not hearing anything for that. I say go do a prototype on Elasticsearch. You’ll feel much better about yourself. You’ll be able to give the users a lot more results and do a lot more with it than you will with full text.

Erik Darling: I like Elasticsearch a lot. I agree with Richie on that.

What should I do with my career?

Erik Darling: Thomas asks, this is a similar question to one we’ve gotten before. “I’m trying to decide where my career should go. I have no development skills.” Good.

Richie Rump: Time to learn, bud.

Erik Darling: He wants to know, “Should I go into cloud, security, data science…?” Talk about falling off on the end there.

[Laughter]

Erik Darling: Pick something else random. Should I go into seashell collecting?

Richie Rump: I hear bus driving is pretty good these days.

Erik Darling: Yeah, why not? So, take it away, folks. What would you do if you were young Thomas?

Tara Kizer: It’s confusing. Do you want to improve your DBA skills or are you looking to move your career on a different path? You talk about cloud and security and data science. Maybe cloud we’ll do a little bit of that as a DBA, if our company is using cloud stuff. But security, no. Data science, not for a DBA. What are you trying to do? Are you trying to go down a different path or do you want to remain a DBA?

Erik Darling: Security is, I guess, to me anyway, seems like a pretty limited field for databases. Like once you get past securing perimeter stuff and once you get into, “How am I going to manage security on the database?” You read a couple of chapters from BOL about users and logins and roles and permissions and stuff. That’s kind of database security in a nutshell. You figure out what port you’re going to run it on and then what? So I don’t if security is what I would go into if I was a database guy, unless I was really into security. But if you’re going to go really into security, then be prepared to step away from the database. There’s a lot wider world out there than just database security.

Richie Rump: Data science. I think data science comes up because it’s on the top of all these, “Most hottest technology jobs” or “hottest jobs ever.” Data science, woo! Until you meet an actual data scientist and not someone who just calls themselves a data scientist because maybe they read a book or two, until you meet a real one, you don’t really understand kind of what level they are at. I was lucky enough to work for a few hours with a real data scientist. My brain started leaking out of my ears. Essentially, they’re statisticians. They’re statisticians that understand the technology side of it as well. This particular individual was a PhD in statistics. So these are the type of individuals that go and excel in data science. They’re very heavy in the stats aspect of it. They call themselves data scientists now and it’s just a marketing term. I mean, am I programmer or am I a developer? Am I an app developer? It’s just the same thing and it will be something different in five years, they’ll call it something else. But that is the type of learning that I see when I see a data scientist. That’s the type of training that you would be looking into—PhD in statistics.

Erik Darling: If you want to read some DBA-friendly writing on data science stuff, Buck Woody has a blog, Backyard Data Science, where he occasionally—I don’t really know what his blogging schedule is. I see it pop up once in a while in my RSS feed but I don’t think he blogs regularly-regularly there. But his posts are always good and entertaining and there’s enough back fill data for you at this point to read. So it’s Backyard Data Science, Buck Woody’s blog. Really smart, funny guy. Good stuff on there. If you want a DBA-friendly-ish intro to data science stuff, I would head over there and read. Probably try to not let my boss see me doing that because he knows I’m trying to switch jobs at that point.

Richie Rump: Yeah, Buck is the best.

Erik Darling: Buck is the best.

Should I be using OPENROWSET for this?

Erik Darling: Joe asks—oh boy. “Can you recommend a replacement for OPENROWSET for putting the output of a stored procedure into a table variable or temp table?” Why is OPENROWSET not working for you? What is happening?

Tara Kizer: Why are you using OPENROWSET though just to put the output of a stored procedure? Just use INSERT INTO exact stored procedure. Is this for a linked server? What are you using OPENROWSET for?

Erik Darling: There’s different uses for it. Like I was using it in some test functionality for BlitzFirst to get BlitzFirst to run for the duration of some ad hoc SQL or a stored procedure because with OPENROWSET I can just pretty easily dump stuff into a temp table to relieve the server of having to return all the data. So a follow-up question, what about OPENROWSET isn’t working and what exactly are you trying to do with these rows?

How important is physical database design when I have SSDs?

Erik Darling: Here’s one from Matthew. I think Richie is going to have something on this. “How important is physical database design in an era of SSDs, SANs, etc.?”

Richie Rump: I typically still do the physical design. Mainly it’s because a lot of the tools still kind of force you to do a physical design still to get it out. A lot of times, if you’re designing for multiple databases, that’s kind of helpful to do your logical and then your physical design so then you can output the logical design into multiple databases. Don’t think of physical design as “I’m trying to make things faster.” I’ve never looked at it in that way. It’s a way that you can get your indexes in and things like that, stuff that doesn’t really fit well into your logical design. If you want to read more about that, Louis Davidson’s Pro SQL Server Relational Database and Design Implementation, good book on that. So physical design doesn’t necessarily mean I’ve got to care about what’s going on in the hardware aspect of it. It’s how it’s physically going to be implemented into the database, whatever platform that you’re running.

Erik Darling: A lot of people used to make a much bigger deal about separating data and log files and putting certain indexes on certain file groups off somewhere else. Which made a lot more sense when storage was direct attached. For SSDs, if they’re direct attached, sure, you can still seem some difference there. For the SAN, it just doesn’t make a lick of difference. Not a single drop. Tara, what say you, ye?

Tara Kizer: The question is “in an era of SSDs and SANs.” A lot of companies don’t have those even though they’ve been around for a lot of years. So physical databases are still really important to a lot of companies, your hardware aspect, your disks. We’ve got a lot of clients that have slow I/O issues. They’re just running 15k disks, no SSDs. A lot of them don’t have SANs. Physical databases, as far as hardware goes, is really important to those types of companies because they’re not getting it right, that’s for sure.

Richie Rump: Yeah, I haven’t seen a server like that in many years, probably maybe a couple decades. It’s been a very long time. In fact, I’d just go ahead and spin up a cluster in Amazon and, poof, I don’t have to deal with any of that stuff either.

Erik Darling: That’s not exactly true because you can pick different kinds of storage up in the cloud.

Richie Rump: Yes, but [inaudible] stuff I’m using right now.

Erik Darling: Of course you do.

Tara Kizer: Erik and I know how bad the disks are in the cloud, the default ones.

Richie Rump: Yeah, they’re terrible.

Tara Kizer: You can’t do anything.

Erik Darling: No, it’s like just sludge, man. Sludge.

Richie Rump: Well since you’ll be using this new system, I’m going to definitely choose the slow disks for your stuff.

Erik Darling: It’s okay because I’m going to choose a box with the most RAM on it so I don’t have to deal with that.

Tara Kizer: There you go.

Erik Darling: Stick my thumb in your eye.

Why does our SQL Server’s CPU go up at 9:30pm?

Erik Darling: Daniel has a question. He starts his question with, “I have a question.” Threw me off my whole game. “For some reason the CPU on our production server has been spiking regularly every night around 9:30 for the past week. I have Redgate monitor.” Okay. “Look at the timeframe and it is not reporting any heavy queries running. Is this a virtualization issue? The top ten queries at that time are running in less than 5 seconds each.” Anyone? Virtualized?

Tara Kizer: I had a client that we couldn’t figure out why the CPU was spiking. It was a virtual environment. Brent said to take a look at Perfmon and add the counter for CPU utilization and there’s another further down below, I think it was like virtual CPU. So that would have told you what the CPU utilization was on the virtual host, not just your guest. He was saying that there would be a discrepancy there. The issue ended up being on the host, not the actual VM.

Erik Darling: This is like one of those horrible downsides of VMs that no one really talks about, especially when you put your SQL Server out there. Because you have VM admins, who are not DBAs, who if they need to will shuffle a whole bunch of other guests onto your host for no particular reason other than—or they have automated load balancing set up. You deal with the noisy neighbor. So unless you have reservations set up for your VM, where you get to hang on to your CPUs and your memory and all that stuff, then you can run into some issues. It may not be your SQL Server’s fault. So outside of SQL Server, I would yell at my VM admins and see what they’re doing. See if they’re shuffling people onto my host or whatever. For a lot of VM environments if they are putting SQL Server on them and they’re hosting multiple SQL Servers, those hosts will be kind of like holy ground. They will not let other kinds of servers live on those. They will be for SQL Servers only and those SQL Servers will take up an even portion of the resources on there, like either half and half or in quarters. I would take a closer look at my VM setup for that.

Tara Kizer: I would wonder, CPU is spiking. Do you know that it is the SQL Server .exe process? Have you gone down to that level in Perfmon to see exactly which process is the one that’s spiking? If it truly is SQL Server, Redgate monitor and all of those other monitoring tools, they’re not looking at every single query that runs during a timeframe. They’re sampling it. Every few seconds they’re taking a look at what’s running. So maybe you don’t have any long-running queries but maybe you have a spike in the amount of queries that are running. Maybe even just do a server-side trace, an extended events session, and just gather everything for like two minutes. Don’t run it for a very long time, just for maybe five minutes or so. That way you can collect the data, put it into a table and then order by the CPU column and see what happened there.

Erik Darling: Works for me, man.

Do I need a data lake?

Erik Darling: Eric Swiggum, I wish I had an answer for you but I don’t know. “I had a developer come to me saying they need a data lake now but they seem to be confused about the concept, as am I, and I’m not sure if the implementation will be successful.” Aside from coming up with a POC and testing it, I don’t have any good advice. I don’t know if you need a data lake either.

Tara Kizer: If they’re confused about it also, why are they asking for it? I’d say if you don’t know what it is, you don’t get to use it.

Richie Rump: There will be a test at the end of the webinar.

Erik Darling: If I were you I’d make them watch a bunch of those Jason movies and see if they still want a data lake.

Should I add developer skills to my resume?

Erik Darling: Thomas follows up. He says, “I work mostly on the infrastructure side right now. I’m looking to expand beyond that.” He likes all the tech and it’s hard picking one thing. “Seems like most DBA jobs are developer DBA rather than production DBA work. That’s why I’m looking to get some focus on that side.”

Tara Kizer: I wonder what industry you’re in that you’re mostly seeing development DBA stuff—or not industry—what city you’re in because there are a ton of production DBA jobs out there when I get the LinkedIn emails and stuff. I don’t really see a whole lot of developer DBA jobs here. Sometimes it will be named a little bit differently but for the most part I’ve only seen production DBA jobs.

Erik Darling: At least when you look at job postings like that, try to take them with a grain of salt. Really read the job description. If you send in your resume and you go in for an interview, I would really prod the people about what exactly their expectations of a DBA are because they might expect you to develop something that is not like a performance thing. They may not expect you to develop like queries that are going to be running for clients or whatever. They may expect you to develop some other process that does stuff like if they want to set up partitioning and automate it. They may be looking at something like that and that sort of thing could be production DBA work because, of course, partitioning is not a performance feature.

Should I consider the cloud for DR?

Tara Kizer: I like Graham’s question.

Erik Darling: Yeah, go for it. You read that one.

Tara Kizer: “As DBA, I’ve asked about having a cloud DR strategy and in the past I’ve been told it’s too expensive. Our area has been experiencing a natural disaster.” I wonder if this is the spillway that’s having issue in northern California—I forget what the city name is—Oroville. It’s near Sacramento. I think it’s Oroville. But anyways. “I brought up exploring a cloud DR strategy and was told the same thing, too expensive.” I’m kind of surprised that you’re getting that answer for cloud servers. “By the way, we have no secondary center and on-site backups. If my org can’t grasp the need for a DR strategy now, will they ever?” If there’s a natural disaster happening near you and they aren’t freaking out already, if you’re in one of those towns that possibly could get flooded, I mean, they probably just don’t want to spend the money. I’ve always worked in environments of companies that had DR sites. These days, with the cloud solutions out there, you have a much cheaper option than having to have your own DR site and servers and managing all that. I know we’ve got a white paper coming up, and I know we’re allowed to talk about this now, that we wrote for Google about using Google’s cloud to copy your backups into their cloud, into a storage bucket. You don’t have a server up but if you ever had a natural disaster, you could then spin up a VM in the cloud and then copy those files down to that server and get up and running on that. That would be a pretty cheap solution because you’re only having your files in the cloud. You’re not hosting the server until you actually need it.

Richie Rump: Yeah, until maybe they lose everything, maybe their mind will change. It’s probably what’s going to have to end up happening. When the whole building gets wiped out and all of a sudden they maybe have a backup that maybe you took, that’s probably when they’re like, “You know, maybe we should be able to get something that’s a little, I don’t know, that we could get up a little faster than a couple months.”

Tara Kizer: Do you at least have your backups being copied to tape in off-site iron mountains, that type of thing?

Erik Darling: It doesn’t sound like it.

Tara Kizer: I would hope something is going offsite.

Erik Darling: I think if I were the DBA in that position I would be totally fine with that as long as I had it in writing.

Tara Kizer: Yeah.

Erik Darling: As long as I was like, “This is what I told you guys however many months back and you said no because it was too expensive. Now we’re in this position. I can’t do anything for you.” As long as I had that in writing I would be totally cool with being like, “I have one less computer to manage.”

Tara Kizer: Talk to them about RPO and RTO goals. A lot of companies have those numbers for on-site issues. It may be they have the same RPO and RTO goals for disaster recovery but show them what the actual reality is of those RPO/RTO. Your goals are not the same thing as your current state. Let them know how long it would take to bring up production if a natural disaster occurred and maybe you’d have total data loss if your backups aren’t going anywhere.

Richie Rump: Yeah, in accordance with that, I used to be a project manager and I used to always to talk to my sponsors about risk. I used to have like a punch board of just, hey, if this risk occurs and what’s the severity, what’s the likelihood that that’s going to occur: high, medium, or low. Then how would we fix that. Having these lists and sharing that with people and saying, “These are our risks: high, medium, and low. There’s a high risk for this to happen.” Or maybe it’s a low risk that it may happen but it will cost a lot of effort to get this thing back up and running, having those types of things in front of people will actually be like, “Oh, I didn’t know that that was a problem because I never thought about it because I’m the VP of whatever and frankly servers aren’t my deal.” So keeping that kind of list was really helpful when I was a PM.

What’s the best branching strategy for database source control?

Erik Darling: Richie, I’m sorry, you just got done talking but I’m going to pick on you again. Matthew asks, “What is the best branching strategy for source control in a database project? We are using VSO front end and a Git back end.”

Tara Kizer: What’s VSO?

Richie Rump: Visual Studio Online.

Tara Kizer: Online? Okay.

Richie Rump: It depends. I think when you talk about branching with databases, I still haven’t found a database strategy that I like. Code is a lot easier but databases are a little bit harder mainly because of the deployment aspect of it than anything else. I’d say go for it. Pick one, whether you’re using something like the Git strategy or whatever and see how it works. Whatever sticks with your team is usually the one that I would stick with. Pick one, try it for a few weeks. If it doesn’t work, go for something else that kind of works with your team. I think branching is more what works with your team than actually what works technically. So just try something with your team. Talk it over with them, get some feedback. If it works, stick with it. If it doesn’t, go somewhere else.

Erik Darling: Yeah, I think it’s a bit more of a cultural thing than a tech thing. But what do I know? I’m not a developer.

Does full recovery model slow down performance?

Erik Darling: J.H. has an interesting question, “Does putting a database in full recovery model slow down performance?”

Tara Kizer: He said model!

Richie Rump: It’s catching on.

Erik Darling: “And are there any native methods to query the transaction log?” Yes and yes. That’s my answer. Tara, what do you say?

Tara Kizer: Does it really affect performance though?

Erik Darling: Inserts, updates, and deletes.

Tara Kizer: Even if you use simple recovery model it’s still writing those transactions to the transaction log. You’re not getting rid of that step. They still get written to the transaction log it’s just what happens at the end of the transaction. If you’re using simple recovery model, that transaction is now gone from the transaction log. So, yeah, you do have to do log backups for full recovery or bulk-log but do they really slow down performance in the recovery model?

Erik Darling: Yeah, so like one thing that a lot of people might be getting in simple recovery and not know it is minimal logging.

Tara Kizer: Right.

Erik Darling: Minimal logging is pretty magical for inserts. So if you’re getting minimal logging sort of by accident, if like it’s Tuesday on Mars and everything else is aligned for you and you haphazardly get it, than, yes, it can totally way speed up inserts like crazy. But, other stuff, not really.

Tara Kizer: What version did that start out in?

Erik Darling: Minimal logging? Oh, god, 2005 maybe?

Tara Kizer: Oh really? Okay.

Erik Darling: It’s been around forever. So like your database has to be in simple or bulk-logged. And this is probably why—you probably never saw much of it because you’re Miss Full Recovery Model.

Tara Kizer: Yeah, I was. Only non-prod use. I never even used bulk-logged because your RPO is destroyed with bulk-logged.

Erik Darling: Bulk-logged is like a weird joke.

Tara Kizer: Yeah, I knew about minimal logging as far as like truncate and bulk inserts and things like that, but are you saying that the insert into T-SQL command can use minimal logging?

Erik Darling: Yeah.

Tara Kizer: Okay, I’ll look into that.

Erik Darling: …sometimes you have to use a trace flag.

Tara Kizer: Okay.

Erik Darling: It only works on tables of like an empty clustered index and no non-clustered indexes or a heap…

Tara Kizer: So a lot of rules.

Erik Darling: If you’re accidentally getting it, it’s really helpful. Sometimes even when if you’re doing everything right it doesn’t work. Minimal logging. One of those things.
Tara Kizer: Michael asked what is RPO because I kept saying RPO and RTO. RPO is your recovery point objective, that’s how much data can your company lose. RTO is your recovery time objective, basically how long can you be down. How long is it going to take you to restore a failover or whatever it is.

Richie Rump: I’m pretty sure we have a page on that.

Tara Kizer: We do.

Erik Darling: Like a billion of them. Billions of pages. All right, thank you for joining us. We’ll see you next week. Goodbye, everyone. Adios.

Guess the SQL Server 2017 Release Date Contest

Last Updated October 1, 2017

Brent Ozar

SQL Server 2017

286

They used to release doves, but that’s so John Woo

We’ve been doing this a while now, and time to kick it off again.

Leave one – and only one – comment here with your guess of the date. If you leave multiple comments, only the first/earliest one is going to count.
“The date” is the date that Microsoft announces as the release date. (Not the date they announce the release date, but the date where the final RTM bits will be downloadable to non-insiders, ordinary folks with MSDN access.)
Closest to win, without going over, wins an Everything Bundle.
In the event of a tie, the earlier comment wins.
Only comments more than 48 hours earlier than Microsoft’s public release announcement will count. If Microsoft makes their announcement, and you run over here trying to leave a fast comment with the release date, not gonna take it.
If Microsoft announces two release dates – one for Windows, and one for Linux – then we’ll pick a separate winner for each. (But you only get to leave one date in the comments.)

So what date are you feelin’?

Update 2017/02/23: Helpful reader Brian Boodman wanted to see what dates people were guessing, so he writes:

Quick and dirty C# snippet for “Guess the SQL Server vNext Release Date”. To use, save https://www.brentozar.com/archive/2017/02/guess-sql-server-vnext-release-date-contest/ to c:\tmp\toto\brentdate.html” using Chrome and run the code. The results of this script are intended to be read by a human. http://pastebin.com/nUTTEkqc Modify as needed. Dates earlier than 2017 are very unreliable.

Update 2017/09/25: Looks like the release date will be October 2, 2017! The winner is Martin Surasky, who guessed it first on July 19th. Runners-up are Stephen Holland (who guessed Oct 3 on Feb 17) and Joe Bednarz (who guessed Oct 1 on Feb 17).

When THREADPOOL Waits Lead To RESOURCE_SEMAPHORE Waits

Last Updated January 8, 2019

Erik Darling

Memory Grants, Wait Stats

Your server is underpowered

That’s an understatement. Your server sucks.

It has four cores in a single socket, data outpaces RAM by a country mile, the disks have whiskers, and the network card still has a phone jack in it.

Alright, so maybe it’s not that bad, but it’s bad enough that you run into trouble.

Always Be Blitzing

You’re a smart goofball, though. You run sp_Blitz, and sometimes you even read the output. For servers in really sad shape, it’ll warn you about something we call Poison Waits. These are a group of wait types that can really ruin your day.

Side effects range from login and query timeouts, degraded performance, nausea, vomiting, and diarrhea. Actually, those last three are limited to you, when your phone and inbox start lighting up with frantic high priority emails from people with office doors.

Unfortunate

One thing I see pretty frequently on these 99 cent bin servers, is people will respond to THREADPOOL waits by upping Max Worker Threads to accommodate additional requests. This may seem fine at first, and when your server is under similar load performance degrades… well, less. But adding all those threads to your CPUs that came with a download code for Blood II only gets you so far.

The problem with all those new worker threads is that each one takes up memory. How much? 2048 KB, or 2 MB for the KB impaired.

All those new threads, taking all that same old memory up. Any guesses where that memory comes from?

Hint: Stuff your users need. And eventually, if you have enough going on, stuff your system needs. I’ve seen people push MWT up to 2000, which is 4 GB of memory if they all get used. Depending on how much is in your system, and what else is going on, this can contribute to other Poison Waits like RESOURCE_SEMAPHORE and RESOURCE_SEMAPHORE_QUERY_COMPILE.

Better way?

The first thing to look at if you hit THREADPOOL waits are your parallelism settings. If you’re curious about those, checkout our Setup Guide. If you’ve already set those, you may need to cut MAXDOP in half, and/or increase Cost Threshold For Parallelism, so fewer queries go parallel, and fewer cores get used when they do.

After that, take a good hard look at your queries and indexes. You can use sp_BlitzCache and sp_BlitzIndex, which are also available at the First Responder Kit link above. Reducing query cost will further reduce CPU strain. This is a good thing.

Finally, tell your cheapskate boss to buy some decent hardware. If you need a starting point, this is what I have sitting under my desk. If that’s way more than what’s in your production server, you’ve got a problem.

Thanks for reading!

Brent says: when we see these symptoms, it’s almost always on VMs with 2-4 cores and 8-16GB RAM. Remember, folks, licensing is expensive. You wouldn’t bring home a present from Tiffany’s and wrap it in a Happy Meal box. Don’t run SQL Server on your grandpa’s laptop.

What is Batch Requests/sec?

Last Updated February 14, 2017

Tara Kizer

Load Testing, Monitoring

When I first look at a server, I want to know how busy it is, where its bottlenecks are, what is SQL Server waiting on and many other things. Batch Requests/sec is one of the data points that is used to measure how busy a server is.

WHAT IS BATCH REQUESTS/SEC?

Batch Requests/sec is a performance counter that tells us the number of T-SQL command batches received by the server per second. It is in the SQLServer:SQL Statistics performance object for a default instance or MSSQL$InstanceName:SQL Statistics for a named instance.

WHAT COMPRISES A BATCH?

If I execute a stored procedure that has multiple queries and calls to other stored procedures, what will Batch Requests/sec show? Let’s test it to find out.

I created three stored procedures:

Run four SELECT queries
Call the first stored procedure twice
Call the first stored procedure, call the second stored procedure and run a SELECT

CREATE PROC BatchRequestsTest1
AS
SET NOCOUNT ON;

SELECT DisplayName, CreationDate
FROM Users
WHERE Id = 1;

SELECT PostId, CreationDate
FROM dbo.Votes u
WHERE Id = 44;

SELECT *
FROM PostTypes;

SELECT TOP 100 * 
FROM Comments
ORDER BY Id DESC;

GO

CREATE PROC BatchRequestsTest2
AS
SET NOCOUNT ON;

EXEC BatchRequestsTest1;

EXEC BatchRequestsTest1;

GO

CREATE PROC BatchRequestsTest3
AS
SET NOCOUNT ON;

EXEC BatchRequestsTest1;

EXEC BatchRequestsTest2;

SELECT Text, Score, CreationDate
FROM Comments
WHERE CreationDate > '03/08/2015';
GO

CREATE PROC BatchRequestsTest1

SET NOCOUNT ON;

SELECT DisplayName, CreationDate

FROM Users

WHERE Id = 1;

SELECT PostId, CreationDate

FROM dbo.Votes u

WHERE Id = 44;

SELECT *

FROM PostTypes;

SELECT TOP 100 *

FROM Comments

ORDER BY Id DESC;

CREATE PROC BatchRequestsTest2

SET NOCOUNT ON;

EXEC BatchRequestsTest1;

CREATE PROC BatchRequestsTest3

SET NOCOUNT ON;

EXEC BatchRequestsTest1;

EXEC BatchRequestsTest2;

SELECT Text, Score, CreationDate

FROM Comments

WHERE CreationDate > '03/08/2015';

I have no other load on my instance. Batch Requests/sec is 0 (see next section for where to find it).

When I execute BatchRequestsTest1, Batch Requests/sec turns to 1 even though BatchRequestsTest1 contains four SELECT queries. It then goes back to 0.

EXEC BatchRequestsTest1;

1	EXEC BatchRequestsTest1;

Batch Requests/sec is 1 when I execute BatchRequestsTest2 and then goes back to 0.

EXEC BatchRequestsTest2;

1	EXEC BatchRequestsTest2;

You can probably already guess what Batch Requests/sec will be when I execute BatchRequestsTest3.

EXEC BatchRequestsTest3;

1	EXEC BatchRequestsTest3;

Highlight all three executions and then click execute. Batch Requests/sec is 1 as all three executions are in one batch.

EXEC BatchRequestsTest1;

EXEC BatchRequestsTest2;

EXEC BatchRequestsTest3;

EXEC BatchRequestsTest1;

EXEC BatchRequestsTest2;

EXEC BatchRequestsTest3;

No matter how many queries are inside a batch, it will add 1 to Batch Requests/sec. For systems that have very long, complex stored procedures, Batch Requests/sec may not be a good metric to determine how busy the server is. You have to combine the metric with everything else you look at.

WHAT IF MY SCRIPT HAS GO IN IT?

Let’s look at one more example. This script has two GOs. GO signals the end of a batch of T-SQL statements in some utilities, such as Management Studio. If I execute the below script all at once, what will Batch Requests/sec be?

BEGIN TRAN;

UPDATE Users
SET Reputation = Reputation + 1
WHERE Id = 1;

UPDATE Users
SET Reputation = Reputation + 1
WHERE Id = 1;

UPDATE Users
SET Reputation = Reputation + 1
WHERE Id = 1;

UPDATE Users
SET Reputation = Reputation + 1
WHERE Id = 1;
GO

UPDATE Users
SET Reputation = Reputation + 1
WHERE Id = 1;
GO

ROLLBACK TRAN;

BEGIN TRAN;

UPDATE Users