Blog

Execution Plans, Indexing, SQL Server, TempDB

People often don’t give this thought

Which is a shame, because I see people sticking fairly large amount of data into temp tables. On the rare occurrence that I do see them indexed, it’s a nonclustered index on a column or two. The optimzer promptly ignores this index while you select 10 columns and join 10,000 rows to another temp table with another ignored nonclustered index on it.

Now, not every temp table needs a clustered index. I don’t even have a rule of thumb here, because it depends on so many things, like

Are you joining it to other tables?
What does the execution plan look like?
Is this even part of the stored procedure that needs tuning?

Pretend That We’re Dem

Assuming that all three things lead us to index temp tables, let’s create some and see how they react to different indexing strategies.

SELECT TOP 100000 *
INTO    #TempUsers
FROM    dbo.Users AS u;

SELECT TOP 100000 p.*
INTO    #TempPosts
FROM    dbo.Posts AS p
JOIN #TempUsers AS tu ON p.OwnerUserId = tu.Id;

SELECT TOP 100000 c.*
INTO    #TempComments
FROM    dbo.Comments AS c
JOIN #TempUsers AS tu ON c.UserId = tu.Id
JOIN #TempPosts AS tp ON c.PostId = tp.Id;

SELECT  *
FROM    #TempUsers AS tu
JOIN    #TempPosts AS tp
ON      tu.Id = tp.OwnerUserId
JOIN    #TempComments AS tc
ON      tc.UserId = tu.Id
        AND tc.PostId = tp.Id;

SELECT TOP 100000 *

INTO #TempUsers

FROM dbo.Users AS u;

SELECT TOP 100000 p.*

INTO #TempPosts

FROM dbo.Posts AS p

JOIN #TempUsers AS tu ON p.OwnerUserId = tu.Id;

SELECT TOP 100000 c.*

INTO #TempComments

FROM dbo.Comments AS c

JOIN #TempUsers AS tu ON c.UserId = tu.Id

JOIN #TempPosts AS tp ON c.PostId = tp.Id;

SELECT *

FROM #TempUsers AS tu

JOIN #TempPosts AS tp

ON tu.Id = tp.OwnerUserId

JOIN #TempComments AS tc

ON tc.UserId = tu.Id

AND tc.PostId = tp.Id;

We have three temp tables with 100k rows in them. I joined them a bit to make sure we get good correlation in the data.

If we get the plan for select statement, this is our query plan. A bunch of table scans and hash joins. SQL is lazy. I don’t blame it one bit.

After frequent user complaints and careful plan analysis, we decide to add some indexes. Let’s start with nonclustered indexes.

/*Nonclustered*/
CREATE NONCLUSTERED	INDEX ix_tempusers ON #TempUsers (Id);
CREATE NONCLUSTERED INDEX ix_tempposts ON #TempPosts (Id, OwnerUserId);
CREATE NONCLUSTERED INDEX ix_tempposts2 ON #TempPosts (OwnerUserId, Id);
CREATE NONCLUSTERED INDEX ix_tempcomments ON #TempComments (UserId, PostId);

/*Nonclustered*/

CREATE NONCLUSTERED INDEX ix_tempusers ON #TempUsers (Id);

CREATE NONCLUSTERED INDEX ix_tempposts ON #TempPosts (Id, OwnerUserId);

CREATE NONCLUSTERED INDEX ix_tempposts2 ON #TempPosts (OwnerUserId, Id);

CREATE NONCLUSTERED INDEX ix_tempcomments ON #TempComments (UserId, PostId);

I’ve gone ahead and added two indexes to #TempPosts so the optimizer has a choice if joining in another order is more efficient.

How’d we do?

Not quite there. Two table scans and an index seek followed by a key lookup that SQL estimates will execute 62k times.

What's wrong with your hair? — What’s wrong with your hair?

Do over

Let’s try that again with clustered indexes. We can only give #TempPosts one, so I picked at random. In real life, you might want to do your job!

/*Clustered*/
CREATE CLUSTERED INDEX cx_tempposts ON #TempPosts (Id, OwnerUserId);
CREATE CLUSTERED INDEX cx_tempusers ON #TempUsers (Id);
CREATE CLUSTERED INDEX cx_tempcomments ON #TempComments (UserId, PostId);

/*Clustered*/

CREATE CLUSTERED INDEX cx_tempposts ON #TempPosts (Id, OwnerUserId);

CREATE CLUSTERED INDEX cx_tempusers ON #TempUsers (Id);

CREATE CLUSTERED INDEX cx_tempcomments ON #TempComments (UserId, PostId);

Our indexes get used and everyone is happy. Well, sort of. TempDB isn’t happy, but then it never is. It’s like your liver, or Oscar the Grouch.

Used and abused

When starting to index temp tables, I usually start with a clustered index, and potentially add nonclustered indexes later if performance isn’t where I want it to be.

You can do yourself a lot of favors by only bringing columns into the temp table that you need. If it’s still a lot, there’s a much higher chance SQL will judge a table scan cheaper than using a narrow nonclustered index along with a lookup.

Remember to test this carefully. You want to make sure that adding the indexes doesn’t introduce more overhead than they alleviate, and that they actually make your query better.

Thanks for reading!

Build Me A Build: What’s Your Hardware Budget?

Last Updated February 14, 2017

Hardware

UPDATED 12/15/2016

Spending Brent’s Money

Reason number infinity spacetime + 1000 why working here rules is having a generous annual ~~tax writeoff~~ hardware budget. The first time around, I basically blew it all on working from home essentials. This time around I’m building a cool toy: a big beefy desktop.

Why, in the age of the cloud, am I doing this? Well, our AWS lab is cool, and has versions going back to 2005 for testing script compatibility, but I have to turn it on and off, spinning up new machines adds new costs, and spinning up enough to account for the different HA/DR scenarios I’d like to explore would be annoying. There are other reasons the cloud can be difficult: networking, good I/O is expensive, and emulating shared storage is a bummer.

With a beefy home desktop, I can create my own VPC with all the VMs I want, my own AD, and with enough hardware, the limits on what I can set up are few. I can have VMs for Log Shipping, Mirroring, Failover Clusters, and AGs. I can also mix features to look at interoperability.

Doing that on my laptop would kinda stink, especially because most of the time I’m also trying to work on it. That means I have a bazillion Chrome tabs open, Slack, PowerPoint, Excel, SSMS, and various productivity apps like Spotify. With 32 GB of RAM, I have plenty of room, until I start doing anything with the Stack Overflow database. While I could get a second laptop, it costs a lot more to put in commensurate hardware, and you hit hardware caps a lot earlier than you do with desktops.

Shutting SQL down helps, but I still have a lot going on.

My CPUs are generally quiet. These aren’t a problem, until I start doing something stupid in a couple different SSMS windows while I’m trying to do other stuff. Then things get messy.

Choices

Since a desktop would be the best fit, I started trying to find one that would work for me out of the box. Cursory Google searches for “128 GB desktop” were useless. Sites like Logical Increments can give you a decent baseline for building a gaming computer, but the RAM tops out at 64 GB, and you can’t mix and match components to test different combinations for different price points. This didn’t help me because I don’t need a crazy graphics card, but if I had to build a gaming PC, I’d probably start here.

Where you can play the mix and match game is PC Part Picker. Here you can mix and match, you get warnings if parts are known to be incompatible, and you can keep an eye on your power draw vs your power supply. Good? Good.

Honorable mention: Buying a barebones Dell Precision workstation and outfitting it with the parts I want. You get the case, the motherboard, a power supply, and (usually) a fan. The rest is up to you. My processor choices here were limited to Xeons, which wasn’t a dealbreaker, but the sourcing made it a bit more difficult to check for incompatibility issues.

What did I end up with?

The link to the build is here, if you want to play with it.

During the process, I sent my potential build to a couple people who really helped me out. My Favorite College Boy, Jeremiah Peschka, and The Man Whose Tweets Change Software Development, Nick Craver. While I’d love to build the official Stack Overflow developer desktop, it’s a little too far above my budget cap of $3000. I could have split the costs into next year, since it’s so close to the New Year, but whatever. I’d rather save that just in case I have a laptop failure, or Mark Rippetoe designs a computer desk that’s also a power rack, for full

SSD

HIP

DRIVE

Are you out there?

So, hardware people, have at it. See any glaring problems? Obvious tweaks? Better choices? Let me know. I haven’t started ordering parts yet.

I’ll be blogging more about this when I do start ordering parts and building in a week or so. It’s my first from-scratch desktop build, so it should be a hoot.

Thanks for reading!

UPDATE!

After tons of great feedback in the comments section, I made some changes to my build. Most notably, upped the CPU, downed the GPU, and went with the slightly newer M.2, plus I got rid of the monitor, since I really don’t need it. I also took off one SSD to compensate on the price. I’ll probably end up ordering drives after the 1st to fill up the ICY DOCK, but it’s okay because I don’t need those immediately. I have no idea how long it will take me to get parts and build this, so waiting on extra drives isn’t a big deal. Of course, here’s the link if you want to mess with it.

Brent says: It’s crazy that you can get 128GB RAM in a desktop for under $3k. Also crazy: 38″ ultrawide curved monitor for $1500. When you get $3k/year for hardware, you can start making these longer-term investments in your working gear.

Things I Would Love to Put In sp_BlitzCache

Last Updated April 21, 2017

Execution Plans, First Responder Kit, SQL Server

This is my favorite script to work on

Getting stuff to work sp_BlitzCache is so rewarding because XML querying is so ridiculous. I wish Mikael Eriksson would write a book already; it’d make my life a whole lot easier.

Along the way, I see a lot of things that I would love to warn people about, but it’s either not present in cached plans, or not retained in the XML at all.

Here’s a list of my top 5 Query Plan XML wishes

Isolation Level

It would be great to be able to call out potentially harmful isolation levels.

Aside from read uncommitted, being able to warn about repeatable read or serializable would be nice. Especially when it’s set at the query level.

On a related note…

Locking Hints

There’s an open issue for this in the GitHub repo.

The best option so far is wildcard searching the query text for hints like ROWLOCK, PAGLOCK, HOLDLOCK, AND UPDLOCK. That stinks, especially for long queries.

This seems like really logical information to include. If someone didn’t put the hints in, the SQL Handle would be different, etc.

Partition Information

This is only available in actual plans! You can see which partitions were used, and which were eliminated.

At minimum it would be nice to tell people if they’re getting any partition elimination, but that’s a no go with cached plans.

TempDB Spills

Another actual-only piece of information. I understand why this isn’t in the XML. One execution might spill, another might not.

It would be cool to have aggregate information about spills available somewhere, like sys.dm_exec_query_stats though, since so much other related information is in there.

Cascading Foreign Keys

Did you know that when foreign keys perform cascading actions (update/delete), they take out serializable locks under the covers?

That stinks. And the actions just get tacked onto the plan without any notice. Catching these and triggers would be fun.

People might be surprised how much weight they can tack on.

Is there anything I’m missing?

If you take a look at the showplan schema, there’s tons of information to harvest. If there’s anything you’d like included in sp_BlitzCache, head on over to GitHub to make a request.

Just make sure it can be caught in cached plans, and that you can provide a repro script. There’s a lot of great stuff that’s actual only in all that mess, like the newly added wait stats.

Thanks for reading!

First Responder Kit Updates: New Checks Across the Board

Last Updated February 13, 2017

Brent Ozar Unlimited Team

First Responder Kit

It’s the December to Remember Script Event, and we’ve got updates to everything. Shout out to our contributors this month: @BlitzErik, @RichBenner, @runamuk0, @SQLMonger, and @BrentOzar.

Grab the newly updated First Responder Kit, and here’s what’s in it:

sp_Blitz Improvements

#591 – add check for 2016 Standard installed without Service Pack 1 (coded by @RichBenner)
#615 – add check for trace flag 834 used in combination with columnstore indexes (coded by @RichBenner)
#603 – add check for min server memory = max server memory (suggested by @BlitzErik, coded by @RichBenner)
#598 – added checks for non-default database configuration scoped settings in SQL 2016 (suggested by @HenrikStaunPoulsen, coded by @BrentOzar)
#593 – added check for Instant File Initialization using new sys.dm_server_services field in SQL 2016 SP1 (suggested by @BlitzErik, coded by @BrentOzar)
#592 – added check for Lock Pages in Memory using new sys.dm_os_sys_info field in SQL 2016 SP1 (suggested by @BlitzErik, coded by @BrentOzar)
#582 – added check for TempDB file creation error (suggested by @BlitzErik, coded by @BrentOzar)
#616 – improved Target Recovery Time warning because some SQL tools set it to 0 and some set it to 60 (suggested by @aclenscorp, coded by @BrentOzar)
#619 – fixed case sensitivity bug introduced in another patch (coded by @mikejwhitty, only visible to folks like us who work in the dev branch)

sp_BlitzCache Improvements

#617 – add alert for computed columns with scalar operators (coded by @BlitzErik)
#626 – add alerts for more kinds of expensive remote queries (coded by @BlitzErik)
#228 – skip offline databases (coded by @BlitzErik)

sp_BlitzFirst Improvements

#599 – made it more PowerShell-friendly by making all column names unique in the FileStats column output (coded by @runamuk0)

sp_BlitzIndex Improvements

#615 – added warning if trace flag 834 is used in combination with columnstore indexes (coded by @RichBenner)
#622 – stop counting hypothetical indexes in the total number of indexes on a table (coded by @BlitzErik)
#607 – now works in partially contained databases (reported and fixed by @SQLMonger)

sp_BlitzWho Improvements

#544 – added database name in the output to show the query’s context (coded by @BlitzErik)

[Video] Office Hours 2016/12/07 (With Transcriptions)

Last Updated December 10, 2016

SQL Server, Videos

This week, Brent, Tara, Erik, and Richie discuss replicating TRUNCATE TABLE command between nodes, how to cope with social isolation that comes with working from home, database shrinking, how to improve log wait times, how to figure out the mac stop and threshold cost setting for servers, and laziest methods to deploy objects and run scripts.

Here’s the video on YouTube:

Office Hours Webcast - 2016/12/07

Watch this video on YouTube

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes

Office Hours Webcast – 2016-12-07

How do you convince the SAN admin that there’s problems?

Brent Ozar: Michella asks, “How do you convince a storage administrator that the database is having IO issues?”

Richie Rump: Chainsaw.

Tara Kizer: It’s a tall order. I’ve worked at large companies with large DBA teams and large SAN teams and talking with them about IO issues can be challenging because in their monitoring tools, they’re not seeing the IO issues that we’re seeing on the server. So they point the finger back at the server, at Windows, at SQL. We point the finger back at them. It can take a lot of tools to figure out where the actual issue is. Now the one problem I have with the monitoring tools that the SAN teams use is that they are averaged over a minute. So they’re not seeing the heavy spikes that we might be seeing because maybe it’s fine 45 seconds for each minute but that 15 seconds is where we’re being hammered and because it’s getting averaged out, they won’t see it. One thing, I’ve worked a lot with SAN teams as well as Microsoft to figure out where the issue is. You can open up a case with Microsoft for help in determining if it’s an issue on a server or somewhere outside the server. They use, it’s an ETW trace that they can help you with and they can interpret it. It will tell you, is the problem on your server or external to it. Now, if it’s external to it, you might have to get other people involved besides SAN teams. You might have to put some kind of—I forget what they’re called—but they can put some packet captures on there to see where the problem is because it could be fiber, could be a lot of different places.

Erik Darling: I would run CrystalDiskMark and I would show the SAN team the results I get from that. We have a blogpost on our site, if you just search for brentozar.com for CrystalDiskMark or diskspd, you’ll get some primers on a couple different ways to test your storage out. Now I would actually turn your question around and ask you why you think you have a storage problem. Then I would ask you to meet your SAN team in the middle on the connections to the SAN because that’s usually what the problem is. The SAN is usually cruising along, the SQL Server is usually cruising along, but you usually have a really crappy either single connection or like not a very good double connection to the SAN from the server. That’s what I would do there.

Brent Ozar: We also have a request for a shout out to Rich Hansel. So there you go, Rich. Howdy. Oh, wow. The hotel housekeeping staff said to send you the house shout out. I said, “Yep, you’re doing a much better job cleaning the room.” That’s totally not true.

Should I worry about trace waits?

Brent Ozar: Next question. “I am seeing trace write waits among my top wait stats. Should I be concerned about trace write waits?”

Erik Darling: Not really. I would only be concerned if I had like double the trace time waits as up time, because that might mean that a couple people are running traces. But if my trace write time to my up time is kind of one-to-one, I’m not that concerned. It usually means that there’s a monitoring tool or something else keeping an eye on your server, or it’s kind of one of those like benign waits that’s not really going to do much.

Do we still need maintenance plans with SSDs?

Brent Ozar: Man, Troy has a great question. I think we’re all going to like chiming in on this one. He says, “Our network engineers are replacing our spinning disk storage with solid state. Our vendor says with solid state we don’t need maintenance plans. Can I really get rid of my maintenance plans?”

Tara Kizer: I’m not sure what they mean by maintenance plans. Are they referring to the actual maintenance plans that SQL Server is providing or they’re referring to something else? Because I don’t know how disks are going to determine if you’re using maintenance plans or not. Regardless of my storage, I’m using Ola Hallengren solution for instance or some custom solution and I just stay away from maintenance plans regardless.

Erik Darling: We like to say that index fragmentation doesn’t matter anyway, so no matter what kind of drives you’re on, who cares. But don’t get rid of all your maintenance plans because some of them might be taking backups.

Tara Kizer: Oh, you’re saying that they’re probably referring to the index maintenance.

Erik Darling: Maybe, I don’t know.

Tara Kizer: So is the question if they can get rid of the index maintenance plan? Or maybe, you can still use a custom solution. So I’m not sure what exactly they’re saying.

Brent Ozar: I hope it’s not that they’re talking about getting rid of backups. That would be bad, or CHECKDB.

Does Metallica’s new album rock?

Brent Ozar: Next comment is, “Metallica’s new album rocks if you care.”

Tara Kizer: I do care. Metallica used to be my favorite band.

Brent Ozar: As long as I can still hear the Black Album, that’s all I really care about because I could keep hearing that forever.

Richie Rump: Jay Z’s Black Album? Or…?

Brent Ozar: Anyone who has a black album, I’m into it.

Richie Rump: I’m listening to The Hamilton Mixtape right now, so I’m not paying attention to you guys, so, sorry.

Brent Ozar: I believe it. I would sing songs off of there but I haven’t gotten that one yet.

What happens when I truncate a table in an AG?

Brent Ozar: Next up, Vladimir asks, “What happens when I do a TRUNCATE TABLE with Always On Availability Groups? How does the TRUNCATE TABLE command get replicated between nodes?”

Tara Kizer: It’s still a logged transaction, it just deallocates the pages, I think. So you’ll still get the truncate on all your replicas.

Brent Ozar: And it’s quick, it’s still quick.

Erik Darling: Well, faster than a delete anyway, usually.

Brent Ozar: Yes.

Should I add more memory incrementally or all at once?

Brent Ozar: Ryan asks, “When I add more RAM to the database server, like from 64 gigs to 128 gigs, is there any reason not to increase the max memory parameter all at once? Should I inch it up incrementally or just go straight up to the number I want to end up at?”

Tara Kizer: Just go straight up. SQL Server is going to incrementally start using it.

Does Agent have problems with daylight savings time?

Brent Ozar: Someone asks, “Doesn’t SQL Server Agent have the smarts to avoid daylight savings time problems?”

Erik Darling: Like what?

Brent Ozar: Elaborate on the problems you’re having. I would be curious to hear more about that.

Tara Kizer: I used to work for a job where our servers had to use GMT time zones that we could do easy conversions for all the time zones where our customers were. So there was a lot of code that you just had to go into—we were actually using UDF to do the time zone conversions at the time. A lot of work had to go into it once the daylight savings time rules changed. Now that was all custom code, there was no issue with SQL Server. I think that maybe around the DST savings time there was a hotfix and then it got put into a service pack. So just make sure you’re not running really old builds of whatever version of SQL Server you’re using.

Brent Ozar: Because those are really common for us. Whenever we code Agent jobs to put our own date logic in there. So that could be some of what’s going on with that.

How do you cope with the social isolation of telecommuting?

Brent Ozar: Fellow human says, “I’m starting a new job where I work 100 percent remotely.” Congratulations.

All: Yay, yeah.

Erik Darling: Welcome to the dream.

Tara Kizer: Pajama [inaudible] coming up.

Brent Ozar: “How do you guys cope with the social isolation of working from home?” Oh, that’s such a good question. Who wants to go first?

Tara Kizer: I don’t cope. I just stay inside now. [Laughter]

Brent Ozar: You like the isolation. You embrace the isolation.

Tara Kizer: I go hiking a lot. So my hiking game has definitely increased since working from home.

Brent Ozar: You can leave the computer for a period of time and not be on call. That is what I think is the most [inaudible].

Erik Darling: We’re talking to customers pretty much all the time over WebEx though. It’s a little bit less isolated. Plus, you know, I got a wife and kid here so if I’m ever feeling lonely, I can just stick my head out the door. But generally, I don’t know, I always kind of liked the alone time anyway even when I’m kind of locked away in my office pounding away at stuff. I kind of enjoy the solitude.

Richie Rump: I think I’m the only really isolated one where I don’t talk to customers. This is our once-a-week, everybody, where we actually get face-to-face time with one another. This is probably about it. I have a wife and kids here. They’re here all the time. Twitter is a good outlet for me when I just need to see what’s going on and chat with some folks. The company chat room is pretty active, especially when I start stirring the pot and then people know that, “Hey, you guys are too quiet.” But I’m with Erik, I prefer the solitude. I can buckle down. I can really think about things. I kind of need silence for that, or at least really loud music to drown everything out. So it doesn’t bug me much at all. But then again, I do my own podcast where I talk with people about their lives and stuff. I’ll go to user groups. I have my own user group. So there are outlets for me that are outside of really work stuff.

Brent Ozar: For me, I’ve been telecommuting since I think 2003. For me, it was really hard initially because I’m very much a people person. I love being around people. So I started with co-working. If you search for co-working spaces in your city, you can get a desk in an office with a bunch of other people. What I love about co-working is they’re not tech people. They’re wedding photographers, artists, salespeople, independent freelancers, just all kinds of things. But I do so much work with clients and training-type stuff that I can’t really be in an open [inaudible] anyway. So I just like, the instant 5:00 hits, I’m out. I like walk out and around. I’ll take a walk, go get coffee. 4:00, really. It’s 5:00 somewhere.

Richie Rump: Actually, Brent is a people person unless you put him in a conference with 32,000 of them. Then all of a sudden he does not become a people person anymore.

Tara Kizer: “Hope they don’t recognize me.” Brent’s got the most famous face in all of the SQL Server world.

Brent Ozar: Wear a hat down real low.

Erik Darling: Get you some of those Kool Moe Dee shades.

Richie Rump: Maybe a Geordi VISOR, that’d be awesome.

Can I add tempdb data files on the fly?

Brent Ozar: Alexandra asks, “We only have one tempdb file and then we have 16 cores on one production server. Can I just add another seven files during business hours, and if so, what’s the recommended size for each of them?”

Erik Darling: I would say yeah, go ahead. I add tempdb files whenever I wanted. Never really bothered me. As far as the recommended size, as long as it’s going to match the size and the autogrowth of the current tempdb file, or else that one tempdb file is just going to kind of black hole all the queries if it’s larger. Then if you make the other ones bigger, they’re going to absorb the work. So same size and autogrowth as the current file and then add them whenever you want.

Tara Kizer: If you don’t have enough storage to add seven more files of that same size of the first file, what I tell my clients is add seven files, divide your total size of your first file by eight and add seven new files with that size, same autogrowth as the first file, and then go back and shrink your first file down so it’s the same size as the other seven.

Erik Darling: Database shrinking? Oh no!

Tara Kizer: Sometimes we do say shrink. Shrinking is bad but there’s some instances where you need to do it.

Richie Rump: Shriek. They meant shriek, databases shrieking.

More about the Agent job issue (and bugs in general)

Brent Ozar: We have a follow up on the Agent job question. He says, “My SQL job was supposed to run every ten seconds all the time but when daylight time fell back, it did not run for an hour.” Oh, that’s really interesting.

Erik Darling: Oh.

Brent Ozar: I would want to know more about what happened with it because it could be a lot of interesting things around like job history tables or whatever. But if you wanted to, I would post that at dba.stackexchange.com and include as much logging detail as you can because I haven’t run into that. When times fall back, it still continues to run every minute for example but it would be interesting to see.

Tara Kizer: You might need to open up a support case with Microsoft on that one. It might just be a bug.

Brent Ozar: I’m pretty sure it’s not. I’m pretty sure it’s just you.

Richie Rump: We’ve never seen bugs in SQL Server. That never happens.

Brent Ozar: One of my bugs got fixed today.

Tara Kizer: Wow, which one?

Brent Ozar: The one when you save an execution plan and then you go to click file, open recent, it’s not there.

Tara Kizer: Oh, I don’t even use that feature.

Brent Ozar: … every Connect item I’ve ever put in, that would be the lowest priority.

Erik Darling: It was also the easiest one.

Richie Rump: Yeah, the MRU list, that one is the easiest one. Sorry, guys.

Brent Ozar: Chris says, “I had a job where I was the only IT person and now I have a whole team of goofballs and I think it has lowered my stress dramatically.” Well you know, you can have people who are goofballs who aren’t in IT as well.

Tara Kizer: All the weight is lifted off your shoulders when you have lots of people doing the work.

Why is my query slow in the app, and fast in SSMS?

Brent Ozar: Larry asks, “We have a simple select from a view. It has two where conditions. When I run the same query from the query window, it chooses a different index.” So he’s saying when he runs it in the application, it performs differently than when he runs it in SSMS.

Tara Kizer: Brent, are you going to say your article, “Slow in the App, Fast in SSMS?”

Brent Ozar: Go right ahead.

Erik Darling: Mark it.

Tara Kizer: Oh no, you guys go ahead. I don’t know the name of the author. I mean, I know how to answer the question but take it away because I don’t know who has that web page.

Erik Darling: Erland Sommarskog.

Tara Kizer: Okay. What the issue is likely is that when you connect from an application, you have different set operations than when we connect in SSMS. So when I’m trying to troubleshoot why it’s slow in the application and fast in SSMS, I’ll run a, let’s say just a very, very quick profiler trace. Only leave the existing connections in there, that event, and then start it and then immediately stop it. Then find your user ID and then grab those set operations that came out of the profiler trace. Dump those into Management Studio. Then run your query in that same window. So that query window has the same set options as your application user. So when the set options are different, you will have your own execution plan and it can be different than what the app has. You can also get this through a DMV query, I believe, the set operations, but I’m just so used to doing a profiler trace with existing connections and only running it for one second. So it’s easy for me.

Erik Darling: Also, if both of the plans are in cache and you can catch them with sp_BlitzCache, if you scroll all the way to the right in BlitzCache, it will give you the set…

Tara Kizer: It does? Wow, I didn’t know that.

Erik Darling: Yeah, no one ever scrolls that far.

Tara Kizer: I only ever scroll to the right to look for the cache time and last executed and the execution plan. I pretty much ignore columns, I should start looking I guess.

Brent Ozar: I forgot it was in there until this week.

Erik Darling: You should see what happens when you put in expert mode equals one.

Tara Kizer: Oh. See I didn’t even know it had an expert mode.

Erik Darling: Yeah.

Why are my tempdb data files in use?

Brent Ozar: Bert says, “I’m trying to reduce the number of tempdb data files I have. I had four, then I tried six, and now I want to go back to four but every time I try it, SQL Server tells me the file is in use, even after a reboot.”

Tara Kizer: Probably have to run alter—whatever it is—empty file. Yeah, it’s shrink command. You have to specify the empty file. It will empty the file and it will no longer be used and then you’ll be able to remove that file.

Brent Ozar: Yeah, there’s also a—one of you guys told me—blowing the plan cache. You can have a plan cache issue. Was it you, Erik? Or was it Tara?

Tara Kizer: It was me. I encountered that at my last job. So yeah, there’s a blog post on that. But the shrinking wasn’t working I believe. We couldn’t shrink it down and had to [inaudible] the procedure cache.

Richie Rump: You can’t blow in it like an NES cartridge?

[Laughter]

Erik Darling: You can, but you may have mixed results.

Richie Rump: You have to do it again, and again, and again.

If we have high log write waits, what do we do?

Brent Ozar: Someone says, “Thanks for the Christmas card. If we have high log wait times on an app and we can’t change the code,” it’s relativity, they say. So I’m teaching a class in here this week and immediately one of the guys from [inaudible] pops up, “what?” “Any suggestions on how to improve this?” So it’s long wait times, and he says log wait times. “I’ve read it helps to commit less often but since I can’t change the code, is there anything I should do?”

Erik Darling: Get faster drives.

Brent Ozar: One of the reasons why stuff like relativity will have high log wait times is you’re writing a lot of stuff from the database. You’re doing lots of inserts which take a long time to write. You could get faster transaction logs. You could—I have an awful idea, but I’ll wait to see what Erik is going to say.

Erik Darling: My whole thing for that is just the faster disk because you’ve got lawyers going through and updating one thing at a time and that triggers the audit record insert and then there’s just a whole bunch of other stuff going on under the covers when they change something, redact something, and make a production. There’s a lot of stuff that happens whenever you change one thing. You’re not just changing one thing. There’s a lot of little other things going on. So the faster drives is always a good one for me. But I’m sure Brent has something nefarious to say.
Brent Ozar: I have a really bad answer and you shouldn’t quote me on this and surely it’s going down in the recording, but if you’re on SQL Server 2014…

Erik Darling: Agh.

Brent Ozar: Yeah, right? … delayed durability, which means your inserts, updates, and deletes are consider committed as soon as SQL Server hears them. They don’t even harden to disk. Now, this is so bad for two reasons. One, you can lose data when you’re not expecting it, like when SQL Server fails over. Whenever there’s a failover, anything that didn’t make it to the disk to log file, you’re out of luck. Second, you could lose data even on a graceful shutdown. SQL Server does not flush the log file to disk when you do a graceful shutdown. So this really is like a last resort if you didn’t care about the data. And when it’s relativity data, I would find it very hard to believe you would not care about that data.

Brent Ozar: Jackie asks, “How do you guys calculate…?”

Richie Rump: Wow.

Brent Ozar: Right, I know, see? And aren’t you proud to work with me? This is why I’m not allowed in front of people very often.

Richie Rump: It’s like how do NoSQL SQL Server. Hmm.

Erik Darling: You know what? They’ll put out that as a feature but they won’t put out un-log tables as a feature. They’ll un-log everything but you can’t just choose a table for it. What a–.

Brent Ozar: Either you care about nothing or you care about everything. There’s nothing in-between.

How do you calculate MAXDOP and Cost Threshold for Parallelism?

Brent Ozar: Jackie says, “How do you guys figure out the MAXDOP and cost threshold settings for servers?”

Richie Rump: 20-sided dice.

Erik Darling: That’s not a bad answer.

Tara Kizer: We have recommendations but it’s going to be dependent upon your server. We recommend cost threshold for parallels be 50. Five is the default and it’s a really old value and we think Microsoft should really raise it but they like backwards compatibility. MAXDOP, maybe half of your processors, up to eight, usually don’t go over eight. Then if any queries that you have are suffering with those server settings, then you can add the OPTION MAXDOP to that individual query to affect it, maybe to use less processors or maybe even more.

Brent Ozar: You’ve got lots of options once you start playing around. You can even do things with like resource [inaudible] guides but of course that takes much more work. So the 50 and up to eight works really well.

Erik Darling: You know what, there’s actually a good script on dba.stackexchange.com, like how to calculate MAXDOP. I think that guy Ken wrote it and it’s just a script that you run and it basically gives you like however many cores are in a single socket up to eight. So if you have dual sixes or dual fours, it will give you six or four.

Brent Ozar: If you happen to find that, make yourself a note and send it to me because I’m doing like an all-Stack week’s link for next week.

How should I deploy changes to my database?

Brent Ozar: Next one, “I use a SQL command script to run deployment scripts.” Whoa, my god, my condolences. Richie immediately is even like backing up. Tell me more.

Erik Darling: Are you on Express Edition? Why are you doing that?

Richie Rump: Oh no, don’t tell me more, not at all. No, I don’t want anything to hear about this.

Brent Ozar: “I hear there are deployment tools that do sync and compare and I’m looking for the laziest method to deploy objects and run scripts.”

Tara Kizer: I don’t know, SQLCMD is easy. You’re just typing in a command and you’re running a SQL file through it. But as far as what other people have used, my last job used Visual Studio’s schema compare to come up with the scripts and then we would just run those in Management Studio. But I have used the SQLCMD method, not because I chose it but because that was what was delivered from development. That works fine too. It is a super easy way to do it, but who’s writing the SQL script? If you’re the one writing the SQL script, are you asking, how are you generating all the commands to compare, to development, to production so you do the deployment? Visual Studio’s schema compare does that as well as Redgate has a schema compare as well.

Richie Rump: Redgate is the really good one. Visual Studio is the free one. I’ve used both with good success on both. So have at it.

Is there a contemporary music artist you all like?

Brent Ozar: Same person continues their question with kind of a horrifying question, “Also speaking of music…” We weren’t speaking of music, but that’s okay, I’ll allow it. “Is there a contemporary artist you all like?” That is definitely a no.

Tara Kizer: Contemporary. [Laughter] I don’t know if we agree on anything.

Richie Rump: You should have seen us in Austin, man. It was not pretty. It was not pretty at all. Red Hot Chili Peppers came on and like half the room cleared out.

Tara Kizer: I had to leave.

Richie Rump: It was almost fighting words.

Tara Kizer: That’s like the worst band ever. I like that kind of music, that genre, I just do not like the Red Hot Chili Peppers.

Brent Ozar: Metallica might not be that bad of a shot. Richie? No, no on Metallica.

Richie Rump: No, I have to draw a line somewhere.

Erik Darling: I’m with you. It’s like pre-Black Album is all I can deal with.

Tara Kizer: That’s how I am too, but I say they were my favorite band, they are no longer. I kind of liked Richie’s playlist that he shared with us a few months ago in a Spotify list. It was a bunch of 80s music, so I’ve actually listened to a bunch of that stuff. It’s a really good list.

Richie Rump: Actually, that was Erik’s. That wasn’t mine. I was doing the alt stuff. That was Erik’s.

Tara Kizer: No, no, it was yours. Maybe it wasn’t 80s then. I just can’t remember what genre it was.

Erik Darling: I have never shared a Spotify playlist in Slack because I don’t want you guys poking around my other playlists.

Richie Rump: Mine is really easy now, it’s just like been all Hamilton stuff because the kids have been singing it nonstop so the past couple weeks have just been that.

Can you explain what reads means?

Brent Ozar: This one is interesting. “Can you explain to me what specifically this means?” I believe this has Richie all over it.

Tara Kizer: Stats IO output.

Brent Ozar: Scan count, some number, logical reads, some number, physical reads, some number, and read ahead reads, some number. I think those are lotto ticket suggestions is what that is.

Richie Rump: Yeah, we may win something here.

Brent Ozar: Yeah, so you read a table multiple times. Logical reads means you read those 8k pages out of memory. Physical reads means you read some of those pages out of disk. Read ahead reads means SQL Server saw you reading a bunch of pages off of disk and started reading in front of you even further.

Tara Kizer: I recently read that read aheads were an Enterprise Edition feature, is that true?

Brent Ozar: Standard will read ahead, just not as far as Enterprise Edition.

Tara Kizer: Oh, okay.
Brent Ozar: So in terms of what it means, you only read 28,000 pages, like the logical reads thing. So that actually doesn’t scare me that bad. That shouldn’t be terrible. Scanning isn’t necessarily bad either, so I wouldn’t just go based on that.

Erik Darling: Well the scan count isn’t actually scans, it’s just index accesses. So it’s not like you scanned the whole index that many times. If a query goes parallel, you can have multiple things reading from it and you get a whole mess of scan counts that aren’t actual full index scans over and over again. It’s just chunks of the index.

Why is CPU time low but elapsed time high?

Brent Ozar: There’s an interesting one. Kenneth asks, and I’m going to rephrase his question a little. Kenneth says, “I have a query where CPU time is only a couple seconds but elapsed time is 30 minutes. What are some things that could account for that?”

Erik Darling: Compiling.

Tara Kizer: [Laughter] That would be a lot of compiling, dang. We’d need to see what you’re waiting on, probably being blocked I would imagine, but you’re waiting on something. During that 30-minute window, check what you’re waiting on.

Brent Ozar: Disk could be reading from disk, but I like the locking thing a lot.

Can you use Standard Edition for development?

Brent Ozar: Sam asks, “Can you use SQL Server Standard Edition for development? If so, do you simply license your developers with MSDN?” Developer Edition is free, so I would just use Developer Edition instead. Stick with that one.

Richie Rump: But you could license with MSDN too. I think those images are up there as well. So if you say, “We really don’t want any developers touching any Enterprise Edition stuff,” you could do that.

Erik Darling: I can’t confirm but I’ve heard that if you are a Standard Edition shop and you have like a thinner edition on a laptop, Microsoft is not going to make you license it like Standard Edition, but you know, it’s up in the air a little bit. I wouldn’t bet on it.

Brent Ozar: But Erik’s answer is due to the fact that he’s covered in tattoos and when the licensing person came to audit, they just kind of were like, “Yeah, no, no, you’re good. You’re all right.”

Brent Ozar: The follow up on the SQL command deployment scripts, someone says, “I hate SQL command for deployments because it only returns one line of output. If there’s an error further down the script, you’d never know.” That’s a great point.

Tara Kizer: I haven’t seen that occur though. I’ve seen lots of outputs. I’m wondering if it’s maybe just the command. So you can use the -o, I think it is, parameter, to put that into a file. I haven’t seen that before.

How can I prevent a SELECT from blocking?

Brent Ozar: J.H. says, “Any methods to prevent blocking from a select statement?”

Erik Darling: From a…? Oh, so like, do you mean like the schema stability lock that it takes?

Brent Ozar: It might even be just pessimistic locking, where the lock is taking forever.

Erik Darling: Yeah.

Brent Ozar: What are some of the ways you could get around that?

[Crosstalk]

Erik Darling: Not even no lock prevents…

Tara Kizer: That’s true. I wonder what they mean though. They’re probably talking about the select is blocking other connections and maybe it’s not just a schema stability lock.

Erik Darling: Could be, not sure though.

Tara Kizer: Need more info.

Brent Ozar: If you search for isolation levels on our site or if you go to brentozar.com/go/rcsi, which is funny, it’s just a module I just got done presenting in class here. We have a whole section just of resources on that that you might be really interested in. The big takeaway is readers don’t block writers and writers don’t block readers.

Erik Darling: But writers will still block writers.

Brent Ozar: Yeah.

Brent Ozar: We’ll take one more question. I’m going through the ones that are 15 lines long and past that. Someone says, “Yes, do not use locks.” So follow up on the select thing. This person is doing an insert. He’s doing an insert into a table using select from. So you could do with no lock on your select thing. You just know you’re getting dirty reads, so you could see rows twice or not see them at all, or your insert could fail.

Erik Darling: I would like to know how many rows you’re selecting because if you’re selecting a lot of rows, that could lead to higher level locks. There’s a great a blog post by Michael Swart about batching modifications. I’m going to find a link to it and I’ll put that in chat. It’s a really good read about how to batch things properly. So if you’re selecting a lot of rows, you could be getting some weird locking patterns from that as well.

Brent Ozar: Perfect. All right, thanks everybody for hanging out with us this week. Head over to GroupBy.org. This is a totally free conference where you get to pick the sessions. Voting is going on right now. So go to groupby.org, pick what you want to see, and then we’ll be taking the top ten or top five to seven and we’re running those in January. Thanks, everybody. See you next week.

Vote Now to Pick the GroupBy Conference Lineup

Last Updated February 14, 2017

GroupBy Conference

Voting closes soon at GroupBy, the new free online conference where YOU pick the lineup. Here’s the sessions up for votes – there’s a lot of them, so maybe focus on rating the sessions you find the most interesting – quantity of votes counts for a lot here:

Analytics and Business Intelligence:

Database Administration:

Development:

Performance Tuning:

PowerShell:

Multiple Categories:

Voting closes next Friday. Make your voice heard and pick the conference lineup.

Should I Install Multiple Instances of SQL Server?

Last Updated April 9, 2017

Licensing, SQL Server

Instance stacking is the technique of installing multiple instances of SQL Server on the same Windows instance. For example, you might have a VM or server named SQLPROD1, and have:

SQLPROD1 – default instance of SQL Server, say SQL 2016
SQLPROD1\SQL2014 – for our older apps
SQLPROD1\SharePoint – because it’s supposed to be on its own instance
SQLPROD1\development – for our QA & testing

The Benefits of Instance Stacking

Lower SQL licensing costs – you only have to pay for one license, and then even Standard Edition lets you install dozens of instances on the same Windows base.

Lower Windows licensing costs – you only have to pay for one Windows.

Easier Windows patching – since you only have to patch one OS install.

The Drawbacks of Instance Stacking

Much harder performance tuning – all of the instances share the same CPU, memory, network pipe, and storage pipe. While SQL Server does offer tricks like affinity masking and memory settings to alleviate the first two, it’s got no answers for the second two. A backup on one instance will knock out performance on the other instances regardless of how much tuning work you put in. If none of the instances are performance-sensitive, this doesn’t matter – but how often does that happen? And how do you figure out what the “right” memory or CPU settings are? It takes so much human work and experimentation that it really only makes sense when you have plenty of free time per DBA per server.

Much harder reboot planning – you have to get all of the customers on all of the instances to agree on a time to patch Windows.

Security challenges – sometimes, we get those awful folks who insist on being able to RDP into the Windows instance that hosts their databases. If they insist on being sysadmin on the box altogether, then they can make changes that wreak havoc on the other running instances.

The Alternative: Virtualization

Whenever you think about carving up a single server into smaller pieces, think virtualization instead. It’s a great default place for new SQL Servers.

Every SQL Server deserves its own Windows instance. Yes, this does mean higher licensing costs – you’ll need to license SQL Server Enterprise Edition at the hardware host level, and then you can pack in as many VMs as possible into the host.

Then, each VM gets their own performance management, patch schedules, and security. Plus, surprise bonus: every VM, even the tiniest ones, get all of the features of Enterprise Edition.

Availability Groups: More Planned Downtime for Less Unplanned Downtime

Last Updated September 21, 2017

Always On Availability Groups, Breaking News, SQL Server

I often hear companies say, “We can never ever go down, so we’d like to implement Always On Availability Groups.”

Let’s say on January 1, 2016, you rolled out a new Availability Group on SQL Server 2014. It’s the most current version available at the time, and you deploy Service Pack 1, Cumulative Update 4 (released 2015/12/22). You’re fully current, and it’s a stable engine from 2014 – how many more bugs can they find, right?

Here’s what your patching schedule would look like:

2016/02/22 – Cumulative Update 5 – corrupted columnstore indexes when AG fails over, stack dumps on AG secondaries.

2016/04/19 – Cumulative Update 6 – non-yielding schedulers during AG version cleanup, FileTables unavailable after AG failover, canceling a backup causes the server to crash (not related, but cringeworthy) – whew! This one has a lot of big fixes. We should definitely apply this.

2016/05/31 – OH SNAP! CU6 broke NOLOCK. Sure hope you didn’t apply that. Time to take another outage to apply the revised version.

2016/06/21 – Cumulative Update 7 – SQLDiag fails in AGs. You could probably skip this one if you don’t use SQLDiag, and most shops don’t.

2016/07/11 – Service Pack 2 – improved lease timeout to prevent outages, filestream directory not visible after a replica is restarted (wait I thought we fixed that in CU6? no wait that was FileTables), missing error numbers in XE.

2016/08/26 – Cumulative Update 1 – memory leak on AGs with change tracking, error 1478 when you add a database back into an AlwaysOn availability group (sic).

2016/10/18 – Cumulative Update 2 – no AG fixes, woohoo!

What do you mean there's only one engine? — What do you mean there’s only one engine?

That’s 5-7 patch outages in 11 months (and I’m not even listing all of the fixes in these, which include things like incorrect results bugs, plus awesome new DMV diagnostic features that you definitely want.)

Here’s the way I like to explain it to companies: if you have an airplane, it’s absolutely imperative that its engines not fail mid-flight. In order to accomplish that, you have to have regular downtime for mechanics to examine and replace parts – and that doesn’t happen up in the air. With Availability Groups, we’re lucky enough to be able to transfer our ~~passengers~~ databases from one airplane to another quickly – but we still have to have those other airplanes getting constant examinations and patches from mechanics.

Selective XML Indexes: Not Bad At All

Last Updated February 14, 2017

Indexing, SQL Server

Promises, Promises

I said I’d follow up on this, and here it is. I’m by no means the world’s foremost authority on these, I just played with a simple example until I figured out what worked the best for my query. So you can all play along at home, I’m using XML we all have access to: Query Plans!

If any of you have ever used sp_BlitzCache, you’ll know there’s a ton of XQuery involved. I can’t promise that I’ll be adding XML indexes to it in the near future. Selective XML indexes are 2012+, so I’d have to wait for 2008R2 to go out of support. At this rate, that won’t happen until I throw SQL Server out a window and open a bar that I only let my friends drink in.

Other forms of XML indexes can end up being quite large, and often not worth the resource expenditure to create for the relatively small amount of XML processing we limit it to, by default. The performance gains just wouldn’t be there.

Hey, Bartender

So what is a Selective XML index? It’s an index! For XML! Where you pick the parts of the XML to index. Other XML indexes are rather all or nothing, and can end up being pretty huge, depending on the size of your documents. I think they’re at least size of data, if I recall correctly. Let’s take a look at some examples.

SELECT
ID = IDENTITY(INT,1,1),
deq.query_plan
INTO #xml_index_test
FROM sys.dm_exec_cached_plans AS dec
CROSS APPLY sys.dm_exec_query_plan(dec.plan_handle) AS deq

ALTER TABLE #xml_index_test ADD CONSTRAINT pk_id PRIMARY KEY CLUSTERED (ID)

SELECT

ID = IDENTITY(INT,1,1),

deq.query_plan

INTO #xml_index_test

FROM sys.dm_exec_cached_plans AS dec

CROSS APPLY sys.dm_exec_query_plan(dec.plan_handle) AS deq

ALTER TABLE #xml_index_test ADD CONSTRAINT pk_id PRIMARY KEY CLUSTERED (ID)

I’m lazy, so I’m just grabbing a bit of code from BlitzCache to query our XML.

Right now, there are a paltry 225 query plans in my temp table. I know, right? Some consultant I am. I’m not even generating a ridiculous workload to pollute my plan cache.

SET STATISTICS TIME, IO ON 

;WITH XMLNAMESPACES('http://schemas.microsoft.com/sqlserver/2004/07/showplan' AS p)
SELECT xit.ID, 
SerialDesiredMemory = xit.query_plan.value('sum(//p:MemoryGrantInfo/@SerialDesiredMemory)', 'float') ,
SerialRequiredMemory = xit.query_plan.value('sum(//p:MemoryGrantInfo/@SerialRequiredMemory)', 'float'),
CachedPlanSize = xit.query_plan.value('sum(//p:QueryPlan/@CachedPlanSize)', 'float') ,
CompileTime = xit.query_plan.value('sum(//p:QueryPlan/@CompileTime)', 'float') ,
CompileCPU = xit.query_plan.value('sum(//p:QueryPlan/@CompileCPU)', 'float') ,
CompileMemory = xit.query_plan.value('sum(//p:QueryPlan/@CompileMemory)', 'float')
FROM #xml_index_test AS xit;

SET STATISTICS TIME, IO ON

;WITH XMLNAMESPACES('http://schemas.microsoft.com/sqlserver/2004/07/showplan' AS p)

SELECT xit.ID,

SerialDesiredMemory = xit.query_plan.value('sum(//p:MemoryGrantInfo/@SerialDesiredMemory)', 'float') ,

SerialRequiredMemory = xit.query_plan.value('sum(//p:MemoryGrantInfo/@SerialRequiredMemory)', 'float'),

CachedPlanSize = xit.query_plan.value('sum(//p:QueryPlan/@CachedPlanSize)', 'float') ,

CompileTime = xit.query_plan.value('sum(//p:QueryPlan/@CompileTime)', 'float') ,

CompileCPU = xit.query_plan.value('sum(//p:QueryPlan/@CompileCPU)', 'float') ,

CompileMemory = xit.query_plan.value('sum(//p:QueryPlan/@CompileMemory)', 'float')

FROM #xml_index_test AS xit;

But the query plan I get looks ridiculous and costs 1355 query bucks. Ew. I already regret working with XML, again.

I'd rather be querying .jpgs — I’d rather be querying .jpgs

The first kind of Selective XML index we can try is defined on the two nodes we’re querying. You can create it like this:

CREATE SELECTIVE XML INDEX ix_xml_selective ON #xml_index_test(query_plan)
WITH XMLNAMESPACES('http://schemas.microsoft.com/sqlserver/2004/07/showplan' AS p)
FOR (
MemorygrantInfoNode = '//p:MemoryGrantInfo' AS XQUERY 'node()',
QueryPlanNode = '//p:QueryPlan' AS XQUERY 'node()'
);

CREATE SELECTIVE XML INDEX ix_xml_selective ON #xml_index_test(query_plan)

WITH XMLNAMESPACES('http://schemas.microsoft.com/sqlserver/2004/07/showplan' AS p)

FOR (

MemorygrantInfoNode = '//p:MemoryGrantInfo' AS XQUERY 'node()',

QueryPlanNode = '//p:QueryPlan' AS XQUERY 'node()'

);

If we run our query again… Nothing changes, only gets worse. Well, not literally worse. The plan and cost are the same, but now we have a useless index.

Well, nuts. What else can we try?

CREATE SELECTIVE XML INDEX ix_xml_value ON #xml_index_test(query_plan)
WITH XMLNAMESPACES('http://schemas.microsoft.com/sqlserver/2004/07/showplan' AS p)
FOR (
SerialDesiredMemory = '//p:MemoryGrantInfo/@SerialDesiredMemory' AS SQL INT SINGLETON,
SerialRequiredMemory = '//p:MemoryGrantInfo/@SerialRequiredMemory' AS SQL INT SINGLETON,
CachedPlanSize = '//p:QueryPlan/@CachedPlanSize' AS SQL INT SINGLETON,
CompileTime = '//p:QueryPlan/@CompileTime' AS SQL INT SINGLETON,
CompileCPU = '//p:QueryPlan/@CompileCPU' AS SQL INT SINGLETON,
CompileMemory = '//p:QueryPlan/@CompileMemory' AS SQL INT SINGLETON
);

CREATE SELECTIVE XML INDEX ix_xml_value ON #xml_index_test(query_plan)

WITH XMLNAMESPACES('http://schemas.microsoft.com/sqlserver/2004/07/showplan' AS p)

FOR (

SerialDesiredMemory = '//p:MemoryGrantInfo/@SerialDesiredMemory' AS SQL INT SINGLETON,

SerialRequiredMemory = '//p:MemoryGrantInfo/@SerialRequiredMemory' AS SQL INT SINGLETON,

CachedPlanSize = '//p:QueryPlan/@CachedPlanSize' AS SQL INT SINGLETON,

CompileTime = '//p:QueryPlan/@CompileTime' AS SQL INT SINGLETON,

CompileCPU = '//p:QueryPlan/@CompileCPU' AS SQL INT SINGLETON,

CompileMemory = '//p:QueryPlan/@CompileMemory' AS SQL INT SINGLETON

);

This also makes no difference whatsoever. I tried defining each path as the SQL type I’m returning. Again, same plan, same cost. What’s next?

CREATE SELECTIVE XML INDEX ix_xml_selective ON #xml_index_test(query_plan)
WITH XMLNAMESPACES('http://schemas.microsoft.com/sqlserver/2004/07/showplan' AS p)
FOR (
SerialDesiredMemory = '//p:MemoryGrantInfo/@SerialDesiredMemory' AS XQUERY 'xs:double' SINGLETON,
SerialRequiredMemory = '//p:MemoryGrantInfo/@SerialRequiredMemory' AS XQUERY 'xs:double' SINGLETON,
CachedPlanSize = '//p:QueryPlan/@CachedPlanSize' AS XQUERY 'xs:double' SINGLETON,
CompileTime = '//p:QueryPlan/@CompileTime' AS XQUERY 'xs:double' SINGLETON,
CompileCPU = '//p:QueryPlan/@CompileCPU' AS XQUERY 'xs:double' SINGLETON,
CompileMemory = '//p:QueryPlan/@CompileMemory' AS XQUERY 'xs:double' SINGLETON
);

CREATE SELECTIVE XML INDEX ix_xml_selective ON #xml_index_test(query_plan)

WITH XMLNAMESPACES('http://schemas.microsoft.com/sqlserver/2004/07/showplan' AS p)

FOR (

SerialDesiredMemory = '//p:MemoryGrantInfo/@SerialDesiredMemory' AS XQUERY 'xs:double' SINGLETON,

SerialRequiredMemory = '//p:MemoryGrantInfo/@SerialRequiredMemory' AS XQUERY 'xs:double' SINGLETON,

CachedPlanSize = '//p:QueryPlan/@CachedPlanSize' AS XQUERY 'xs:double' SINGLETON,

CompileTime = '//p:QueryPlan/@CompileTime' AS XQUERY 'xs:double' SINGLETON,

CompileCPU = '//p:QueryPlan/@CompileCPU' AS XQUERY 'xs:double' SINGLETON,

CompileMemory = '//p:QueryPlan/@CompileMemory' AS XQUERY 'xs:double' SINGLETON

);

Finally, I tried defining each path as the XQuery expression I’m using to retrieve them, along with the data type. This finally makes a difference, and a huge difference. Query cost is down to 0.55. That ain’t bad for an index.

Just Selecting XML

I’m not getting into the other possible XQuery methods, like .exist(), .nodes(), or .query(), or using sql:column to join data anywhere. That’s more XML than I care to mess with in one blog post. If you use XML a lot, and you’re on SQL Server 2012+, you may want to give SXIs a shot. They can be pretty cool when you get them working.

Thanks for reading!

Test: The Top Two SQL Server Problems I Find Everywhere

Last Updated February 13, 2017

CHECKDB and Corruption, SQL Server

Companies call us for performance or high availability issues, but over and over, the very first two things we find are:

They’re not taking backups to match the business’s RPO and RTO
They’re not doing CHECKDB weekly, or at all, and don’t understand why that’s an issue

So let’s walk through a simple scenario and see how you do.

It’s Thursday morning at 11AM, and you get an email: users are reporting corruption errors when they run SELECTs on a critical table. You run the query, and it turns out the clustered index on the table has corruption.

Here’s your maintenance schedule:

Full backups nightly at 11PM
Log backups every 15 minutes
Delete log backups older than 2 days (because you only need point-in-time restore capability for recent points in time, right?)
CHECKDB weekly on Saturdays at 9AM

Our national symbol is ashamed of your RPO/RTO

You can’t repair the corruption (it’s a clustered index, and there aren’t enough nonclustered indexes to cover all the columns), and the business needs that data back. You’re on: answer these questions:

What backups do you restore, in order?
Will they be free of corruption?
How much data will you have lost?
How long will that process take?
Given that, what’s your effective RPO and RTO?
If the business said that wasn’t good enough, what specific steps could you take to improve those numbers without spending money?

This week, while folks are working at low speed due to the holidays, double-check those backups and corruption check jobs.

The Cost of Adding a Reporting Server

Last Updated February 13, 2017

Architecture, Licensing, Log Shipping, SQL Server

“We’d like to offload our reporting queries to a separate SQL Server.”

The first costs are fairly obvious.

Hardware and storage – even if you’re running it in a virtual machine, you need to account for the costs of say, 4 cores and 32GB RAM. Not only will you need storage for the databases, but you’ll also need to decide whether this server gets backed up, and copied to a disaster recovery data center.

Software licensing – Standard Edition is ~$2k per core, and Enterprise Edition is ~$7k per core. Toss in Windows (especially now that it’s licensed per-core), your management/backup/antivirus tools, and your monitoring software.

Project planning – you’ll need to design how to get the data from production to the reporting server, like with Always On Availability Groups, log shipping, or transactional replication.

App modifications – the app running reporting queries will need a new connection string. Even with Always On Availability Groups, reads aren’t automatically offloaded to readable replicas – you have to use the connection string parameter ApplicationIntent = ReadOnly to tell SQL Server that you promise not to try to write anything. If you have a single app that does both reads and writes, and you only want to offload some of the queries, you’ll need to go through the code to switch those queries over to the new connection string.

The rest of the costs are surprises.

Adding a troubleshooting process – sooner or later, the data replication process will break. Depending on the method (AGs, log shipping, replication) and failure type, it’ll fail in different ways – maybe all of the data is old, maybe just some of it is, or maybe the reports aren’t accessible at all. You’ll want to list out the failure methods and explain what symptoms will look like. This helps business users recognize when their reports are wrong, and react appropriately. If you don’t do this step, then after the first failure, people are just always going to expect that there’s a bug in the report data.

Prepare for failure – for each of those failure methods, decide how you’re going to react. For example, if AG replication breaks and reports are out of date, will you point reports at the primary until the problem is resolved, or will users just have to deal with unavailable reports while you troubleshoot or resync the replicas? If you don’t do this step, then you’re going to be winging it every time, and you’ll look unprepared while reports are wrong or down.

Set realistic expectations for RPO and RTO – based on your process and preparation, make sure the business users understand how long their reports will be down when things break.

Measure the overhead of replication – AGs and transactional replication can add performance slowdowns beyond what the reports used to cost. For example, if you were only running a few reports an hour, and only hitting a subset of the data, then suddenly replicating every individual delete/update/insert operation can have a huge overhead.

Add monitoring – you need to start monitoring how far behind the reporting server is, and how performance is doing on both. Performance troubleshooting becomes a lot harder, too – for example, when you’re doing index tuning, you have to combine data across both the primary and the reporting servers in order to find the right mix of indexes across the board.

Are you sure you really need to offload reporting?

Before you embark on this expensive project, ask:

What’s the primary wait type that we’re facing?
(Find out with sp_BlitzFirst @SinceStartup = 1)
What’s the cheapest/easiest way to reduce that wait type?

Time and again, I see people facing PAGEIOLATCH waits (which mean waiting to read data pages from a data file), and they’re juggling a 1TB database with 16-32GB RAM. Don’t spend tens of thousands of dollars to fix that problem – buy $1,000 of RAM and spend some time doing index tuning.

7 Things I Learned About Aurora at AWS re:Invent 2016

Last Updated February 9, 2017

Amazon Web Services (AWS)

Richie and I attended the AWS re:Invent conference in Vegas last week. Here’s some of my favorite takeaways about Amazon Aurora, their homegrown relational database with MySQL compatibility.

1. AWS has a grudge against Larry Ellison. Andy Jassy’s keynotes made repeated jokes about Oracle’s cloud product (or lack thereof), the high cost of proprietary databases, and expensive sailboat hobbies. Larry’s big, bold personality makes for an easy target, but I couldn’t help but wonder if AWS will come after SQL Server’s costs next.

2. Amazon’s database products are lucrative. Their database products are their fastest-growing products in history. The cynic in me suspects that they’re measuring growth in terms of revenue, not customer count, and since databases are expensive, voila.

3. Amazon’s Aurora product is ambitious. Aurora is their home-built relational database product. They manage backups, availability, and patching. In 2014, they announced full MySQL compatibility – you could point any MySQL app at Aurora, and it’d work just fine. This year, they announced PostgreSQL 9.6.1 compatibility, too.

4. Aurora PostgreSQL was 3 years in the making. That’s gotta make you wonder: are they trying to build SQL Server compatibility? Granted, it’d be a lot of work – it’s a closed source database. Feature compatibility would be an arms race: Microsoft would rapidly try to add features to entice users onto newer versions, while at the same time Amazon would have to race to keep up. (Hey, interesting coincidence – Microsoft’s been suddenly shoving features in – even in service packs.)

5. I’m guessing Aurora Oracle compatibility will come next. In the PostgreSQL announcement session, the speakers pointed out that PostgreSQL is the most Oracle-compatible database, with some 60-70% of applications being able to switch over. Theoretically, AWS could fund open source development of further Oracle compatibility, but given AWS’s lack of contributions back to open source projects, I wouldn’t bet on that. Instead, I’d bet on them doing something internally that only Aurora would offer.

Before going to re:Invent, I would have bet against AWS aiming for Oracle compatibility. However, during the keynotes, I kept hearing enterprise stories over and over. AWS really wants to be the default data center for enterprises, and enterprises run Oracle (and SQL Server of course).

Amazon won’t have an easy road. Their edge is that they’re essentially getting two database platforms for free: MySQL and PostgreSQL. Amazon has developers, sure, but they don’t have to build a query optimizer, for example. Thing is, those platforms aren’t Oracle.

6. Amazon’s throwing in free performance monitoring tools. The new Performance Insights tool shows wait stats, top resource-intensive queries, lock detection, execution plans, and 35 days of data retention. It’s designed by a former Oracle (there we go again) performance tuning consultant to match his workflow. You don’t have to install an agent, configure a repository, or keep the thing running. It’s in preview for Aurora PostgreSQL today, and will be rolled out to all RDS databases (including SQL Server) in 2017.

7. You don’t need to learn Aurora today. This isn’t going to be one of those doom-and-gloom posts that says, “The cloud is coming for your job!” The cool thing about Aurora is that it works just like the MySQL and PostgreSQL you ~~know and love~~ don’t know anything about and aren’t particularly fond of – it just takes away the crappy parts of database administration, like backups, corruption checking, and HA/DR failovers.

If you were a MySQL or PostgreSQL query/index tuner, you could keep right on working in Aurora – only now, your skills relate in direct cost improvements. If you can tune queries and indexes well, you can cut your company’s cloud bill directly – and see the improvement in your very next bill.

The same thing will happen when Amazon comes for Oracle, and eventually, SQL Server. Your performance tuning skills will still work – and they’ll even be worth more.

As they say in Vegas, place your bets.

Date Math In The WHERE Clause

Last Updated February 13, 2017

Oh, THAT SARGability

I realize that not everyone has a Full Metal Pocket Protector, and that we can’t all spend our days plunging the depths of query and index tuning to eek out every CPU cycle and I/O operation from our servers. I mean, I don’t even do that. Most of the time I’m just happy to get the right result back!

I kid, I kid.

For those of you out there that have never heard the word before, go watch this. You’ll thank me later. Much later.

What does that have to do with me?

It has to do with you, because you’re still formatting your WHERE clause poorly. You’re still putting expressions around columns and comparing that output to a value, or another expression.

Huh?

Think about times when you’ve done something like First_Name + ‘ ‘ + Last_Name = ‘Meat Tuperello’, or even worse, when you’ve totally broken a date into YEAR(), MONTH(), and DAY() and compared them all to values.

Yes, you. Yes, that’s bad.

More Common

Sometimes people forget that DATEADD exists. They go right to DATEDIFF, because it sounds like it makes more sense.

What’s the difference between these two dates? Can I go home now? I’m so hungry. No one takes the Esperantan money you pay me with, Mr. Ozar.

But this can get you into a lot of trouble, especially if you’re either dealing with a lot of data, or if the WHERE clause is part of a more complicated series of JOINs. Not only does it not make efficient use of any indexes, but it can really screw up cardinality estimation for other operations. What does this peril look like?

SELECT COUNT(*)
FROM dbo.SalesOrders AS so
WHERE DATEDIFF(DAY, CONVERT(DATE, so.OrderDate), GETUTCDATE()) = 55

SELECT COUNT(*)

FROM dbo.SalesOrders AS so

WHERE DATEDIFF(DAY, CONVERT(DATE, so.OrderDate), GETUTCDATE()) = 55

Looks good to me

Of course it does. That’s why you’re reading this blog. You have questionable taste. There are some problems with this, though.

We didn’t do too bad with cardinality estimation here. The Magic Math guessed about right for our 10,000 row table. But breakthroughs in Advanced Query Plan Technology (hint: GET OFF SQL SERVER 2008R2) allow us to see that we read all 10,000 of those rows in the index, rather than just getting the 5003 rows that we actually need. Shame on us. How do we do better?

No Sets In The Champagne Room

We’re going to flip things around a little bit! We’re going to take the function off of the column and put it on our predicate. If you watched the video I linked to up top, you’d know why this is good. it allows the optimizer to fold the expression into the query, and push it right on down to the index access. Hooray for us. Someday we’re gonna change the world.

SELECT COUNT(*)
FROM dbo.SalesOrders AS so
WHERE CONVERT(DATE, so.OrderDate) = DATEADD(DAY, -55, CONVERT(DATE, GETUTCDATE()))

SELECT COUNT(*)

FROM dbo.SalesOrders AS so

WHERE CONVERT(DATE, so.OrderDate) = DATEADD(DAY, -55, CONVERT(DATE, GETUTCDATE()))

Now we get a cheaper index seek, we don’t read the extra 4997 rows, and the cardinality estimate is spot on. Again, it wasn’t too bad in the original one, but we got off easy here.

Face 2 Face

If you’re wondering what the plans look like side by side, here you go.

Both plans are helped by our thoughtfully named index on the OrderDate column, though the one with cheaper estimated cost is the bottom one. Yes, I know this can sometimes lie, but we’re not hiding any functions in here that would throw things off horribly. If you’re concerned about the Nested Loops join, don’t worry too much. There is a tipping point where the Constant Scan operator is removed in favor of just index access. I didn’t inflate this table to find exact row counts for that, but I’ve seen it at work elsewhere. And, yeah, the second query will still be cheaper even if it also scans.

Thanks for reading!

Brent says: this is a great example of how people think SQL Server will rewrite their query in a way that makes it go faster. Yes, SQL Server could rewrite the first query to make it like the second – but it just doesn’t go that extra mile for you. (And it shouldn’t, you wacko – write the query right in the first place.)

Spills SQL Server Doesn’t Warn You About

Last Updated February 13, 2017

Don’t make me spill

Table variables get a lot of bad press, and they deserve it. They are to query performance what the TSA is to air travel. No one’s sure what they’re doing, but they’ve been taking forever to do it.

One particular thing that irks me about them (table variables, now) is that they’ll spill their spaghetti all over your disks, and not warn you. Now, this gripe isn’t misplaced. SQL Server will warn you when Sort and Hash operations spill to disk. And they should! Because spilling to disk usually means you had to slow down to do it. Disks are slow. Memory is fast. Squatting in the Smith Machine is cheating.

Wouldn’t it be nice?

Why am I picking on table variables? Because people so frequently use them for the wrong reasons. They’re in memory! They made my query faster! No one harbors delusions about temp tables, except that one guy who told me they’re a security risk. Sure, we should get a warning if temp tables spill to disk, too. But I don’t think that would surprise most people as much.

So let’s see what hits the sneeze guard!

You’ve been here before

You know I’m going to use Stack Overflow. Here’s the gist: I’m going to set max memory to 1 GB, and stick the Votes table, which is about 2.4 GB, into a table variable. While that goes on, I’m going to run sp_BlitzFirst for 60 seconds in Expert Mode to see which files get read from and written to. I’m also going to get STATISTICS IO information, and the query plan.

SET STATISTICS IO ON	   

DECLARE @Votes TABLE (Id INT NOT NULL, PostId INT NOT NULL, UserId INT NULL, BountyAmount INT NULL, VoteTypeId INT NOT NULL, CreationDate DATETIME NOT NULL)

INSERT @Votes 
        ( Id, PostId, UserId, BountyAmount, VoteTypeId, CreationDate )
SELECT v.Id, v.PostId, v.UserId, v.BountyAmount, v.VoteTypeId, v.CreationDate
FROM dbo.Votes AS v

SELECT COUNT(*) FROM @Votes AS v

SET STATISTICS IO ON

DECLARE @Votes TABLE (Id INT NOT NULL, PostId INT NOT NULL, UserId INT NULL, BountyAmount INT NULL, VoteTypeId INT NOT NULL, CreationDate DATETIME NOT NULL)

INSERT @Votes

( Id, PostId, UserId, BountyAmount, VoteTypeId, CreationDate )

SELECT v.Id, v.PostId, v.UserId, v.BountyAmount, v.VoteTypeId, v.CreationDate

FROM dbo.Votes AS v

SELECT COUNT(*) FROM @Votes AS v

First, let’s look at stats IO output. The first section shows us hitting the Votes table to insert data. The second section shows us getting the COUNT from our table variable. Wouldn’t you know, we hit a temp object! Isn’t that funny? I’m laughing.

Table 'Votes'. Scan count 1, logical reads 309027, physical reads 0, read-ahead reads 308527, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(67258370 row(s) affected)

Table '#ACD2BA14'. Scan count 1, logical reads 308525, physical reads 36, read-ahead reads 308077, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Votes'. Scan count 1, logical reads 309027, physical reads 0, read-ahead reads 308527, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(67258370 row(s) affected)

Table '#ACD2BA14'. Scan count 1, logical reads 308525, physical reads 36, read-ahead reads 308077, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

The query plan doesn’t give us any warnings. No little yellow exclamation points. No red X. It’s all just kind of bland. Even Paste The Plan doesn’t make this any prettier.

The query plan doesn't warn us about anything. — Oatmeal

Well, unless you really go looking at the plan…

Okay, that sucks

Let’s look at sp_BlitzFirst. Only Young And Good Looking® people contribute code to it, so it must be awesome.

Boy oh boy. Boy howdy. Look at all those physical writes. We spilled everything to disk. That’s right at the 2.4 GB mark, which is the same size as the Votes table. We should probably know about that, right?

Is a temp table any better?

In short: kinda. There are some reasons! None of them are in stats IO. They’re nearly identical.

For the insert, the scan could go parallel, but doesn’t. Remember that modifying table variables forces query serialization, so that’s never an option to us.

In SQL Server 2016, some INSERT operations can be fully parallelized, which is really cool. If it works. Much like minimal logging, it’s a bit of a crapshoot.

The COUNT(*) query gets an accurate estimate and does go parallel. Hooray. Put those pricey cores to use. Unless you recompile the query, you’re not going to get an accurate estimate out of your table variable. They’re just built that way. It doesn’t even matter if you put an index on them.

Does sp_BlitzFirst tell us anything different?

Yeah! For some mystical magical reason, we only spilled out 1.8 GB, rather than the full 2.4 GB.

Party on, Us.

I still hate table variables

I mean, I guess they’re okay if the # key on your keyboard is broken? But you should probably just buy a new keyboard and stop using Twitter.

Anyway, I think the point was that we should have some information at the plan level about spills to disk for table variables and (possibly) temp tables. It wouldn’t help tools like sp_BlitzCache, because spill information isn’t in cached plans, but it might help if you’re doing live query tuning.

Thanks for reading!

Brent says – OMG THESE ARE SPILLS TOO. I never thought of it that way.

Announcing Group By: A New Kind of Free Community Conference

Last Updated November 26, 2016

Brent Ozar Unlimited Team

GroupBy Conference

Let’s try something new. Let’s build a free event by the community, for the community, where:

Speakers submit abstracts publicly online. Anybody can submit an abstract on any topic – community members, software vendors, even Microsoft’s Bob Ward has submitted a session on 2016: It Just Runs Faster.

Attendees suggest tweaks to the abstracts. You leave comments on the abstracts, and you rate the topic, the abstract, and your likelihood of attending. Your constructive criticism helps refine their abstracts. For example, Alex Yate’s session on DevOps 101 has had a lively discussion in the comments, and he’s using that feedback to help craft a better abstract. If you’re a presenter, you should put in an abstract just to get the detailed, insightful feedback people like Hugo Kornelis and Adam Machanic have been giving.

The 5-7 top-voted sessions are picked. You, dear reader, are building this conference for yourself. Voting closes on December 16th, and the ones you’re most likely to attend – those are the ones that will be picked.

On Friday, January 13th, the whole thing happens live. Register to watch, and we’ll do it live, with me as your cohost throughout the event. I’ll take your typed-in questions and ask ’em out loud, help with technical glitches, and manage the recording process. If you miss the event, no worries – the sessions will be posted to YouTube, a podcast feed, and the site.

Everything gets done transparently in public: the abstracts, the presenters, the comments, the voting, and the conference itself.

It’s called Group By. Go check it out, and rate the sessions you’d like to see!

[Video] Office Hours 2016/11/23 (With Transcriptions)

Last Updated November 27, 2016

SQL Server, Videos

This week, Brent and Richie discuss their hardware setups, lazy spools, SQL 2016 Service Pack 1, Always On Availability Groups, Database IDs, GroupBy.org, SQL on Linux, and new features they would like to see in the new version of SQL.

Here’s the video on YouTube:

Office Hours Webcast - 2016/11/23

Watch this video on YouTube

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes

Office Hours Webcast – 2016-11-23

What do you guys think of my blog?

Brent Ozar: All right, let’s see what we’ve got for questions here. Nate says, “I don’t know if you guys are open to this or not–” No, we’re not, really. Next question. Just kidding. “I’d love to have you check out my blog.” He says, “I’m a new blogger and I write about DBA stuff since that’s my current job and passion. I’d love to have some feedback from the pros.” I assume he’s talking about Richie. URL is natethedba.wordpress.com. Nate, the one piece of advice I’d give you is go get your own URL. It’s only like $10 a year. You can get them from GoDaddy, from AWS, Namecheap. There’s a bunch of providers that do it. WordPress will charge you—I want to say it’s $30 a year to be able to use your own vanity URL but the thing is that way it’s totally you and it looks way more professional. You’re going to be telling it to people for years so you might as well just get the URL that you want there.

Richie Rump: Yeah, don’t do Jorriss.com because you’re going to have to tell people that there are two Rs, two Ss, and they’re like, “What’s Jorriss?” In the 90s, it made sense, not so much nowadays.

What should I do about lazy spools?

Brent Ozar: Fellow human says, “In one of the previous Office Hours webcasts, Erik mentioned that lazy spools are often an indication of bad performance. Can you elaborate on why and how to avoid them?” That hasn’t been my experience generally that I see a one-to-one when there is a lazy spool I go after that. I think what Erik was saying, and I think I agree with because of course, he’s not here so I can speak for him, is that when there’s a plan and I see a lazy spool, it’s a crappy query that I need to do something about or a crappy plan, not enough indexes, whatever. I don’t focus on the spool, I take a step back and just start looking at, all right, what’s this query doing? How am I assembling data together and what can I do in a better way? Often it involves things like index tuning or query tuning, but I don’t focus on the spool itself.

Is 403 page reads per second a problem?

Brent Ozar: Next up, someone says, “My Buffer Manager page reads per second is 403. However, my page life expectancy is 22,089. What’s the problem?” First of all, I don’t look at either of those numbers but I’ll tell you what’s happening. What’s happening is stuff is staying in memory for a long period of time. However, you’re having to read 400 pages from disk. 400 pages times 8K is nothing. It’s not that big of a deal. I could read 400 8K pages in one second. I’ve taken speed reading classes.

Richie Rump: Yeah, no.

Brent Ozar: I wouldn’t comprehend any of it. Richie has seen me read documentation, even when I read only one page a minute, I’m like, “What’s this mean again, Richie?”

Richie Rump: Even his emails. Just, no.

Brent Ozar: It’s about the rate of emails that I get. What I would say is, you don’t necessarily have a memory pressure problem. SQL Server is just reading some stuff from disk. What you want to look at is wait time instead. If you search for wait time on brentozar.com you’ll get a whole bunch of blog posts about how to do that. My personal favorite way to do that is sp_BlitzFirst. Sp_BlitzFirst will tell you what your performance problems are right now.

We changed a bit to an int. Why did .NET break?

Brent Ozar: Next up, Nate says, “Pretty please, I promise I have a real question to ask too, maybe even blog about. We had a view which had a BIT casted to an int column. When we changed it back to a BIT, it broke the .NET app with an invalid cast exception. It’s using Entity Framework. Any ideas why that would happen? I thought the app code would handle implicit gracefully but I guess not.” What on earth are you doing that for?

Richie Rump: Well, you do have to change your code. If you have an Entity Framework entity, because of what it is, and it’s mapped to a table or a view, it’s not implicit. You’re actually declaring it in the background as an INT or string or whatever. So you’ll have to go into the code and then tell it, “Hey, by the way, this is an INT or a BIT.” So that’s probably where you’re broken. You’ll just have to change that mapping and those should be fixed.

Brent Ozar: I also have this horrible vision that somewhere data is coming back as like a 9 or a 14 instead of a 5^th just with somebody’s crazy cast.

Should I switch from a Surface Pro 3 to a Mac?

Brent Ozar: “I’m thinking of a Mac since the battery-gate with my Surface Pro 3, even though they finally fixed it. How has your battery been on the Surface Pro 4?” You have a 4, right?

Richie Rump: It’s been fine but what I’ve been telling people about my Surface 4 is I like it. I really love it, but this is my third machine. Like I actually have three machines that I actively use. I’ve got one I’m using right now. I’ve got this baby that I use at night and then the Surface Pro 3 is essentially a large tablet for me. Everything has been fine on it though. I had one problem and it was a hardware problem. I’ve had it since day one. I went right into Microsoft’s store, they exchanged it right out and I was back up and running literally within the next hour or so. But the way I use my Surface Pro is not the way I would normally use a machine.

Brent Ozar: You don’t run like three instances of Visual Studio on there.

Richie Rump: No. No, no, not at all. I mean, it’s all there. But I don’t actively write code on it or anything like that.

Brent Ozar: So you’re the same as me. That’s what you’re saying. It’s installed.

Richie Rump: In ten years when I’m not actually doing anything with technology, I will be you, yes, exactly.

Brent Ozar: Yeah, might take longer than that. Yeah, sadly. To take your skills to atrophy, to this level, it’s, you know, measured in decades.

Richie Rump: You forget, developer skills, they go quick. Right? A lot quicker than DBA skills.

Brent Ozar: Yeah, wow, you guys have to learn new languages every six weeks.

Richie Rump: Haven’t we had this conversation before?

Brent Ozar: Yes.

Richie Rump: I watched a Pluralsight course yesterday, Brent.

Brent Ozar: To give people in the background an idea, we develop stuff like, so you take Paste the Plan for example, which is developed in AWS using Lambda which is function as a server, aka server-less code. It’s not really server-less. Nick Craver has a heart attack every time we say that but there are real servers behind…

Richie Rump: It’s just bad marketing.

Brent Ozar: It is bad marketing. But new frameworks come out with that like every sixty days on how you go about accomplishing this stuff.

Richie Rump: Yeah, and not even that but even some of the new features that have been around a little bit. What I’ve been playing with this week is ECMAScript 6 which is the next version of JavaScript. So I’ve been messing with that in Lambda this week. It just makes things a lot cleaner. All the spaghetti code I had to write with Paste the Plan, now with using ES6 syntax will be a lot cleaner. I’m smiling a lot more this week than I did when I was writing Paste the Plan. So that’s good.

Brent Ozar: Golly, I would also say don’t change from Windows to Mac—don’t take that lightly. Changing either way sucks, jumping platforms. I’ve tried to switch back over to Windows repeatedly because Windows 10 is amazing. Windows 10 is freaking fantastic.

Should I use SQL Server 2016 SP1 in production?

Brent Ozar: Someone says, “Now that SQL Server 2016 Service Pack 1 is out there, would you feel comfortable using that in production? Of course after testing in a test environment first.” I’m actually working with a client today on 2016. That’s going to be one of my recommendations is to go to Service Pack 1. People get really excited about having Standard Edition to have the same capabilities for developers that Enterprise Edition has. Forget that. There’s really cool DMV and execution plan changes that are awesome. In this case, it’s exactly going to help the client get better execution plan insight when they’re going and doing query performance tuning. So if you’re on 2016, I would totally go to SP1, just make sure you test it first. If you’re trying to consider a new deployment today, I would totally vote for 2016 and with Service Pack 1. It’s just great.

Richie Rump: Do you want to mention Erik’s post on 2016 SP1?

Brent Ozar: Yeah. Erik has an opinion, and I agree with entirely, Erik said that when Microsoft put the Enterprise Edition—and for those of you who are listening, I have air quotes around my head—Enterprise Edition features in 2016 Standard Edition in Service Pack 1, they really kind of took the leftovers. They didn’t take the best features. They just took these fringe features that are well tested and everybody kind of knows how they work but nobody really wants to use them. Great examples of that are Always Encrypted, partition tables, you hardly ever see that stuff out there and, yes, it’s going to make ISVs’ lives a little easier, but it really doesn’t help DBAs at all. DBAs, all the features that we wanted, online index rebuilds in Standard Edition, creating indexes online, merry-go-round reads, there’s a lot of things that I think we should have had in Standard Edition if we’re going to get features and we didn’t get those.

Richie Rump: Yeah. As a developer, I should be really jazzed about all the cool stuff they threw in Standard but without raising the memory limits, it’s meaningless.

Brent Ozar: Yeah.

Richie Rump: It’s almost like, “Hey, look all the stuff that you can get and you can do.” And I’d be like, “Yay, I get to play.” Then all of a sudden it’s like, but you only can stay on Standard and like why is it so slow? I don’t understand.

Brent Ozar: Yeah, if you need partitioning, the odds that you’re going to be able to still perform well on 128 gigs of RAM are not good. Usually, stuff like partitioning is designed for much larger datasets.

Richie Rump: Yeah.

Brent Ozar: I’m actually—like when I saw it drop I’m like, okay, I have a vision two years from now the amount of partitioning work we’re going to be doing is skyrocketing because everybody is going to go throw partitioning in as soon as they can and then they’re going to find out, “Oh, it actually makes queries slower instead of faster.”

Richie Rump: Yep, and, “Why I can’t do partition swapping because I’ve put an index across the entire table,” and, yeah. Yeah, that’s going to be fun.

Brent Ozar: Yeah, not a performance fix.

Brent Ozar: Someone says, “So is 2016 better than 2014 and worth going to?” Absolutely, for DMV improvements that are much better. Query Store is my personal favorite. I believe that feature alone is worth going to 2016 for. Thanks, Conor Cunningham, I love you, even though you don’t love me right now because I wrote a blog post that made you angry.

Why do I have user problems when I fail over an AG?

Brent Ozar: One person, let’s see here, says, “Doing some testing with Always On Availability Groups and I’m having trouble with SQL users in the replicated database losing their mapped link after a failover. Is there an easier way to reestablish this link after failover? It must be a common issue.” It is and in this case Always On Availability Groups has a lot of similarities to database mirroring. Database mirroring exhibits the same problem. The guy who wrote the book on database mirroring, his name is Robert Davis, aka SQLSoldier, if you search for SQLSoldier sync logins—not as in sank my battleship—but as in synchronize logins, he has a script out there that will help you go about doing that.

How do I get my RPO and RTO in writing?

Brent Ozar: “Any suggestions for getting RPO and RTO for high availability disaster recovery from an organization?” Yeah, absolutely, go to our site. Go to brentozar.com. Put in your email address or click on “free stuff” up at the top of the blog. We have a worksheet, an HA/DR worksheet for this exact purpose. What this does is it lets you fill in the current state first. You fill in how much data you’re going to lose when the server goes down, how long you’re going to be down for based off of things like your backups and whatnot. Then hand that to the business and say, “Here’s current state. If you want it to be nicer, you pick what you want.” The second page of that worksheet has prices to help you make that a lot easier with the business.

What causes database timeouts?

Brent Ozar: J.H. says, “What are some factors on the database side of things that cause timeouts?” So there’s really two parts to a timeout question. One is timing out before you connect to SQL Server. The other part is timing out waiting for your results. For timing out connecting to SQL Server you really have to watch on the application side of things because often SQL Server never even hears the call. For timing out on query executions, SQL Server doesn’t have a timeout. It will sit there and run your crappy query for years or decades. So then that’s also an application setting on how long you want to wait for your queries. So on both cases of those I start working with my developers and saying, “All right. If you’re having problems connecting, then let’s go set up ping monitors, any kind of monitoring utility from your application server over to the SQL Server so we can start to see when you’re not even able to ping the SQL Server. Then we can work with the network guys to figure out how to fix that.”

What should I do with statistics IO output?

Brent Ozar: Oh, there’s someone who asks an interesting question. “Set statistics IO on gives you great information where tuning in terms of reads.” I know where you can go with that information. Richie, tell them where they can go with their set statistics IO output.

Richie Rump: Thanks, Don Pardo, you could go to statisticsparser.com.

Richie Rump: Dang, I was almost there.

Brent Ozar: So close. The Don Pardo thing was what did it. I was like, ah, that’s good. He says, “Am I correct in assuming that each read is 8K so when I’m talking to my developers I can tell them how much data their current query is reading and start asking does it make sense that this thing is reading say 10 gigs worth of data when they’re writing their queries?”

Richie Rump: Yes.

Brent Ozar: Those are great, good—and it’s always a consistent number. It doesn’t change.

Richie Rump: Yeah. And if you go to statisticsparser.com it will not only parse it out for you so you can actually read it and know that that’s millions and not just kind of having to do the, “Oh, here’s a comma, here’s a comma, here’s a comma,” but also aggregate it for you as well. So it will give you the total number of page reads over a query.
Brent Ozar: Super helpful.

Can database IDs be different on AG replicas?

Brent Ozar: Someone says, “How do I resolve the issue where a database…” blah, blah, blah, “I have a database with one database ID say 6 on the primary and 8 on the secondary in Always On Availability Groups.” That’s totally okay. It doesn’t affect anything at all. Database IDs can be different on every server.

Thanks for helping me with Twitter.

Brent Ozar: Someone says, “Brent, you are awesome because two years ago I was new to Twitter and posted a question not using the #SQLHelp, instead I just named certain people. I was mocked by others while you helped me understand how to use #SQLHelp by reading your blogs.” I’m glad I could help. Twitter is not intuitive. Twitter is the opposite of intuitive. It’s like those old IRC—Richie, were you ever on IRC?

Richie Rump: Uh, of course, Brent.

Brent Ozar: Yeah, I thought so, and BBSs and all that.

Richie Rump: Oh, BBSs, oh. Let’s upload more stuff so I can get more download credits.

Brent Ozar: Or that deal—there was a ZMODEM hack where you could cancel your download at the last second and it would not fill up your account points. It would do it right after you sent the confirmation. Yeah, no, I’m all for that. Everything about databases is cryptic, online is cryptic, so whenever I can help people. Just make sure you can pay it forward.

Psst – introducing Brent’s newest project

Brent Ozar: My current attempt to go pay it forward is GroupBy.org. If you go to GroupBy.org, all one word, no spaces or minus signs, at GroupBy.org we’re building a platform where people can teach each other. Where you can go submit a session. You can vote on which sessions you’d like to see. You can leave feedback to presenters so that they can help make their own abstract better. So it’s like a live conference submissions process. We’re just going to take the top-rated submissions from readers and then go let them present them online on January 13^th.

Richie Rump: It’s such a great idea.

Brent Ozar: Thank you.

Richie Rump: It’s such a great opportunity, man.

Brent Ozar: It’s going to be so much fun. Plus, it uses the exact same platform that you’re looking at here. Those of you who are online, those of you who are listening to the podcast, same exact deal. So you’ll see talking heads. Like I’ll be in there as a cohost for whoever is doing the presenting. So if something goes up in flames, we can still have a good time talking about it. I could post the questions that people ask verbally, just really helps presenters relax as they see the talking heads and not freak out.

Richie Rump: Yeah. I could totally see how me presenting with Brent Ozar’s face right there is not going to freak people out at all. No, no. I mean, it’s totally normal. It happens to me all the time.

Brent Ozar: I could wear a mask. Yeah, see, now that you say that, I’ve kind of always thought of myself as being like—no, but yeah, no, I think you have a point there. I could be actually worse. And it won’t be me every time, too. We’re going to rotate in and out hosting duties. So we’ll do just like go contact people in the community who like doing that kind of co-hosting duty and different, other people can do it. Because no one wants to see me every webcast. It needs to be moving around to different people.

Richie Rump: Dude, you’ll totally get some troll in here, you know, like myself. “Wroooong! That is incorrect!”

Brent Ozar: Or just hold up little signs like, “This slide I’m voting a 2.”

Richie Rump: Yes.

Brent Ozar: Someone says, “When will the registration option be available at Group By?” (Update: it’s available now.)

Are you seeing much interest in SQL Server on Linux?

Brent Ozar: Next one. “Are you guys seeing much interest in SQL on Linux? Allan Hirt told me he was in August at his class in Chicago.” Nope. Nope.

Richie Rump: I’m surprised someone actually has interest in SQL on Linux. Please, tell me more.

Brent Ozar: Yeah, I don’t get it. I get that there are people who are Linux-only shops and they want an easier way to run databases. I find it kind of suspicious that people who have standardized on a free, open-source operating system are going to spend $2,000 a core on Standard Edition and $7,000 a core on Enterprise Edition.

Richie Rump: Yes.

Brent Ozar: There are some pretty good open-source databases out there. I’m not saying SQL Server sucks. It’s amazing. I love SQL Server. It’s fantastic. But if you’re really drinking the Linux kool-aid, which is a delicious kool-aid, and I endorse it highly, great stuff, I don’t see a lot of people going, “I sure can’t wait to give Microsoft my database money.” I just don’t see that.

Richie Rump: Yeah, let me just throw out this Postgres thing that does some things really, really well and let’s get into this SQL Server thing and it’s not really running on Linux but on this kind of faux hybrid OS-y thing that runs on top of Linux. Let’s go with that.

Brent Ozar: Yeah, it’s a virtualization layer basically. Drawbridge, which is now known as SQLPAL 2, SQL Platform Abstraction Layer 2. Anthony Nocentino has a great blog post about it that will be in next week’s weekly links, how he reverse engineered what Microsoft is doing in order to run SQL Server on Linux. Early testing, for example, Klaus Aschenbrenner has done some early testing with Hekaton and found that it’s almost twice as slow. Now, I have no reason to believe that Microsoft won’t fix that stuff as we get closer to production but I don’t see a lot of people going, “I’d like to use it on Linux and have it be slower, please.” I just don’t see that.

Richie Rump: Yeah, let me use an in-memory database that’s slower and it costs me more.

Brent Ozar: Yeah, yeah. Someone else says, “I realize it’s not a replacement for Windows, it’s more for people running a lot of Linux concurrently.” Linux isn’t starving for database choices. What Microsoft will tell you is that people who are using Oracle are looking for a way out. I would just say, as an independent database consultant, if you’re going to go rewrite your app, because essentially if you’re using PL/SQL, you’re going to be doing an app rewrite to move from one database to another. If you’re going to rewrite your app, why not go with open source? Why not go with something that has no licensing costs to it? Sure, you’re going to still pay support but Postgres has got some pretty good looking legs on it right now.

Richie Rump: [Whistles] I mean at that point if you’re thinking about changing your platform, why not go to the cloud? I mean, there’s a lot of stuff out there that’s looking really, really good right now and just because I get to play with it every day, I’m not just saying that, right? It’s true. There’s some good stuff out there right now.

Brent Ozar: Yeah, but AWS has Aurora, there’s all kinds of interesting scalable databases. I am a huge fan of Azure SQL db. I think if you’re going to write a .NET application that wants a relational database up there and you can build it from scratch inside one database, no cross-database queries if you can avoid it, Azure SQL db takes a lot of crappy work out of database administration, stuff that we just don’t like to do.

Brent Ozar: Follow up question, “Can you teach a fish to climb a mountain? Linux is a different animal. I don’t see that getting anywhere.” Well, so, because I happen to be an Alaska guy, I’m a real big fan of Alaska, I’ve seen salmon going upstream, so I do know that that can happen. And they taste fantastic.

What do sleeping queries mean?

Brent Ozar: Next question. “When queries are in a sleeping status, does that mean that the application is not closing out the connection to SQL?” Yep, exactly. If you look with one of my favorite diagnostic tools, sp_WhoIsActive, if you look with sp_WhoIsActive, that’s something that’s not included with SQL Server, it’s an additional diagnostic tool that Adam Machanic out of the Boston area wrote, it will not only show you if it’s sleeping, it will show you if it has open transactions too as well.

Why are partitioned table queries slower?

Brent Ozar: Next question. “Why are queries slower on a partition table? You just mentioned this.” If you go to brentozar.com/go/partitioning, we have links that explain it. Kendra Little did a few example blog posts using the Stack Overflow database. If you split a table up into a bunch of little, bitty child tables underneath, SQL Server has to recombine those results when it goes to join them to another table. So you can end up with even more expensive sorting. There have been a couple of clients where 80 to 85 percent of the cost of their select queries was simply reassembling the data from multiple partitions and then resorting it into a way that you could have had an index on a non-partitioned table. “Well, you can put indexes on a partition table.” Yes, but then you can’t do partition switching. You can’t drop a partition or load a partition at once.

Richie Rump: Yep. That’s why we always—I had a rule when I was working on that 60-terabyte [monster monitor] database that we always had to include the partition key. That way we didn’t get any sort of craziness, joining things back together and things like that.

Brent Ozar: You mean included in the select query, like in the where clause.

Richie Rump: That’s correct, yep.

Brent Ozar: As long as you can do partition elimination, as long as you can tell SQL Server, “Only look in these partitions” and it’s a minority of the partitions, you can get good performance improvements. And I’m not saying I don’t love partitioning, partitioning is awesome. You just have to find the right use case for it. Maintenance, making table maintenance faster is the great use case for partitioning. Making selects faster, not quite as much.

Richie Rump: But if you do the maintenance incorrectly, you could totally hose up a database. Let me tell you.

Brent Ozar: Oh, I need to have a blog post from the team on what were your worst database memories. Worst outages we’ve ever had.

We got an error…

Brent Ozar: Let’s see, we’ll take two more questions. One of them is, “We tried to install SP3 to SQL Server 2012 and got an error hexadecimal value, invalid character, blah blah blah. Tried redownloading the SP, same error.” I would just open a support call with Microsoft. Things like that they may actually even know about the problem and have a fix for it.

What feature do you wish SQL Server would add?

Brent Ozar: Last one we’ll take is, “What would be the number one feature you would want to see in the next version of SQL Server that some other database platform has but we don’t?” So I’ll let you think for a minute, Richie, because I have mine, and both me and Erik have blogged about this. I want non-logged tables. I want to be able to say that this specific table I don’t need to log it at all. Whatever data is in there I’m going to live with. If something hoses up mid-transaction whatever I have for records in there is totally fine for me. The great examples of that are sessions, state tables, reporting tables for like data marts that I could just clean out at the drop of a hat. You can’t just fix this by putting a database into simple mode—model. Tara will kill me if I don’t say “model.” Simple recovery model will still log all your inserts, updates, and deletes to disk. Whereas if like Postgres has non-logged tables, that would let me solve some problems with clients really quickly by just saying, “This table is garbage. Let’s turn off the logging.” Like if I had that over the last year how many real customer problems would I have solved? It’s probably on the small number of fingers on one hand, it’s just so cool, so I’m kind of jaded there. How about you, Richie?

Richie Rump: So I could go the, hey let’s get some of the windowing functions to spec fully because they’re mostly there but they could—but I’m not going to go there. There’s a feature that they just released for Aurora that I’ve been playing around with, probably finish playing around with this week, but you could load data from S3 into Aurora, so I’m going to say that, being able to load data from a JSON or XML file and just by one line on the command line and it just goes up and does it.

Brent Ozar: All right, so now I can think of another one. I would also love to have on Always On Availability Groups, I would love to have—and this is just pie in the sky, this is never going to happen. I would love to have one replica that doesn’t have indexes. I would love to have a replica that is just purely the clustered index, not any of the nonclustered indexes, or different nonclustered indexes somewhere else. So it would kind of give me the best of transactional replication plus Always On Availability Groups. It is never going to happen. I say that, I would love to have it happen, I just don’t see it happening. J.H. says, “How about table level restores?” All the third party backup software can already do that anyway so I’m kind of like, eh, I’m used to seeing that out of LiteSpeed, Redgate, and Idera. So if you want that, you can get that today for like $1,000 a server.

Richie Rump: There’s another comment, “Am I mistaken but isn’t no-logged table the same as in-memory tables? Or do those get logged?”

Brent Ozar: It’s not. With in-memory tables, first off, you have to accept all the restrictions of in-memory tables, which rule out a whole lot of capabilities of regular, plain old tables. Plus, if they’re durable, they are still written to disk. They’re just written to disk in a different format. You can say nondurable in memory tables but those are simply erased whenever you go restart that SQL Server. I do want the data there, I just don’t need it logged.

Richie Rump: Yeah, another great way to put session information inside of SQL Server because that’s always a great idea. Yeah, let’s do that.

Brent Ozar: And you know, so I actually came around. When they said that they’re going to allow Hekaton in Standard Edition I was like, okay. So finally, I will accept session state in Hekaton on Standard Edition. I’m not happy about it, not excited about, it still seems like flushing money down the toilet but if you had to flush money down the toilet, now you can just flush it down with Standard Edition money instead of Enterprise.

Richie Rump: Yeah, so if you want to flush any more money down the toilet, you can go to Richie Rump Palmetto Bay, Florida 33157. There you go.

Brent Ozar: There you go. And on that bombshell, we will end. Good night, everybody. Thanks for watching. We will see you next week. Adios.

SQL Server 2016 SP1 Bug with Graphical Execution Plans

Last Updated February 13, 2017

Early adopters: if you’re rolling on SQL Server 2016 SP1, you may have noticed that when you click on query plans in sp_BlitzCache or sp_WhoIsActive, they sometimes render as ugly XML rather than the beautiful graphical showplan.

Here’s a 3-minute video showing the bug:

SQL Server 2016 SP1 Query Plan Bug

Watch this video on YouTube

To work around it, save the XML plan as a file – but you’ll need to frame the name in double quotes, like “my beautiful query.sqlplan” and end it with a sqlplan extension. Then, open it up again, and SSMS renders it as a graphical plan.

To get this fixed, vote up these 3 Connect items:

Index Scans: Not Nearly As Bad As You Might Think

Last Updated February 13, 2017