Blog

First Responder Kit Release: Still Totally Lacking A Servicing Model

First Responder Kit
7 Comments

A lot of people would think we’re crazy for releasing a FRK update a few days before a new version of SQL Server drops, but we’ve been adding 2017 compatible features for months

Also none of you are on 2017

Most of you won’t live to be on 2017

If you’re lucky, your kids3000 will migrate the last 2008R2 servers to SQL Server 2016

In the year 10,016, when, Double Extra Extended WE MEAN IT THIS TIME Support finally ends

My uploaded AI will be laughing at you

You can download the updated FirstResponderKit.zip here.

sp_Blitz Improvements

— #1100 Ironic that someone would encrypt backups with a certificate but then not backup their encryption certificate, isn’t it? A bit like copying all your DVDs to VHS and then lending your VCR to a carny.
— #1110 There was a code switcheroo that took out the filtering for regular DBCC CHECKDB calls. @markions told us about it. Brent fixed it. I did absolutely nothing. I chose cozy.
— #1116 I got up off my cozy butt and fixed some typos in system XE session names to filter out. Boy were those stupid typos.

sp_BlitzCache Improvements

— #976 Is Paul White Electric? It’s a question we’ve all asked ourselves about the SQL Mastermind/Intergalactic Celestial Being masquerading as a humble blogging man. If you’ve always wanted a reason to email him but couldn’t think of one, perhaps finding a Switch operator will be your excuse.
— #1129 No matter how much gets written about Implicit Conversion, we still keep finding it. This code extends the analysis of implicit conversions to tell you which variables caused implicit conversion on which column when they effect a Seek Plan. In testing, it works for stored proc parameters and declared variables (procs and adhoc).

While we’re in there, we pull out the compile-time values and give you the exec command used to call the proc, in case you need to test a parameter sniffing scenario.

sp_BlitzFirst Improvements

— #983 Wait stats are categorized the way MS lines them up in sys.query_store_wait_stats. It’s like a wait stats decoder ring that you don’t need to buy 48 boxes of Count Chocula to get.
— #1092 There are 172 kinds of memory pressure in the known universe. We now warn on one more of them: external memory pressure. We try to figure that out by looking at if memory usage is way below max memory.

sp_BlitzIndex Improvements

Nothing this time.

sp_BlitzWho Improvements

— #1089 On pre-2012 versions of SQL Server, it was possible for multiple rows to be returned due to an errant LEFT JOIN clause that was later replaced by an OUTER APPLY. Thanks to @nirst for letting us know about this one!
— #1090 In tempdb usage, we weren’t subtracting deallocated pages from allocated pages. This is dumb. And Fixed.
— #1112 Welcome to the future. Or at least current. SQL Server 2016 introduced the sys.dm_exec_query_statistics_xml DMV. That’s s fancy way of saying we have another way of finding query plans for currently executing queries.

sp_DatabaseRestore Improvements

Nothing this time.

sp_BlitzBackups Improvements

Nothing this time.

sp_BlitzQueryStore Improvements

— #976 and #1129 from above
— #1118 Some stored procs were being flagged as having unparameterized queries when they weren’t.
— #1120 When examining a single stored proc, different runs had costs summed up across executions. Now they don’t. Weird, right? Accuracy. What a concept.
— #1122 Parameter sniffing symptom thresholds were relaxed from 10,000x difference between averages/min/max to 100x difference. This makes more sense. Long term, I’m going to work on making this percentage based. I just have to learn math first.

sp_AllNightLog and sp_AllNightLog_Setup Improvements

— #1005 Databases were left in a questionable state. I thought it was Maryland. I was wrong. It was “not backed up”.
— #1016 It was hard to disable restoring jobs, because we built this thing Ford Tough. It’s still Ford Tough, but it has an override to make it Kia Wimpy if you need to disable stuff.
— #1037 Added support for encrypted backup parameters!

sp_foreachdb Improvements

Nothing this time.

You can download the updated FirstResponderKit.zip here.


Adaptive Blog Posts

It turns out I can be dumb

In a previous blog post about Adaptive Joins, I thought that EXISTS wasn’t supported by the new feature. It turns out that EXISTS is totally 100% supported, as long as your indexes support EXISTS.

To show this, I need to show you a query that gets an Adaptive Join plan. After I show you a query that can’t get one.

Forget about the Users table. With no index on the Posts table that has OwnerUserId as the leading column, there’s no join choice.

There is no Dana, there is only hash join.

Probably more Canadian regulations

If I add an index that makes join choices possible, I get an Adaptive Join plan.

Re-running the same query, I get El Adaptivo Plana.

Ain’t It Fun?

Now, I’m going to run two slightly different queries. The main difference is that I’m now selecting PostTypeId. The where clause date changes a bit to show different types of plans with the new column.

The two plans that result from these two queries aren’t Adaptive at all.

Human Being

What Happened?

Well, that first plan looks just like the original plan. It uses an index where OwnerUserId isn’t the leading column, so no other plan choice is available. It has to hash.

That second plan, though. It uses the right index to give us an Adaptive Join, but it seems like the Key Lookup and downstream Nested Loops join does the whole caper in.

Since I decided to stop being dumb, I started up my Extended Events session to capture reasons why Adaptive Joins aren’t used.

Cryptic Walk

The reason it shows for that is eajsrUnMatchedOuter.

There are a number of reasons listed in that XE session for why Adaptive Joins might be skipped:

  • eajsrExchangeTypeNotSupported
  • eajsrHJorNLJNotFound
  • eajsrInvalidAdaptiveThreshold
  • eajsrMultiConsumerSpool
  • eajsrOuterCardMaxOne
  • eajsrOuterSideParallelMarked
  • eajsrUnMatchedOuter

I’ve tried some other methods to trigger other reasons, like hinting Merge Joins, generating Eager Spools, and joining to one row tables. So far nothing else has made something pop up.

Oh well. At least I’m less dumb now.

Thanks for reading!


SQL Server 2017 Showplan Schema Is Available

File this under…

Things probably only I care about. But the Showplan Schema for SQL Server 2017 popped up… well, I caught it last night. I probably could have written this sooner, huh?

So what’s new? Well, if I compare the 2016 SP1 schema using KDiff, a few things stand out. Some I’ve blogged about, and some… I have no idea what they are.

The first thing that shows up is the statistics used by the query. Both sp_BlitzCache and sp_BlitzQueryStore support this.

Stat Stud

After that is support for Interleaved Execution and Adaptive Joins. Unfortunately, from what I’ve been able to see, the Interleaved stuff is in actual plans only. Adaptive Joins are flagged in both cache-spelunking procs, but there’s not much analysis yet. Still gotta see if there are patterns to look for.

California Sun

There are a few odds and ends line that build plumbing in for AJ/IE, and then there’s this! Jackpot!

Anyone want a HJ?

So here’s the weird one.

ICKEY

I have no idea what these are for. If anyone from Microsoft wants to drop me a line, I promise I’ll… tell or not tell people about it. Whatever works for you.

After that, there’s this curious bit.

50 Shades of Optional

The reason this is curious is because TOP WITH TIES syntax has been around forever. I wonder what this is around for.

Nothing else was all that different or interesting. Which I guess means I can leave the plan cache procs alone.

Just kidding!

Thanks for reading!


How to Hire a Junior DBA

Interviewing
9 Comments

When you’re hiring a Junior DBA, ask yourself a few questions:

“What job did they have before this?”

Someone who’s currently employed as a Junior DBA isn’t looking for another Junior DBA job: they’re looking for a Senior DBA job. Because of that, you have to rethink how you write the job description and minimum required qualifications.

To explain DBA career paths, let’s start with a slide from The Senior DBA Class‘s first session:

DBA Job Roles

The left side talks about what’s involved in building and hosting a database application.

Developers usually start from the top and work down: they work with databases in terms of what they need to do to get data in & out of the SQL Server. They’re familiar with the T-SQL language, for example, but not intimately familiar with hardware, virtualization, storage, and Windows clustering.

Systems administrators start from the bottom and work up: they come from an infrastructure perspective, and they’re used to installing and troubleshooting SQL Server. However, they may not even be able to write a single T-SQL query.

You might want someone who knows that whole left side stack, but that person isn’t a Junior DBA. That person is a Senior DBA – and even then, they won’t have a deep level of knowledge across the entire stack.

“What part of that stack is most important to us?”

If you can’t have everything in a Junior DBA, then you’ll want to prioritize which roles in that stack they need to already have experience in when they start on day 1. If table & query design is more important to you, then look for developers first. If installation & troubleshooting is more important, then look for systems administrators.

Then, write a career development plan for the target hire. Where are you going to send them for training after 6-12 months in order to flesh out their knowledge in other areas where you need them to grow? This becomes part of your sales pitch – and it’s exactly the kind of pitch that will set your company apart from other offerings.

Then, write it into your job description. For example:

  • We want a systems administrator who’s been spending part of their time with SQL Servers for the last 1-2 years
  • You should have installed several production SQL Servers, and done troubleshooting on them when they went down
  • You don’t need to know how to design tables, indexes, or queries
  • After 6-12 months with us, we’ll send you to a query design class so you can become comfortable with T-SQL, and segue into full time database administration

“Do we have someone who fits this already?”

You might recognize that this person is already on your team. You might already have a developer or systems administrator who really seems to love databases, and they want to take their career to the next level. Promoting them into a database administrator position lets you keep their valuable company knowledge around, and give them some more career runway.

When given the choice, I’d usually rather promote someone internally because the DBA position can be politically challenging. This person’s job is to serve and protect the data, which means telling a lot of people “no.” Their job is a little easier if they’ve already built up some political capital internally in the company. If a new person comes in and starts telling everybody “no no no, we have to shut that off,” then they make enemies fast.


Which sp_configure Options Clear the Plan Cache?

When you change MAXDOP and run RECONFIGURE, SQL Server clears the plan cache. (It’s documented, by the way, as is the rest of the behavior I’m about to show you, but sometimes I like to learn the hard way by running experiments.) Here’s the easiest way to see it – but make sure to only do this on a development server, not in production:

When you check again, you’ll see that your plan cache has gotten the Men in Black treatment.

DBCC FREEPROCCACHE

It makes sense that changing parallelism from single-threaded (1) to any other option would require building plans anew, but you might be surprised to see that any change causes it to blow the plan cache – even from, say, 4 to 8.

Or even if you change it from 4 to, uh, 4.

There’s no logic built into RECONFIGURE – it just flat out blows the plan cache if you issue any sp_configure command to alter MAXDOP, regardless of whether the value was changed or not. (Remember that – it’s going to come in handy here in a second.)

So what about other options? To see the full list, check the sys.configurations table:

sys.configurations

Dang, there’s dozens! Clearly, I’m not going to run each of these by hand to write a blog post, dear reader. To the power of dynamic SQL! Knowing that a non-change still blows the plan cache, let’s write dynamic SQL that will:

  • Put a query in the plan cache
  • Run sp_configure with one of the configuration values, and its maximum value
  • Run reconfigure
  • Check to see if the plan cache was cleared
  • Run sp_configure again with the original value to set it back

DO NOT RUN THIS ON A SERVER YOU CARE ABOUT.

But here it is in case you wanna poke holes in my logic, because I’m pretty sure there’s some:

The end result:

sp_configuration options that clear the plan cache

Voila – these are options should probably only be changed during a maintenance window, and monitor for parameter sniffing issues afterwards.


Live Query Plans and Blocking

You’d think this would be cool

After all, you’ve got a query plan being executed LIVE ON NATIONAL TELEVISION! But it’s not. It’s actually one of those weird reasons why you still need other ways of looking at DMVs for issues.

Demo Library

Let’s take a closer look!

First, we need an index. Indexes are good for things, I’m told.

I mean, maybe not this index. It’s kind of okay for about one query, and it isn’t this one.

Don’t worry, I’m only setting my own Reputation back to zero. No other Users were hurt during this demo.

Over in another window, I’m going to turn on Live Query Plans and then start running my magnificent select query.

Giffy

 

Quick Rundown

So what’s happening here?

Our update is running in an open transaction and changing the Reputation to 0 for my Id.

Our select is running and trying to show the row of data for my Display Name.

In the query plan, we see our select query using the very narrow nonclustered index on Display Name to find the row we want, and then doing a Key Lookup back to the clustered index to fetch all the columns we need to return.

The nonclustered index returns the single row we’re interested in, but the Key Lookup is blocked by the update, so it can’t continue.

Intersection

What the live query plan doesn’t tell you about your query.

  1. That it’s being blocked
  2. What it’s waiting on

No seriously, nothing turns red or beeps or starts checking its watch, and there’s no Clippy asking you to hang on a moment. It just sits there.

And since the query hasn’t finished running yet, we don’t have any wait stats in the plan XML yet.

No, not Activity Monitor

If you run an activity snapshot tool like sp_WhoIsActive or sp_BlitzWho, you see the blocking quite easily.

Know how

All the lock waits and the blocking session Id is right up front. In sp_BlitzWho, we hit a DMV “new” to SQL Server 2016 called sys.dm_exec_session_wait_stats to return the top waits a session has encountered. Looking there, you can see just how long I was running demos for and writing this blog post.

Thanks for reading!

Brent says: I was kinda hoping there would be a red flashing X or something, showing the operator that was blocked. How cool would that be? Or maybe have the Soup Nazi pop up over the key lookup. As long as we’ve got an animated GIF the post, why not, right?


#PASSSummit Expert Performance Tuning Pre-Con: Stretch Goals

#SQLPass
0
Sure, they look trustworthy

Eric and I are doing a PASS Summit pre-con class called Expert Performance Tuning for SQL Server 2016 & 2017. To encourage signups, we’re giving every attendee a free Enterprise/Everything Bundle, plus some other surprises you’ll get during the event.

Response has been absolutely bonkers – over 300 of you have registered, making it PASS’s highest-selling pre-con ever.

So we thought, why not give you some Kickstarter-style stretch goals, extra benefits you get if attendance goes even higher?

At 325 – free recording of the class – when Erik & I do the online version of the course in November, we’ll record it for Instant Replay, and the Summit attendees will get access to that for 1 year.

At 350 – repeat the pre-con online – attendees get one free seat in the online version of the class in November, January, or April. You can use this yourself to catch up on parts that you didn’t quite absorb the first time, or give to a friend so they can get their learn on.

At 375 – free 1-day online class designed by you – we’ll give the attendees a menu of our training presentations to pick from, let them vote on their favorites, and then deliver it online with Instant Replay recordings included. You’ll be able to pick from topics like query tuning, wait stats, index design, and more.

At 400 – the online class will be TWO days long – same as above, but you get to pick more training presentations. Am I insane to give away a $1,995 value to every pre-con attendee? Probably. But if you people make this crazy stretch goal happen, then you deserve it. No Summit pre-con has ever broken 300 attendees before, let alone 400, so if you can pull it off, then you deserve something spectacular.

If the class sells out – the online class will be THREE days long. If PASS says there’s so many registrations that they have to stop taking new ones, then we’ll pull out all the stops and pour training into your brains for days.

Current registrations: 339 as of Sept 29! Wanna get the stretch goals? Send the pre-con info to a friend, get the word out, and let’s have fun in Seattle!

Update 2017/10/06 – congratulations, folks, you sold it out! Attendees will get to design their own 3-day online class.


[Video] Office Hours 2017/09/20 (With Transcriptions)

SQL Server, Videos
0

This week, Brent, Tara, Richie, and Erik discuss a most excepted solution for SA and securing an instance, how many VLFs should a database have, merge replication issues, query tuning tips, database restores, and Larry Ellison’s self-driving database.

Here’s the video on YouTube:

Office Hours Webcast - 2017/09/20

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes

Office Hours Webcast 9-20-17

(Sorry, no titles for questions this week – Brent’s off in the Isle of Man. Insert joke here about him not being manly enough to be a permanent resident.)

 

Brent Ozar: Hannah asks a question… I guess we’ll get started with the technical stuff… Hannah says – of course, it’s a tough one to get started with, “In regards to SA, the SA account and securing an instance, what’s the most accepted solution? Should I disable SA and only use Windows Auth, or should I change the SA account?” What did y’all do when you were managing servers?

Tara Kizer: I renamed the SA account and then set the password to something ridiculous that nobody could ever type in unless you got to source control. We had to use SQL Authentication due to the applications. I would have preferred to be Windows Authentication only, but whether or not you had to set up the instance to mix mode authentication to allow SQL Authentication is dependent upon the applications that are connecting. So since I had to have SQL Auth enabled, I wanted to do something about the SA accounts, so we renamed it and then just set a ridiculous password and put that in source control, or the password manager, of course; but never use the renamed SA account to login.

Erik Darling: I was in the same boat as Tara, where that old [inaudible] account was a SQL login, so I couldn’t just use Windows Auth only. But I would at least make an attempt at keeping SA disabled. Other people with SA privs would come on and turn it back on to do things with, which was kind of aggravating. Room full of developers just changing SA settings, and I was like no, put that down, stop. Someone’s popular…

Tara Kizer: No, it was just spam calls; so many spam calls.

Brent Ozar: David says, “I scramble the SA account daily. I had a vendor that kept resetting it and using the [crosstalk].” That’s excellent.

 

Brent Ozar: Next up, Chris says, “Hi guys, curious what your thoughts are for how many VLFs my database should have. I hear lots of different recommendations, what’s a good number to watch out for?”

Tara Kizer: [crosstalk] I have these numbers in my head, just based upon all of the databases that I’ve gone through and shrunk and fixed the VLFs on. Because I have run into massive downtime because of extremely high VLFs, back when I was an intermediate DBA and learned about VLFs afterwards. Anyway, he says to keep them under 100 or 1000 VLFs. What about – for 100GB database, generally they have less than 2000 VLFs. 2000 is too high in my opinion for 100GB database. 100GB database, I’d expect around less than 200 VLFs. Shrink it down as far as possible, like even to 1MB, then start growing out into bigger chunks. But 2000 is too high, in my opinion, for a 100GB database.

Erik Darling: Yeah, I think the main thing with VLFs, though is that it really is an odd circumstance where they cause a big problem. So, if you’re at the point now where you’re kind of twiddling thumbs and looking for something to fix, go for it. Otherwise, I would probably concentrate on something else.

Tara Kizer: Exactly, that’s what we did at my last three jobs. Because we ran into this massive downtime because of high VLFs, it was just about overall health check of the systems. And so, my last job, two or three times per year we had this checklist of things we would do to ramp up for certain holidays that would cause extreme load on our servers. So, we had this checklist and one of them was check VLFs, so it’s about the health check.

Erik Darling: It’s one of those things like usually when you fix it once, you don’t have to go back and go crazy with it. Like I ran into a 5TB or 6TB database, when I first got it there were like 25,000 VLFs, which even I was like oh, I should do something about that. But even when I got it down to a normal number, I still had a restore take 21 hours with it because the VLFs ended up being…

Tara Kizer: Oh right, yeah that’s the balancing act, yeah.

Erik Darling: Ah that was cute.

 

Brent Ozar: Roy says, “Am I the only one who hates all the cloud talk around SQL Server? I prefer the simplicity and strength of database integrity, you know, ACID. There are developers struggling to keep up with the latest visual studio, much less constantly being bombarded with the cloud migration suggestions.” Richie, this has you written all over it.

Richie Rump: It kind of does, doesn’t it?

Erik Darling: Richie’s ears started whistling.

Richie Rump: Oh look, my ears are turning red, not that Brent could tell. But yeah, I kind of do, and there’s going to be a big push from Microsoft to kind of get everybody into Azure, because that’s going to be their next big business venture. It’s not going to be from licensing fees, it’s going to be from getting everybody in the cloud. So, it’s not going to go away anytime soon. The only question is, are you going to go there or are you going to stay on-prem? And I can only imagine they’re going to make it harder for us, as time goes on, to stay on-prem. Either through higher licensing costs or something other, because they really want us all to get to the cloud.

I think you still could do ACID in the cloud without a problem. Azure SQL should be able to do that, but things are in flux right now in the Microsoft world with all this stuff with .NET and visual studio constantly being re-upped, revamped and the next version of .NET not being where people expected it to be, and that constantly being turned over. So just hold on tight. If you’re going to stay where you’re at and wait to see what happens, then do that. If you’re going to grab the cloud hard and hold on, try that. But I don’t think there’s anything wrong with sticking with on-prem SQL Server until Microsoft really figures out what they’re going to do in the cloud.

Erik Darling: I think SQL Server’s going to be a tough zoom. I mean, they’re not going back to less frequent releases. From what I read, there’s going to be probably every two years, maybe even every year releases. I mean, if they throw in a 2017 and they called it 2017, not 2016 R2. So maybe even yearly releases, so that’s going to be tough to juggle even as a consultant. Like what version are you on, what can that do?

Richie Rump: For your lying DBA, right, they guy just working on-prem, not a very large stuff that’s going on, is there going to be stuff in the engine that’s going to be worth that upgrade? I mean that’s going to be the real question; do I need to upgrade this year or not, I think, is going to be a question that we’re going to be hearing a lot going down the line as we start to see more and more frequent updates. And a lot more stuff being put into the product, and stuff that we may not need or use.

Brent Ozar: I’m tired of it just because it feels like a sales pitch. Very often it feels like – I know they’re going to get monthly revenue from me, I get it. And they’re shoving it down DBA’s throats to go, “Buy now, save later.” And it just feels kind of old with that, but I get why they’re doing it, but it just feels like a spammy overreach right now.

Erik Darling: Obviously, part of what they want to suck people in for is so they can keep going towards the automated tuning thing. So, all the, “We can fix this problem for you if you just give us the keys to the car. We’ll just go in and watch the engine run and figure out what needs to get fixed and we’ll just keep doing that.” So, like we talked about last week, with the managed instances, a lot of that is just farmed out query tuning. So, they do the automatic query tuning, they do the automatic index tuning, they fix regression. So, all the tooling that they’re building in with query store, with query store wait stats, with the memory grant feedback, all the cool new batch mode stuff. So, all that stuff that comes back is like, oh okay, we just –  the more people we can get into this, the better we can get the automation to be.

 

Brent Ozar: Alex has a question. Alex, I’m just going to tell you, whenever you see more than multiple sentences in there, go ahead and put follow the instructions on the screen. Go to DBA.StackExchange.com. It’s a great place to go in and post really long involved questions. We do the same thing for ourselves. Erik posted a question today over at Stack, we answer questions over there all the time, but whenever you find multiple paragraphs involved, go ahead and post that over as Stack.

Erik Darling: We answer questions there too.

 

Brent Ozar: Sri says, “I’ve got merge replication problems, erroring at night due to virtual memory. I’ve increased the RAM on the VM to 32GB, unfortunately, I’m still having problems with SQL Server running low on virtual memory, please restart the merge agent. What should I do?” That’s actually a good one for Stack as well, I think probably. Have any of us used merge replication?

Tara Kizer: Just transactional. On this week’s client, I’m using it and I’ve had one other client using it.

Richie Rump: I’ve used it once, that was it. That was my first and last time, that was it.

Erik Darling: Oh wait, but if it’s virtual memory, couldn’t that be that the VM is getting sucked dry?

Brent Ozar: Yeah, it could be anything from outside, external memory pressure…

Erik Darling: I might try doing that…

Tara Kizer: But this is the error from within merge replication though, so it’s referring to virtual memory at the OS level, isn’t it? Wouldn’t that be the page file?

Erik Darling: Right, but if it’s a VM then something else is sucking memory away from the VM then.

Tara Kizer: Right, but I mean the virtual memory in the error is not necessarily referring to the virtual memory on VM.

Erik Darling: I think what I might try – the first thing I would do, off the top of my head, is try turning lock pages in memory on, to see if I could keep memory on the box and keep memory for SQL stuck in there.

Brent Ozar: Also, go Google for this, because I Googled and there’s a whole bunch of people with the same issue. So, go Google it and there’s a bunch of advice out there.

Erik Darling: Brent Google Ozar.

 

Brent Ozar: Joseph says, “Please don’t laugh. Have you ever seen a SQL Server database repeatedly fry an SSD drive?”

Tara Kizer: I wonder if it’s the drive that’s the problem and not the server.

Brent Ozar: I wonder if it’s not the server. Like a crappy RAID controller. Tell us ore about what fry means, because I’m curious. What kind of symptoms are you seeing? He says, “RMA.” Is that remotely Madonna assisted or?

 

Brent Ozar: Samuel asks, “How accurate is the tempdb estimate while running CHECKDB with estimate only?” Wow, I’ve never done that.

Tara Kizer: I haven’t either.

Erik Darling: I’ve done it, I just don’t remember how accurate it is.

Brent Ozar: He says, “Can you simply resize tempdb with the estimated value?” Tell us more information. Tell us how big the database is, like how many GB or TB, then how large is your tempdb, just ballpark. That would be interesting…

Erik Darling: Also, how big is your largest object in the database? I mean, I’ve run that before and I want to say the thing that they use is kind of like what they use to estimate – when you run that stored procedure that estimates like page RO compression; where it takes a rough guess at how much it will compress by looking at some of the data.

 

Brent Ozar: J.H. says, “Is it accurate when enabling TLS 1.2 support we would need to install a specific patch?” Your best bet there is to go post that on Stack Exchange as well. We don’t get involved with client-side encryption stuff at all.

 

Brent Ozar: Rex says, “Pedro Lopes of Microsoft has a solution to your VLF problems…” I’m going to say thanks just so I can stick that in everyone’s Q&A, if you want to go hit that link. There’s actually a really good post by Tracy Boggiano this week that will be in the weekly links this coming week. She has a script to automatically fix your VLFs. She checks VLFs, then shrinks your database down to nothing and regrows it out to an appropriate amount, and just runs it in a SQL agent job, which scared the hell out of me until I remembered it was Tracy Boggiano. Then I was like, I actually kind of trust. And it’s a big script. It’s huge.

Erik Darling: I was going to say, if you wanted to like take into account all the weird stuff, like is it okay to shrink a log? I’m like, what time is it?

Brent Ozar: Yes, and she’s got a lot of logic into there, and I was like – working hard on that.

Erik Darling: That’s cool.

 

Brent Ozar: Graham says, “I have high CXPACKET and latch waits. My application is slow, the database size is 90GB, the server has 32 cores and 64GB of RAM, and I can’t put any more RAM on the server. I see a lot of parallelism in query plans from my top ten most expensive queries. Should I play around with MAXDOP and cost threshold?

Tara Kizer: I mean, you need to make sure that they’re set to – yes, not play around but, I mean, make sure that they’re set appropriately for your server, probably eight for MAXDOP and 50 for cost threshold for parallelism. But the fact that queries are going parallel doesn’t mean that there’s a problem. I would tune the expensive queries.

Brent Ozar: And what would you look at in the queries? What are the kinds of things that you look for?

Tara Kizer: I mean, that’s more of an Erik question, but I mean, for my level – because he gets into the individual operators. I’m just like, I’m out. So, I’m looking for like, glaring problems, you know? The warnings, implicit conversions, compilation timeouts, missing indexes. I’m looking at the scans, but scans aren’t necessarily a problem. Expensive key lookups – you know, I always pull my plans into – what is it called – Plan Explorer, Century One to Plan Explorer, and then if the query is large enough, I’ll go into the top operations tabs so I can see exactly which one is the highest estimate percentage, because I can’t always find it if it’s so big and a lot of our clients have some really large plans. At least I’ll know what to be on the lookout for and then go back to the graph.

Brent Ozar: He says it’s an ISV app where he can’t change the queries. I would still index though.

Erik Darling: Yes, index tuning would be the biggest part of that.

 

Brent Ozar: Joseph says, “See howsmyplan.com.” Alright, I got to go look this up. If this is – there was a SME – oh yes, this is – I think Daniel, I forget his last name. Dan Janick has a plan analysis tool howsmyplan.com. I haven’t seen it improved a lot lately, like he just launched it once and that was kind of the end of it, but it was interesting.

 

Brent Ozar: Let’s see. Hannah says, “Is it possible to add schema binding to a current existing view or do I have to completely recreate it. My goal is to add a clustered index to speed up performance. The view returns four billion rows.”

Erik Darling: I mean, you have to alter the view to add schema binding, but you don’t have to like recreate the view.

Tara Kizer: Aren’t you adding a where clause when you query this view?

Erik Darling: Also, I’d be curious to know why you think adding a clustered index to the view would help you return four billion rows.

Brent Ozar: Well, imagine you got [crosstalk]

Richie Rump: Security reasons? I don’t know. That’s Brent’s question.

Brent Ozar: Say that the big cost of the view is joining you know, one million rows,cross-applying or cross-joining another million rows and producing billions of rows.

Erik Darling: You can’t make an index view with apply.

Brent Ozar: No, I mean, like an inner join, whatever, million times a million, inner join, one equals one. This is horrifying for so many reasons because it’s going to write four billion rows to disk. It’s going to write four billion rows, whatever space that takes in order to do the clustered index. Hannah, we like you, we want to know more about this.

Tara Kizer: It doesn’t matter how many rows the view returns when you just query the view without a where clause. Once you apply the where clause, it’s not churning four billion rows and then applying the where clause. It’s not like a linked server where it pulls all over the data first, then filters.

Brent Ozar: No, but if the view has four billion rows and you put an index on it, it’s going to take enough space for the [crosstalk].

Tara Kizer: I’ve tested index views, I’ve never ever gone down the route of deploying those to production because indexing the underlying table has been sufficient in all cases for me.

Erik Darling: Yes, unless you’re doing like really, like, you know, kind of big, involved aggregations, I’ve never seen really great returns on index views either. [crosstalk] so many rules.

Brent Ozar: Yes, it’s messy. It was for us, it was the overhead of joining tables together was so high and we were trying to get stuff down into milliseconds, but super rare.

Richie Rump: Yes, the only time I’ve had to deal with you know, four, five billion row tables, I’ve actually used table partitioning and then used the partitioning key to get into the data where I needed to, but on a view…

Brent Ozar: Somehow your audio went flawless again.

Erik Darling: Somehow.

Brent Ozar: And you’re not – I’ve seen you, you haven’t even gotten near the computer. You’ve been standing over by the couch. It’s a random number generator, maybe it’s the pacemaker that’s in your pocket maybe.

Richie Rump: It works better now.

Brent Ozar: You’re totally good. It’s flawless.

 

Brent Ozar: Samuel says his database size – this is about that CHECKDB with estimate only. Samuel says, “My database size is 100GB, the largest table is 90GB. My tempdb was 10GB. Don’t ask.” Yes, that’s going to be a problem. That’s going to be a problem. Especially if you have any indexes on that thing, you’re going to need – it’s not like you’re going to need 100GB, but I wouldn’t like, try to get by with 15 or 20.

Erik Darling: Yes, but read that again. His database is 100GB and his largest table is 90GB.

Tara Kizer: It’s basically one table.

Erik Darling: But I was going to say, does it have audit in the name? Like, log, what’s – because…

Richie Rump: Images, maybe? I don’t know.

Erik Darling: Because what I’m curious about is – because I see this a lot with databases of this type, where you will have that auditing or that logging table, and that table will be a heap, and that heap may have a whole bunch of empty space in it. So, I would be curious if that’s part of why that table is so massively weird, or this has some crazy like XML column in it. I don’t want to talk about it.

 

Brent Ozar: Mike says, “If I restore database to a temporary server to run as production for a day or two, do I need to restore any of the system databases with it?”

Erik Darling: No? I have never had to do that.

Brent Ozar: And you think about how you want to move your jobs around. Like agent jobs or stuff like that, but SSIS packages, security on them.

Richie Rump: I mean, if it’s restored to a temporary store just to do some dev testing, like unless you have objects that you rely on in system databases, then no. Even then, I would just script those out and stick them on the server.

Tara Kizer: I mean, but if you can take the downtime and you install on that second server using the same paths as the original server, stop SQL Server and copy all database files over, I mean, you know, the system databases over, and then both servers have to be stopped for this and that gives you a quick way to get all the logins and jobs and everything. That way you don’t have to script stuff out, but that requires the downtime to do it.

 

Brent Ozar: Srejath says – and I’m going to take this one because I know this – Srejath says, “Larry Ellison reveals self-driving database.” No. He says he’s going to build one. He doesn’t have one yet. Similarly, I would like to announce – I don’t know, I was going to come up with something. I can’t do it on the fly without being horrible. He says, “Fully autonomous Cloud database service, sounds cool, what’s your take?” My take is that it’s just like every time that Larry Ellison says things like, “My database is unbreakable. This database is self-tuning. Oracle is amazing, it’s really cool.” But you notice how it’s always surrounded by highly paid experts, you know, gently massaging it, feeding it cheese, whatever. It is a highly tuned database that requires highly paid professionals. I am sure that Larry is hearing the calls of people saying, “Come to SQL Server. Any knucklehead can manage this thing” and going, well, I better make it easier to manage. So, he’s going to paint a rosy picture of what it’s like down the road, it’s going to take years, or lots of releases to get there. But will Microsoft announce something similar? They already have. As your SQL DB, they make the same kinds of claims about automated index tuning and automated query tuning, and it doesn’t work yet.

Erik Darling: The other thing is that Larry has been making similar claims since I think, Oracle 9, that you know, doesn’t need a DBA, fixes itself, has like, you know, little settings you can change to just make everything fly right. So, you know, just another one of those things that will not happen down the line.

Brent Ozar: Samantha says, “He also says, buy our Cloud credits.” Yes, I mean, they got caught behind the game on in terms of the Cloud, it’s just like with Microsoft got caught behind with Internet Explorer. Netscape Navigator came, took out over the world, Microsoft had to backpedal fast and they were able to get past it, build out a web browser and did fine, but I don’t’ know how Oracle is going to do long-term in terms of sales, but they’ve got a fantastic database. I’m sure they’re going to do well.

 

Brent Ozar: Roy says, “What are we going to do when Ola Hallengren stops upgrading his maintenance solution? Like when he retires?” Great news is, he’s got it on GitHub. He put the whole thing out on GitHub, so you can make changes to it. We’ve already branched it ourselves and made changes to send back to Ola, so it’s cool. If he retired, I think we would probably be willing to take it over. It would be a lot of fun. I think that where we would fork it into like the DBA tools project, I think the community’s adopted it so much that it’ll keep going.

Tara Kizer: I felt bad when I retired from my eight solutions because I was one of the competitors back in the day. I had actually had mine out there longer but it was so hard to keep it up to date with all the versions and editions and all the various things, and eventually I gave up and we stopped using my solution at my work and deployed out all those codes to – I mean, there was about 700 servers we had it deployed to, and I knew from my emails and the blog comments that my solution had been used tons of servers around the world, and I’m just like, sorry. I just don’t have the time to do this anymore.

Erik Darling: Yes, I mean, like I’m going to be curious how Ola decides to keep up with releases. Just in 2017, there are a couple of really cool new editions to views for log files and for data files where it will tell you how much has changed. So, you can actually make a decision to flip to a full backup if enough stuff has changed. Like you don’t want to take a differential now, you’re going to go ahead and take a full backup because enough stuff has changed. So, I’m interested to see how he decides to keep up with that.

Tara Kizer: Brent’s frozen, and his audio. Maybe he’s gone.

Richie Rump: We got him.

Erik Darling: I was sitting there and I was looking at him, I was like, I’m saying something stupid. He’s just – he’s making that like school mom face. I’m like, crap, I’m saying something bad.

Richie Rump: Make fun of my audio one more time, Brent. One more time.

Erik Darling: Make fun of my video one more time.

Tara Kizer: That’s all the questions we have anyway, so we can either end the call here…

Erik Darling: All we know is that North Korea has bombed Chicago.

Richie Rump: Wow, let’s not mess with that.

Erik Darling: Well, we’re out of questions and just about out of time anyway, and I think Brent is actually silently called the end of this. So, let’s boogie people.

Tara Kizer: Alright, see you.


Announcing More Online SQL Server Classes

Company News
0

Our recent class lineup did pretty well, so now we’ve got a new round:

Always On Availability Groups: The Senior DBA’s Field Guide with Edwin Sarmiento – December 20-22, $2,995 – Learn clustering, design, topologies, readable secondaries, monitoring, and troubleshooting.

Expert SSIS Training with Andy Leonard – December 4-5, $1,995 – For SQL Server Integration Services (SSIS) professionals responsible for developing, deploying, and managing data integration at enterprise-scale.

Expert Performance Tuning for SQL Server 2016 & 2017 with Brent Ozar & Erik Darling – November, January, and April – $995 – Your job is making SQL Server go faster, but you haven’t been to a performance tuning class since 2016 came out. Get up to speed in one day. (This is the same class we’re teaching at the PASS Summit with over 300+ folks signed up – but we’ve heard from a ton of folks who can’t make it to Seattle, and wanted us to teach it online, so here you go.)

All of these classes include Instant Replay, the ability to watch the class recordings for one year after your class.

All of these are 50% off right now!

Why? Because we’re going to run a 50% off sale during Black Friday, and we don’t want you buying now, and then doing a giant facepalm when we do the sale. The classes do have a limited number of seats, too, so grab yours now and beat the Black Friday crowd. See you there!


Why Columnstore Indexes May Still Do Key Lookups

I was a bit surprised that key lookups were a possibility with ColumnStore indexes, since “keys” aren’t really their strong point, but since we’re now able to have both clustered ColumnStore indexes alongside row store nonclustered indexes AND nonclustered ColumnStore indexes on tables with row store clustered indexes, this kind of stuff should get a closer look.

Of course, the effects of the sometimes-maligned Key Lookup are sometimes pretty lousy.

When datatypes aren’t supported by columnstore

You may need to mix indexes in cases where you have columns with unsupported datatypes, like MAX, or perhaps just datatypes that don’t have aggregate pushdown support in ColumnStore yet. I hesitate to make a list here, since it could change in a CU, but here’s what the MS doc currently says about it:

The input and output datatype must be one of the following and must fit within 64 bits.
Tiny int, int, big int, small int, bit
Small money, money, decimal and numeric which has precision <= 18
Small date, date, datetime, datetime2, time

Got it? Also!

The aggregates are MIN, MAX, SUM, COUNT and COUNT(*).
Aggregate operator must be on top of SCAN node or SCAN node with group by.
This aggregate is not a distinct aggregate.
The aggregate column is not a string column.
The aggregate column is not a virtual column.

So uh, anything outside of those datatypes and aggregates can potentially inhibit optimal performance (in case you’re wondering, this sentence wasn’t written by a lawyer).

Demoing it with a clustered columnstore index

Here’s how I set the tables up, finally. Remember kids: loading large volumes of data into tables with nonclustered indexes is dumb.

Don’t do it.

We now have the most optimal setup to get a Key Lookup plan: tables with clustered indexes and God-awful single column nonclustered indexes (okay, so there are exceptions here, but for the most part…).

Come and take it

First up, we need a date to work with. I know most of you can sympathize with that problem.

Q: which function do DBAs have the biggest problem with in real life?
A: GETDATE()

Actually, I’m told DBAs make great spouses, because you barely have to see them.

What was I saying? Date? DATE!

Date.

If you want an explanation for the 1 = (SELECT 1), head over here. Otherwise, let’s write some queries.

For me, that query returns a value of 2018-08-23. The first thing I discovered is that you really have to jump through hoops to get the Key Lookup to happen. The optimizer’s adversity to choosing Key Lookup plans with ColumnStore indexes is well-meaning.

With regular row store indexes, returning 28k rows out of 10,485,760 with a Key Lookup plan would be a no-brainer for the optimizer.

Hooray for index hints.

Where the wood at?

Looking at screencaps of query plans isn’t too fun, is it? I stuck them on PTP for anyone interested.

And for extra credit, let’s see what sp_BlitzCache says about our queries.

Shortwide

The warnings shown by sp_BlitzCache

It’s nice when you don’t have to do any work to find problems. That’s why I do all the work I do on sp_BlitzCache and sp_BlitzQueryStore. I want to make your life better and easier. Let’s look at some of the warnings we have for each query:

  • Clustered ColumnStore: Missing Indexes (1), Parallel, Expensive Key Lookup, Plan created last 4hrs, Forced Indexes, ColumnStore Row Mode
  • Clustered Row Store:  Missing Indexes (1), Parallel, Expensive Key Lookup, Unused Memory Grant, Plan created last 4hrs, Forced Indexes

Without opening a plan or looking at a single tool tip or hitting f4, we know some things:

  • SQL is angry about missing indexes
  • We have expensive key lookups
  • We’re forcing indexes
  • We have a ColumnStore query operating in row mode instead of Batch mode
  • We have an unused memory grant
  • Both plans are relatively new in the cache (duh)

But focusing on the point of the post, which we should probably do, something kind of obvious happens.

The query that uses the nonclustered ColumnStore index does a typical Key Lookup. It’s able to (1) scan the ColumnStore index in Batch mode, (2) pass those rows to a Nested Loops join, and then (3) ‘join’ the nonclustered index to the clustered index on the clustered index key column.

1, 2, 3, and to the 4

But clustered ColumnStore indexes don’t have key columns.

Let’s look at what happens there!

Loc Out

I’m going to assume that this is a bit like a RID Lookup in a plan using HEAPs. Without a clustered index key column, we need to rely on internal metadata to locate rows. That’s what the Seek Keys[1]: Prefix: ColStoreLoc1000 = Scalar Operator([ColStoreLoc1000]) part of the Key Lookup is doing. Interesting!

Fun yet?

This isn’t a knock against clustered ColumnStore indexes. The team behind them has done awesome work to make them more usable and less painful (remember back when they didn’t work with Availability Group secondaries? Of course you don’t!). I wrote this because I’ve gotten increasingly interested in ColumnStore as it becomes more powerful, and as more people start hopping on newer versions of SQL Server where they’re a viable path to fixing real problems.

Thanks for reading!


Answering Questions For Fun And No Profit

Humor, SQL Server
2 Comments

I love answering questions about SQL

My forum of choice is dba.stackexchange.com because I find the voting and point system with badges for being a decent internet person addictive.

If you like more traditional forums, SQLServerCentral.com has a lot of really smart people on it as well.

When I post questions or answers, I try to make them at least vaguely entertaining. Sometimes it works better than others, and I’m sure the way I write annoys some people who prefer more dry Q&A. I just have a hard time not having a good time.

Famosity

After all, no one likes a finger-wagger.

Personality Plus

Because we send people over to dba.se so often, Brent used his magic internet power to make the company Twitter account send out a link to our answers. Why? So you know that when we send you there, we’re not just punting you off to someone else for help. We really do answer questions all the time.

If you’re going to post a question, make sure to include as much detail as possible. Error messages, query plans, example data, and this one is really important: if you’re going to ask for help with code, make sure you post what you’ve tried, or where you’ve gotten stuck.

The fates are cruel to those who don’t.

And remember! It’s not just SQL Server questions. Folks over there use Oracle, Postgres (the vegan version of Oracle), and even… MySQL. Weird.

Style Guide

If you’re an aspiring blogger, questions on sites like this can be great inspiration for posts.

Heck, doing Q&A is a great way to get comfortable blogging, because it gets you somewhat used to writing, formatting, adding in pictures and links, and all that fun stuff.

It can be a little intimidating getting over the fact that your questions and answers will be scrutinized by millions of strangers, so pick a topic that you’re comfortable with or that you’re interested in, and wait for a question to come along that’s in your wheelhouse.

I promise I’ll upvote you.

Probably.

But I still can’t answer your PowerShell questions.

Thanks for reading!

Brent says: for years, I’ve refused to answer questions privately for free because it just doesn’t help anybody. Post your answers in public, let other folks improve ’em, and let everyone find good answers for free, fast. Seriously, I read some of Erik’s answers just for entertainment purposes. Upvoting means laughter.


Breaking News: Using TDE on 2016? Doing Backups? Time to Patch.

Normally, when you use Transparent Data Encryption (TDE) to secure your database at rest, you can’t get backup compression. Encrypted data basically looks like random data, and random data doesn’t compress well.

SQL Server 2016 introduced the ability to compress your TDE databases. Yay!

Unfortunately, it has bugs that can leave you unable to restore the backup. Boo!

Even worse: compression can get turned on even if you weren’t aware. Booooo! When a database has multiple files created or you’re backing up to a URL, the compression is enabled automatically, and your restore might not work.

Microsoft now says you need to be on SQL Server 2016 RTM CU7 or SP1 CU4 or higher.

Facepalm 2.0

Database administration is hard.

When you install a new version of SQL Server, you get new features – and sometimes, you’re not told about them. For example, when 2016’s TDE compression came out, nobody told you, “If you back up across multiple files, your backups might suddenly be compressed.” You didn’t know that you had a new thing to test – after all, I don’t know a lot of DBAs who have the time to test that the new version of SQL Server successfully performs restores. They restore their production databases into the new version, test a few things, and declare victory – but testing restores FROM the new version’s backups isn’t usually on that list.

Then after you go live, hopefully you start testing your restores, implementing log shipping, and refreshing your dev boxes. That’s when you suddenly discover that your backups aren’t.

Change equals risk: every time you put new software into your environment, you’re gambling that it’s going to improve more than it breaks, and that you can fix the things it breaks. That doesn’t mean you shouldn’t ever change – it brings rewards, too – but just don’t go in naively thinking every piece of software is bug-free, and you need to keep up with all of the different Microsoft SQL Server team blogs. I wish that in 2017, there was a better way to get urgent SQL Server news from Microsoft – but until then, I’ll keep recapping to get the word out about urgent stuff like this. You’re welcome. You can buy me drinks at Summit.


What Would You Put in SQL Server 2019?

SQL Server 2019
186 Comments

If you were leading Microsoft’s development programs, what would you add or change in SQL Server 2019?

Forget licensing or pricing changes – that’s dictated by the marketing teams, and those decisions are made really close to the release date. Instead, let’s think bigger: what are the really big improvements you’d make that would really drive increased adoption?

What features do you think would win people over from expensive platforms like Oracle or DB2, or would you focus on the open source MySQL or PostgreSQL crowds, or the outright NoSQL users?

Erik Says: We used to make posts like this half jokingly, but with MS embracing Community Driven Development, stuff like this can potentially go a lot further. Seriously, take the time to comment, or write your own blog posts about features you want. If you keep quiet, it’s your fault if you don’t see what you want in the product. Me? I’m still pulling for unlogged tables.


The Ghosts of Temp Tables Past

Humor, SQL Server
18 Comments

True story

You may find it hard to believe, but I recently had to fix a small bug in sp_BlitzCache and sp_BlitzQueryStore.

Since both stored procedures have similar functions, they also share some temp table names (mainly the ones for parsing down XML nodes to more manageable chunks).

In the window where I was making adjustments to sp_BlitzCache, I had created one of those temp tables. When I went to run sp_BlitzQueryStore, I was flooded with invalid column name errors. Same temp table name, different column names.

Now, I hadn’t made any changes to BlitzQueryStore yet, and it had been running without error as of the last FRK release.

When I ran sp_BlitzQueryStore from a different SSMS window, it ran fine.

Even though the proc explicitly checks for and drops my temp tables, it was reading from temp tables my session created with different column names.

So what gives?

Demonstrate My Syntax

You’ll need a couple SSMS windows for this.

Run this in one of them. Note that the column name in #t1 is t, here.

In window #2, run this. Note that the column name in #t1 is c, here.

You should get this error (sometimes you have to run the create table statement a couple times in a row, then the proc — I can’t nail it down exactly).

Msg 207, Level 16, State 1, Procedure temp_maker, Line 13 [Batch Start Line 9] Invalid column name ‘t’.
Msg 207, Level 16, State 1, Procedure temp_maker, Line 17 [Batch Start Line 9] Invalid column name ‘t’.

Weird!  FWIW, it also throws the same error if I check for the OBJECT_ID of #t1 instead.

Another dumb creation

Strange newfound respect for tables named #dumb.

Chaos Chaos

I’m sure there’s a fun explanation for this. Perhaps it’s something to do with deferred compilation, or the magic of temp tables in stored procedures.

This isn’t the most practically helpful post. It’s mostly so I remember to avoid doing this again in the future.

And who knows, maybe it’ll save you some time, too.

Thanks for reading!

Reference posts:
Linchi Shea
Sebastian Meine


Kickstarter-Style Ideas for our PASS Summit Pre-Con

SQL Server
34 Comments

Boy, have I got a weird challenge for you, dear reader.

When Erik & I started building Expert Performance Tuning for SQL Server 2016 & 2017, our PASS Summit pre-con, I talked with the PASS staff about the attendance record for pre-cons. I thought, “Wouldn’t it be cool if we had the biggest pre-con ever?”

Turns out the record attendance number is 269. Gulp. That’s a pretty big number to beat, especially when pre-con sales might be down this year given the Halloween date, and the fact that there’s a good lineup of pre-cons on Tuesday. So you can imagine my surprise when I asked for a sales update last week, and…

Sure, they look trustworthy

You’ve already bought 279 seats.

You beat the record, and we’re still over a month out from the event!

So that got us to thinking in the company chat room: what would it be like to run a Kickstarter-type set of stretch goals? What would you folks want if we hit 300, 350, or 400 attendees?

Keep in mind that they shouldn’t be bulky physical items like t-shirts or hoodies because we can’t easily drag 300 t-shirts into the conference center. But what about virtual goods, or small things that are easy to carry in?

We’re already giving every attendee an Enterprise/Everything Bundle for 1 year, but here’s some of the other things we’ve thought about as stretch goals:

  • Longer bundle access (say, 2 years or 3)
  • Bundle access for a friend, too
  • Special 1-time-only commemorative stickers

That’s where you come in, dear reader: what would you want? We’ll pick our favorites this week, and if you were the first one to suggest something we end up doing, we’ll give you a free Enterprise/Everything Bundle too.

Update – here’s the goals we came up with.


Why You Should Use Top 100, But Not More Than That

Execution Plans, Memory Grants
0

Earlier this week, I showed you two nearly identical queries and their estimated plans:

One of these things is much like the other

They look pretty darned identical – so how much worse can one query be? Here’s how much worse:

  • Top 100: 51.0 seconds duration, 141.6 seconds CPU time
  • Top 101: 160.0 seconds duration, 244.0 seconds CPU time

That’s right: adding just one row made the query take over three times longer. Let’s investigate the actual plans:

Note the yellow bang on the 101’s sort

The TOP 101’s sort spills to disk – and it’s tempting to call that the root cause, because we all know spilling to disk is going to be slower. Hover your mouse over the sort, and you can see the spill details:

Crying over spilled pages

But just one row couldn’t cause a spill to disk by itself, would it? Surely SQL Server estimated enough memory in order to handle just one extra row coming back.

Dig a little deeper, and look at the amount of memory desired for each query. You can see this (in both the estimated and actual plans by right-clicking on the SELECT operator in the plan, click Properties, and head over to the right side properties window under Memory Grant Info. Here’s a summary:

Memory grant comparison

Gulp. I’m running a SQL Server VM with max server memory at 26,000, and this one TOP 101 query all by itself grabs 5GB of my memory – and holds it for the entire 2-minute duration of the query!

But why are the memory grants so different?

As Geoff Patterson pointed out in Part 1’s comments, SQL Server uses one sorting method for TOP 1-100 rows, and a different one for 101+ rows! Back in 2010, Paul White blogged about this TOP 100/101 problem and how even different data types can influence the memory grants.

Query authors rarely know this magic tipping point, so I see folks writing SELECT TOP 200 or SELECT TOP 1000 without understanding the additional cost. (And it’s not that TOP 100 is faster 100% of the time, either – read Paul’s post for an example where it’s actually slower.)

Query tuners rarely know it either, and they can’t spot it just by glancing at the execution plan. There’s nothing in the plan that clues you in to a different sort method – you just get a plain ol’ sort operator, and that’s it.

Thanks to the SQL Server team’s efforts lately, this kind of problem is much easier to spot.

How to Spot The Problem Faster in 2012+

First, get on a current patch level for 2012/2014/2016. Microsoft added memory grant details in the 2012 SP3 DMVs.

Then, get on SSMS 2017, too. If you’re lucky enough to have the actual plans for each query, you can also see this with the compare-showplan feature:

Why your boss needs to buy you the widescreen monitor

If you’re not that lucky – and I rarely am – then you probably have to troubleshoot it live as it’s happening, but without running the queries. In this case, check out sp_BlitzFirst @ExpertMode = 1, and/or sp_BlitzWho – both of which show you running queries, what memory grants they’ve picked up, and which queries are still waiting for memory grants:

We also have a Cajun version called sp_BlitzWhoDat

Wanna learn more of this stuff?

Erik and I will be covering these DMVs and the ways we use ’em in our Summit 2017 pre-con, Expert Performance Tuning for SQL Server 2016 & 2017. There’s over 200 seats sold, but there’s still seats available – get yours before they sell out.


Optional Parameters and Missing Index Requests

That’s when it all gets blown away

At one point or another in everyone’s SQL-querying career, they end up writing a query that goes something like this:

These are often called optional parameters, and if you spend any time looking at queries, this will make you shudder for many reasons. Poor cardinality estimates, full scans, etc.

One thing that often gets overlooked is that queries constructed like this don’t register missing index requests.

As usual, using the Stack Overflow database for a demo.

Askance

With literal values, the optimizer goes into index matching mode, finds nothing helpful, and tells you all about it. The missing index request is pretty predictable, on DisplayName and Reputation. Makes sense so far, right?

What about with NULL variables?

Kiss your missing index request goodbye

You may blame the NULLs, but it’s not their fault.

Milli VaNULLi

What about with a RECOMPILE hint? Someone on the internet told me that if I use RECOMPILE I’ll get an optimal plan.

Like Crest on plaque

Someone on the internet was, well, not wrong, but not right either. By most standards, you’ll get an optimal plan. But still no missing index request.

Using a stored procedure doesn’t help with that, either (unless you recompile, but that may not be ideal for other reasons).

Swamp of Sadness

Big Deal?

If we add an index with the correct definition, all of those queries will use it as written.

That’s not exactly the problem. Nor is the missing index request being missing a direct affront to anyone who has been tuning queries for more than 30 seconds. It’s obvious what a good enough index would be for this query.

What could be very misleading is if you’re using the DMVs to round up missing index requests, you’re unfamiliar with the overall schema and current index design, or if the optional parameter searches are part of a larger query where the index usage patterns being sub optimal aren’t apparent.

The bottom line on this type of search is that it’s not SARGable. Like using functions and other icky-messies across joins and where clauses, it will prevent missing index requests from popping up. And while missing index requests aren’t perfect, they are a valuable workload analysis tool, especially to beginners.

Thanks for reading!

Brent says: This is such a good example of why you need at least 3 tuning passes for performance tuning: run sp_BlitzIndex looking for obvious index improvements, then run sp_BlitzCache to tune the queries that are still slow, then after tuning them, run sp_BlitzIndex one more time to catch the new missing index requests.


Indexing for Windowing Functions: WHERE vs. OVER

Life Is Messy

Demo queries have this nasty habit of being clean. Even using a pit of despair like Adventure Works or World Wide Importers, it’s easy to craft demo queries that fit the scenario you need to make yourself look like a genius. Stack Overflow, in all its simplicity, makes this even easier (lucky me!) because there’s nothing all that woogy or wonky to dance around.

While working with a client recently — yes, Brent lets me talk to paying customers — we found a rather tough situation. They were using Windowing functions over one group of columns to partition and order by, but the where clause was touching a totally different group of columns.

The query plan wasn’t happy.

Users weren’t happy.

I was still dizzy from being on a boat.

Optimal?

If you’ve been reading the blog for a while, you may remember this post from about two years ago. Over there, we talked about a POC index, a term popularized by Itzik Ben-Gan.

But how does that work when your query has other needs?

Let’s meet our query!

 

We have a Windowing function that partitions and orders by three columns, and a where clause that uses three other columns. If we stick a POC index on the Posts table that prioritizes performance of the Windowing function, what happens? I’m going to put the three where clause columns in the include list to avoid troubleshooting key lookups later.

Now when I run the query, here’s my plan with — you guessed it! A missing index request.

You’re a wang

The missing index request is for nearly the EXACT OPPOSITE INDEX we just added. Oh boy.

96.8%! I must be a bad DBA. I made a backwards index. I hope someone automates this soon.

Okay, so, let’s create an index close in spirit to our original index. Just, y’know, backwards.

When we re-run our query, what happens?

Astronaughty

Oddball

Let’s pause here for a minute. Stuff like this can seem witchcrafty when it’s glossed over in a blog post.

The index I created is awesome for the Windowing function, and the index that SQL registered as missing was awesome for the where clause.

When I have both indexes, SQL chooses the where-clause-awesome-index because it judges the query will be cheaper to deal with when it can easily seek and filter out rows from the key of the nonclustered index, and then pass only those rows along to the Windowing function.

Now, it can still do this with the Windowing-function-awesome-index, because the where clause columns are included, just not as efficiently as when they’re key columns.

The trade-off here is a Sort operation to partition and order by for the Windowing function, but SQL says that will still be far cheaper to sort a bunch of data

Time bomb

If you’re query tuning with a small amount of data, you’ll take a look at these query costs, stick with the where clause awesome index, and go get extra drunk for doing a wicked good job.

Here they are back to back.

Sortless
Sorta kinda

What happens when we include more data?

Going back a year further, to 2015, the costs are close to even. The Sortless plan costs about 159 query bucks, and the Sorted plan costs about 124 query bucks.

Going back to 2013, the Sortless plan now costs 181 query bucks, the Sorted plan costs 243 query bucks, and the Sort spills to disk.

Little Blue Spills

So what’s the point?

Missing index requests don’t always have your long term health in mind when they pop up. Some may; others may just be a shot and a beer to get your query past a hangover.

If I go back and run the ‘2013’ query with only the original index on there (the one that helps the Windowing function), there’s still a missing index request, but with a lower value (75% rather than 98%). Part of this is due to how costs are estimated and where SQL expects the sort to happen (disk vs memory).

In our case, the Sort was a bit of a time bomb. At first, it didn’t matter. As we included more data, it got worse. This is the kind of challenge that a lot of developers face as their app goes from a couple hundred clients to a couple thousand clients, and exactly the kind of thing our Critical Care helps with.

Thanks for reading!

Brent says: this isn’t just about missing index hints in query plans, either: it’s also a great example of why you have to be a little bit careful with the missing index DMV recommendations, too. sp_BlitzIndex would report this index as missing, and you won’t know which queries are asking for it (or whether they’ve gotten better or worse.) Every now and then, you’ll add a missing index and performance will actually get worse – so you’ve also gotta be looking at your top resource-intensive queries via sp_BlitzCache. In this example, after you’ve added Clippy’s index, the now-slower query would show up in sp_BlitzCache with no missing index hints, and you’d need to know how to hand-craft your own.


How Much Can One Row Change A Query Plan? Part 1

Execution Plans
13 Comments

Last week, Erik showed two queries that were aaaaalmost identical, with only one extra column – and the execution plans were dramatically different. Adding just one eensy column made all the difference in the world.

Now, check out these two queries – the first asks for top 100, and the second asks for top 101:

They produce estimated plans that seem identical (PasteThePlan), even down to the estimated costs – note that they’re both 50%:

One of these things is much like the other

But one of these has a couple of very, very bad performance issues, and if you look very closely in the plan’s XML, you’ll discover the gotchas.

For the details, read on in Part 2.


How Much Can One Column Change A Query Plan? Part 2

What happened in Part 1?

Join Elimination, naturally. Until the end. My copy of the Stack Overflow database doesn’t have a single foreign key in it, anywhere.

If we go down the rabbit hole a couple steps, we end up at a very quotable place, with Rob Farley.

2. Duplicated rows

Not necessarily duplicated completely, but certainly a row in our original table may appear multiple times in our result set, if it matches with multiple rows in the new table. Notice that this doesn’t occur if the join-fields (those used in the ON clause) in the new table are known to be unique (as is the case with a Foreign-Key lookup).

When we select a distinct list from one column, or create a unique index on one column, the optimizer knows that that one column is unique and won’t produce multiples of a value. I’m assured by mathematicians that even if you left join two distinct lists, it won’t produce duplicates.

With more than one column involved in a DISTINCT/GROUP BY, there may be duplicates of a single value, which would change our results. There’s a little more information about this over here as well.

How does that apply to us?

The results are going to be every DisplayName in the Users table, but the way our left joins are written to DISTINCT/GROUP BY the list of Ids that each produces, we know that each would only occur once.

That isn’t true in the last join, where we messed with columns. That join may produce multiples of some Ids with the multi-column distinct, which means the join can’t safely be eliminated. You could end up needing to show some DisplayNames more than once, in other words.

Similarly

If I re-create all my joins by dumping them into temp tables, we get a similar effect. A difference I want to point out is that I’m not joining other temp tables to each other, like in the first query.  That’s why the “big” plan only has two joins. The multi-column-duplicate DISTINCT changed things up the whole tree of joins. Funny, right? Hysterical.

Here’s what happens.

Expected?

If I go back and add in the joins, the plan changes again. The duplicate producing join has a domino effect on the other joins — now they can’t be safely eliminated.

Dupes come out at night

Want a simple example?

If you’d like something a bit easier to follow along with, use this example.

Thanks for reading!