Blog

[Video] Office Hours 2018/6/13 (With Transcriptions)

This week, Brent and Richie discuss using full-text index, Elasticsearch, server restart issues, sp_blitzfirst, Always On availability groups, issues with users connecting to SQL Server, learning the basics of Python on SQL Server 2017, VMWare Hyper-V configuration issue, and more.

Here’s the video on YouTube:

Office Hours Webcast - 2018/6/13

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes

 

Should I use SQL Server’s full text indexing?

Brent Ozar: Ron asks, “Recommendations for using full-text index. My database is about 50GB. I’m going to want to index about five tables with about a million rows each. 30% of the database is images, which doesn’t need to be indexed.” So I guess 70% times 50GB – about 35GB. Richie, if you wanted to full-text index 35GB worth of data, what would you do?

Richie Rump: Elasticsearch.

Brent Ozar: And that – how do you go about implementing that or where would you recommend that someone go read for more information about how?

Richie Rump: I don’t know, I guess Pluralsight would be as good a place as any to start. You know, because at least then you could get up and running really, really quickly, at least on easy scenarios and at least to get something out there and cranking. The one thing you do have to worry about is security; if you have any data that need to be secure, I’m not quite sure how you would necessarily handle that if it needs to be encrypted or something like that. I think they’ve got some stuff in there, but I haven’t taken a look at it since I started working here, oddly enough. So it’s at least three years or something like that, but yeah, Elasticsearch, there’s so much more you could do with Elasticsearch than you can with full-text search in SQL Server.

I mean, just the querying scenarios alone are just – they’re amazing, they’re phenomenal. You could do querying from URLs. I mean, it is, right now, the gold standard if you want to do any sort of things like that. So, take a look in Elasticsearch. I guess Solr is another one, but Elasticsearch is just a step ahead of Solr at any given time, so yeah, check that out.

Brent Ozar: You might be wondering why we don’t recommend SQL Server’s full-text indexing. It’s fine if you need to get started and you can literally only have one box. Like, they’re like, sorry, you can only have one server; that’s all you can ever have. But you can run Elastic on VMs, just as long as you can implement cheap VMs. It’s open source, you can run it on Linux, it’s free.

Richie Rump: Yeah, and you know, just from the what you can do from the query scenario, I mean, if you want to query – there’s like keywords you can’t use inside full-text search. And I mean, you start scratching your head, like, why can’t I do that? That doesn’t make any sense. Well, you can do it from a URL perspective in Elasticsearch and it’s all there and you can get pretty complicated, you know, searching scenarios, digging into certain things. It’s a big rabbit hole once you start getting down that, but man, it’s a fun rabbit hole when you start figuring stuff out.

Brent Ozar: If you’re going to start something in the year 2018, you know, it makes more sense to go that way. One of the bumps we hit at Stack Overflow was you couldn’t search for a term like .NET 3.5; that combination of things, just punctuation and all that.

Richie Rump: Yeah, and I believe you do need to move the data over to Elasticsearch server. It’s not like it can plug into SQL Server and everything is grand. So you will have to worry about getting data out and those types of things. But man, the power you get compared to what something you get in SQL Server, which they haven’t touched in how long, Brent? Was it 2005?

Brent Ozar: No, they did a redo of it in 2010 or – I think it was 2010. In 2008 R2 was when I think they did the redo, but…

Richie Rump: But even then, it just pales in comparison to what we can do these days with Elasticsearch and Solr.

Brent Ozar: This is also why you see people getting so excited about NoSQL databases. I was just talking to a user group last night about that. they said there was all this buzz about NoSQL databases and what is it that they can do that we can’t do. And they just make design choices. For example, with Elasticsearch, there’s not really the concept of security and auditing, it’s just anybody who is on the network could make queries against that server. And for a lot of stuff, that’s fine. You know, for a lot of kind of full-text stuff that you’re doing, products in a catalog et cetera, it’s totally okay and is so insanely blazing fast because that’s what it’s designed to do. With SQL Server’s full-text search, they kind of stapled it into the database engine.

Richie Rump: Yeah, and then you inverse that, right. And it’s like, if you want to start storing data in Elasticsearch, that doesn’t make any sense whatsoever, you know. So when you start looking at the specialized tools to do specialized things, like searching or document storage and things like that, that’s what you use NoSQL for and it’s in addition to what you need to do with SQL Server, not instead of.

Brent Ozar: That’s also a great way of saying it because I believe that if you could only pick one database, like if you’re only allowed to run just one data persistence layer, you’re only allowed to use one database, I think you could do a lot worse than SQL Server. SQL Server does a lot of things really well. But as soon as you start going, my needs are high enough that I want to add in a different layer or a different data persistence area, then that’s where these other solutions are phenomenal.

Richie Rump: And that’s where you really need to get to understand what my requirements are, what my needs are, what are the needs of the business in the future and those types of things, you know, and really pie in the sky. Well what do we really need to do and if I get this one thing done and it’s really successful, what’s the next step? And if they tell you, well we want to do more of that, then it’s more like, well, SQL Server can’t do more. I mean, we’ve hit the limit at version one. So at that point, would it make sense to go further out and what not? So these are the types of questions that you need to start asking your business and the people who know the functionality because you probably won’t know right off the bat. You could just say, oh yeah SQL Server will get that done, but maybe in their head, two years down the road, they’re thinking something else so you will need that room to grow.

Brent Ozar: Yeah, and I don’t think full-text is an area that Microsoft is really investing heavily in, in SQL Server. I think they’re adding lots of functionality to different areas; I don’t think full-text search is a big area of focus for them.

Richie Rump: You know, it’s one of those things, you’re not really growing, right. I mean, you kind of understand the scope of everything and Elasticsearch kind of handles all of that and they’re not really adding a ton of new features into Elasticsearch either. I mean, it kind of is what it is, so…

Brent Ozar: Yeah, does what it says on the label.

 

My SQL Server restarted. Where should I look?

Brent Ozar: Pablo says, “Ola amigos, the last four days, my virtual SQL Server got restarted due to an energy outage. No Windows logs about it but extended events shows me a resource… Mem-physical high. Can something related to memory take down the server? Where can I check?” You know, that’s really a tough one. So mem-physical high, that doesn’t sound like you’re running into a problem. That sounds like, from what I remember with extended events, that you’re totally okay, like there’s plenty of memory available. So I wouldn’t dig deeper into that one. That wouldn’t be the place that I would go. I would jump back to the Windows layer because usually, it’s not something at the SQL Server layer that causes the entire box to restart, usually it’s something else.

 

sp_BlitzFirst has no “more details” info….

Brent Ozar: Kevin says, “When I run sp_BlitzFirst and I click on the details link, it opens in a new window and no XML is there. Why is this?” It would depend on what was inside the details. I’m guessing it’s one of these click-to-see details. You could even just hover your mouse over it to see what it is or click on it in a different SSMS, like try someone else’s SSMS. But it might also be to do with the line that you’re on, like there’s no more details about that particular error.

 

How can I test connectivity between replicas?

Brent Ozar: Teshale says, “I was trying to add a node to my cluster for Always On high availability, but I keep getting an error that says the cluster node is not reachable. I can ping it both ways and also I can ping the clustered name.” Ooh, that’s good. What I bet it is, is it – so pings, when they go through, they’re on certain ports. Connecting to each other for clustering or for things like Always On Availability Groups are different ports

Though. So what I like to do is Telnet. So Telnet is – Richie and I are old enough to know Telnet. You,Teshale, are probably way not as old and grey as Richie and I…

Richie Rump: Before the web, Brent.

Brent Ozar: When I couldn’t access Gopher, I checked with Telnet…

Richie Rump: Finger, who needs that?

Brent Ozar: Oh, the finger command, and you would set a message so when someone fingered you, it would return a specific message and people would put jokes in there.

Richie Rump: Giving you the finger command.

Brent Ozar: Can’t understand why that didn’t catch on. So what you do is you Telnet into the ports that you’re trying to get access to. So for example, with Always On Availability Groups, it’s port number 5022 by default, you may have set a different one. So you would Telnet from one box to the other into port 5022 to see if you got a response back. Now, that’s pretty old school. I’m sure there are probably other ways to check to see if all the ports are reachable, but I would start there with 5022. And if you can’t, odds are there’s some kind of firewall going between those two SQL Servers. I guess I’d be a bad person if I didn’t also mention, I would make sure to check to see that you have exclusions set up on your firewall.

Brent Ozar: Teshale follows up and says that the firewall is off on both. That’s good. So at least it’s not coming from inside Windows. But what I worry about is that it’s somewhere in between the two boxes; network switches, stateful packet inspection, it could be any number of things that’s dropping connection in between the two of those. So Telnetting on port 5022, if you get a response or not, that will be the next indicator. telnet’s kind of weird to use. Like sometimes, the thing that comes back looks like it’s a blank screen when you successfully connect. So for example, to test it, test it locally first. Remote desktop into one of the boxes, Telnet into the port 5022. Make sure you get a response. You’ll see what it looks like. And then try another non-functional port, you’ll see how that looks.

 

SQL Server authentication stops working, needs a restart

Brent Ozar: R.P. says, “My users are connecting to SQL Server from an app and they’re providing SQL Server authentication. Sometimes they’re unable to connect and after we restart the SQL Server services, they’re able to connect. We’ve checked, but we didn’t find any issues.” You know what’s funny is I’m tempted to recycle the Telnet thing. What’s the exact error message that you get? Like, the exact error message from the client side, what’s the exact error message that they’re getting. And the second part of this, how long does it take to get the message? Does the message come back instantly, the error message, or does it take 30 seconds before it comes back?

Because if it’s 30 seconds, you may not be able to connect to the SQL Server. If it comes back instantly, it’s connecting, but there’s some kind of problem with the authentication. If it’s taking 30 seconds then the next level that I would try is, actually, what someone else earlier said – when Teschal said – is try to ping the SQL Server from that app server. Are you able to ping it? And if you’re not able to ping it, it means they can’t even connect to the SQL Server for some reason; can’t even connect to the Windows instance that it’s on.

Richie Rump: And, R.D. are you in good relationships with the network guys because maybe they’re in there going he-he, click, ha-ha, click, ha-ha, click…

Brent Ozar: Watch this…

 

Where should I go to learn Python on SQL 2017?

Brent Ozar: Let’s see, next, Heather says, “Do y’all know of a good resource for learning the basics of Python on SQL Server 2017?” Richie, if I put a gun to your head and said go learn Python, where would you go?

Richie Rump: Okay, so I think there’s a difference here, right. So Python, I’d go to Pluralsight because that’s kind of where I could learn the fastest. They distill it down and they say here are the things that are important. If I wanted to get deep knowledge, I’d buy a book. If I just wanted to get up and running, it’d be Pluralsight or some sort of other video training type thing.

I think your question was how do I learn Python on SQL 2017. I don’t know, it’s one of those things where it’s so new and it’s not in the bread and butter of the SQL community. So you would be – if you said R, there would be a little bit more resources because there’s more data-savvy people understanding R, but Python is a little different. I would say Google around. I don’t know of anybody giving any classes or anything. There may be something at PASS this year at the summit; maybe one or two on that. But I wouldn’t say, hey I’m going to go to the PASS Summit and learn Python on SQL 2017; you’ll get a brochure level of knowledge and that’s about it.

I would say, you may just have to get in and jump into the middle of it and just start cranking on things and reading blog posts from engineering groups and things like that and see how things are actually working out because that’s one of those things where it’s so new and people really haven’t gotten into it yet and there’s just not a lot of content out there. Just watch, somebody wrote a book and published it last week.

Brent Ozar: No, I bet you’re right. So there’s a couple of sessions on Python – they don’t focus just on Python, so there’s one, T-SQL, R and Python, there’s another one, Data Science Tips and Tricks. But what I liked about where Richie was going with that too, go talk to who the speakers are. And you don’t even have to wait for PASS. So you can go contact those speakers now. Dejan Sarka does so much SQL Server work with new features and I wouldn’t even be surprised if he’s in the midst of writing a book on this because he’s the kind of guy – I think he’s cranked out like 20 SQL Server books. The guy is a machine. He’s also really easily reachable on the web, does a lot of training classes, so I would just ping him and say, hey, what resources are out there?

Carlos Bossy, also a really nice guy. I don’t think he’s written a book on this, but I would totally ping on him as well just to go see if you can track him down and see what they know about how easy it is to get started. I’d also say, what pains are you trying to solve? Pinal points out learnpython.org. I’m going to put in that to send it to everyone, learnpython.org. And let me throw it up on the browser as well. There’s a lot of good stuff out there on just how to learn python in general. The trick is [crosstalk] – yeah, the SQL Server part.

Richie Rump: Yeah, and that’s the one that got me, like… Actually, when you said – when you said machine, it reminded me of that comedian Burt Kushner, I think. He’s got a really funny thing about the machine. I’ll forward it in chat. It’s a little blue, people, okay.

Brent Ozar: That’s our chat room. Our chat room is pretty blue.

 

I have users that start with a backslash

Brent Ozar: Julie says, “I’m running select star from sys.users, order by name, I’m seeing users that start with a backslash. It’s causing errors when I’m trying to add the user. I can’t run sp_user or drop user. Any ideas on how to fix?” No, you got me there, other than obviously escaping and [them] with square brackets, you know, just to make sure that you got that. But I’m sure – I know Julie is way beyond that. We see her all the time in here. So that’s the only thing that I can think of off the top of my head. I also think it’s an awesome repro to try and do. If you can create the user and then have problems dropping it, this is one of those questions that goes over awesomely well on Stack Overflow or on dba.stackexchange.com.

Richie Rump: Have you tried hitting the little pi icon in the bottom left of the screen, and then maybe that will unlock it.

Brent Ozar: The pi icon?

Richie Rump: I’m surprised you didn’t get that. That’s The Net with Sandra Bullock. That was like a whole thing.

Brent Ozar: Oh, I forgot all about that. That was so good. She had her little Volkswagen Cabriolet…

Richie Rump: Of course, you would know the car that Sandra Bullock drove in The Net.

Brent Ozar: Yes, I can’t remember the name of her co-star, the evil guy, but I can remember that she drove a VW Cabriolet.

Richie Rump: All I remember as a technologist, even a young one back then, I’m like, this is crap. This is nothing. This is stupid.

Brent Ozar: But you know, I remember – because one of the climax parts is she goes running into a computer show – I think there was a computer show – because she wanted to access the internet. I’m like, wow, there were computer shows back then. It was like car shows today, but they would have the latest models…

Richie Rump: And they were huge.

Brent Ozar: Yeah, and you would go – computer swap meets too. I would go to this school back in Memphis and we would go around – it was almost like you were going through LP records at a flea market. You were going through these different motherboards.

Richie Rump: Oh, this [mem slot] looks good. Let me pick that one up and…

Brent Ozar: Yeah, different CPU fans, you know, trying to figure out exactly what was the best one. Yeah, that was incredible. Bob Ward – uh-oh – Bob Ward says, “Oh never mind.”

Richie Rump: Computer Shopper, you know, magazines that look like phone books.

Brent Ozar: I would get so excited about that. Every time my computer shopper came out at the bookstore, I would go get it and I would fold over all the pages of the best-looking cases, the cases that I really wanted to buy. Those were good old days.

Richie Rump: And now, we just throw away machines.

 

I had CPU cores disabled and I had no idea

Brent Ozar: Daryl says, “Today’s blog post is fascinating…” Alright, let’s pull up today’s blog post, since, of course, on the recording, people won’t know exactly what Daryl’s talking about there. Today’s blog post was about a VMware configuration – VMware or Hyper-V configuration issue. He says, “So you’re saying I can put Standard Edition on a 24-core box and Standard Edition will only use four and if I had the issue, SQL ConstantCare would flag it?” Yes. Now it’s not that way in most hypervisors by default. Most of them will intelligently configure CPU, but a lot of times over the years, we’ve either inherited a setup from someone else or we just went and created a new VM, we didn’t really know what we were doing and we did the wrong CPU configuration.

SQL ConstantCare isn’t the only way to flag it. You can also flag it with sp_Blitz. Both of those works in terms of quickly telling you if you have that issue or not. And it’s funny because it’s not like a number one priority issue. It’s down in the teens, I think, teens or 20s. But it’s just that you have cores sitting around idle. And worst case, I have seen people with memory sitting around idle and not able to use it.

Richie Rump: Yep.

Brent Ozar: It’s so fun to now get all the data with ConstantCare and be able to trend this stuff and see, oh my god, we caught like 20 people with cores offline, you know, and have this immediate turnaround difference.

Richie Rump: Yeah, and everyone’s so, like, oh I never thought to look at that. and it’s like, well why would you, you know? It’s one of those things, like, you don’t even think about it. Well, why would they do that?

Brent Ozar: Yeah, or if they did it, why wouldn’t it be a big red flashing message that, yo, dog, I can’t use the rest of the cores. And I’ve talked to the rest of the – I don’t know why I always go to yo, dog.

Richie Rump: Yo, dog, I found some cores in your cores that are not being cored.

Brent Ozar: I heard you like cores. Turn your cores off. And in fairness to the Microsoft people I talk to when we go through this issue, they’re like, well look, we throw it in the startup event log. It’s in there as a line, we’re using this many cores, but no one ever looks. No one looks at their startup log. They never review it and there’s so much spam in there. There’s so much garbage inside there that you don’t read. Alright, well that’s all the questions that we’ve got this week. Thanks, everybody, for hanging out with us and we will see y’all next week at Office Hours. Adios.

 

Wanna attend our next Office Hours and ask questions live?


Can Non-SARGable Predicates Ever Seek?

Development, Indexing
4 Comments

Cheating At Candy Crush

The short answer is that yes, they can. But only with a little extra preparation.

Before I show you what I mean, we should probably define what’s not SARGable in general.

  • Wrapping columns in functions: ISNULL, COALESCE, LEFT, RIGHT, YEAR, etc.
  • Evaluating predicates against things indexes don’t track: DATEDIFF(YEAR, a_col, b_col), a_col +b_col, etc.
  • Optional predicates: a_col = @a_variable or @a_variable IS NULL
  • Applying some expression to a column: a_col * 1000 < some_value

Applying predicates like this show that you don’t predi-care.

They will result in the “bad” kind of index scans that read the entire index, often poor cardinality estimates, and a bunch of other stuff — sometimes a filter operator if the predicate can’t be pushed down to the index access level of the plan.

Most Common

Is the NULL replacement issue, I think.

If I had an index and query like this, life would be grand. I’d be able to perform two seeks into the index and only read ranges of rows that qualify.

Seekaroni

Life gets less grand if I change the query to this. We have to scan the entire index, replace null values with 0, and apply a predicate.

I’m sleepy.

This isn’t disastrous here, but I’d probably want to fix it.

Unless I couldn’t. This may be turd party code that I can’t alter.

The Workaround

We can cheat a little bit, by adding a computed column and indexing it.

Which means our original query now gets a Seek plan!

Iffy

If you compare the number of rows read between the Seek and Scan, the Seek does get just the rows it needs.

A pillar in the community

Surprises

I was a bit surprised that this worked out well. It’s a bit like function based indexes in other RDMBS platforms. It’s also nice that the expression matching portion of the optimizer was able to pick up on it easily.

Thanks for reading!

Brent says: indexed (not just persisted) computed columns are an awesome trick to have in your performance tuning bag. If you have trouble getting SQL Server to use ’em, read Paul White’s post about trace flag 176.


Why You Should Stop Depending On SQL Server Versions In Code

Development
3 Comments

It Used To Be

That when you wrote scripts that had to work across different versions of SQL Server, you were pretty safe.

Microsoft would release a new version, or a Service Pack that had something new in it, and it wouldn’t start showing up in older versions.

That’s not so true anymore. Microsoft has been doing some awesome work to add and back port new features. It’s doubly true when you’re dealing with cloud products (I’m looking at you, Azure SQL DB) where who knows what’s going to be or not be there, what version numbers will be, or anything like that. It’s the Wild West.

Which means if you write the kind of scripts that might throw errors if they go looking for a DMV or column that might not be there, you constantly have to figure out which major and minor versions you can go look for stuff in, often using dynamic SQL.

Nowadays

You still have to use dynamic SQL. That much is obvious. But it doesn’t make sense to do stuff like this anymore:

Why? Because all your code is going to do something like this:

This is gonna be a real pain in the butt to keep up with. Stuff can get released in Cumulative Updates that’s brand spanking new.

Here’s an example from 2017 CU3. That got backported to 2016 SP2.

Awesome! Unless your code looks like… that. Up there.

What’s Better?

It’s a whole lot more reliable (and a whole lot less work) to explicitly check for new things in DMVs, and use that to inform your code what to look for.

For example, this just makes sure all four columns are in dm_exec_query_stats:

Then your dynamic code becomes a bit simpler, too:

Now it doesn’t matter if Microsoft adds, removes, changes names, or definitions down the line. That stuff could normally cause problems or create a lot of busy work keeping dynamic SQL up to date. Now it doesn’t.

Hooray

Now, there’s still a whole lot of version checking in our code in the First Responder Kit. Cleaning it up isn’t a priority, but not writing more code like that is.

And this won’t work for everything, unfortunately. For example, if there’s a bug in certain versions and you need to skip those, you still need to rely on version numbers. There are likely other examples where it won’t work out there, too.

Thanks for reading!

Brent says – another thing I love about this approach is that when Microsoft backports a feature to an older SQL Server version by way of cumulative update, you don’t have to go revise your scripts right away. If the field is there, you’ll use it, whenever it happens to show up.


Nodes Offline: Why Your SQL Server VM Can’t Use All Its Virtual Hardware

Virtualization
31 Comments

When you install SQL Server Standard or Web Edition in a virtual machine with more than 4 cores, things can look like they’re just fine – when in reality, you’re not using anywhere near your server’s potential. See, these editions have a tricky limitation – let’s look at the documentation:

Standard and Web Edition Limitations

Here’s the catch: the lesser of 4 sockets or 24/16 cores. Not the greater – the lesserSo if you configure a virtual machine with, say, 8 virtual processors, you have to be really sure that you don’t accidentally end up with 8 sockets. Here’s a screenshot of Task Manager in a poorly configured VM:

8 sockets, each with 1 core

At the right, see how it says “Sockets: 8”? That means we set the VM up with 8 CPUs, each of which has a single core. Standard Edition will only use the first 4 cores here, and that’s it. No matter how much load our SQL Server gets, it’ll only use 4 cores – meaning CPU will only hit 50% busy. Our wait stats will show SOS_SCHEDULER_YIELD out the wazoo, but our sysadmins will say, “There’s plenty of available CPU power – you’re just not using it.”

Here’s an example of a reader who ran into the problem after reading this post – note how their first four cores are getting hammered, and the rest are sitting idle:

This is notoriously tricky because if you open SQL Server Management Studio, right-click on the server, and click Properties, it looks like everything’s okay. You see all 8 cores – you just can’t use ’em because they’re offline, as shown in sys.dm_os_schedulers:

There’s another clue buried in the SQL Server startup log:

SQL Server detected 8 sockets with 1 cores per socket and 1 logical processors per socket, 8 total logical processors; using 4 logical processors based on SQL Server licensing. This is an informational message; no user action is required.

Just informational. No big deal. Carry on.

How to tell if you’re having this problem – and fix it

Just run sp_Blitz, and look for the warning of “CPU Cores Offline” or “Memory Nodes Offline.” This is one of those things I found once in the field, figured I’d better check for it, and now I find it all over the place. I totally understand why – from the SQL Server side, nothing looks out of the ordinary. (SQL ConstantCare® finds this problem on dozens of servers per month, for free!)

The solution is easy:

  • Shut down the VM
  • Change the number of cores per socket – in my case, I could set it to a single-socket, 8-cores-per-socket VM, or a 2-socket quad-core, or a 4-socket dual-core
  • Turn the VM back on

And enjoy your newfound power. Be aware, of course, that your CPU percent-busy may suddenly get higher (worse) – but that’s a good thing, because it means you’re using all that horsepower you paid for.

Another cause: running Standard Edition on big servers

SQL Server Standard Edition is limited as to how many cores it can access: 2014 & prior cap out at 16 cores (32 with hyperthreading), and 2016 & newer cap out at 24 cores (48 with hyperthreading.) If you try to run one of these on, say, a 64-core server, you’ll encounter this error.

You could use all of your cores by upgrading to Enterprise Edition, but…really, who wants to pay $7,000 per core to do that? Instead, use affinity masking to configure alternate cores as active so that you can balance the workloads more effectively across the underlying physical CPUs.

For example, say you’ve got 2014 Standard (which maxes out at 16 cores, 32 hyperthreading), and you’re running it on a 2-socket, 16-core-per-CPU server with hyperthreading enabled (for 64 total cores.) You would configure SQL Server to use every other core so that it uses 32 of the 64 cores, but just every other core. That way, both of your 16-core CPUs will get activity, and it’ll balance out the heat & memory better between the processors (instead of only lighting up the threads on the first 16-core processor.)


New Stack Overflow Public Database Available (2018-06)

Stack Overflow
5 Comments

@Taryn and the kind folks at Stack Overflow publish their data export periodically with your questions, answers, comments, user info, and more. It’s available as an XML data dump, which I then take and import into SQL Server for teaching performance tuning.

You can download the 38GB torrent (magnet), which gives you a series of 7Zip files that you can extract to produce a ~304GB SQL Server 2008 database. You can then attach it to any 2008-2017 SQL Server. (If you want a smaller 10GB version, check out the mini 10GB version circa 2010.

Stack Overflow
The place that saves your job

The data goes up to 2018-06-03 and includes:

  • Badges – 27M rows, 1GB data.
  • Comments – 66M rows, 22GB
  • PostHistory -106M rows, 174GB (new)
  • Posts – 41M rows, 106GB
  • Users – 9M rows, 1GB.
  • Votes – 151M rows, 5GB.
  • And a few smaller supporting tables: PostHistoryTypes, PostLinks, PostTypes, and Tags.

As always, it’s open source, licensed under Creative Commons Attribution-Share Alike 3.0, as is the source data dump.

The new PostHistory table makes this one way bigger.

The Posts table holds questions and answers. The PostHistory table tracks events that have happened to a post over time, like edits. One of my favorite features of the Stack sites is the ability for people to edit each others’ questions and answers. You can jump right in and improve someone’s question by adding more details, or edit their answers, too.

To join them together:

Results for a very active question:

PostHistory

I haven’t included this table in the past because it’s so large, and I wanted people to be able to do Stack query demos on smaller machines. However, we’ve solved that problem by distributing a separate 10GB StackOverflow2010 database (1GB 7zip) showing data from 2008-2010.

To learn more about using the database:

  • BrentOzar.com/go/querystack – my page about the SQL Server export with more info about how I produce the database.
  • Data.StackExchange.com – a web-based SSMS where you can run your own queries against a recently restored copy of the Stack databases, or run other folks’ queries.
  • Watch Brent Tune Queries – free sessions where I take different queries from Data.StackExchange.com and tune it live.
  • How to Think Like the Engine – free videos where I show the Users table to explain clustered indexes, nonclustered indexes, statistics, sargability, and more.

Indexing Strategies For Cross Apply

Slicked Brad

The apply operator remains one of my favorite tools, but much like CTEs and windowing functions, they’re not totally magical unless you look at your indexes.

Starting with a dead simple query on the Users table, it’d be easy to come up with an ideal index for this one thing.

If you really cared the most about this, you’d probably create an index like this for it.

It’s got everything, really. It’s even ordered correctly.

Lambs To The Sausage

Now what if we complicate things a bit? What if we want to find items in the Posts table for all of those very active users?

Say we wanted to find their top 10 questions by Score. We could add this Cross Apply to do just that.

But it would be slow. Real slow. This thing does a lot of work.

There are some interesting things here!

  1. There’s no missing index request here, yet…
  2. The optimizer decided to create an index for us behind the scenes
  3. The index it created still required sorting data
It took a lifetime.

Sorted Affair

You may want to create an index like this. It has a lot of potential!

We’ll have the data sorted for our order by! Then it’s just a matter of grabbing the right users.

It’s filtered and everything.

What could go wrong?

Well, sit down for a spell, and lemme tell ya…

This thing flat out lies to us.

See where it says the Posts table was scanned once?

Malarkey!

There’s about 549 scans that we’re not being told about.

That top executed 550 times, and scanned the index for 10 rows.

Cool, huh?

Back To The Drawing Board

Our fear of needing to sort data may have gotten the better of us on that last one.

If we think back to an earlier post about how equality searches can support sorts with non-leading columns, we might try this index.

This finishes… yep. And hey, look, it’s back to being honest about the number of times we hit the Posts table.

The query plan is a less offensive version of the last one, still with no sort.

But this time sporting a much more efficient index seek.

Weary

Why Does This Work?

For the Nested Loops Join, we pass in a unique list of Ids from the Users table, and for each of them, the Top operator performs a single seek.

Since they’re processed iteratively, when we match to an OwnerUserId in Posts, Score is already ordered for us.

This takes care of almost all the work that we were doing in previous plans.

Thanks for reading!


Building SQL ConstantCare®: Why People Aren’t Using Query Store

When Microsoft first talked about Query Store, I was wildly excited. It’s basically a black box flight data recorder for SQL Server, tracking which queries run over time, how much work they do, and how their execution plans change.

And then it came out, and…hardly anybody used it.

We thought it was a good idea, so we took a few measures to encourage adoption:

  • I added a recommendation in sp_Blitz to turn it on
  • Erik wrote sp_BlitzQueryStore to make Query Store data easier to understand and use
  • SQL ConstantCare encouraged folks to turn it on where it made sense (where they had Cumulative Updates applied to fix QS issues)

And even amongst the population of SQL ConstantCare users, where we actively pester you to turn it on, adoption is still low. Amongst eligible databases, for customers willing to share their data for collective analysis:

  • 1,434 databases have it turned on (18%)
  • 6,635 databases don’t (82%)

So the question became – why?

We asked, and here’s what customers told us.

We emailed folks who hadn’t turned Query Store on for any of their databases, and here are some of the answers:

I just haven’t had the time to understand what it does, and best practice for implementation, any considerations for this specific server, etc.  I fear things I don’t completely understand.

Companies want paranoid people in charge of the databases, so this makes perfect sense. You don’t wanna go flipping switches unless you’re confident they’re going to make things better, not worse.

With each of these answers, I wanted to think about what would change that answer. Here, Microsoft needs to figure out the best practices for implementation based on their experience hosting Azure SQL DB across a lot of customer workloads, and then build those into the SQL Server product as the default settings for Query Store. (Right now, you can kinda feel during QS setup that the defaults are a shot in the dark.)

Basically, just too busy. I’m excited about it, but haven’t had a chance to really look into it too much, and I thought I saw something early on about a possible performance hit or memory usage with it turned on.

Several answers came in with that theme. There have been a few Query Store sessions at GroupBy that mentioned the overhead, and it’s measurable. Folks need more confidence in the expected overhead, plus guidance on whether their own system can safely handle that additional overhead. (This is where our work on telling you if your server is bored comes in handy.)

We have a monitoring tool (SQL Sentry/SentryOne) which allows us to see some of the plans, the database/application is not a critical one (not a high number of users/load), and we have not had a chance to really look into what’s needed from a setup and maintenance perspective.  When we migrate our critical application DBs to newer versions, we will most likely look at turning it on for them.

Makes total sense. Query performance tracking isn’t a new need – it’s been around forever – and a lot of folks are using tools that already relieve that pain point.

To change this answer, Query Store would need to be good enough at the “free” price point to beat SentryOne at its paid price point, and that definitely isn’t the case today.

We’re using SQL Server to host our ERP software from a third party vendor. I wasn’t sure what kind of benefit Query Store would give us.

I never thought about this before, but Query Store could be an excellent black box for ISVs to use with their customers. However, I don’t see that happening anytime soon – most ISVs don’t have T-SQL tuning staff on hand to leverage the plan cache, let alone build new tooling to analyze Query Store.

How that influences our development

We’re starting to work on query analysis in SQL ConstantCare®: telling you which queries are causing your performance issues, and how to go about fixing those queries. There are a few different ways we could collect the data:

  • Daily sample of the plan cache – which is really easy to do with low overhead, but it has huge blind spots, especially around servers under heavy memory pressure or building dynamic strings
  • Frequent samples of more sources – get running queries and the plan cache on a regular basis, like every X minutes – which gets much harder to do, but has less blind spots
  • Get it from Query Store – have customers turn on QS, leverage the code Microsoft has already built – which is easy, but only works on 2016+, and even then, people aren’t using it. Could overcome some of that with the open source OpenQueryStore, which works on 2008+, but there are costs and risks involved with building our products on top of that.

But these numbers make the choice clear:

  • 72% of customer databases can’t use Query Store (2014 & prior)
  • 23% are eligible, but would take training/convincing/testing to use it
  • 5% have Query Store enabled

As a tiny company, we gotta pick our battles. Right now, it makes more sense to focus on traditional plan cache analysis – something that will pay off instantly for every single customer we have – rather than try to roll the Query Store evangelism boulder uphill. (I tell you what, though: if I was a DBA again tomorrow, I’d have Query Store on for my databases.)


[Video] Office Hours 2018/6/6 (With Transcriptions)

This week, Brent, Erik, Tara, and Richie discuss index maintenance, stats updates, the acquisition of GitHub, using linked servers to move tables across, talking to developers as a DBA, the best cloud option, setting the number of tempdb files, and their favorite SQL Server features.

Here’s the video on YouTube:

Office Hours Webcast - 2018/6/6

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes

Office Hours – 6-6-18

 

Should I run update stats before or after index maintenance?

Brent Ozar: David asks, “Is it better to do an update stats before I do index maintenance or after and why?”

Erik Darling: It’s a trick question…

Tara Kizer: I know, for real.

Brent Ozar: Do you still beat your father? No, no, wait, yes, no…

Erik Darling: I think so – at what? Chess? So, David, around these parts we’re not too keen on doing index maintenance like your indexes owe you money. It’s not really the right way to do things. All the time and effort that you expend figuring out if an index is fragmented and then, you know, doing actual work on it is kind of a lot of wasted effort from our point of view and it doesn’t really fix the kind of performance problems you want to fix.

Updating stats is a really good thing to do. I’m a fan of that. I enjoy the stats update because it does the two best parts of an index rebuild without all the crazy work involved. It update statistics on your indexes and it invalidates old or potentially bad plans in the cache. So a really smart guy named Michael Swart once commented on one of our blog posts that index rebuilds are the most expensive statistics update in the world.

So I would tend to run with that advice a little bit and say that you should not just blindly keep running index maintenance but keep on updating those stats.

Tara Kizer: And then, to answer the question though, if you are doing index maintenance, do your update stats after and make sure it’s intelligent enough to not bother updating stats on an index that just got rebuilt because you already got update statistics if you had an index rebuild. So I normally recommend to my clients, update stats daily and index maintenance much less frequently. Some of them are really – they don’t want to make big changes, so I’ll say, okay, weekly then for your index maintenance, but I really want to tell them monthly, quarterly, you know, every once in a while maybe. It’s update stats that matters.

 

What are your thoughts on Microsoft’s acquisition of GitHub?

Brent Ozar: Let’s see, Michael asks, “Hi team Ozar, what are your thoughts on Microsoft’s acquisition of GitHub?”

Tara Kizer: Good question for Richie…

Brent Ozar: yeah, Richie, what do you feel about it?

Richie Rump: I’d take the money – I’d take much less…

Brent Ozar: $7.5 billion…

Richie Rump: Sounds like a good deal for GitHub. Microsoft is different than it was 10 years ago. It’s different than it was in the Ballmer era, it’s different than it was in the Gates era. They’re much more open to open source. If you didn’t know, Microsoft is the number one contributor to Linux. I mean, who’d figure that? So I think it’s a good deal for GitHub, I think it’s a good deal for Microsoft. From what I’ve read, they say they’re not going to really do much to anything to GitHub, but if you take a look at any of their dev presentations for the last three years, they’re all using GitHub anyway, unless they’re specifically showing something from the guys out in Charlotte, South Carolina. That’s where the Visual Studio Team Services team is.

Unless they’re showing something like that, it’s all using GitHub stuff. And all the Microsoft stuff, even Microsoft Docs is now over on GitHub, so it just made sense that they have all this money, they might as well put it into a system that they use, they love and hopefully they’ll continue to have it grow and more focus on the Hub part as opposed to the Git. So I like it.

Brent Ozar: And we use it ourselves. We won’t change off of it. If anything, I get more excited about it because we’re Microsoft people. We do all kinds of stuff on a Microsoft stack, so anything…

Tara Kizer: What’s this, “We people?” I mean, we don’t all use GitHub…

Richie Rump: Yeah, hey Microsoft person, what computer do you use again, I forget?

Brent Ozar: Ah yes, an Apple, yes. Somebody was like, they were telling me, are you going to leave all your stuff off of GitHub because you’re one of those open source people? I’m like, I use an Apple, I make a living with SQL Server, I really don’t care whether my stuff is open source or not. I mean, we give out open source stuff, but the things that we build aren’t open source. Our SQL ConstantCare is closed source.

We had a discussion about that when we first started too. Like, PasteThePlan, should it be open source or not? And then SQL ConstantCare, should it be open source or not?

Richie Rump: I mean, I think we’re a perfect example of how Microsoft has changed, you know. Back 10 or 15 years ago, I was 100% Microsoft stuff. I didn’t use anything else other than that and now I’m using AWS, I’m using other database tools, I’m using NoSQL tools. I’m still using .NET; I’m still using all the Microsoft IDEs, I use Visual Studio code, but there’s all this other stuff that has come into my development environment and they’re great. Microsoft stuff is still great. We still use SQL Server, obviously; we’re a SQL Server shop. But there’s other stuff that’s great too and I think that Microsoft has just kind of embraced that as well. Say hey, we make good stuff, other people make good stuff. We could leverage this stuff in our product and purchase other stuff like GitHub and hopefully not make it suck. Hopefully, it’s better than their purchase of Skype.

Brent Ozar: Skype – one thing everybody immediately goes to. And LinkedIn seems to have set a good precedence. I don’t use LinkedIn – I use it as one of those things where I go, okay I’ll accept a bunch of connections. My general filter on LinkedIn is do you have a picture? Because if you don’t care enough to have a picture then I’m not even going to bother connecting. And it’s not like I look at what you look like, I just see, is there a picture or not? And then second, does it say recruiter? Because I’m not touching a recruiter. But as long as they have a picture and they’re not a recruiter, I’m like, yeah, sure, whatever, car dealer, sure, whatever, professional photographer, whatever.

 

Can I use a linked server to move data between two instances in different domains?

Brent Ozar: Teschal asks, “Is it possible to create a linked server between two SQL Server instances in different domains? I want to move tables from one instance to another, but they’re in different domains.”

Tara Kizer: Oh boy, why are you using linked servers to do this job? This is not a job for a linked server. Use SSIS. Use the import-export wizard. I mean, use other features. Yes, you can do it via linked server, but it is not a great feature as far as moving data around between two servers. And it’s really easy to set up a linked server and use SQL authentication, but there’s security concerns there/ it’s just not the right tool. Use the right tool for the job.

Erik Darling: Tara, I notice that you didn’t say replication…

Tara Kizer: I’m assuming this is like a onetime move, you know. I mean, if it’s moving data all the time and if you can’t use availability groups then I’ll say, okay, replication.

Richie Rump: SSIS, I mean, this sounds like an SSIS problem. If you need to move a planet, you use Superman, but if you need to go and kill a crime-lord, you use Batman. So this seems like an SSIS problem.

Tara Kizer: I like it.

Brent Ozar: That’s pretty good.

 

How can I approach the developers to say SQL Server is tuned and that the application needs to be looked at?

Brent Ozar: Corey says, “Sites like Stack Overflow scale by using tools like Elasticsearch and Redis to offload requests from their expensive database server. I work in a company that has SQL Server, but we don’t have any of those tools to offload requests. Developers complain about slow performance. Assuming I’m the greatest DBA…” I like that. that’s a good one. “And my server is tuned to the max, what’s the conversation I can have with developers to say I’ve tuned everything I could and maybe the application needs to take a different approach?”

Tara Kizer: I’d be looking at BlitzCache output and look at the XPM sort order, the transactions per minute and seeing what’s in there. I bet you, you’re running queries a lot that they don’t need to be running a lot. You could be caching the data.

Brent Ozar: I’ll fire it open to show you. So if we open up – sp_BlitzCache is a tool that originally was written by Jeremiah Peschka and is these days managed by our good friend here, Erik Darling, because it has XQuery in it, and that’s like kryptonite to the rest of us. We’re like, yeah, no, sorry, unsubscribe. BlitzCache sort order equals – XPM stands for executions per minute. This will show you the queries that are running the most often on your SQL Server. And if course, mine there’s going to be nothing in here because I just started it up and IntelliSense started running. But if you scroll across just a little, there’ll be an executions per minute column. Might as well throw in a select from users, have it run a few times, select count star from DBO users. Actually, you know what, I’ll do if I want high XPM, select top one star from DBO users, go 1000, 10,000 whatever. And then as the thing starts to run more often, it will float up to the top of sp_BlitzCache. Of course, it’s not going to happen yet because we haven’t had a minute pass by, but you can also do, in cases like this, check out sp_BlitzCache sort order equals executions.

Tara Kizer: That’s one I don’t use ever with my clients. I see it in our Excel file, but I just don’t use that one.

Brent Ozar: yeah, so here he goes, there’s our select top one star from users and then it shows you the query has run X number of times. Yeah, that’s good. What else would we tell him if he wants to tell his developers to look at the application code?

Richie Rump: You need to point out the queries that are going bad. The developers don’t know and if we go up against our development or test, everything is fast and then when it goes to production, everything is slow, so you need to be the one going into the queries and seeing, hey, do I need an index here, do I need this there? What if this query needs to be changed because we’re using recursive whatever and blah, blah, blah – that needs to be you, dawg. You’re the best DBA in the world and now you need to prove it. So yeah, you need to go into it, you need to take a look at all the queries and you need to say this query is slow, this query is slow, even if we shave a half second off this, because it’s running so frequently then we’ll be saving X amount of time in and the system would be that much faster. So yeah, you need to dig into that, bro.

Erik Darling: The other thing is that, you know, when developers complain that something is slow, you need to figure out A, what they feel slow is. Like, you need to set some basic guidelines, like okay, what’s slow, like 100 milliseconds, half a second, one second, two seconds, and then show me what’s slow to you. Like, physically run something in the application that you feel doesn’t happen fast enough, then let’s try to repro that on the SQL Server itself. Because if you run that query on the SQL Server and it’s not slow then it’s slow in the application and fast in SSMS and you might have a totally different set of problems.

Brent Ozar: Oh, then you know what – there’s one thing I should say too about the last question. And god bless, Corey, you probably are the best DBA in the world, but for the folks who are reading this later or watching the Youtube video, whatever, be really careful about pointing the finger somewhere. Very often, when I was a database administrator, I’m like, it’s your application; your application sucks. And they’re like, well we just hired in an outsider and they found out that it turns out we have no indexes in our tables. What, who, wait, what? You know, or I’d totally hosed all the storage.

Richie Rump: Brent bad guy Ozar.

Brent Ozar: Brent idiot Ozar too.

 

Should I add multiple tempdb data files and use trace flag 1118 even without tempdb contention?

Brent Ozar: Michael says, “If you have tempdb contention that isn’t related to PFS GAM or SGAM pages, is it still wise to increase the number of files and set the…” So basically, what he’s asking is, should I set the number of tempdb files that I have and use that trace flag 1118?

Tara Kizer: I like to go, you know, as soon as I install SQL Server I go to best practices and that’s just how I configure it. And Microsoft does recommend trace flags 1117 and 1118, so I’m just going to apply those and increase the number of files up to eight, depending on the number of cores. So right off the bat, I’m just going to best practices and then I might change things later on depending upon what I see.

Erik Darling: yeah, I mean, those things became the default in SQL Server 2016 installer, like trading one tempdb data file per core up to eight cores and trace flags 1117, 1118 and a whole host of other ones just became default behavior because they were such no-brainers to turn on on modern SQL Servers with, you know, any kind of workload on them. There’s really no reason not to have them. I would be interested to see you turn those on, come back next week and tell us if you’re still having the same problems. Because if you’re still having the same problems after that then this is a completely different discussion; like what the hell are you doing in tempdb, my friend. Like, 40 concurrent sessions with four billion row temp tables getting created and then – I don’t think all the trace flags in the world are going to help.

Brent Ozar: No files…

 

Why does Brent always have to do such awful things to databases?

Brent Ozar: Next up, Joshua says, “Why does Brent always have to do such awful things to databases, a la today’s post.” So I will show you guys today’s post because it’s awesome…

Richie Rump: So you don’t have to…

Brent Ozar: Yes, so you learn them and you can take them to work. I loved this particular thing. I wanted to write a select where everything was keywords. So select select, from from, where where, like like, and that’s valid and that compiles. That’s kind of cool. You can put spaces in a query. Like, you can break this up, you know, query across spaces. You can break them across lines, but this one is the one that I really find delightful. This is so gorgeous. It’s only a picture on the WordPress page. I’m going to go get it from GitHub because it’s really nice when you copy-paste it into SSMS and it looks like it shouldn’t even work.

I’m going to go paste it in here – oh, this tab is hosed because of this go 10,000 times thing. I didn’t think it was going to take me seriously. I thought it would error out when I said go 10,000 because I thought there was a max. SSMS is locked up tighter than…

Richie Rump: Fort Knox…

Brent Ozar: Fort Knox is good. Golly, which means we’re probably going to lose PowerPoint here in a while too as well, but it is what it is.

 

Which cloud provider do you recommend for a personal lab environment?

Brent Ozar: Next we have Pablo. Pablo says, “I want to buy myself some cloud for my SQL Server lab environments. What would you recommend?

Richie Rump: A credit card with no limit.

Brent Ozar: Richie, for the record, ours has a limit.

Erik Darling: You just can’t see the limit from here. It’s like a supposed limit.

Richie Rump: I don’t know, without even trying, I’m trying, right, so there we go.

Erik Darling: Brent’s cards have a theoretical limit. You just can’t see them with normal physics, but they’re out there.

Brent Ozar: I’ve always wanted, at some point in my life, to buy a car and just put the whole thing on a credit card, say watch this and see what happens, you know. Can I try that one over there?

Erik Darling: Can I return this one, get that…

Brent Ozar: Yes, if I don’t sign it with my regular signature, can I dispute that charge? Now what – when you guys work with cloud stuff – because we’ve had the opportunity to work with Azure, we’ve worked with Google, we’ve worked with Amazon, we’ve jumped around back and forth – if you were going to tell somebody to go get started, what would you tell them? And there’s no right or wrong answer.

Erik Darling: I would say to set up an account with each of them and continue to use whichever one you find you have the easiest time dealing with because that’s going to be your main – like, if you can’t set up a server or get into your lab or make changes easily, you are going to bail on ever doing anything with your cloud environment from there on in. so whichever one you have the easiest time interacting with, setting stuff up with, that’s the one that I would go with. Whichever one you feel the best about…

Brent Ozar: Which is kind of, I think, how we feel about monitoring tools too. You should get demos and evals and you’re going to bond with one of them and you should use the one you bond with.

Tara Kizer: I bond with sp_whoisactive.

Brent Ozar: I do too. We all love sp_whoisactive. And the other thing I would say too is if you use one at work you should probably consider using that just so you get good at it.

Richie Rump: Yeah, love the one you’re with.

 

Should I use more than 8 tempdb data files for an instance that has 40 cores?

Brent Ozar: Augusto says, “Is it a correct approach to implement eight tempdb files in an instance that has 40 cores?”

Tara Kizer: I think he’s saying should I go above eight, so it’s plus eight. So the recommendation is to stop at eight and then if you have tempdb contention and you’ve proved that adding more files resolves it, then go higher, but otherwise stop at eight. At least that’s what I’ve read.

Brent Ozar: Yeah, a long time ago in a galaxy far, far away they said one file per core, and that’s not a good idea today.

Tara Kizer: It was the correct thing back in the day before we had more than eight cores. [crosstalk]

Erik Darling: That was an issue. Four cores, 3GB of RAM…

Tara Kizer: I remember when my database was reaching 1GB it was having some performance issues.

Erik Darling: Thomas [Grosser] tells a really funny story about how when he first started with SQL Server on, like, 4. Something…

Tara Kizer: 4.2.1, get it right…

Erik Darling: I’m not going to remember those. He used to have to go into the office 45 minutes early to hit the power button so when people started showing up for work, the SQL Server would be up and running and ready to go. Now, 25 something years later, whenever he reboots a SQL Server, it takes just as long because there’s 4TB of memory in it, it’s got to check all this stuff, go through things. So that was like 45 minutes all over again just to get the hardware up and running.

Richie Rump: There goes five nines.

Brent Ozar: Yeah, wow. Well, that’s all the questions we’ve got for today. Thanks, everybody, for hanging out with us and we will see y’all at the next Office Hours. Adios y’all.

Erik Darling: Bye.


Announcing a Very Special #SQLFamily Webcast with Dr. David DeWitt

Conferences and Classes
2 Comments

On Thursday, June 28th, at noon Eastern (see in your local time zone), join us for a very special presentation:

SQL Query Optimization. Why is it so hard to get right?

Dr. David DeWitt

The first paper on cost-based query optimization was published in 1979 by Patricia Selinger from IBM Research. This paper laid the basic framework for optimizing relational queries that is still in place today. While there have been many technical enhancements since 1979, query optimizers still fail to pick the best plan when faced with a complex SQL query.

In this talk I will describe the basic mechanisms used by modern query optimizers including plan enumeration, the use of histograms to estimate selective factors, and plan costing. I will also talk about a new approach to query optimization that I believe will revolutionize the optimization of queries in the cloud.

About the presenterDr. David DeWitt has positively wowed audiences at the PASS Summit over the years, consistently delivering amazing technical keynote presentations. You can read his bio at LinkedIn, or check out his Wikipedia page, or his past work at the University of Wisconsin. He could talk about pretty much anything, and I’d listen.

SQL Query Optimization. Why is it so hard to get right?

Here are the slides, but note that they use animations extensively – for the best effect, download the PowerPoint directly and step through it in presenter mode to see the animations.

https://www.slideshare.net/BrentOzar/sql-query-optimization-why-is-it-so-hard-to-get-right


Demoing Latch Waits with Stupid Tricks

Development
0

Say you’ve got PAGELATCH_UP or PAGELATCH_EX waits, and you’re wondering what might be causing ’em.

I’m going to keep the brutally simple approach of building a stored procedure to simply dump hard-coded values into a table at high speed. I don’t want to select from other tables since they might introduce some other types of waits, especially when I’m demoing concurrency quickly.

So let’s create a stored procedure that does an insert into a table:

And then use SQLQueryStress to run 200 threads with it:

SQLQueryStress, stressin’ the queries

When we run sp_BlitzFirst @ExpertMode = 1 to take a live sample of wait stats, the results aren’t all that impressive:

Small fries

Because frankly, our code isn’t all that impressive. We’re only inserting 3 rows. What if we insert a lot more? Let’s build SQL dynamically:

That gives me a nice long list of values:

Valuable values

That I can dump into my stored procedure:

I could go on and on

And now, when I run it from 200 threads of SQLQueryStress, the waits for inserting 1,000 hard coded values at a time – each of which needs a page for its whopper CHAR(8000) field – look a little different:

Like cowboy up, but different

That’s what it looks like when your system is under heavy contention due to a lot of queries partying in table variables: in each second, each core on this system is spending 48 seconds waiting on PAGELATCH_UP. That’s awesome.

What about temp tables instead, you ask? Same symptoms: PAGELATCH_UP.

How about PAGELATCH_EX?

You don’t see exclusive locking on temp tables and table variables because they’re per-session. Global temp tables, however, that’s another story:

Again, use your system-generated list of values to throw in 1,000 rows each time. Don’t bother selecting anything from the temp table or deleting the rows – that’d only slow us down. The result: exclusivity:

All my PAGELATCH_EXes live in Texas

Now, everybody’s fighting over the right to insert into a single shared object, the global temp table. The same thing happens if you use a user table:

You end up in a brutal fistfight for exclusive rights to insert rows into an existing page:

PAGELATCH_EX waits hitting user tables

You can change the severity of the waits by adding or removing fields on the tables involved. Want to simulate more contention on a single page? Remove the CHAR(8000) field. Want to throw big fields in the mix? Toss in some off-row NVARCHAR(MAX) – but like we generated a hard coded list of IDs ahead of time, do something similar with your other fields, too, lest you end up locked in contention for the source table.

When I’m facing unusual waits like these in production, I love demoing them with the simplest queries possible. This way, we can quickly show how hardware helps (or doesn’t help) a particular bottleneck – especially as opposed to just changing the code or indexes.


How Table Variables Mess With Parallelism

This Is A Parallel Query

Hi There

This Is A Temp Table

This Is A Parallel Insert Into A Temp Table

Sort of asterisk

(The Insert isn’t parallel, but the part of the plan leading up to it is)

This Is A Table Variable

This A Serialized Insert Into A Table Variable

Yeah Nah®

The XML Tells Us Why

This Is A Parallel Query Against A Temp Table

I Feel Pretty

The Estimate Tells Us Why

Good Guess®

This Is A Serial Query Against A Table Variable

Face Gym

The Estimate Tells Us Why

Bad Guess®

Two Different Limitations

Inserts and other modifications to table variables can’t be parallelized. This is a product limitation, and the XML warns us about it.

The select could go parallel if the cardinality estimate were more accurate. This could potentially be addressed with a recompile hint, or with Trace Flag 2453.

Temp Tables don’t have those limitations, however they do incur some overhead. Accurate table and column cardinality aren’t free.

There isn’t much more of a point to this post except to consolidate this stuff in one place where it’s not all side notes.

Thanks for reading!


Stupid T-SQL Tricks

Bad Idea Jeans, T-SQL
33 Comments

Presented without comment:

Next up, can you break up a query with spaces? Yep:

Well if you can do that – can you break up a query across lines? Sure you can:

And now, brace yourself: this one is so weird that I can’t even embed it in the blog. I’m just going to show you a picture of it first:

Look carefully. This shouldn’t work, right?

You’re probably going to go through a few phases as you look carefully at that code:

  1. “Is it going to work?”
  2. “No – that can’t possibly work.”
  3. (Then you’re going to copy/paste the code from a Github gist and try it, but here’s the key: don’t try to edit it yet, just run it.)
  4. “How did that work?”
  5. “Why doesn’t the proc return any results?”

And then you’re going to try editing it, and things are going to get really weird. I wish I had a webcam to see your face as you worked through it. I’ll follow up with the secret in a later post. Here’s your only clue: Solomon Rutzky knew what it was as soon as he saw the code.


Is Your Database Databasic?

Pumpkin Spice

When I’m looking at someone’s server, it’s easy to tell the kind of care it’s been under.

We’ve written a lot about servers that have gotten the wrong kind of care. You know, no backups, no checkdb, but someone managed to turn auto create stats off for every database, and enable affinity masking. That’s not what I’m here to talk about today.

No, no. We’re here to talk about those servers that lack the self-awareness to realize they’re getting the bare minimum out of life.

Juicy

The first sign is always the hardware

You’ll see two or four CPUs, and a pittance of RAM.

Someone may have set MAXDOP, Cost Threshold for Parallelism, and Max Server Memory, but those settings will make about as much difference as putting your Sauvignon Blanc in Riedel stemware.

The second sign is always the software

Q: How do you spell apathy?

A: RTM

Uptime is great. A year of uptime is impressive.

It’s impressive in the same way that Elon Musk met Grimes; so many people had to not do their jobs for that to happen, one may find themselves literally-can’t-even-ing.

The third sign is always the indexes

Sure, you have them. They’re there. Clustered, nonclustered. Some included columns. Some more included columns.

But most of them have DTA in the name, and then some incomprehensible list of Ks and numbers. Some much longer than others.

This is the equivalent of letting Jenny McCarthy pick your pediatrician. Sure, she’ll pick a doctor, but now you’re partially responsible for the next plague.

The fourth sign is always the code

You don’t even have to scroll down past the comments usually.

Snake Oil

But when you do, you’ll be greeted by a DISTINCT, and a dozen LEFT JOINs, finally culminating in WHERE final_left_join.id IS NULL. While the code is old, it certainly doesn’t predate NOT EXISTS.

This is as morally reprehensible as juice cleansing.

Ugg

It doesn’t take much to get your database out of basic mode. Our First Responder Kit is a great place to start.

Thanks for reading!


First Responder Kit Release: Just When You Think There’s Nothing New Left To Do

T*m* f*r An*th*r F*rst R*spond*r K*t R*l**s*.

All joking aside! A big thank you goes out to a few people this go around:

@jadarnel27 for not only contributing a bunch of Super Professional T-SQL, but also for writing a web scraping application to compile a list of current SQL Server versions.

@nedotter @ktaranov and Aleksey Nagorskiy for putting together a brand new stored proc: sp_BlitzInMemoryOLTP, to examine your In Memory stuff. For documentation on that, head on over to http://nedotter.com/archive/2018/06/new-kid-on-the-block-sp_blitzinmemoryoltp/.

@EmanueleMeazzo has done an outstanding job working on both PowerBI and the backing queries in sp_BlitzFirst lately. I’m amazed anyone has this much patience — both for dealing with us, and with PowerBI.

Now that we’re done being all emotional, let’s get to the nitty gritty.

You can download the updated FirstResponderKit.zip here.

sp_Blitz Improvements

#1565: When determining processor speed and power settings, we weren’t accounting for people being on AMDs. I know, I know. Who would ever do that to their SQL Server? (@jadarnel27)
#1585@rsocol let us know that we weren’t ignoring DBCC USEROPTIONS. Consider it done!
#1592@Tisit let us know about some row duplication when checking for procs with recompile. This has been fixed with the healing power of DISTINCT.
#1606@CCUMan let us know about an arithmetic overflow issue they were hitting. Darn milliseconds.
#1603: Microsoft announced some new ignorable AG waits, along with some other surprises and gotchas around AGs:

When the host server has 32 or more CPU cores, each database will occupy 16 parallel redo worker threads and one helper worker thread. It means that all databases starting with the 7th database (ordered by database id ascending) that has joined availability group it will be in single thread redo or serial redo irrespective which database has actual redo workload. If a SQL Server Instance has a number of databases, and it is desired for a particular database to run under parallel redo model, the database creation order needs to be considered. Same idea can be applied to force a database always runs in serial redo model as well.

sp_BlitzCache Improvements

#1559: Fixed an issue where unsafe XML characters weren’t displaying correctly in clickable columns
#1561 A friendly user finally helped us track down when some data was causing truncation errors when looking at cached paramaters. Blame it on CASE expressions in WHERE clauses.
#1564 & #1578: SET OPTIONS for stored procs and statements are now logged in the cached execution parameter clickable column.
#1575: Fixed math for tallying proc and statement costs
#1594@tompazourek hates leading spaces. He let us know by changing the SUBSTRING arguments for the warnings column

sp_BlitzFirst Improvements

#1557: When we show CPU usage from ring buffers, we were only showing the most recent one. Now we show you all of them so you can see if something went terribly wrong recently.
#1581: We fixed sp_BlitzFirst to only alert when only > 20GB of memory is free. Thanks to @dbadave87for letting us know about that one!
#1586@EmanueleMeazzo fixed a little bug around dynamic view creation, reported by @smcnaughton. Yay, we didn’t have to do any work.
#1603: Same as in sp_Blitz.

sp_BlitzIndex Improvements

#1588: More clear wording around what aggressive index warnings are and mean, and how to approach fixing them.

sp_BlitzQueryStore Improvements

Just about the same stuff as sp_BlitzCache.

PowerBI

#1579: Standard deviations! @EmanueleMeazzo added a couple new pages to help you figure out if any current metrics are much higher than usual. This is badass.

sp_BlitzLock

@jadarnel27 added some great stuff:
#1572: Added deadlock priority to the results
#1573: Filtered out Heaps from deadlock index info.
#1597: Added the entire deadlock graph for easy distribution

I’m starting to think Josh has a deadlock problem.

sp_BlitzInMemoryOLTP Improvements

Initial release! Mazel tov! This is a soft release, meaning it’s not in any of the bulk installer scripts yet. After you nice people have had some time to kick it around, we’ll throw it in there as well.

sp_BlitzWho Improvements

Nothing this time around

sp_DatabaseRestore Improvements

Nothing this time around

sp_BlitzBackups Improvements

Nothing this time around

sp_AllNightLog and sp_AllNightLog_Setup Improvements

Nothing this time around

sp_foreachdb Improvements

Nothing this time around

For Support

When you have questions about how the tools work, talk with the community in the #FirstResponderKit Slack channel. If you need a free invite, hit SQLslack.com. Be patient – it’s staffed with volunteers who have day jobs, heh.

When you find a bug or want something changed, read the contributing.md file.

When you have a question about what the scripts found, first make sure you read the “More Details” URL for any warning you find. We put a lot of work into documentation, and we wouldn’t want someone to yell at you to go read the fine manual. After that, when you’ve still got questions about how something works in SQL Server, post a question at DBA.StackExchange.com and the community (that includes us!) will help. Include exact errors and any applicable screenshots, your SQL Server version number (including the build #), and the version of the tool you’re working with.

You can download the updated FirstResponderKit.zip here.


New Training Videos on Memory Grants, Paging, Reporting, Variables, and More

Company News
3 Comments

This week, Erik’s added several new videos to his T-SQL Level Up: Level 2 course:

And to celebrate their release, you can watch ’em free this week.

Students with eagle eyes will notice new “Mark as Complete” buttons beneath each video, too. This makes it easier for you to track which sessions you’ve already watched, versus which ones you still need to nail down.

Go get your learn on!


[Video] Office Hours 2018/5/30 (With Transcriptions)

Videos
0

This week, Brent, Erik, Tara and Richie discuss severity 16 error warning, best way to move databases from SQL 2008 R2 to SQL Server 2014, adding Analysis services to a cluster, taking to higher management, deadlocks in a transaction log shipping setup, syncing date between an on-prem and Azure SQL DB server, log backups, renaming SQL Server VMs, Docker, SP_Blitz question, and more.

Here’s the video on YouTube:

Office Hours Webcast - 2018/5/30

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes

Office Hours – 5-30-18

 

Can I ignore severity 16 errors?

Brent Ozar: Alright, let’s see here. Steve asks, “I set up the alerts on my SQL Server per y’all’s suggestions, but why do all the references I can find say to ignore the warning severity 16 occurred on server name?” I want to say sev16 is one of the ones that you can – it usually indicates a T-SQL problem, so people say you shouldn’t worry about T-SQL problems. I still want to know. I still want to know if those kinds of errors happen. But if you have the kind of crappy application that throws sev16 errors all the time, I can understand why people would want to mute it.

Erik Darling: Yeah, you know, if you have so many errors coming out of your application that you’re getting error fatigue or alert fatigue, I would just limit it to the really high severity stuff like 19 through 25 so I know when stuff is really hitting the fan and not necessarily not every little peep and squeak the SQL Server makes.

Tara Kizer: It looks like our script does do 16 to 25, so maybe 19 to 25 is a better choice.

Brent Ozar: And I always wanted to know 16, but you’re not the first person to say that too.

 

Should I upgrade in place or build a new server?

Brent Ozar: Augusto asks, “Hi, if you have to choose between in-place upgrades or migrations and you’re moving between SQL 2008 R2 to SQL Server 2014 and you have a tight maintenance window and a very large database, what would you do?”

Erik Darling: Log shipping.

Brent Ozar: From the old cluster to the new one?

Erik Darling: Yeah, man.

Brent Ozar: And why would you do that?

Erik Darling: That’s my preferred way to fly just because it’s so dead simple to set up. Like, with mirroring, you know, you could add a lot of overhead to your system if you do synchronous. You might only be on Standard Edition, only be able to do synchronous. So if you’re on Enterprise, you can do async and then flip it to sync at the last minute and then flip over, but I would rather go with log shipping just because it’s so dead simple. It doesn’t really add any overhead because you’re most likely doing log backups anyway. And then when you go do cutover, it’s pretty easy to cut things off at that last log backup, bring things live on the new server and then let people, you know, mess around in there for a little bit. If anything goes terribly wrong, you can still point them back to the old server and have something to go back to. You don’t have that if you do an in-place upgrade. If you do an in-place upgrade, like you know, you get things upgraded, like people get when they start working and something goes terribly wrong, you don’t have a backup plan. You are just miserably stuck with whatever awful thing you did to your server.

Brent Ozar: I could see management saying you could need to do in in-place upgrade if it’s a very large database, depending on what they mean by very large, just because it’s less storage. And I’d be like, no man, that’s plan B.

Erik Darling: Like, how much of your maintenance window are you willing to dedicate to setting up a new server and then in the middle of the night moving things over to that to reassemble what used to work? I don’t want to do that.

Tara Kizer: I will never do an in-place upgrade in production ever, period, end of story. And if management is saying I need to, sorry, not going to do it. I just have to put my foot down. For production, it’s just asking for problems.

Brent Ozar: I like saying no. it sounds weird to say no to management, but at some point, you go, you’re paying me for my advice. You hired me for my expertise. I wouldn’t come to work wearing an asbestos suit; why would I do an in-place upgrade? There’s some things you don’t want to do.

Erik Darling: Why just paint over lead paint?

Brent Ozar: That’s right. That’s what kids are for. You give the kids a scraper and put them to work.

Erik Darling: And if they’re teething, it’s even easier; they just latch right on.

Brent Ozar: Which brings the question, Richie, during your home renovations, why aren’t you making this part of a home-schooling project where you just teach the kids construction? Of course, I guess you have to deal with the aftermath.

Richie Rump: Part of the same reason I don’t do a ton of work offshore; because you can’t control the quality.

Brent Ozar: You can control the quality with your kids it’s just continuously low quality.

Richie Rump: It’s difficult to control the quality; how’s that?

Brent Ozar: Samantha follows up with – she says, “Why aren’t you going to SQL Server 2016? Just bypass 2014. If you get 2016, you also get query store.” For that matter, why not go 2017? 2017 you start to get adaptive query plans, you can run the whole thing on Linux, Docker containers.

Erik Darling: Let your developers start using R and Python live in the database.

Richie Rump: So many bad ideas; it’s amazing.

Tara Kizer: None of my 2016 or 2017 clients that I’ve had so far, and I’ve had two 2017 clients so far, none of them have had query store enabled. It’s very disappointing.

Brent Ozar: It’s so weird. We have it in sp_Blitz, hey, you should put in query store. Nobody puts it in. it’s really weird.

Erik Darling: I wrote a whole stored procedure for it hoping that, like, this is the future, and I’ve used it twice. Every client that I’ve had has been on 2016 – well not every, but, like, a lot of them have had some variation on the story. But they’re like, yeah I tried to turn it on then my CPU shot off the radar for a half hour and I finally had to turn it off. I’m like, okay, well welcome to V1.

Brent Ozar: Weird.

 

Can I add Analysis Services to an existing cluster?

Brent Ozar: Justin asks, “Once you already have a clustered SQL Server, is there any way to add analysis services to the cluster?”

Tara Kizer: Even if there is, why would you? I don’t want those extra products on my SQL Server. The database engine is what I’m going to have on my cluster. I’m going to have supporting applications, such as my antivirus, you know, net backup or whatever it is, that supports SQL Server instance. But SSIS, SSAS, SSRS, they all get to go on their own boxes somewhere else. They do not get to pollute the SQL Server instance at all. I segregate everything. Nothing goes on the SQL Server boxes.

Brent Ozar: Preach. And too, if you think you’re getting high availability by clustering analysis services, you’re not really getting it. It doesn’t really do health checks on there. So if you go to Books Online they talk about how you can cluster it; it just doesn’t really have any health checking kind of things. What you’re better off, if you want high availability, is you build a couple of separate VMs or a couple of separate physical boxes and you put them behind a load balancer.

 

Should I switch to log backups every 15 minutes?

Brent Ozar: Next up, Steven says, “My company has a two-hour data loss policy.” Like us, we have a two drink minimum, so I kind of agree with that. He says, “Any suggestions of how to approach convincing them to go to 15-minute transaction log backups?” Oh, that’s an interesting question.

Tara Kizer: What, who are you trying to convince? I set up the job schedules and make sure that I at least hit their goals or better and 15 minutes is better than two hour anyway. Who is responsible for the job schedule here? My managers wouldn’t even have access to look; I would just be telling them frequently.

Brent Ozar: And y’all think she’s joking, but that’s how we roll here inside the company too as well. I don’t have access to do most of the stuff on our production SQL Servers. Richie owns this stuff. When one of us wants access to something, we put in a ticket to Richie. I have no business poking around in there, you know. It’s a bad idea.

Erik Darling: It’s a business’s job to tell you how much data they want to lose. It’s your job to figure out the technical setup that you need to meet those goals. So they shouldn’t be in there saying, no you have to take log backups every two hours. It doesn’t work like that – cart before the horse; whatever that saying is.

Brent Ozar: And I like doing log backups more frequently only because if one fails or two fail you still have enough time there to meet your two hour. If you’re only doing it exactly every two hours and you’re only supposed to lose two hours of data, the first job that fails, you’re screwed. You lost it.

 

Why are my restores being deadlocked?

Brent Ozar: Oscar asks, “I’ve got a shipped transaction log set up in one of our SQL Server databases and it works just fine, but every now and then it gets deadlocked in the restore job when it’s trying to restore the log backup. Is there any way to prevent deadlocks in a transaction log shipping setup?” I bet you’re getting deadlocks in MSDB. I bet you’re trying to insert update or delete record in MSDB and I bet that’s where you’re getting log shipped. And if you want, you can use sp_BlitzLock – sp_BlitzLock is a really slick stored procedure that Erik wrote to go get you the deadlock graphs and show what you’re being deadlocked between. And it’s probably something like somebody’s manual scripting that’s going and doing things inside the MSDB backup and restore tables. The one I’ve seen it happen a lot is where someone’s cleaning up history continuously. Like every five minutes, they’re trying to delete old records in MSDB. Just do that daily.

Erik Darling: Or if your MSDB history has just never been pruned and is gigantic and the god-awful thing that goes and looks for where to put in rows, where to delete stuff from, is just getting all lost on its own. I actually blogged a little while back where I put up some of the worst T-SQL that I’ve ever seen in a stored procedure and I wanted people to look at it. And it’s from sp_deletebackuphistory, or whatever that thing is called, and it’s all table variables and poorly written T-SQL with no indexes on anything. It’s like hell on earth whenever anyone wants to try and delete…

Tara Kizer: Like hell on earth, but it works starting with SQL Server 2005. You should have seen the crap-show back in SQL Server 2000. We all had to roll our own code back then and my manager back then realized that if he put an index on one of the tables, MSDB did that fix built in. I’m like, I don’t want to put indexes on tables in MSDB, so we all rolled our own back then. And in SQL Server 2005 or greater, they at least fixed that performance issue. I understand it’s not as fast as it could be, but SQL Server 2000 and earlier was horrible.

 

How should I keep data in sync with Azure SQL DB?

Brent Ozar: Daniel says he’s inherited an environment where they’re doing a custom database sync between on-premises SQL Server and then Azure SQL DB, which gets manipulated somehow in the production database. “What would you do in order to keep data…

Erik Darling: Excuse me…

Brent Ozar: “How would you recommend syncing data…” Richie… “How would you recommend syncing data between an on-premises SQL Server and Azure SQL DB?”

Tara Kizer: What are they trying to sync? Is that for a failover on-system or is it so you can run reports out there? So I maybe would use transactional replication out there, but maybe use another feature if it’s just for failover purposes. It just depends on what you’re trying to do.

Brent Ozar: Daniel says reporting, he’s doing reporting up in Azure SQL DB. Oh, so then I like Tara’s idea of transactional replication.

Tara Kizer: Well it does work well when it’s working and you don’t have to troubleshoot it and you can tolerate whatever latency that’s there when large transactions happen. It gets a bit of latency.

 

Where should I do log backups in an AG?

Brent Ozar: Sri says, “I’ve got a multi-site Always On Availability Group. What’s the best way to do log backups?”

Tara Kizer: It says log backups on all the servers. Do your log backups on the primary, that’s it. You certainly can move it to the secondary replica, but it’s going to be out of date, you know. So I do all my backups on the primary replicas. I don’t want my backups to be possibly not up to date.

Erik Darling: I really never understood the point of offloading backups with the AG. I just never got it…

Tara Kizer: What problem are they experiencing that they have to offload this simple task?

Erik Darling: Like, can you even have a good AG environment where backups screw you that hard?

Tara Kizer: I know. What does your I/O look like if that’s your problem?

Erik Darling: Like network disk, like, what is it that’s going on that’s making your server barf that hard taking a backup? It’s odd to me. Odd problem to solve…

Brent Ozar: I’m going to offload readable queries; I’m not going to offload backups because you’re going to pay the same licensing fee no matter what you’re offloading. Just go get rid of the locking problems; move those off.

 

Should I rename a SQL Server VM?

Brent Ozar: Next up, Guillermo asks, “What should I take into account when I rename a SQL Server VM?”

Erik Darling: Which part? Like just the VM or the SQL Server or…

Brent Ozar: I bet – there’s a Microsoft Knowledge Base article on what you have to do and it has to do with renaming – sp_rename, changing the server’s rename.

Tara Kizer: And it used to be that you had to do extra steps, but it’s much simpler these days.

Erik Darling: Well, I asked because something – I used to always goof-up whenever I made a new VM. Like, I would give the VM a new name in, like Hyper-V or whatever and I would be, like, this is my VM. And then I would like install Windows and go to connect to the VM name that I gave it in the virtualization platform and it’d be like, no that doesn’t exist. And I always forget that I have to change the computer name too to like match the VM name so I would connect right.

Tara Kizer: And there was one operating system – this does not have to do with virtualization, but one operating system where the name needed to be in upper case, and if you had it in lower case, it caused some kind of weird issue because we encountered that and it was horrific. And I forget what operating system it was, but it was a big deal. And this was on a clustered system.

Brent Ozar: And I wouldn’t change names on anything in a cluster either. Forget that. Guillermo says he wants to change the DNS computer name, all of it. I think we would take the same approach here that we would talk about with in-place upgrades. I’d rather build a new SQL Server with the name that you want and everything correct from the get-go and log ship or DR over or database mirror over to it. But this Knowledge Base article from Microsoft is actually fantastic. Books Online has gotten way better over the last several years and you can even be a contributor. You see how there’s these contributors things up here. You can go over and click edit on the right-hand side and you can go in and just edit these. They’re stored in GitHub, but you don’t even have to get fancy with creating a GitHub account or anything like that.

If you scroll all the way down when you’re logged in – I guess you do have to create an account, but you don’t have to go and fork the repo or anything. There’s an edit the page thing down at the bottom where you can just edit the page right there inside your web browser. You don’t have to do a whole lot of the more complex GitHub stuff.

Richie Rump: That’s why it’s gotten better.

Erik Darling: I’ll tell you one thing that bit me real bad with VM setup too is I had an old AG one that was all 2014. And when 2016 came along, I wanted to replace it with that. So it’s deleted all my old 2014 VMs, spun up 2016 VMs with the same name and then expected them to just work out of the box. But then I had to delete a bunch of DNS records, like other crap because it was looking at the wrong IPs, nothing was connecting. It was just a nightmare. So make a new one – please, just make a new one. Name it what you want, think about the children; awful.

 

Should I use Docker?

Brent Ozar: next up, Steven says, “I’ve heard a lot lately about Docker. What are your stances for Docker for DBAs in 2018? Is this something that DBAs need to worry about? Is this something that developers use?”

Erik Darling: Just stay away from the ones with pleated fronts. Pleated fronts are just miserable. I did it… I did it…

Brent Ozar: What you can’t see, ladies and gentlemen, is that he’s wearing plaid pants.

Richie Rump: Shorts – we know he’s not wearing any pants.

Brent Ozar: And socks and sandals… Now, have any of y’all played around with Docker?

Richie Rump: No. Docker is fun for the dev, but the question is, what problem will you be solving suing Docker? Because Docker is a way to install something very quickly. I believe it runs in its own container and does all this other kind of nifty stuff for the developer or I need to get some code out running. But for a database, what problem are you trying to solve here? So there’s a lot of questions that I would ask – why would I want something in a container? Am I moving it somewhere? Am I installing a lot of these in all these different servers all over the place? If this is going on physical hardware or if it’s going on a VM, just the ease of installation and configuration settings is what you’re getting. And if you’re a DBA, that’s part of your bread and butter, so even I, as a developer, haven’t really seen, for me in what I do, a really big need to be moving stuff to Docker. Serverless does that all for me.

Erik Darling: DBAs are like the most insecure people. Like every new technology that comes along, they’re like is this going to take the DBA job? I read a blog post not too long ago; DBAs worry about R taking their job. Like, what is R going to do that takes a DBA job? Like, what facet of R makes you nervous about your job? Like, no, no one’s going to, you know, look at statistical analysis of your log backups and decide you’re a terrible DBA and fire you. I just don’t understand. Everything that comes out, like, this is going to kill me. I don’t know what’s going to happen.

Brent Ozar: Artificial intelligence…

Erik Darling: Machine learning, the cloud, like everything, people are like the DBA is dead, done, buried, forget about it. And I’m like, you know, five, ten years later there’s still this big stinking pile of us that show up in Seattle every year, so I don’t know.

Richie Rump: You would think after 30 years of hearing your job’s going to be obsolete in the next five years that it would just kind of go over people’s heads, but no, everyone still freaks out about it. It just doesn’t make sense.

Erik Darling: you know, the [Mane Tack] repairman is still hanging out doing stuff.

Richie Rump: He’s still getting paid, I don’t understand…

Erik Darling: Residuals at least.

 

Why isn’t SQL Server recommended in virtualization?

Brent Ozar: Mark says, “Why is it not recommended to have SQL Server running on a VM?” Whoever’s telling you that, you need to go have a talk with them. He follows up with, “Are there best practices when running SQL Server on a VM?” yeah…

Erik Darling: VMware publishes them. VMware has a whole architecting VMware with SQL Server guide that I would say is a pretty good place to start for advice.

Brent Ozar: SQL Server VMware best practices – the one thing that I would warn you about is in some of these, they’ll talk about, for example, single core SQL Servers by default. So just read some of it with a grain of salt.

Erik Darling: That’s not in this one though. This one is actually pretty sane about how to configure stuff around numa and other things. I’m fairly behind this guide. Like, not 1000%, but…

Brent Ozar: See if it says anything about cores per – oh good, cores per socket, oh that’s great. That’s a good topic to get into. So that’s nice. We’ll throw that back up at the very top. So that is architecting SQL Server on VM vSphere and I got there from Googling that right there.

 

The First Responder Kit didn’t update itself

Brent Ozar: Marci says, “I installed the latest sp_Blitz scripts on my dev and DR boxes, but then when I ran the install all script on my prod box, it didn’t update the stored procedures.” I bet you ran it in the wrong database.

Erik Darling: Yep.

Brent Ozar: This happens to me every week to the point where Erik had to write a trigger for user databases to make sure that when we try to install them in the wrong database, they’d just get rolled back. A long time ago in a galaxy far, far away, we put a use script up at the top, like use master, so that it would enforce that it would go only in the master database. And then we have users who are like, no I want to put it in any database and I want to just be able to run the script. So we took the use part out…

Erik Darling: Now we have the extra fun part of figuring out which database they got installed in, then go delete all those. Context is everything, folks. Context is everything.

Brent Ozar: Alright, well that’s all the questions we have this week. Thanks, everybody for hanging out with us this week at Office Hours and we will see y’all next week. Adios, everybody.

Erik Darling: Goodbye.

 

Wanna attend the next Office Hours podcast and ask your own questions? Register here.


Mysterious Forwarded Records

Indexing
7 Comments

Thinking Of You

When people think about Heaps and the problems they can cause, they don’t often think of temp tables and table variables as Heaps.

Of course, without a clustered index, any table is a Heap.

This isn’t an argument for or against indexing temp tables, but while working with a client we came across something strange!

sp_BlitzFirst was reporting hundreds of thousands of Forwarded Records, but there were no Heaps in user databases.

When we dug in closer to what queries were doing, we found lots of places where temp tables had an insert/update pattern.

Setting Up

A close approximation of the problem part of this code looked like this:

A temp table got created, some amount of data was inserted to it, the temp table got updated with values from another table, and then there was a final select from the temp table.

Setting up SQL Query Stress to run that proc across a bunch of threads…

STRASS

And then running sp_BlitzFirst for a 30 second sample:

We get these results back!

A Crapload By Any Other Name

Changing The Pattern

The reason why we warn about forwarded records is that they can cause a lot of additional random I/O. Reads need to jump around from where a row originally was, to where it lives now, and then back to continue reading rows.

In this case, it was simple enough to change the pattern to insert with the same join that performed the update.

If you ever happen across a Forwarded Fetches warning in sp_BlitzFirst, but you’re Heap-free in your user databases, they may be lurking in tempdb.

This can happen with temp tables or table variables — they both occupy space in tempdb, regardless of what someone told you about table variables being in memory only.

Thanks for reading!


Wanna Attend All My Classes for a Year?

Company News
1 Comment
Live Class Season Pass

When I introduced my Live Class Season Pass, buyers got access to my entire new Mastering series for 2018.

I wasn’t sure how buyers would react to the new class lineup – these classes are tough. They’re 3 days of alternating between deep lectures and challenging labs.

You get hands-on access to a beefy SQL Server 2017 VM in the cloud, with a series of live workloads running against the Stack Overflow database.

The reviews have been great!

Mastering Index Tuning reviews

“Excellent class, the ratio explanation to labs worked really well for me. Expectations were high and got met.”

“Excellent class! I used to be scared of interpreting execution plans, but after this class I feel much more confident. Brent is an AWESOME teacher, and his enthusiasm for SQL Server and performance tuning are inspiring!” – Marie McAndrews

“As always Brent exceeded expectations! He has the ability to review the material to make it stick and the demos and labs worked great together. A natural born teacher who has mastered his craft; I highly recommend his training courses; webinars and conference talks. Although I knew enough to get by with index tuning and of course SentyOne plan explorer has index analysis that is great for finding the right index, the material the Brent covered made so much sense and cleared up a lot of the finer points of parsing execution plans that I learned many new tricks. He also takes the time to answer any questions. Take his courses, you won’t be disappointed!!” – Becky Harter

“Fantastic class. Brent is constantly refining his teaching skills to make complex topics understandable. If you want to learn what an index does, go watch How to Think Like the Engine. If you’re ready for the next step, index tuning, take this class.” – Tim Cockerham

Mastering Query Tuning reviews

“This class was a thorough reminder of how skilled Brent Ozar and Erik Darling are at query tuning and how much more I have to learn. The structure of the course was unique and wonderful in that the labs were realistic representations (not AdventureWorks) of many of the same problems that we fight in production every day – with not necessarily one “right answer” to the problem. It was also very cool to see the presenters go off-script and adapt their query tuning approach on the fly. This class is training that is well worth the time.” – Bryan Rebok

“This class gave me a giant toolkit for improving query performance. I found holes in my current process as well. I look forward to finding bad queries now because I can make them so much better. I highly recommend you take this course if you have queries that need to be tuned. You won’t be disappointed or bored while the Brent Ozar team provides a fun but very intense class.” – Kevin Kelso

“Mastering Query Tuning is another fantastic training offering from Brent Ozar Unlimited. Brent and Erik did a fantastic (and entertaining) job of presenting the material and associated triage and tuning methods. The interactive discussion in Slack was very beneficial. The well planned course labs allow attendees to experience, triage, and improve on all of the issues present in the demo database. After spending hours with the lab environment I feel confident that I can diagnose and solve (or at least improve) any of the issues in our production environment.” – Todd Nelson

Mastering Server Tuning reviews

“As with all of Brent’s classes – fantastic. So much great information is presented and the hands-on portion reinforces the concepts.” – Alan O’Neal

“Great mix of lecture and hands-on lab plus the use of the SLACK website for real time communication was amazingly helpful. I always learn a lot from Brent’s classes and the convenience of doing this on-line is perfect when our budget is limited.” – Lori Fry

“Brent has the rare qualities of being knowledgeable, understandable and fun all at the same time. The exercises are well-designed. Plus there is a good balance between instruction and hands-on learning. I highly recommend this course as well as the “Mastering Index Tuning” and “Mastering Query Tuning” courses.” – David Hicks

So what am I changing?

Before, when you bought the Live Class Season Pass, it stuck within one calendar year, like 2018. I designed it that way because I thought I might tweak the lineup for 2019.

The verdict is in, though: y’all love the classes, so I’m keeping the same lineup – just adding more material and continuing to change up the labs over time.

So now, when you buy the Live Class Season Pass, it’s good for one year from the day you buy it. That means if you buy now, you get access to the class through mid-2019. I’ve added new class dates going through May 2019. You can buy with confidence today, and have fun with me over and over through the next year. It’s been so much fun seeing the same students again and again, taking classes over to see if they can do a better job on the lab challenges the next time around. (Plus, we’re giving this out retroactively too – anyone who already bought the Live Class Season Pass after Jan 1st gets a full year of access too, not just the 2018 class dates.)

To celebrate, it’s on sale for $3,995 through the end of June (down from $5,995). If you want to pay via purchase order or check, just choose that option during check-out. As long as we get your payment before the end of June, you’re in.

See you in class – there’s still time to join the 26 other folks in Monday’s Mastering Index Tuning.


How Many Kinds Of Joins Are There?

Quite So

There are three in SQL Server, not counting the Adaptive Join which is just a placeholder for either Nested Loops or Hash Joins.

If you look closely at each of those joins, you’ll see all sorts of interesting things pop up.

Hopefully none of them will be Sorts.

NOT EXISTS

Nested Loops has:

  • Unordered Prefetch
  • Ordered Prefetch
  • And might be Optimized!

Merge has:

  • One to many
  • Many to many
  • With a Bitmap!

Hash has:

  • Residuals
  • “Perfect” hashes
  • With a Bitmap!

But going even further, there’s stuff that isn’t exposed about Hash Joins.

Internally, they might be In-Memory, Grace, or Recursive. You can’t see that anywhere in a query plan, but it’s out there.

This Is Jeopardy

There are things I’d much rather you know and understand about joins, of course.

But the next time someone asks you what types of Joins SQL Server supports, you can have a much longer answer.

Thanks for reading!

Brent says: and then there’s remote joins, which aren’t really a separate kind of join, but just using a join hint to declare which side the join happens on. But I know that you, dear reader, are going to come up with a gazillion edge case examples of other kinds of joins in the comments. That’s how you roll.


Behind Every Trivial Plan Is A Good Demo

Execution Plans, Indexing
0

A While Back

I started this Q&A on Stack Exchange about ISNULL. It’s one of the most common client issues I see, and putting stuff like that out in a different community to get more opinions on it is valuable.

Also, I’m saving up points for a Stack Overflow Swiss Army Knife. That’s what you do with points, right? Like Marlboro Miles?

Hiding Out

It wasn’t until I was looking at something else entirely, that I realized my starting query was a Trivial plan.

You’re Just Too Trivial

Now… Looking at this plan, you don’t really expect it to be Trivial. There’s kind of a lot going on there.

Via Paul White:

The details of which types of query can benefit from Trivial Plan change frequently, but things like joins, subqueries, and inequality predicates generally prevent this optimization.

Now, I’m not sure if the details have changed, or if this query is just much more Trivial than the syntax or plan make it appear.

But let’s go one step further.

Laying Low

Let’s run this query with a known Trivial plan killer.

The plan changes quite a bit, and uses a far less common strategy called Index Union.

Oh you’re like a little bunny rabbit.

This is a far more interesting plan, and I was a bit surprised to see it.

If I tried for a week to write a query that got this plan on its own, I’d fail utterly.

Plan Collector

I wish I had this plan in time for this post. It is indeed a sort we did not ask for.

The optimizer chose something between index intersection and sort-merge and a weird kinda-but-not-really Key Lookup.

It performs two ordered seeks into the same index, sorts the side with the inequality predicate on Id, and merges the two inputs together.

For those wondering, the seek without the sort doesn’t need one, because our index is just on Age, and is non-unique.

That means the clustered index key column (Id) is in the key of the index, and since IS NULL is an equality, the Id column is ordered by it next.

After that are Stream Aggregates to do some grouping and counting.

And to think, all this interesting stuff was hiding just a couple search phases beyond a Trivial plan.

Thanks for reading!

Brent says: when people see this, their hands immediately shoot up in the air to ask, “Should I add WHERE (1 = SELECT 1) to all my queries?” I lean towards no on that one because most of your trivial queries just don’t need full optimization, and that further compilation takes longer. I think I can count on one hand the number of times in the last couple years where an under-optimized trivial plan was the main cause of a client’s problems, and I wasn’t even lucky enough to hit that myself. Tara hit it, and the rest of us were all excited to see it, hahaha.