Blog

Hidden in SQL Server 2017 CTP v1.1: sys.dm_db_stats_histogram

It’s Friday night, so I’m waiting for new CTP Releases

As soon as I got the email, I started reading the release notes. Some interesting stuff, of course.

Batch mode queries now support “memory grant feedback loops,” which learn from memory used during query execution and adjusts on subsequent query executions; this can allow more queries to run on systems that are otherwise blocking on memory.

‘New’ string functions like CONCAT_WS, TRANSLATE, and TRIM.

STRING_AGG WITHIN GROUP (ORDER BY)s.

Cool.

Poking around

I started looking for new objects in sys.objects, comparing it to what’s in 2016.

GREAT GOOGLY MOOGLY

I don’t know how this didn’t make the release notes. But there it is. My first Connect baby with Microsoft. I’m a proud papa.

What and how

So, what does it look like, and how do you use it?

It’s a pretty simple CROSS APPLY, like many other system functions.

What you get out looks like a histogram. Because, well, it’s a histogram, but in rows.

THIS IS

sysrowcols whatever GIMME

AWESOME!

It’s not exactly perfect

For instance, if you just let it loose without filters, you get a severe error. The same thing happens if you try to filter on one of the columns in the function, rather than a column in sys.stats, like this.

Oops

But I don’t even care. It’s brand new, and it’s so much better than trying to tabularize DBCC SHOW_STATISTICS output.

Thanks, whomever worked on this.

And thanks for reading!


[Video] Office Hours 2016/12/14 (With Transcriptions)

This week, Brent, Erik, and Richie discuss their craziest SQL server troubleshooting issues, common causes of SSPI handshake errors, setting deadlock priority “high” on a query, vmotion high availability, best practices on Windows Server 2012 page files, and whether a career as a Database Administrator will be good for the next 40 years.

Here’s the video on YouTube:

Office Hours Webcast - 2016/12/14

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes

Office Hours Webcast – 2016-12-14

 

Brent Ozar: Let’s see, I suppose we should get into questions. It is 15 minutes after the hour. We’ll start off with the first one, it says, “I have a replication question. Just kidding.” I guess this person sees that we don’t have Tara here, so we’re fresh out of replication.

 

Brent Ozar: Fellow human says, “I have a nephew who is very interested in what I do. Is database administrator still a good career choice for someone who’s in high school now? Is this career going to be good for the next 40 years?”

Erik Darling: In some form, but maybe not in like a traditional DBA sense. I think data is always going to need a steward. I think it’s always going to need someone to take care of it and to protect it in some way and to watch to make sure that certain things are happening that safeguard it. I don’t think that it’s always going to be the same backup and restore and corruption checks and, you know, query and index tuning that it is now. But I think in some way data is always going to need someone to make sure that the right processes are happening with it. It’s good to learn what’s happening now and maybe get into it now and get your feet wet so 40 years from now you can say, “Back when I started, I used to do backups myself. I used Ola Hallengren—the senior, not the junior.” So it’s good to get your feet in and start doing stuff now but whether this job, in this role, and the way it is now is going to last for forty years, I doubt it. Think about cab drivers five years ago.

Richie Rump: Look at me. I am the future of your data. This is it. My role.

Brent Ozar: That’s terrifying.

Richie Rump: It’s the future.

Brent Ozar: So what do you think, Richie?

Richie Rump: I think it’s going to be the person that understands the development side as well as the data side, that is going to be your steward. That is going to be the person that in the future with the cloud and all this other stuff that is going down, I think that’s going to be the person that is going to understand the data, how it works. It’s not going to be this low-level “Oh, I do the backups,” because those are all going to be automated in SQL Server 2030, whatever that is.

Erik Darling: But who runs the cloud?

Richie Rump: Someone who understands the code, someone who understands the data, that is going to be the future of the DBA.

Erik Darling: Who clouds the cloud, Richie? Who clouds the cloud?

Brent Ozar: I think if you step back generally, too, if you say, is any career going to be good for 40 years? That answer is probably no. With our parents, or grandparents, like my grandpa held the same job for I want to say 40 years at the same factory, he was a union’s rep or a union relations rep. Retired, he knew everybody in the community. My dad has held—I want to say four different careers. I’ve held a bunch of different jobs even, just every two or three years switching out. The things that you have to learn so much quicker is just very different.

Erik Darling: My dad has been doing the same kind of thing forever and ever doing sort of like technical sales writing and marketing but the companies and the stuff he’s been doing it for have been all over the place over the years. He’s at Oracle now but he started a little company called Data General in Massachusetts that was doing, at the time, tiddlywink hardware where like NT clusters were like holy crap.

Brent Ozar: Yeah.

Richie Rump: So now you don’t have to go anywhere else, Erik. You’re in your last job.
Erik Darling: That’s true. Ever.

Brent Ozar: We think he’s going to die with the rate that we eat and drink. So you can have one career, it’s just a really short lifespan now.

Erik Darling: I basically have the lifespan of a fly.

Brent Ozar: Watch him age in front of our eyes.

 

Brent Ozar: Next question, we’ve got some interesting soft skills questions here today. “What is the craziest thing you’ve heard someone wanting to do with SQL Server? Or the craziest troubleshooting issue that you’ve had?” Well, Richie went second last time, we’ll make him go first this time. Richie, what’s the craziest thing you ever heard someone wanting to do with a database server, or just in general?

Richie Rump: With a database server? Oh, geez, I’d have to think about that one.

Brent Ozar: Then Erik goes first.

Erik Darling: Well, a couple things that pop out in memory since I’ve been here was one person who basically, it was using CDC, change data capture, but it wasn’t working the way they wanted so they rewrote a layer over change data capture with Dynamic SQL. It was like CDC broke for us so we had to rewrite it. So they basically had this layer over CDC that did stuff for them. Another one that came to mind was people who are partitioning data by day but when they receive data it could update prior day’s stuff, so all these merge statements that they wrote were just going kablooey. Like they were just not getting the kind of partitioning magic that they were hoping for with things. There was sparse little partition elimination happening in that scenario.

Brent Ozar: How about you, Richie?

Richie Rump: The craziest thing, and actually it didn’t turn out to be so crazy, but I was working with a QSR company, fairly large. They had decided to write their own NoSQL database.

Brent Ozar: Oh, okay, which is common now these days.

Erik Darling: Everyone is doing that.

Richie Rump: Yeah, so it’s all this payment processing that’s coming in and they decided to write their own SQL database to store all this information. Where I came in is that they had no way to report off of that. So they started dumping it into SQL Server and that’s how I came into the project. It ended up working out for them. They put a ton of effort into building their own NoSQL stuff, but it worked out. So again, maybe crazy like a fox, maybe a bit. So I can’t wait to hear them upgrade to Cassandra. That would be fun.

Brent Ozar: I’d say the weirdest thing—there are a couple, two come to mind. One was storing Access databases inside a SQL Server database and they were redoing a terabyte of Access databases every day. So that was spectacularly challenging. The thing that I was most horrified by was I went to go troubleshoot queries and I’m like, “Where are these queries from?” The guy says, “Perl.” He goes, “Yeah, we have a Perl application, a Perl script that goes through and generates dynamic T-SQL.” I’m like, “Okay. All right. I’m curious now. You have to show it to me.”

Erik Darling: So where does Perl sit? She writes some interesting T-SQL. What’s her email? Let’s bring her in here. Another crazy thing is a friend of mine works for—he’s a lead developer at a rather large company that does news stuff. They’re still currently, as far as I know, using Oracle. Years ago, when I first started working with databases, I was like, “Do you guys do anything with databases?” He’s like, “Yeah, we use Oracle but we’re migrating away. We’re going to start doing everything with files on disk because there’s no locks and it’s faster.” So like every six months I’ll be like, “Hey, how’s that migration going?” He’s like, “We still haven’t got off Oracle.” Still working on it.

Richie Rump: The Oracle rep was not worried about losing that account, at all.

Erik Darling: He’s like, “Oh, cool.”

Brent Ozar: “Yeah, you want another renewal?”

Erik Darling: “Whenever you’re ready, just sign them.”

Richie Rump: “No discount, sorry.”

 

Brent Ozar: Let’s see here, next question, “What are some common causes of SSPI handshake failed errors?” Dagnabbit, what I’ll do is I’ll add in the weekly notes a link that I’ve got. Robert Davis has a post. If you search for Robert Davis SSPI and setspn, set service provider name, just all one word, setspn, he has a great post on how to fix that.

Erik Darling: Yeah. His posts offer the T-SQL Tuesday really cool about backups and restores.

Brent Ozar: Yeah.

Erik Darling: Because I don’t do a lot of backup and restores anymore so it’s cool to like reread someone who is like, “No, this is my thing.” Talk about LSNs and, you know, catching stuff up with differentials and logs, and like NoRecovery posts. Interesting stuff. I love reading his blogs.

Brent Ozar: You know, he has a function to convert the LSNs from fn_dump_dblog.

Erik Darling: Yeah, but it’s a scalar valued function so I can’t use it.

Brent Ozar: That’s not fast enough for me.

Erik Darling: No, it’s against my religion. I can’t do it. I can’t have the parallelism inhibited when I’m converting LSNs.

Richie Rump: So it’s against the Jedi religion? I didn’t know that. That’s good to know.

 

Brent Ozar: “Have any of you found useful situations in which to utilize SET DEADLOCK_PRIORITY high?” He’s got a whole paragraph of other stuff in there, but basically, have you ever thought about setting DEADLOCK_PRIORITY high on a query?

Erik Darling: No, only low, and only for like index rebuilds which I think was what it was for, for the most part. But, no.

Brent Ozar: Yeah, I always get worried overriding someone else’s session and like making them rollback when they may not be the cheaper one to rollback.

Erik Darling: Yeah.

Brent Ozar: I would also say if you find yourself doing DEADLOCK_PRIORITY high and if it’s anything like a report, maybe it’s time to think about a reporting server or a different isolation level.

Erik Darling: Or at least there’s NOLOCK hints, man. Come on.

Brent Ozar: Yeah. Dirty reads.

Erik Darling: Wrong data.

Brent Ozar: We need to kill a DBA on a song called dirty reads.

Richie Rump: Yes.

Brent Ozar: Someone make this happen.

Richie Rump: Straight out of Detroit, it’s Dirty Reads.

 

Brent Ozar: Fellow human asks, “We run update stats with full scan on our key tables every night. I noticed the other day that there was a command update statistics with full scan that was threaded out, meaning it went parallel. My concern is that I currently have MAXDOP set to 1. Do some Microsoft commands simply ignore MAXDOP and do what they want?”

Erik Darling: Yes, they do.

Brent Ozar: Yes, they do. Does update stats do that? I don’t even know.

Erik Darling: Yeah, I want to say it’s Bob Ward, but it might have been another Bob or someone else at Microsoft, but I just associate most of the really good blogs with Bob Ward—or the Bobs, the Bob blog. But there was one, I think it might have been during his 2016 “It Just Rans Faster” talk. I think he was talking about how sometimes even on a server with MAXDOP set to 1 you’ll still see some CXPACKET waits for some things. That’s because some system commands will use parallelism behind the scenes, whether you want them to or not.

Brent Ozar: Which is cool. I like that.

Erik Darling: I think stats is one of them. There was that trace flag where you could update more than one stat at a time on a table. That was one of those things where it was like, “No, these are definitely going to go parallel because you probably need it if you’re doing all that.” But there was something about parallelism in that one too.

Brent Ozar: Nice.

 

Brent Ozar: Oh, god, Tammy asks a question. I’ll try to keep the poker face on for Tammy there. “You are all smart so I thought I’d ask.” There’s your first mistake. “I want to query Active Directory.” Okay, strike one. “Using PowerShell.” Strike Two. “To get all the CNAMEs for my servers.” Strike three. “Any suggestions?” Mike Fal. You want Mike Fal. It’s like the word fall but only with one L. Mike Fal—really nice guy, PowerShell guru, knows systems-type stuff and he may be able to answer that one for you. If not, I would actually post it on Stack Exchange, dba.stackexchange.com, and I bet somebody has got a script for it already.

Erik Darling: If you’re more of a Slack person, the SQL community chat has a big PowerShell room where they work on DBA tools. Someone in there might be able to help because, shot in the dark, but I’m willing to bet someone in there is maybe doing something with log shipping where getting CNAME alias would be really helpful.

Richie Rump: As opposed to going to DBA, Server Fault is probably the better one to post that in.
Brent Ozar: Oh yeah, you’re right. Yeah.

Richie Rump: Server Fault, so not the dba.stackexchange.

Erik Darling: It’s weird that there’s no powershell.stackexchange.com, right? You’d think they’d want to keep those people out of general population.

Brent Ozar: I thought when Erik said, “go to the Slack channel,” it’s basically because if you tell someone, “Can you do this in PowerShell?” the answer is always yes.

Richie Rump: Yes, there’s lots of Thors wielding the PowerShell hammer in that room over there.

Brent Ozar: Which is good, we love them. Nothing against them. DBAtools.io is where you’ll go find their site and their Slack channel.

Erik Darling: I change my underwear with PowerShell.
Brent Ozar: Only once a week though.

Erik Darling: True story.

 

Brent Ozar: “Would like to get Brent’s take on vMotion for high availability. Can it replace SQL Server clustering?” Erik, how about your take?

Erik Darling: I don’t have one. You know me and VMs, man. I never spin those up. So VM, vMotion, no, because if SQL crashes, all right, where a cluster would pick that up and say, “Hey, we’re going to try to restart SQL or I’m going to try to restart SQL on another node,” VMware doesn’t really cover that too well as far as I know. So I don’t know if that’s the way I would go with it.

Richie Rump: We could actually say that we got a call on Sunday with this exact problem, where they had a vMotion on a backup and they thought that was adequate and the whole database got corrupt and there was a whole bunch of other stuff and they couldn’t get it back. So there’s your answer.

Brent Ozar: That wasn’t even availability, let alone high availability. It was un-availability.

 

Brent Ozar: Next question, “What’s the best practice on Windows Server 2012 page files? As my Window’s admin says we only need 16 megabytes for the crash dump and then leave the rest to physical RAM but I’m seeing a scenario where SQL Server needs more memory and it’s showing RESOURCE_SEMAPHORE waits.” Okay, so hold that thought. So, Erik, do you want to answer that one? Would you increase the page file size…?

Erik Darling: No. Because I really wouldn’t want SQL Server paging that to disk. I would probably want to add more actual memory or increase the max memory if it’s undersubscribed right now. Like if you have more memory in the server than you’re giving to SQL, but really RESOURCE_SEMAPHORE means that SQL Server does not have enough physical memory. That is not a page file problem. Page file should be teeny tiny on a SQL Server because you shouldn’t be doing anything with a page file there. You should avoid that like the plague. Eek, no. You know what’s funny? During the first DBA days someone was like, “Should I have a 512 gig page file?” I thought that you guys were messing with me. I thought I was being hazed. It was like, “Did you ask me that right?” I was like 512 gigs? No. I was like they’re screwing with me. Like Jeramiah is going to come up behind me and give me an atomic wedgie after I answer this question and is going to like Indian burn. There’s no way this is a real question. Really, no. RESOURCE_SEMAPHORE waits are horrible. They are poisonous. So RESOURCE_SEMAPHORE, you may also see RESOURCE_SEMAPHORE query compile, when that pops up and that is stuff just does not have enough memory to do anything when that happens. You have queries waiting to compile because it can’t get memory to compile and queries waiting to run because they can’t get memory to run. I don’t see the question but if you’re on a newer version of SQL Server, you can run sp_BlitzCache and sort by memory grants. So if you’re on SQL Server 2012 SP3, or 2014 SP2, or 2016 any ole version, you can run BlitzCache and sort by memory grants and get all sorts of good information about what queries are using giant chunks of memory and maybe get a hint there. But really, if your server has like 12 gigs of memory in it, that’s your bigger problem, not the query memory grants.

Brent Ozar: I have to say, that’s one of my favorite questions ever. I like that quite a bit.

Erik Darling: Well then you answer it.

Brent Ozar: I saw your face go and I’m like, there he goes, that’s awesome.

Erik Darling: It just reminded me of that DBA Days thing that I got all nostalgic and I was like they asked me if I should have a 512 gig page file.

Brent Ozar: And now those of you watching on the webcams can see Jeremiah sneaking up and giving Erik an atomic wedgie.

 

Brent Ozar: Next, “We have a transaction log backup that keeps hanging, meaning it runs for hours until I manually cancel and restart it. Normally it only takes ten seconds to run and it just happens randomly. Any suggestions on where to start investigating?” I like WhoIsActive for that. I would run sp_WhoIsActive and see if there’s either that something is blocking it, because I can envision scenarios where MSDB, there’s blocking in MSDB, and the thing isn’t able to update the MSDB completion-type stuff. Or, that it’s waiting for something, like a wait stat shows something, growing a file, creating a file on a network file share. That’s probably where I would start.

Erik Darling: So like what waits would you look for? Would it be backup I/O or async I/O completion or backup thread? Or just one of those things maybe just going?

Brent Ozar: I have a feeling that it’s—and of course, I’m just making stuff up—but I have a feeling that it’s creating the file. That it’s either creating a file or writing to the file. Like if you have a network blip and suddenly there’s a problem writing to the network server on the other end.

Erik Darling: Or like if it happens at a particular time where it’s kind of like it’s repeatable, I bet a bunch of stuff is writing to that drive at the same time and it’s all just kind of saturated, that could be it too.

Brent Ozar: Yeah, over and over again I see DBAs go set up all their log backups to run every 15 minutes and everybody is in like nuclear clocks synchronized so exactly at 15 minutes the file share just falls over.

 

Brent Ozar: Fellow human says, “vMotion cannot get guarantee transactional integrity.” Put the crack pipe down. It totally works just fine in terms of transactional integrity. What you might be thinking is it can cause a cluster to failover, or database mirroring to fall over, but it is completely transparent to SQL Server otherwise.

Erik Darling: One thing that does come up sometimes with vMotion is that if Windows crashes, you have the reboot time wherever it brings back up. So you know, it’s not as fast as a cluster where that other node is just sitting there like, “I got Windows. I got some ready to go.” It’s just like where you fail SQL Server over and it’s just pretty much the SQL Server startup time and whatever resource failover happens.

 

Brent Ozar: Follow up on the page file question. He says, “We did increase the physical RAM but then earlier we’d had a paging issue and modified memory grew. Even after we added physical RAM the modified memory is not getting released.” I’m not sure what you mean by modified memory. If you want, search for—just hit Google and Brent Ozar “Sysadmin’s Guide to SQL Server Memory.” We explain how the basic memory process works in there and whether or not it gets freed up and why.

 

Brent Ozar: Next one, “I need to migrate a bunch of larger databases and I can’t take a significant outage on Saturday night.” Okay, well, so how about Sunday? Sunday works for me. Sunday works not for me, I’m going to be somewhere else. “I was thinking about Saturday morning restoring the last full backup and diff and then an hour before a short outage, I take this…” All you really want to do is log shipping. Either do log shipping or database mirroring. I’m kind of a fan of log shipping for this because it works kind of like what you’re describing. Database mirroring is a little bit more painful to setup. It’s still pretty easy, but log shipping is really slick. We actually cover this exact technique for migrations in our Senior DBA class, which I happen to be teaching this week online. Yesterday, I was talking about how you use log shipping to migrate databases from one place to another using the WITH NORECOVERY for your last transaction log backups, so check into that.

Erik Darling: I think I’m working again.

Brent Ozar: I don’t know that you’ve ever really worked, but you’re audio is working.

Erik Darling: Before I worked here. Before I got hired here rather. I did quite a bit of work, sir.

 

Brent Ozar: Eric asks, “I went away from Idera’s defrag manager and I went to Ola Hallengren scripts. However, the update usage and stats step is trashing tempdb. Why is this? I’m on SQL Server 2008 R2.”

Erik Darling: Update usage? Ola Hallengren runs update usage?

Brent Ozar: I’m surprised. I can see it in the CHECKDB step maybe, maybe in the CHECKDB, it has DBCC update usage, but I’d be surprised at that too.

Erik Darling: No, no.

Brent Ozar: No.

Erik Darling: I’ve never seen that. I used to run that all the time and I never ran into that.

Brent Ozar: Stats step too, I mean, updating statistics that would trash tempdb, I would imagine, so I’m going to say something horrible. Anything Ola’s stuff does, Idera should have been doing. So that would kind of make me wonder the opposite thing. If Ola script is doing it, I wonder why Idera’s is not.

Erik Darling: Yeah. I would also be curious about what values you’re running Ola scripts with. That’s what I would have some questions about because you know like for the years and the number of machines that I’ve run Ola scripts on, I never had a problem with them trashing tempdb. The one time that a stats update trashed tempdb was when I updated with a full scan all the stats on a table used for some kind of logging that was around 2.8 terabytes in size. So that was the one time I messed up tempdb doing a stats update. That wasn’t Ola though, that was me. That was one file that got real big.

 

Brent Ozar: “What kind of advice do you guys have for managing batch loads to a database while end users are still querying it? Like do you manage locking when you’re doing batch loads?” Richie, how about you? Have you got experience with that? Like doing batches?

Richie Rump: While they’re querying? No, not that I’ve had any problems with locks or anything like that. So not on some really high transactional system. We’ve always been able to dump data and locks are handling it. Usually if your hardware is good enough, you should be able to handle pretty much anything.

Brent Ozar: So they’re basically saying that your hardware isn’t good enough. That’s what Richie is saying here. Have an SSD, good luck. How about you, Erik? Any advice on that one?

Erik Darling: It depends on what kind of situation you’re in and how you’re doing your loads. If you’re using bulk insert, you can control the batch size so that maybe you don’t put a million rows in at once. You can do like smaller chunks going in. If you’re on expensive edition or 2016 SP1, you have some partitioning tricks you might be able to do so you don’t totally wonk up people’s stuff when you’re loading data into a new partition and then swapping it in. I guess as a third option, if you’re on just about any edition and you sort of have some competency, you could always try using optimistic isolation levels to cut down on the amount of locking problems you have while you’re pushing data into a thing, jam it on in there, like Al Bundy with a fat woman’s foot.

Richie Rump: I love bulk loads. It’s so awesome.

Brent Ozar: I’m just going to leave that there. I’m not going to ask anything about that. I’m just going to leave that just where it is.

 

Brent Ozar: J.H. asks, “I have SQL Server 2008. I’ve got the latest and last service pack applied to it. Is it possible to apply vulnerability hotfixes afterwards? Should a specific KB article be applied to fix it?” That’s a great kind of question that you can go ask Microsoft Support or go into stackexchange.com. Usually, if you go filter for those kinds of KB articles, you can find more details about those. We don’t do any patching on SQL Server 2008. Nothing against it, it’s just not something that we go dig that much into.

 

Brent Ozar: Fellow human says, “I’ve discovered ten custom statistics. Thank you for sp_Blitz. I deleted them. As a result of deleting my custom statistics, CPU utilization dropped by 20 percent. Does that mean that custom statistics were giving me suboptimal execution plans?”

Erik Darling: Maybe.

Brent Ozar: Maaaaybe. If you suspected that a change to SQL Server was making execution plans better, where would you go look?

Erik Darling: If a change to SQL Server was making execution plans better, where would I look? Well that’s really hard for statistics because you don’t know what statistic SQL Server is using unless you use some black magic trace flags. So it’s tough to look there for statistics. You don’t get a lot of information about stats the way you do about indexes. You don’t get like usage counts or like how fragmented they are. Just kidding. But you don’t get that kind of information about statistics unless you use some crazy trace flags out there, which you can Google about. Paul White talks about them in several blog posts. I use them in several blog posts but you should really go back to the source and look at Paul White’s blog posts about where stats are getting used.

Brent Ozar: You would have to know the queries ahead of time and run them with those trace flags in order to know whether or not it was getting better or worse. I would just, if I was you, take full credit for it and tell your boss, “Yep, I totally did that and then keep right on trucking.”

 

Brent Ozar: On that point, “We dropped 400 hypothetical indexes the other day. Hypothetical indexes have real stats. My mind was blown. In all my years of being—in all my year—of being a DBA, I’d never seen that.” Well, yeah, because SQL Server has to figure out whether or not it would use those hypothetical indexes. So it builds stats on them when it goes and builds the hypothetical index.

 

Brent Ozar: Christopher says, “I have a SQL Server 2012 Always On Availability Group created and it works well. However, a dev has asked me for a reporting server off of this data. I don’t want their reports to interfere with production. Do I make a new separate SQL Server or can I just add it to the existing secondaries and make sure they put Application Intent = Read Only and how do I police that?” That’s a good set of questions.

Erik Darling: I wish Tara was here.

Brent Ozar: So the first thing you want to know is like is it synchronous AGs? If you’re doing asynchronous AG, then putting queries on the secondary can totally slow it down. Even in an async, you can get into positions where queries can cause locks that will slow down applications on the primary. I wouldn’t let that slow me down. If you’re interested in it, what I would let it slow me down is the licensing because if you now have to run reports against a server, it now has to be licensed. So I would want to check with management and see, and of course that’s Enterprise Edition if you’re talking readable secondaries, that’s real money.

 

Brent Ozar: Interesting question. “Have you guys considered talking about any add-on tools such as MDS, SSIS or Analysis Services in your podcasts?” No. Only because we don’t use it. We’re super specialized in just the engine. We just focus on the database engine. Even if you just look at the database engine, the size and scope of that thing has grown so much in the last two releases: R, PolyBase, Hadoop, columnstore indexes, CLR, DLLs. I mean you could go on and on for all the stuff in the engine that we don’t even cover. So yeah, we try to just be good about queries, indexes, high availability, disaster recovery.

Erik Darling: There are so many people out there who are already good at that stuff and specialize in that stuff that if I need to figure something out or help someone with something or learn something about a particular topic outside of the core engine realm, then I can go Google it and read about it. But me, I still haven’t opened a single one of those and I install SQL on my laptop. I only install the core engine, and well Management Studio, it’s a separate, but I don’t touch that other stuff.

Richie Rump: If I have to touch SSIS again, I’m going to run out of this room screaming.

Brent Ozar: Every time there’s a new release I always add in all the extra features that are new and then I discover within a month that I don’t touch them. Like I installed R and I saw that it added like 20 service accounts to my engine and I’m like what in the Sam Hill is this? I don’t even know what an R script looks like, so yeah.

Erik Darling: It’s not that I wouldn’t want to learn R, because like Buck Woody talks about R and makes it super interesting. I always want to look at his blog and be like, “Oh, cool, let me see what’s going on here.” But then like at the end of the day, I just have so much regular SQL stuff to do and learn and write about that I just can’t. Unless I was really forced to, I don’t think I could invest that much time in R.

Brent Ozar: Yeah, and I’m not going to be the guy who forces you to do that.

Richie Rump: So the answer is can’t touch this, that’s pretty much…

Brent Ozar: [Hums U Can’t Touch This by MC Hammer] If you want to learn about the stuff that’s outside of SQL Server’s engine what you should do right now is, for those of you who are watching the live podcast, go over to groupby.org—as in the SQL statement—groupby.org has sessions available for voting right now. There are sessions in the running for R, columnstore, Apache Zeppelin. One of the ones that’s highly ranked right now is SSRS, a high-performance SSRS session that looks like it’s going to make the cut. So if you’re listening to the podcast afterwards, voting is already finished. Voting closes Friday, December the 16th. Then the live free webcast will be on Friday, January 13th. So you can go over to Group By and register to attend the free webcast.

Erik Darling: One of the crazy things about all these new features is that sometimes you just have to be your own expert in them, because there isn’t one. There’s no, you know, no one knows all the best practices around PolyBase and that stuff these days. There’s no one who’s an expert in it. So maybe you can be the expert in it. Maybe you figure that all out.

Richie Rump: Maybe you can be Lewis or Clark—which one shot himself in the head? Lewis?

Brent Ozar: Every time there’s a new feature out, like before when columnstore came out, I’m like, you could be the Kalen Delaney of columnstore. You can be the Paul Randal of R. It’s just wide open to anybody who wants to blog about it. But you are going to have to use it in production, get your hands dirty, and then blog and share what you learn. All right, well thanks everybody for coming and hanging out with us this week. We will see you guys next week in Office Hours. Adios.


Indexing Temp Tables

People often don’t give this thought

Which is a shame, because I see people sticking fairly large amount of data into temp tables. On the rare occurrence that I do see them indexed, it’s a nonclustered index on a column or two. The optimzer promptly ignores this index while you select 10 columns and join 10,000 rows to another temp table with another ignored nonclustered index on it.

Now, not every temp table needs a clustered index. I don’t even have a rule of thumb here, because it depends on so many things, like

  • Are you joining it to other tables?
  • What does the execution plan look like?
  • Is this even part of the stored procedure that needs tuning?

Pretend That We’re Dem

Assuming that all three things lead us to index temp tables, let’s create some and see how they react to different indexing strategies.

We have three temp tables with 100k rows in them. I joined them a bit to make sure we get good correlation in the data.

If we get the plan for select statement, this is our query plan. A bunch of table scans and hash joins. SQL is lazy. I don’t blame it one bit.

I'm so parallel.
I’m so parallel.

 

After frequent user complaints and careful plan analysis, we decide to add some indexes. Let’s start with nonclustered indexes.

I’ve gone ahead and added two indexes to #TempPosts so the optimizer has a choice if joining in another order is more efficient.

How’d we do?

Not so hot.
Not so hot.

 

Not quite there. Two table scans and an index seek followed by a key lookup that SQL estimates will execute 62k times.

What's wrong with your hair?
What’s wrong with your hair?

Do over

Let’s try that again with clustered indexes. We can only give #TempPosts one, so I picked at random. In real life, you might want to do your job!

Blastoff
Blastoff

 

Our indexes get used and everyone is happy. Well, sort of. TempDB isn’t happy, but then it never is. It’s like your liver, or Oscar the Grouch.

Used and abused

When starting to index temp tables, I usually start with a clustered index, and potentially add nonclustered indexes later if performance isn’t where I want it to be.

You can do yourself a lot of favors by only bringing columns into the temp table that you need. If it’s still a lot, there’s a much higher chance SQL will judge a table scan cheaper than using a narrow nonclustered index along with a lookup.

Remember to test this carefully. You want to make sure that adding the indexes doesn’t introduce more overhead than they alleviate, and that they actually make your query better.

Thanks for reading!


Build Me A Build: What’s Your Hardware Budget?

Hardware
51 Comments

UPDATED 12/15/2016

Spending Brent’s Money

Reason number infinity spacetime + 1000 why working here rules is having a generous annual tax writeoff hardware budget. The first time around, I basically blew it all on working from home essentials. This time around I’m building a cool toy: a big beefy desktop.

Why, in the age of the cloud, am I doing this? Well, our AWS lab is cool, and has versions going back to 2005 for testing script compatibility, but I have to turn it on and off, spinning up new machines adds new costs, and spinning up enough to account for the different HA/DR scenarios I’d like to explore would be annoying. There are other reasons the cloud can be difficult: networking, good I/O is expensive, and emulating shared storage is a bummer.

With a beefy home desktop, I can create my own VPC with all the VMs I want, my own AD, and with enough hardware, the limits on what I can set up are few. I can have VMs for Log Shipping, Mirroring, Failover Clusters, and AGs. I can also mix features to look at interoperability.

Doing that on my laptop would kinda stink, especially because most of the time I’m also trying to work on it. That means I have a bazillion Chrome tabs open, Slack, PowerPoint, Excel, SSMS, and various productivity apps like Spotify. With 32 GB of RAM, I have plenty of room, until I start doing anything with the Stack Overflow database. While I could get a second laptop, it costs a lot more to put in commensurate hardware, and you hit hardware caps a lot earlier than you do with desktops.

TABLES

 

Shutting SQL down helps, but I still have a lot going on.

Shut It Down

 

My CPUs are generally quiet. These aren’t a problem, until I start doing something stupid in a couple different SSMS windows while I’m trying to do other stuff. Then things get messy.

Palooza.

Choices

Since a desktop would be the best fit, I started trying to find one that would work for me out of the box. Cursory Google searches for “128 GB desktop” were useless. Sites like Logical Increments can give you a decent baseline for building a gaming computer, but the RAM tops out at 64 GB, and you can’t mix and match components to test different combinations for different price points. This didn’t help me because I don’t need a crazy graphics card, but if I had to build a gaming PC, I’d probably start here.

Where you can play the mix and match game is PC Part Picker. Here you can mix and match, you get warnings if parts are known to be incompatible, and you can keep an eye on your power draw vs your power supply. Good? Good.

Honorable mention: Buying a barebones Dell Precision workstation and outfitting it with the parts I want. You get the case, the motherboard, a power supply, and (usually) a fan. The rest is up to you. My processor choices here were limited to Xeons, which wasn’t a dealbreaker, but the sourcing made it a bit more difficult to check for incompatibility issues.

What did I end up with?

A beast by any other name

 

The link to the build is here, if you want to play with it.

During the process, I sent my potential build to a couple people who really helped me out. My Favorite College Boy, Jeremiah Peschka, and The Man Whose Tweets Change Software Development, Nick Craver. While I’d love to build the official Stack Overflow developer desktop, it’s a little too far above my budget cap of $3000. I could have split the costs into next year, since it’s so close to the New Year, but whatever. I’d rather save that just in case I have a laptop failure, or Mark Rippetoe designs a computer desk that’s also a power rack, for full

SSD

HIP

DRIVE

Are you out there?

So, hardware people, have at it. See any glaring problems? Obvious tweaks? Better choices? Let me know. I haven’t started ordering parts yet.

I’ll be blogging more about this when I do start ordering parts and building in a week or so. It’s my first from-scratch desktop build, so it should be a hoot.

Thanks for reading!

UPDATE!

After tons of great feedback in the comments section, I made some changes to my build. Most notably, upped the CPU, downed the GPU, and went with the slightly newer M.2, plus I got rid of the monitor, since I really don’t need it. I also took off one SSD to compensate on the price. I’ll probably end up ordering drives after the 1st to fill up the ICY DOCK, but it’s okay because I don’t need those immediately. I have no idea how long it will take me to get parts and build this, so waiting on extra drives isn’t a big deal. Of course, here’s the link if you want to mess with it.

Brand new

Brent says: It’s crazy that you can get 128GB RAM in a desktop for under $3k. Also crazy: 38″ ultrawide curved monitor for $1500. When you get $3k/year for hardware, you can start making these longer-term investments in your working gear.


Things I Would Love to Put In sp_BlitzCache

This is my favorite script to work on

Getting stuff to work sp_BlitzCache is so rewarding because XML querying is so ridiculous. I wish Mikael Eriksson would write a book already; it’d make my life a whole lot easier.

Along the way, I see a lot of things that I would love to warn people about, but it’s either not present in cached plans, or not retained in the XML at all.

Here’s a list of my top 5 Query Plan XML wishes

Isolation Level

It would be great to be able to call out potentially harmful isolation levels.

Aside from read uncommitted, being able to warn about repeatable read or serializable would be nice. Especially when it’s set at the query level.

On a related note…

Locking Hints

There’s an open issue for this in the GitHub repo.

The best option so far is wildcard searching the query text for hints like ROWLOCK, PAGLOCK, HOLDLOCK, AND UPDLOCK. That stinks, especially for long queries.

This seems like really logical information to include. If someone didn’t put the hints in, the SQL Handle would be different, etc.

Partition Information

This is only available in actual plans! You can see which partitions were used, and which were eliminated.

At minimum it would be nice to tell people if they’re getting any partition elimination, but that’s a no go with cached plans.

TempDB Spills

Another actual-only piece of information. I understand why this isn’t in the XML. One execution might spill, another might not.

It would be cool to have aggregate information about spills available somewhere, like sys.dm_exec_query_stats though, since so much other related information is in there.

Cascading Foreign Keys

Did you know that when foreign keys perform cascading actions (update/delete), they take out serializable locks under the covers?

That stinks. And the actions just get tacked onto the plan without any notice. Catching these and triggers would be fun.

People might be surprised how much weight they can tack on.

One po-table, Two po-table
One po-table, Two po-table

Is there anything I’m missing?

If you take a look at the showplan schema, there’s tons of information to harvest. If there’s anything you’d like included in sp_BlitzCache, head on over to GitHub to make a request.

Just make sure it can be caught in cached plans, and that you can provide a repro script. There’s a lot of great stuff that’s actual only in all that mess, like the newly added wait stats.

Thanks for reading!


First Responder Kit Updates: New Checks Across the Board

First Responder Kit
1 Comment

It’s the December to Remember Script Event, and we’ve got updates to everything. Shout out to our contributors this month: @BlitzErik, @RichBenner, @runamuk0, @SQLMonger, and @BrentOzar.

Grab the newly updated First Responder Kit, and here’s what’s in it:

sp_Blitz Improvements

  • #591 – add check for 2016 Standard installed without Service Pack 1 (coded by @RichBenner)
  • #615 – add check for trace flag 834 used in combination with columnstore indexes (coded by @RichBenner)
  • #603 – add check for min server memory = max server memory (suggested by @BlitzErik, coded by @RichBenner)
  • #598 – added checks for non-default database configuration scoped settings in SQL 2016 (suggested by @HenrikStaunPoulsen, coded by @BrentOzar)
  • #593 – added check for Instant File Initialization using new sys.dm_server_services field in SQL 2016 SP1 (suggested by @BlitzErik, coded by @BrentOzar)
  • #592 – added check for Lock Pages in Memory using new sys.dm_os_sys_info field in SQL 2016 SP1 (suggested by @BlitzErik, coded by @BrentOzar)
  • #582 – added check for TempDB file creation error (suggested by @BlitzErik, coded by @BrentOzar)
  • #616 – improved Target Recovery Time warning because some SQL tools set it to 0 and some set it to 60 (suggested by @aclenscorp, coded by @BrentOzar)
  • #619 – fixed case sensitivity bug introduced in another patch (coded by @mikejwhitty, only visible to folks like us who work in the dev branch)

sp_BlitzCache Improvements

  • #617 – add alert for computed columns with scalar operators (coded by @BlitzErik)
  • #626 – add alerts for more kinds of expensive remote queries (coded by @BlitzErik)
  • #228 – skip offline databases (coded by @BlitzErik)

sp_BlitzFirst Improvements

  • #599 – made it more PowerShell-friendly by making all column names unique in the FileStats column output (coded by @runamuk0)

sp_BlitzIndex Improvements

  • #615 – added warning if trace flag 834 is used in combination with columnstore indexes (coded by @RichBenner)
  • #622 – stop counting hypothetical indexes in the total number of indexes on a table (coded by @BlitzErik)
  • #607 – now works in partially contained databases (reported and fixed by @SQLMonger)

sp_BlitzWho Improvements

  • #544 – added database name in the output to show the query’s context (coded by @BlitzErik)

[Video] Office Hours 2016/12/07 (With Transcriptions)

This week, Brent, Tara, Erik, and Richie discuss replicating TRUNCATE TABLE command between nodes, how to cope with social isolation that comes with working from home, database shrinking, how to improve log wait times, how to figure out the mac stop and threshold cost setting for servers, and laziest methods to deploy objects and run scripts.

Here’s the video on YouTube:

Office Hours Webcast - 2016/12/07

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes

Office Hours Webcast – 2016-12-07

 

How do you convince the SAN admin that there’s problems?

Brent Ozar: Michella asks, “How do you convince a storage administrator that the database is having IO issues?”

Richie Rump: Chainsaw.

Tara Kizer: It’s a tall order. I’ve worked at large companies with large DBA teams and large SAN teams and talking with them about IO issues can be challenging because in their monitoring tools, they’re not seeing the IO issues that we’re seeing on the server. So they point the finger back at the server, at Windows, at SQL. We point the finger back at them. It can take a lot of tools to figure out where the actual issue is. Now the one problem I have with the monitoring tools that the SAN teams use is that they are averaged over a minute. So they’re not seeing the heavy spikes that we might be seeing because maybe it’s fine 45 seconds for each minute but that 15 seconds is where we’re being hammered and because it’s getting averaged out, they won’t see it. One thing, I’ve worked a lot with SAN teams as well as Microsoft to figure out where the issue is. You can open up a case with Microsoft for help in determining if it’s an issue on a server or somewhere outside the server. They use, it’s an ETW trace that they can help you with and they can interpret it. It will tell you, is the problem on your server or external to it. Now, if it’s external to it, you might have to get other people involved besides SAN teams. You might have to put some kind of—I forget what they’re called—but they can put some packet captures on there to see where the problem is because it could be fiber, could be a lot of different places.

Erik Darling: I would run CrystalDiskMark and I would show the SAN team the results I get from that. We have a blogpost on our site, if you just search for brentozar.com for CrystalDiskMark or diskspd, you’ll get some primers on a couple different ways to test your storage out. Now I would actually turn your question around and ask you why you think you have a storage problem. Then I would ask you to meet your SAN team in the middle on the connections to the SAN because that’s usually what the problem is. The SAN is usually cruising along, the SQL Server is usually cruising along, but you usually have a really crappy either single connection or like not a very good double connection to the SAN from the server. That’s what I would do there.

Brent Ozar: We also have a request for a shout out to Rich Hansel. So there you go, Rich. Howdy. Oh, wow. The hotel housekeeping staff said to send you the house shout out. I said, “Yep, you’re doing a much better job cleaning the room.” That’s totally not true.

 

Should I worry about trace waits?

Brent Ozar: Next question. “I am seeing trace write waits among my top wait stats. Should I be concerned about trace write waits?”

Erik Darling: Not really. I would only be concerned if I had like double the trace time waits as up time, because that might mean that a couple people are running traces. But if my trace write time to my up time is kind of one-to-one, I’m not that concerned. It usually means that there’s a monitoring tool or something else keeping an eye on your server, or it’s kind of one of those like benign waits that’s not really going to do much.

 

Do we still need maintenance plans with SSDs?

Brent Ozar: Man, Troy has a great question. I think we’re all going to like chiming in on this one. He says, “Our network engineers are replacing our spinning disk storage with solid state. Our vendor says with solid state we don’t need maintenance plans. Can I really get rid of my maintenance plans?”

Tara Kizer: I’m not sure what they mean by maintenance plans. Are they referring to the actual maintenance plans that SQL Server is providing or they’re referring to something else? Because I don’t know how disks are going to determine if you’re using maintenance plans or not. Regardless of my storage, I’m using Ola Hallengren solution for instance or some custom solution and I just stay away from maintenance plans regardless.

Erik Darling: We like to say that index fragmentation doesn’t matter anyway, so no matter what kind of drives you’re on, who cares. But don’t get rid of all your maintenance plans because some of them might be taking backups.

Tara Kizer: Oh, you’re saying that they’re probably referring to the index maintenance.

Erik Darling: Maybe, I don’t know.

Tara Kizer: So is the question if they can get rid of the index maintenance plan? Or maybe, you can still use a custom solution. So I’m not sure what exactly they’re saying.

Brent Ozar: I hope it’s not that they’re talking about getting rid of backups. That would be bad, or CHECKDB.

 

Does Metallica’s new album rock?

Brent Ozar: Next comment is, “Metallica’s new album rocks if you care.”

Tara Kizer: I do care. Metallica used to be my favorite band.

Brent Ozar: As long as I can still hear the Black Album, that’s all I really care about because I could keep hearing that forever.

Richie Rump: Jay Z’s Black Album? Or…?

Brent Ozar: Anyone who has a black album, I’m into it.

Richie Rump: I’m listening to The Hamilton Mixtape right now, so I’m not paying attention to you guys, so, sorry.

Brent Ozar: I believe it. I would sing songs off of there but I haven’t gotten that one yet.

 

What happens when I truncate a table in an AG?

Brent Ozar: Next up, Vladimir asks, “What happens when I do a TRUNCATE TABLE with Always On Availability Groups? How does the TRUNCATE TABLE command get replicated between nodes?”

Tara Kizer: It’s still a logged transaction, it just deallocates the pages, I think. So you’ll still get the truncate on all your replicas.

Brent Ozar: And it’s quick, it’s still quick.

Erik Darling: Well, faster than a delete anyway, usually.

Brent Ozar: Yes.

 

Should I add more memory incrementally or all at once?

Brent Ozar: Ryan asks, “When I add more RAM to the database server, like from 64 gigs to 128 gigs, is there any reason not to increase the max memory parameter all at once? Should I inch it up incrementally or just go straight up to the number I want to end up at?”

Tara Kizer: Just go straight up. SQL Server is going to incrementally start using it.

 

Does Agent have problems with daylight savings time?

Brent Ozar: Someone asks, “Doesn’t SQL Server Agent have the smarts to avoid daylight savings time problems?”

Erik Darling: Like what?

Brent Ozar: Elaborate on the problems you’re having. I would be curious to hear more about that.

Tara Kizer: I used to work for a job where our servers had to use GMT time zones that we could do easy conversions for all the time zones where our customers were. So there was a lot of code that you just had to go into—we were actually using UDF to do the time zone conversions at the time. A lot of work had to go into it once the daylight savings time rules changed. Now that was all custom code, there was no issue with SQL Server. I think that maybe around the DST savings time there was a hotfix and then it got put into a service pack. So just make sure you’re not running really old builds of whatever version of SQL Server you’re using.

Brent Ozar: Because those are really common for us. Whenever we code Agent jobs to put our own date logic in there. So that could be some of what’s going on with that.

 

How do you cope with the social isolation of telecommuting?

Brent Ozar: Fellow human says, “I’m starting a new job where I work 100 percent remotely.” Congratulations.

All: Yay, yeah.

Erik Darling: Welcome to the dream.

Tara Kizer: Pajama [inaudible] coming up.

Brent Ozar: “How do you guys cope with the social isolation of working from home?” Oh, that’s such a good question. Who wants to go first?

Tara Kizer: I don’t cope. I just stay inside now. [Laughter]

Brent Ozar: You like the isolation. You embrace the isolation.

Tara Kizer: I go hiking a lot. So my hiking game has definitely increased since working from home.

Brent Ozar: You can leave the computer for a period of time and not be on call. That is what I think is the most [inaudible].

Erik Darling: We’re talking to customers pretty much all the time over WebEx though. It’s a little bit less isolated. Plus, you know, I got a wife and kid here so if I’m ever feeling lonely, I can just stick my head out the door. But generally, I don’t know, I always kind of liked the alone time anyway even when I’m kind of locked away in my office pounding away at stuff. I kind of enjoy the solitude.

Richie Rump: I think I’m the only really isolated one where I don’t talk to customers. This is our once-a-week, everybody, where we actually get face-to-face time with one another. This is probably about it. I have a wife and kids here. They’re here all the time. Twitter is a good outlet for me when I just need to see what’s going on and chat with some folks. The company chat room is pretty active, especially when I start stirring the pot and then people know that, “Hey, you guys are too quiet.” But I’m with Erik, I prefer the solitude. I can buckle down. I can really think about things. I kind of need silence for that, or at least really loud music to drown everything out. So it doesn’t bug me much at all. But then again, I do my own podcast where I talk with people about their lives and stuff. I’ll go to user groups. I have my own user group. So there are outlets for me that are outside of really work stuff.

Brent Ozar: For me, I’ve been telecommuting since I think 2003. For me, it was really hard initially because I’m very much a people person. I love being around people. So I started with co-working. If you search for co-working spaces in your city, you can get a desk in an office with a bunch of other people. What I love about co-working is they’re not tech people. They’re wedding photographers, artists, salespeople, independent freelancers, just all kinds of things. But I do so much work with clients and training-type stuff that I can’t really be in an open [inaudible] anyway. So I just like, the instant 5:00 hits, I’m out. I like walk out and around. I’ll take a walk, go get coffee. 4:00, really. It’s 5:00 somewhere.

Richie Rump: Actually, Brent is a people person unless you put him in a conference with 32,000 of them. Then all of a sudden he does not become a people person anymore.

Tara Kizer: “Hope they don’t recognize me.” Brent’s got the most famous face in all of the SQL Server world.

Brent Ozar: Wear a hat down real low.

Erik Darling: Get you some of those Kool Moe Dee shades.

Richie Rump: Maybe a Geordi VISOR, that’d be awesome.

 

Can I add tempdb data files on the fly?

Brent Ozar: Alexandra asks, “We only have one tempdb file and then we have 16 cores on one production server. Can I just add another seven files during business hours, and if so, what’s the recommended size for each of them?”

Erik Darling: I would say yeah, go ahead. I add tempdb files whenever I wanted. Never really bothered me. As far as the recommended size, as long as it’s going to match the size and the autogrowth of the current tempdb file, or else that one tempdb file is just going to kind of black hole all the queries if it’s larger. Then if you make the other ones bigger, they’re going to absorb the work. So same size and autogrowth as the current file and then add them whenever you want.

Tara Kizer: If you don’t have enough storage to add seven more files of that same size of the first file, what I tell my clients is add seven files, divide your total size of your first file by eight and add seven new files with that size, same autogrowth as the first file, and then go back and shrink your first file down so it’s the same size as the other seven.

Erik Darling: Database shrinking? Oh no!

Tara Kizer: Sometimes we do say shrink. Shrinking is bad but there’s some instances where you need to do it.

Richie Rump: Shriek. They meant shriek, databases shrieking.

 

More about the Agent job issue (and bugs in general)

Brent Ozar: We have a follow up on the Agent job question. He says, “My SQL job was supposed to run every ten seconds all the time but when daylight time fell back, it did not run for an hour.” Oh, that’s really interesting.

Erik Darling: Oh.

Brent Ozar: I would want to know more about what happened with it because it could be a lot of interesting things around like job history tables or whatever. But if you wanted to, I would post that at dba.stackexchange.com and include as much logging detail as you can because I haven’t run into that. When times fall back, it still continues to run every minute for example but it would be interesting to see.

Tara Kizer: You might need to open up a support case with Microsoft on that one. It might just be a bug.

Brent Ozar: I’m pretty sure it’s not. I’m pretty sure it’s just you.

Richie Rump: We’ve never seen bugs in SQL Server. That never happens.

Brent Ozar: One of my bugs got fixed today.

Tara Kizer: Wow, which one?

Brent Ozar: The one when you save an execution plan and then you go to click file, open recent, it’s not there.

Tara Kizer: Oh, I don’t even use that feature.

Brent Ozar: … every Connect item I’ve ever put in, that would be the lowest priority.

Erik Darling: It was also the easiest one.

Richie Rump: Yeah, the MRU list, that one is the easiest one. Sorry, guys.

Brent Ozar: Chris says, “I had a job where I was the only IT person and now I have a whole team of goofballs and I think it has lowered my stress dramatically.” Well you know, you can have people who are goofballs who aren’t in IT as well.

Tara Kizer: All the weight is lifted off your shoulders when you have lots of people doing the work.

 

Why is my query slow in the app, and fast in SSMS?

Brent Ozar: Larry asks, “We have a simple select from a view. It has two where conditions. When I run the same query from the query window, it chooses a different index.” So he’s saying when he runs it in the application, it performs differently than when he runs it in SSMS.

Tara Kizer: Brent, are you going to say your article, “Slow in the App, Fast in SSMS?”

Brent Ozar: Go right ahead.

Erik Darling: Mark it.

Tara Kizer: Oh no, you guys go ahead. I don’t know the name of the author. I mean, I know how to answer the question but take it away because I don’t know who has that web page.

Erik Darling: Erland Sommarskog.

Tara Kizer: Okay. What the issue is likely is that when you connect from an application, you have different set operations than when we connect in SSMS. So when I’m trying to troubleshoot why it’s slow in the application and fast in SSMS, I’ll run a, let’s say just a very, very quick profiler trace. Only leave the existing connections in there, that event, and then start it and then immediately stop it. Then find your user ID and then grab those set operations that came out of the profiler trace. Dump those into Management Studio. Then run your query in that same window. So that query window has the same set options as your application user. So when the set options are different, you will have your own execution plan and it can be different than what the app has. You can also get this through a DMV query, I believe, the set operations, but I’m just so used to doing a profiler trace with existing connections and only running it for one second. So it’s easy for me.

Erik Darling: Also, if both of the plans are in cache and you can catch them with sp_BlitzCache, if you scroll all the way to the right in BlitzCache, it will give you the set…

Tara Kizer: It does? Wow, I didn’t know that.

Erik Darling: Yeah, no one ever scrolls that far.

Tara Kizer: I only ever scroll to the right to look for the cache time and last executed and the execution plan. I pretty much ignore columns, I should start looking I guess.

Brent Ozar: I forgot it was in there until this week.

Erik Darling: You should see what happens when you put in expert mode equals one.

Tara Kizer: Oh. See I didn’t even know it had an expert mode.

Erik Darling: Yeah.

 

Why are my tempdb data files in use?

Brent Ozar: Bert says, “I’m trying to reduce the number of tempdb data files I have. I had four, then I tried six, and now I want to go back to four but every time I try it, SQL Server tells me the file is in use, even after a reboot.”

Tara Kizer: Probably have to run alter—whatever it is—empty file. Yeah, it’s shrink command. You have to specify the empty file. It will empty the file and it will no longer be used and then you’ll be able to remove that file.

Brent Ozar: Yeah, there’s also a—one of you guys told me—blowing the plan cache. You can have a plan cache issue. Was it you, Erik? Or was it Tara?

Tara Kizer: It was me. I encountered that at my last job. So yeah, there’s a blog post on that. But the shrinking wasn’t working I believe. We couldn’t shrink it down and had to [inaudible] the procedure cache.

Richie Rump: You can’t blow in it like an NES cartridge?

[Laughter]

Erik Darling: You can, but you may have mixed results.

Richie Rump: You have to do it again, and again, and again.

 

If we have high log write waits, what do we do?

Brent Ozar: Someone says, “Thanks for the Christmas card. If we have high log wait times on an app and we can’t change the code,” it’s relativity, they say. So I’m teaching a class in here this week and immediately one of the guys from [inaudible] pops up, “what?” “Any suggestions on how to improve this?” So it’s long wait times, and he says log wait times. “I’ve read it helps to commit less often but since I can’t change the code, is there anything I should do?”

Erik Darling: Get faster drives.

Brent Ozar: One of the reasons why stuff like relativity will have high log wait times is you’re writing a lot of stuff from the database. You’re doing lots of inserts which take a long time to write. You could get faster transaction logs. You could—I have an awful idea, but I’ll wait to see what Erik is going to say.

Erik Darling: My whole thing for that is just the faster disk because you’ve got lawyers going through and updating one thing at a time and that triggers the audit record insert and then there’s just a whole bunch of other stuff going on under the covers when they change something, redact something, and make a production. There’s a lot of stuff that happens whenever you change one thing. You’re not just changing one thing. There’s a lot of little other things going on. So the faster drives is always a good one for me. But I’m sure Brent has something nefarious to say.
Brent Ozar: I have a really bad answer and you shouldn’t quote me on this and surely it’s going down in the recording, but if you’re on SQL Server 2014…

Erik Darling: Agh.

Brent Ozar: Yeah, right? … delayed durability, which means your inserts, updates, and deletes are consider committed as soon as SQL Server hears them. They don’t even harden to disk. Now, this is so bad for two reasons. One, you can lose data when you’re not expecting it, like when SQL Server fails over. Whenever there’s a failover, anything that didn’t make it to the disk to log file, you’re out of luck. Second, you could lose data even on a graceful shutdown. SQL Server does not flush the log file to disk when you do a graceful shutdown. So this really is like a last resort if you didn’t care about the data. And when it’s relativity data, I would find it very hard to believe you would not care about that data.

Brent Ozar: Jackie asks, “How do you guys calculate…?”

Richie Rump: Wow.

Brent Ozar: Right, I know, see? And aren’t you proud to work with me? This is why I’m not allowed in front of people very often.

Richie Rump: It’s like how do NoSQL SQL Server. Hmm.

Erik Darling: You know what? They’ll put out that as a feature but they won’t put out un-log tables as a feature. They’ll un-log everything but you can’t just choose a table for it. What a–.

Brent Ozar: Either you care about nothing or you care about everything. There’s nothing in-between.

 

How do you calculate MAXDOP and Cost Threshold for Parallelism?

Brent Ozar: Jackie says, “How do you guys figure out the MAXDOP and cost threshold settings for servers?”

Richie Rump: 20-sided dice.

Erik Darling: That’s not a bad answer.

Tara Kizer: We have recommendations but it’s going to be dependent upon your server. We recommend cost threshold for parallels be 50. Five is the default and it’s a really old value and we think Microsoft should really raise it but they like backwards compatibility. MAXDOP, maybe half of your processors, up to eight, usually don’t go over eight. Then if any queries that you have are suffering with those server settings, then you can add the OPTION MAXDOP to that individual query to affect it, maybe to use less processors or maybe even more.

Brent Ozar: You’ve got lots of options once you start playing around. You can even do things with like resource [inaudible] guides but of course that takes much more work. So the 50 and up to eight works really well.

Erik Darling: You know what, there’s actually a good script on dba.stackexchange.com, like how to calculate MAXDOP. I think that guy Ken wrote it and it’s just a script that you run and it basically gives you like however many cores are in a single socket up to eight. So if you have dual sixes or dual fours, it will give you six or four.

Brent Ozar: If you happen to find that, make yourself a note and send it to me because I’m doing like an all-Stack week’s link for next week.

 

How should I deploy changes to my database?

Brent Ozar: Next one, “I use a SQL command script to run deployment scripts.” Whoa, my god, my condolences. Richie immediately is even like backing up. Tell me more.

Erik Darling: Are you on Express Edition? Why are you doing that?

Richie Rump: Oh no, don’t tell me more, not at all. No, I don’t want anything to hear about this.

Brent Ozar: “I hear there are deployment tools that do sync and compare and I’m looking for the laziest method to deploy objects and run scripts.”

Tara Kizer: I don’t know, SQLCMD is easy. You’re just typing in a command and you’re running a SQL file through it. But as far as what other people have used, my last job used Visual Studio’s schema compare to come up with the scripts and then we would just run those in Management Studio. But I have used the SQLCMD method, not because I chose it but because that was what was delivered from development. That works fine too. It is a super easy way to do it, but who’s writing the SQL script? If you’re the one writing the SQL script, are you asking, how are you generating all the commands to compare, to development, to production so you do the deployment? Visual Studio’s schema compare does that as well as Redgate has a schema compare as well.

Richie Rump: Redgate is the really good one. Visual Studio is the free one. I’ve used both with good success on both. So have at it.

 

Is there a contemporary music artist you all like?

Brent Ozar: Same person continues their question with kind of a horrifying question, “Also speaking of music…” We weren’t speaking of music, but that’s okay, I’ll allow it. “Is there a contemporary artist you all like?” That is definitely a no.

Tara Kizer: Contemporary. [Laughter] I don’t know if we agree on anything.

Richie Rump: You should have seen us in Austin, man. It was not pretty. It was not pretty at all. Red Hot Chili Peppers came on and like half the room cleared out.

Tara Kizer: I had to leave.

Richie Rump: It was almost fighting words.

Tara Kizer: That’s like the worst band ever. I like that kind of music, that genre, I just do not like the Red Hot Chili Peppers.

Brent Ozar: Metallica might not be that bad of a shot. Richie? No, no on Metallica.

Richie Rump: No, I have to draw a line somewhere.

Erik Darling: I’m with you. It’s like pre-Black Album is all I can deal with.

Tara Kizer: That’s how I am too, but I say they were my favorite band, they are no longer. I kind of liked Richie’s playlist that he shared with us a few months ago in a Spotify list. It was a bunch of 80s music, so I’ve actually listened to a bunch of that stuff. It’s a really good list.

Richie Rump: Actually, that was Erik’s. That wasn’t mine. I was doing the alt stuff. That was Erik’s.

Tara Kizer: No, no, it was yours. Maybe it wasn’t 80s then. I just can’t remember what genre it was.

Erik Darling: I have never shared a Spotify playlist in Slack because I don’t want you guys poking around my other playlists.

Richie Rump: Mine is really easy now, it’s just like been all Hamilton stuff because the kids have been singing it nonstop so the past couple weeks have just been that.

 

Can you explain what reads means?

Brent Ozar: This one is interesting. “Can you explain to me what specifically this means?” I believe this has Richie all over it.

Tara Kizer: Stats IO output.

Brent Ozar: Scan count, some number, logical reads, some number, physical reads, some number, and read ahead reads, some number. I think those are lotto ticket suggestions is what that is.

Richie Rump: Yeah, we may win something here.

Brent Ozar: Yeah, so you read a table multiple times. Logical reads means you read those 8k pages out of memory. Physical reads means you read some of those pages out of disk. Read ahead reads means SQL Server saw you reading a bunch of pages off of disk and started reading in front of you even further.

Tara Kizer: I recently read that read aheads were an Enterprise Edition feature, is that true?

Brent Ozar: Standard will read ahead, just not as far as Enterprise Edition.

Tara Kizer: Oh, okay.
Brent Ozar: So in terms of what it means, you only read 28,000 pages, like the logical reads thing. So that actually doesn’t scare me that bad. That shouldn’t be terrible. Scanning isn’t necessarily bad either, so I wouldn’t just go based on that.

Erik Darling: Well the scan count isn’t actually scans, it’s just index accesses. So it’s not like you scanned the whole index that many times. If a query goes parallel, you can have multiple things reading from it and you get a whole mess of scan counts that aren’t actual full index scans over and over again. It’s just chunks of the index.

 

Why is CPU time low but elapsed time high?

Brent Ozar: There’s an interesting one. Kenneth asks, and I’m going to rephrase his question a little. Kenneth says, “I have a query where CPU time is only a couple seconds but elapsed time is 30 minutes. What are some things that could account for that?”

Erik Darling: Compiling.

Tara Kizer: [Laughter] That would be a lot of compiling, dang. We’d need to see what you’re waiting on, probably being blocked I would imagine, but you’re waiting on something. During that 30-minute window, check what you’re waiting on.

Brent Ozar: Disk could be reading from disk, but I like the locking thing a lot.

 

Can you use Standard Edition for development?

Brent Ozar: Sam asks, “Can you use SQL Server Standard Edition for development? If so, do you simply license your developers with MSDN?” Developer Edition is free, so I would just use Developer Edition instead. Stick with that one.

Richie Rump: But you could license with MSDN too. I think those images are up there as well. So if you say, “We really don’t want any developers touching any Enterprise Edition stuff,” you could do that.

Erik Darling: I can’t confirm but I’ve heard that if you are a Standard Edition shop and you have like a thinner edition on a laptop, Microsoft is not going to make you license it like Standard Edition, but you know, it’s up in the air a little bit. I wouldn’t bet on it.

Brent Ozar: But Erik’s answer is due to the fact that he’s covered in tattoos and when the licensing person came to audit, they just kind of were like, “Yeah, no, no, you’re good. You’re all right.”

 

Brent Ozar: The follow up on the SQL command deployment scripts, someone says, “I hate SQL command for deployments because it only returns one line of output. If there’s an error further down the script, you’d never know.” That’s a great point.

Tara Kizer: I haven’t seen that occur though. I’ve seen lots of outputs. I’m wondering if it’s maybe just the command. So you can use the -o, I think it is, parameter, to put that into a file. I haven’t seen that before.

 

How can I prevent a SELECT from blocking?

Brent Ozar: J.H. says, “Any methods to prevent blocking from a select statement?”

Erik Darling: From a…? Oh, so like, do you mean like the schema stability lock that it takes?

Brent Ozar: It might even be just pessimistic locking, where the lock is taking forever.

Erik Darling: Yeah.

Brent Ozar: What are some of the ways you could get around that?

[Crosstalk]

Erik Darling: Not even no lock prevents…

Tara Kizer: That’s true. I wonder what they mean though. They’re probably talking about the select is blocking other connections and maybe it’s not just a schema stability lock.

Erik Darling: Could be, not sure though.

Tara Kizer: Need more info.

Brent Ozar: If you search for isolation levels on our site or if you go to brentozar.com/go/rcsi, which is funny, it’s just a module I just got done presenting in class here. We have a whole section just of resources on that that you might be really interested in. The big takeaway is readers don’t block writers and writers don’t block readers.

Erik Darling: But writers will still block writers.

Brent Ozar: Yeah.

Brent Ozar: We’ll take one more question. I’m going through the ones that are 15 lines long and past that. Someone says, “Yes, do not use locks.” So follow up on the select thing. This person is doing an insert. He’s doing an insert into a table using select from. So you could do with no lock on your select thing. You just know you’re getting dirty reads, so you could see rows twice or not see them at all, or your insert could fail.

Erik Darling: I would like to know how many rows you’re selecting because if you’re selecting a lot of rows, that could lead to higher level locks. There’s a great a blog post by Michael Swart about batching modifications. I’m going to find a link to it and I’ll put that in chat. It’s a really good read about how to batch things properly. So if you’re selecting a lot of rows, you could be getting some weird locking patterns from that as well.

Brent Ozar: Perfect. All right, thanks everybody for hanging out with us this week. Head over to GroupBy.org. This is a totally free conference where you get to pick the sessions. Voting is going on right now. So go to groupby.org, pick what you want to see, and then we’ll be taking the top ten or top five to seven and we’re running those in January. Thanks, everybody. See you next week.


Vote Now to Pick the GroupBy Conference Lineup

GroupBy Conference
0

Voting closes soon at GroupBy, the new free online conference where YOU pick the lineup. Here’s the sessions up for votes – there’s a lot of them, so maybe focus on rating the sessions you find the most interesting – quantity of votes counts for a lot here:

Analytics and Business Intelligence:

Database Administration:

Development:

Performance Tuning:

PowerShell:

Multiple Categories:

Voting closes next Friday. Make your voice heard and pick the conference lineup.


Should I Install Multiple Instances of SQL Server?

Licensing, SQL Server
46 Comments
Marina Towers in Chicago
Marina Towers in Chicago

Instance stacking is the technique of installing multiple instances of SQL Server on the same Windows instance. For example, you might have a VM or server named SQLPROD1, and have:

  • SQLPROD1 – default instance of SQL Server, say SQL 2016
  • SQLPROD1\SQL2014 – for our older apps
  • SQLPROD1\SharePoint – because it’s supposed to be on its own instance
  • SQLPROD1\development – for our QA & testing

The Benefits of Instance Stacking

Lower SQL licensing costs – you only have to pay for one license, and then even Standard Edition lets you install dozens of instances on the same Windows base.

Lower Windows licensing costs – you only have to pay for one Windows.

Easier Windows patching – since you only have to patch one OS install.

The Drawbacks of Instance Stacking

Much harder performance tuning – all of the instances share the same CPU, memory, network pipe, and storage pipe. While SQL Server does offer tricks like affinity masking and memory settings to alleviate the first two, it’s got no answers for the second two. A backup on one instance will knock out performance on the other instances regardless of how much tuning work you put in. If none of the instances are performance-sensitive, this doesn’t matter – but how often does that happen? And how do you figure out what the “right” memory or CPU settings are? It takes so much human work and experimentation that it really only makes sense when you have plenty of free time per DBA per server.

Much harder reboot planning – you have to get all of the customers on all of the instances to agree on a time to patch Windows.

Security challenges – sometimes, we get those awful folks who insist on being able to RDP into the Windows instance that hosts their databases. If they insist on being sysadmin on the box altogether, then they can make changes that wreak havoc on the other running instances.

The Alternative: Virtualization

Whenever you think about carving up a single server into smaller pieces, think virtualization instead. It’s a great default place for new SQL Servers.

Every SQL Server deserves its own Windows instance. Yes, this does mean higher licensing costs – you’ll need to license SQL Server Enterprise Edition at the hardware host level, and then you can pack in as many VMs as possible into the host.

Then, each VM gets their own performance management, patch schedules, and security. Plus, surprise bonus: every VM, even the tiniest ones, get all of the features of Enterprise Edition.


Availability Groups: More Planned Downtime for Less Unplanned Downtime

I often hear companies say, “We can never ever go down, so we’d like to implement Always On Availability Groups.”

Let’s say on January 1, 2016, you rolled out a new Availability Group on SQL Server 2014. It’s the most current version available at the time, and you deploy Service Pack 1, Cumulative Update 4 (released 2015/12/22). You’re fully current, and it’s a stable engine from 2014 – how many more bugs can they find, right?

Here’s what your patching schedule would look like:

2016/02/22Cumulative Update 5 – corrupted columnstore indexes when AG fails overstack dumps on AG secondaries.

2016/04/19 – Cumulative Update 6 – non-yielding schedulers during AG version cleanup, FileTables unavailable after AG failover, canceling a backup causes the server to crash (not related, but cringeworthy) – whew! This one has a lot of big fixes. We should definitely apply this.

2016/05/31 – OH SNAP! CU6 broke NOLOCK. Sure hope you didn’t apply that. Time to take another outage to apply the revised version.

2016/06/21 – Cumulative Update 7 – SQLDiag fails in AGs. You could probably skip this one if you don’t use SQLDiag, and most shops don’t.

2016/07/11 – Service Pack 2 – improved lease timeout to prevent outages, filestream directory not visible after a replica is restarted (wait I thought we fixed that in CU6? no wait that was FileTables), missing error numbers in XE.

2016/08/26 – Cumulative Update 1 – memory leak on AGs with change tracking, error 1478 when you add a database back into an AlwaysOn availability group (sic).

2016/10/18 – Cumulative Update 2 – no AG fixes, woohoo!

What do you mean there's only one engine?
What do you mean there’s only one engine?

That’s 5-7 patch outages in 11 months (and I’m not even listing all of the fixes in these, which include things like incorrect results bugs, plus awesome new DMV diagnostic features that you definitely want.)

Here’s the way I like to explain it to companies: if you have an airplane, it’s absolutely imperative that its engines not fail mid-flight. In order to accomplish that, you have to have regular downtime for mechanics to examine and replace parts – and that doesn’t happen up in the air. With Availability Groups, we’re lucky enough to be able to transfer our passengers databases from one airplane to another quickly – but we still have to have those other airplanes getting constant examinations and patches from mechanics.


Selective XML Indexes: Not Bad At All

Indexing, SQL Server
19 Comments

Promises, Promises

I said I’d follow up on this, and here it is. I’m by no means the world’s foremost authority on these, I just played with a simple example until I figured out what worked the best for my query. So you can all play along at home, I’m using XML we all have access to: Query Plans!

If any of you have ever used sp_BlitzCache, you’ll know there’s a ton of XQuery involved. I can’t promise that I’ll be adding XML indexes to it in the near future. Selective XML indexes are 2012+, so I’d have to wait for 2008R2 to go out of support. At this rate, that won’t happen until I throw SQL Server out a window and open a bar that I only let my friends drink in.

Other forms of XML indexes can end up being quite large, and often not worth the resource expenditure to create for the relatively small amount of XML processing we limit it to, by default. The performance gains just wouldn’t be there.

Hey, Bartender

So what is a Selective XML index? It’s an index! For XML! Where you pick the parts of the XML to index. Other XML indexes are rather all or nothing, and can end up being pretty huge, depending on the size of your documents. I think they’re at least size of data, if I recall correctly. Let’s take a look at some examples.

I’m lazy, so I’m just grabbing a bit of code from BlitzCache to query our XML.

Right now, there are a paltry 225 query plans in my temp table. I know, right? Some consultant I am. I’m not even generating a ridiculous workload to pollute my plan cache.

But the query plan I get looks ridiculous and costs 1355 query bucks. Ew. I already regret working with XML, again.

I'd rather be querying .jpgs
I’d rather be querying .jpgs

 

The first kind of Selective XML index we can try is defined on the two nodes we’re querying. You can create it like this:

If we run our query again… Nothing changes, only gets worse. Well, not literally worse. The plan and cost are the same, but now we have a useless index.

Well, nuts. What else can we try?

This also makes no difference whatsoever. I tried defining each path as the SQL type I’m returning. Again, same plan, same cost. What’s next?

Finally, I tried defining each path as the XQuery expression I’m using to retrieve them, along with the data type. This finally makes a difference, and a huge difference. Query cost is down to 0.55. That ain’t bad for an index.

What time is it, even?
What time is it, even?

Just Selecting XML

I’m not getting into the other possible XQuery methods, like .exist(), .nodes(), or .query(), or using sql:column to join data anywhere. That’s more XML than I care to mess with in one blog post. If you use XML a lot, and you’re on SQL Server 2012+, you may want to give SXIs a shot. They can be pretty cool when you get them working.

Thanks for reading!


Test: The Top Two SQL Server Problems I Find Everywhere

Companies call us for performance or high availability issues, but over and over, the very first two things we find are:

  1. They’re not taking backups to match the business’s RPO and RTO
  2. They’re not doing CHECKDB weekly, or at all, and don’t understand why that’s an issue

So let’s walk through a simple scenario and see how you do.

It’s Thursday morning at 11AM, and you get an email: users are reporting corruption errors when they run SELECTs on a critical table. You run the query, and it turns out the clustered index on the table has corruption.

Here’s your maintenance schedule:

  • Full backups nightly at 11PM
  • Log backups every 15 minutes
  • Delete log backups older than 2 days (because you only need point-in-time restore capability for recent points in time, right?)
  • CHECKDB weekly on Saturdays at 9AM
Our national symbol is ashamed of your RPO/RTO
Our national symbol is ashamed of your RPO/RTO

You can’t repair the corruption (it’s a clustered index, and there aren’t enough nonclustered indexes to cover all the columns), and the business needs that data back. You’re on: answer these questions:

  1. What backups do you restore, in order?
  2. Will they be free of corruption?
  3. How much data will you have lost?
  4. How long will that process take?
  5. Given that, what’s your effective RPO and RTO?
  6. If the business said that wasn’t good enough, what specific steps could you take to improve those numbers without spending money?

This week, while folks are working at low speed due to the holidays, double-check those backups and corruption check jobs.


The Cost of Adding a Reporting Server

“We’d like to offload our reporting queries to a separate SQL Server.”

The first costs are fairly obvious.

Hardware and storage – even if you’re running it in a virtual machine, you need to account for the costs of say, 4 cores and 32GB RAM. Not only will you need storage for the databases, but you’ll also need to decide whether this server gets backed up, and copied to a disaster recovery data center.

Software licensing – Standard Edition is ~$2k per core, and Enterprise Edition is ~$7k per core. Toss in Windows (especially now that it’s licensed per-core), your management/backup/antivirus tools, and your monitoring software.

Project planning – you’ll need to design how to get the data from production to the reporting server, like with Always On Availability Groups, log shipping, or transactional replication.

App modifications – the app running reporting queries will need a new connection string. Even with Always On Availability Groups, reads aren’t automatically offloaded to readable replicas – you have to use the connection string parameter ApplicationIntent = ReadOnly to tell SQL Server that you promise not to try to write anything. If you have a single app that does both reads and writes, and you only want to offload some of the queries, you’ll need to go through the code to switch those queries over to the new connection string.

The rest of the costs are surprises.

Adding a troubleshooting process – sooner or later, the data replication process will break. Depending on the method (AGs, log shipping, replication) and failure type, it’ll fail in different ways – maybe all of the data is old, maybe just some of it is, or maybe the reports aren’t accessible at all. You’ll want to list out the failure methods and explain what symptoms will look like. This helps business users recognize when their reports are wrong, and react appropriately. If you don’t do this step, then after the first failure, people are just always going to expect that there’s a bug in the report data.

Prepare for failure – for each of those failure methods, decide how you’re going to react. For example, if AG replication breaks and reports are out of date, will you point reports at the primary until the problem is resolved, or will users just have to deal with unavailable reports while you troubleshoot or resync the replicas? If you don’t do this step, then you’re going to be winging it every time, and you’ll look unprepared while reports are wrong or down.

Set realistic expectations for RPO and RTO – based on your process and preparation, make sure the business users understand how long their reports will be down when things break.

Measure the overhead of replication – AGs and transactional replication can add performance slowdowns beyond what the reports used to cost. For example, if you were only running a few reports an hour, and only hitting a subset of the data, then suddenly replicating every individual delete/update/insert operation can have a huge overhead.

Add monitoring – you need to start monitoring how far behind the reporting server is, and how performance is doing on both. Performance troubleshooting becomes a lot harder, too – for example, when you’re doing index tuning, you have to combine data across both the primary and the reporting servers in order to find the right mix of indexes across the board.

Are you sure you really need to offload reporting?

Before you embark on this expensive project, ask:

  • What’s the primary wait type that we’re facing?
    (Find out with sp_BlitzFirst @SinceStartup = 1)
  • What’s the cheapest/easiest way to reduce that wait type?

Time and again, I see people facing PAGEIOLATCH waits (which mean waiting to read data pages from a data file), and they’re juggling a 1TB database with 16-32GB RAM. Don’t spend tens of thousands of dollars to fix that problem – buy $1,000 of RAM and spend some time doing index tuning.


7 Things I Learned About Aurora at AWS re:Invent 2016

Richie and I attended the AWS re:Invent conference in Vegas last week. Here’s some of my favorite takeaways about Amazon Aurora, their homegrown relational database with MySQL compatibility.

1. AWS has a grudge against Larry Ellison. Andy Jassy’s keynotes made repeated jokes about Oracle’s cloud product (or lack thereof), the high cost of proprietary databases, and expensive sailboat hobbies. Larry’s big, bold personality makes for an easy target, but I couldn’t help but wonder if AWS will come after SQL Server’s costs next.

2. Amazon’s database products are lucrative. Their database products are their fastest-growing products in history. The cynic in me suspects that they’re measuring growth in terms of revenue, not customer count, and since databases are expensive, voila.

3. Amazon’s Aurora product is ambitious. Aurora is their home-built relational database product. They manage backups, availability, and patching. In 2014, they announced full MySQL compatibility – you could point any MySQL app at Aurora, and it’d work just fine. This year, they announced PostgreSQL 9.6.1 compatibility, too.

4. Aurora PostgreSQL was 3 years in the making. That’s gotta make you wonder: are they trying to build SQL Server compatibility? Granted, it’d be a lot of work – it’s a closed source database. Feature compatibility would be an arms race: Microsoft would rapidly try to add features to entice users onto newer versions, while at the same time Amazon would have to race to keep up. (Hey, interesting coincidence – Microsoft’s been suddenly shoving features in – even in service packs.)

5. I’m guessing Aurora Oracle compatibility will come next. In the PostgreSQL announcement session, the speakers pointed out that PostgreSQL is the most Oracle-compatible database, with some 60-70% of applications being able to switch over. Theoretically, AWS could fund open source development of further Oracle compatibility, but given AWS’s lack of contributions back to open source projects, I wouldn’t bet on that. Instead, I’d bet on them doing something internally that only Aurora would offer.

Before going to re:Invent, I would have bet against AWS aiming for Oracle compatibility. However, during the keynotes, I kept hearing enterprise stories over and over. AWS really wants to be the default data center for enterprises, and enterprises run Oracle (and SQL Server of course).

Amazon won’t have an easy road. Their edge is that they’re essentially getting two database platforms for free: MySQL and PostgreSQL. Amazon has developers, sure, but they don’t have to build a query optimizer, for example. Thing is, those platforms aren’t Oracle.

AWS Performance Insights
AWS Performance Insights

6. Amazon’s throwing in free performance monitoring tools. The new Performance Insights tool shows wait stats, top resource-intensive queries, lock detection, execution plans, and 35 days of data retention. It’s designed by a former Oracle (there we go again) performance tuning consultant to match his workflow. You don’t have to install an agent, configure a repository, or keep the thing running. It’s in preview for Aurora PostgreSQL today, and will be rolled out to all RDS databases (including SQL Server) in 2017.

7. You don’t need to learn Aurora today. This isn’t going to be one of those doom-and-gloom posts that says, “The cloud is coming for your job!” The cool thing about Aurora is that it works just like the MySQL and PostgreSQL you know and love don’t know anything about and aren’t particularly fond of – it just takes away the crappy parts of database administration, like backups, corruption checking, and HA/DR failovers.

If you were a MySQL or PostgreSQL query/index tuner, you could keep right on working in Aurora – only now, your skills relate in direct cost improvements. If you can tune queries and indexes well, you can cut your company’s cloud bill directly – and see the improvement in your very next bill.

The same thing will happen when Amazon comes for Oracle, and eventually, SQL Server. Your performance tuning skills will still work – and they’ll even be worth more.

As they say in Vegas, place your bets.


Date Math In The WHERE Clause

Execution Plans, SQL Server
20 Comments

Oh, THAT SARGability

I realize that not everyone has a Full Metal Pocket Protector, and that we can’t all spend our days plunging the depths of query and index tuning to eek out every CPU cycle and I/O operation from our servers. I mean, I don’t even do that. Most of the time I’m just happy to get the right result back!

I kid, I kid.

For those of you out there that have never heard the word before, go watch this. You’ll thank me later. Much later.

What does that have to do with me?

It has to do with you, because you’re still formatting your WHERE clause poorly. You’re still putting expressions around columns and comparing that output to a value, or another expression.

Huh?

Think about times when you’ve done something like First_Name + ‘ ‘ + Last_Name = ‘Meat Tuperello’, or even worse, when you’ve totally broken a date into YEAR(), MONTH(), and DAY() and compared them all to values.

Yes, you. Yes, that’s bad.

More Common

Sometimes people forget that DATEADD exists. They go right to DATEDIFF, because it sounds like it makes more sense.

What’s the difference between these two dates? Can I go home now? I’m so hungry. No one takes the Esperantan money you pay me with, Mr. Ozar.

But this can get you into a lot of trouble, especially if you’re either dealing with a lot of data, or if the WHERE clause is part of a more complicated series of JOINs. Not only does it not make efficient use of any indexes, but it can really screw up cardinality estimation for other operations. What does this peril look like?

Looks good to me

Of course it does. That’s why you’re reading this blog. You have questionable taste. There are some problems with this, though.

Your mom.
Your mom.

 

We didn’t do too bad with cardinality estimation here. The Magic Math guessed about right for our 10,000 row table. But breakthroughs in Advanced Query Plan Technology (hint: GET OFF SQL SERVER 2008R2) allow us to see that we read all 10,000 of those rows in the index, rather than just getting the 5003 rows that we actually need. Shame on us. How do we do better?

No Sets In The Champagne Room

We’re going to flip things around a little bit! We’re going to take the function off of the column and put it on our predicate. If you watched the video I linked to up top, you’d know why this is good. it allows the optimizer to fold the expression into the query, and push it right on down to the index access. Hooray for us. Someday we’re gonna change the world.

Now we get a cheaper index seek, we don’t read the extra 4997 rows, and the cardinality estimate is spot on. Again, it wasn’t too bad in the original one, but we got off easy here.

Just you, and nobody else but you.
Just you, and nobody else but you.

Face 2 Face

If you’re wondering what the plans look like side by side, here you go.

Gateway Goth
Gateway Goth

 

Both plans are helped by our thoughtfully named index on the OrderDate column, though the one with cheaper estimated cost is the bottom one. Yes, I know this can sometimes lie, but we’re not hiding any functions in here that would throw things off horribly. If you’re concerned about the Nested Loops join, don’t worry too much. There is a tipping point where the Constant Scan operator is removed in favor of just index access. I didn’t inflate this table to find exact row counts for that, but I’ve seen it at work elsewhere. And, yeah, the second query will still be cheaper even if it also scans.

Thanks for reading!

Brent says: this is a great example of how people think SQL Server will rewrite their query in a way that makes it go faster. Yes, SQL Server could rewrite the first query to make it like the second – but it just doesn’t go that extra mile for you. (And it shouldn’t, you wacko – write the query right in the first place.)


Spills SQL Server Doesn’t Warn You About

Execution Plans, SQL Server
28 Comments

Don’t make me spill

Table variables get a lot of bad press, and they deserve it. They are to query performance what the TSA is to air travel. No one’s sure what they’re doing, but they’ve been taking forever to do it.

One particular thing that irks me about them (table variables, now) is that they’ll spill their spaghetti all over your disks, and not warn you. Now, this gripe isn’t misplaced. SQL Server will warn you when Sort and Hash operations spill to disk. And they should! Because spilling to disk usually means you had to slow down to do it. Disks are slow. Memory is fast. Squatting in the Smith Machine is cheating.

Wouldn’t it be nice?

Why am I picking on table variables? Because people so frequently use them for the wrong reasons. They’re in memory! They made my query faster! No one harbors delusions about temp tables, except that one guy who told me they’re a security risk. Sure, we should get a warning if temp tables spill to disk, too. But I don’t think that would surprise most people as much.

So let’s see what hits the sneeze guard!

You’ve been here before

You know I’m going to use Stack Overflow. Here’s the gist: I’m going to set max memory to 1 GB, and stick the Votes table, which is about 2.4 GB, into a table variable. While that goes on, I’m going to run sp_BlitzFirst for 60 seconds in Expert Mode to see which files get read from and written to. I’m also going to get STATISTICS IO information, and the query plan.

First, let’s look at stats IO output. The first section shows us hitting the Votes table to insert data. The second section shows us getting the COUNT from our table variable. Wouldn’t you know, we hit a temp object! Isn’t that funny? I’m laughing.

The query plan doesn’t give us any warnings. No little yellow exclamation points. No red X. It’s all just kind of bland. Even Paste The Plan doesn’t make this any prettier.

The query plan doesn't warn us about anything.
Oatmeal

 

Well, unless you really go looking at the plan…

Oh, that terrible estimate
Oh, that terrible estimate

Okay, that sucks

Let’s look at sp_BlitzFirst. Only Young And Good Looking® people contribute code to it, so it must be awesome.

Just too physical
Just too physical

 

Boy oh boy. Boy howdy. Look at all those physical writes. We spilled everything to disk. That’s right at the 2.4 GB mark, which is the same size as the Votes table. We should probably know about that, right?

Is a temp table any better?

In short: kinda. There are some reasons! None of them are in stats IO. They’re nearly identical.

For the insert, the scan could go parallel, but doesn’t. Remember that modifying table variables forces query serialization, so that’s never an option to us.

In SQL Server 2016, some INSERT operations can be fully parallelized, which is really cool. If it works. Much like minimal logging, it’s a bit of a crapshoot.

The COUNT(*) query gets an accurate estimate and does go parallel. Hooray. Put those pricey cores to use. Unless you recompile the query, you’re not going to get an accurate estimate out of your table variable. They’re just built that way. It doesn’t even matter if you put an index on them.

Promised Land
Promised Land

 

Does sp_BlitzFirst tell us anything different?

SCANDAL
SCANDAL

 

Yeah! For some mystical magical reason, we only spilled out 1.8 GB, rather than the full 2.4 GB.

Party on, Us.

I still hate table variables

I mean, I guess they’re okay if the # key on your keyboard is broken? But you should probably just buy a new keyboard and stop using Twitter.

Anyway, I think the point was that we should have some information at the plan level about spills to disk for table variables and (possibly) temp tables. It wouldn’t help tools like sp_BlitzCache, because spill information isn’t in cached plans, but it might help if you’re doing live query tuning.

Thanks for reading!

Brent says – OMG THESE ARE SPILLS TOO. I never thought of it that way.


Announcing Group By: A New Kind of Free Community Conference

GroupBy Conference
2 Comments

group_by_conference_logoLet’s try something new. Let’s build a free event by the community, for the community, where:

Speakers submit abstracts publicly online. Anybody can submit an abstract on any topic – community members, software vendors, even Microsoft’s Bob Ward has submitted a session on 2016: It Just Runs Faster.

Attendees suggest tweaks to the abstracts. You leave comments on the abstracts, and you rate the topic, the abstract, and your likelihood of attending. Your constructive criticism helps refine their abstracts. For example, Alex Yate’s session on DevOps 101 has had a lively discussion in the comments, and he’s using that feedback to help craft a better abstract. If you’re a presenter, you should put in an abstract just to get the detailed, insightful feedback people like Hugo Kornelis and Adam Machanic have been giving.

The 5-7 top-voted sessions are picked. You, dear reader, are building this conference for yourself. Voting closes on December 16th, and the ones you’re most likely to attend – those are the ones that will be picked.

On Friday, January 13th, the whole thing happens live. Register to watch, and we’ll do it live, with me as your cohost throughout the event. I’ll take your typed-in questions and ask ’em out loud, help with technical glitches, and manage the recording process. If you miss the event, no worries – the sessions will be posted to YouTube, a podcast feed, and the site.

Everything gets done transparently in public: the abstracts, the presenters, the comments, the voting, and the conference itself.

It’s called Group By. Go check it out, and rate the sessions you’d like to see!


[Video] Office Hours 2016/11/23 (With Transcriptions)

This week, Brent and Richie discuss their hardware setups, lazy spools, SQL 2016 Service Pack 1, Always On Availability Groups, Database IDs, GroupBy.org, SQL on Linux, and new features they would like to see in the new version of SQL.

Here’s the video on YouTube:

Office Hours Webcast - 2016/11/23

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes

Office Hours Webcast – 2016-11-23

 

What do you guys think of my blog?

Brent Ozar: All right, let’s see what we’ve got for questions here. Nate says, “I don’t know if you guys are open to this or not–” No, we’re not, really. Next question. Just kidding. “I’d love to have you check out my blog.” He says, “I’m a new blogger and I write about DBA stuff since that’s my current job and passion. I’d love to have some feedback from the pros.” I assume he’s talking about Richie. URL is natethedba.wordpress.com. Nate, the one piece of advice I’d give you is go get your own URL. It’s only like $10 a year. You can get them from GoDaddy, from AWS, Namecheap. There’s a bunch of providers that do it. WordPress will charge you—I want to say it’s $30 a year to be able to use your own vanity URL but the thing is that way it’s totally you and it looks way more professional. You’re going to be telling it to people for years so you might as well just get the URL that you want there.

Richie Rump: Yeah, don’t do Jorriss.com because you’re going to have to tell people that there are two Rs, two Ss, and they’re like, “What’s Jorriss?” In the 90s, it made sense, not so much nowadays.

 

What should I do about lazy spools?

Brent Ozar: Fellow human says, “In one of the previous Office Hours webcasts, Erik mentioned that lazy spools are often an indication of bad performance. Can you elaborate on why and how to avoid them?” That hasn’t been my experience generally that I see a one-to-one when there is a lazy spool I go after that. I think what Erik was saying, and I think I agree with because of course, he’s not here so I can speak for him, is that when there’s a plan and I see a lazy spool, it’s a crappy query that I need to do something about or a crappy plan, not enough indexes, whatever. I don’t focus on the spool, I take a step back and just start looking at, all right, what’s this query doing? How am I assembling data together and what can I do in a better way? Often it involves things like index tuning or query tuning, but I don’t focus on the spool itself.

 

Is 403 page reads per second a problem?

Brent Ozar: Next up, someone says, “My Buffer Manager page reads per second is 403. However, my page life expectancy is 22,089. What’s the problem?” First of all, I don’t look at either of those numbers but I’ll tell you what’s happening. What’s happening is stuff is staying in memory for a long period of time. However, you’re having to read 400 pages from disk. 400 pages times 8K is nothing. It’s not that big of a deal. I could read 400 8K pages in one second. I’ve taken speed reading classes.

Richie Rump: Yeah, no.

Brent Ozar: I wouldn’t comprehend any of it. Richie has seen me read documentation, even when I read only one page a minute, I’m like, “What’s this mean again, Richie?”

Richie Rump: Even his emails. Just, no.

Brent Ozar: It’s about the rate of emails that I get. What I would say is, you don’t necessarily have a memory pressure problem. SQL Server is just reading some stuff from disk. What you want to look at is wait time instead. If you search for wait time on brentozar.com you’ll get a whole bunch of blog posts about how to do that. My personal favorite way to do that is sp_BlitzFirst. Sp_BlitzFirst will tell you what your performance problems are right now.

 

We changed a bit to an int. Why did .NET break?

Brent Ozar: Next up, Nate says, “Pretty please, I promise I have a real question to ask too, maybe even blog about. We had a view which had a BIT casted to an int column. When we changed it back to a BIT, it broke the .NET app with an invalid cast exception. It’s using Entity Framework. Any ideas why that would happen? I thought the app code would handle implicit gracefully but I guess not.” What on earth are you doing that for?

Richie Rump: Well, you do have to change your code. If you have an Entity Framework entity, because of what it is, and it’s mapped to a table or a view, it’s not implicit. You’re actually declaring it in the background as an INT or string or whatever. So you’ll have to go into the code and then tell it, “Hey, by the way, this is an INT or a BIT.” So that’s probably where you’re broken. You’ll just have to change that mapping and those should be fixed.

Brent Ozar: I also have this horrible vision that somewhere data is coming back as like a 9 or a 14 instead of a 5th just with somebody’s crazy cast.

 

Should I switch from a Surface Pro 3 to a Mac?

Brent Ozar: “I’m thinking of a Mac since the battery-gate with my Surface Pro 3, even though they finally fixed it. How has your battery been on the Surface Pro 4?” You have a 4, right?

Richie Rump: It’s been fine but what I’ve been telling people about my Surface 4 is I like it. I really love it, but this is my third machine. Like I actually have three machines that I actively use. I’ve got one I’m using right now. I’ve got this baby that I use at night and then the Surface Pro 3 is essentially a large tablet for me. Everything has been fine on it though. I had one problem and it was a hardware problem. I’ve had it since day one. I went right into Microsoft’s store, they exchanged it right out and I was back up and running literally within the next hour or so. But the way I use my Surface Pro is not the way I would normally use a machine.

Brent Ozar: You don’t run like three instances of Visual Studio on there.

Richie Rump: No. No, no, not at all. I mean, it’s all there. But I don’t actively write code on it or anything like that.

Brent Ozar: So you’re the same as me. That’s what you’re saying. It’s installed.

Richie Rump: In ten years when I’m not actually doing anything with technology, I will be you, yes, exactly.

Brent Ozar: Yeah, might take longer than that. Yeah, sadly. To take your skills to atrophy, to this level, it’s, you know, measured in decades.

Richie Rump: You forget, developer skills, they go quick. Right? A lot quicker than DBA skills.

Brent Ozar: Yeah, wow, you guys have to learn new languages every six weeks.

Richie Rump: Haven’t we had this conversation before?

Brent Ozar: Yes.

Richie Rump: I watched a Pluralsight course yesterday, Brent.

Brent Ozar: To give people in the background an idea, we develop stuff like, so you take Paste the Plan for example, which is developed in AWS using Lambda which is function as a server, aka server-less code. It’s not really server-less. Nick Craver has a heart attack every time we say that but there are real servers behind…

Richie Rump: It’s just bad marketing.

Brent Ozar: It is bad marketing. But new frameworks come out with that like every sixty days on how you go about accomplishing this stuff.

Richie Rump: Yeah, and not even that but even some of the new features that have been around a little bit. What I’ve been playing with this week is ECMAScript 6 which is the next version of JavaScript. So I’ve been messing with that in Lambda this week. It just makes things a lot cleaner. All the spaghetti code I had to write with Paste the Plan, now with using ES6 syntax will be a lot cleaner. I’m smiling a lot more this week than I did when I was writing Paste the Plan. So that’s good.

Brent Ozar: Golly, I would also say don’t change from Windows to Mac—don’t take that lightly. Changing either way sucks, jumping platforms. I’ve tried to switch back over to Windows repeatedly because Windows 10 is amazing. Windows 10 is freaking fantastic.

 

Should I use SQL Server 2016 SP1 in production?

Brent Ozar: Someone says, “Now that SQL Server 2016 Service Pack 1 is out there, would you feel comfortable using that in production? Of course after testing in a test environment first.” I’m actually working with a client today on 2016. That’s going to be one of my recommendations is to go to Service Pack 1. People get really excited about having Standard Edition to have the same capabilities for developers that Enterprise Edition has. Forget that. There’s really cool DMV and execution plan changes that are awesome. In this case, it’s exactly going to help the client get better execution plan insight when they’re going and doing query performance tuning. So if you’re on 2016, I would totally go to SP1, just make sure you test it first. If you’re trying to consider a new deployment today, I would totally vote for 2016 and with Service Pack 1. It’s just great.

Richie Rump: Do you want to mention Erik’s post on 2016 SP1?

Brent Ozar: Yeah. Erik has an opinion, and I agree with entirely, Erik said that when Microsoft put the Enterprise Edition—and for those of you who are listening, I have air quotes around my head—Enterprise Edition features in 2016 Standard Edition in Service Pack 1, they really kind of took the leftovers. They didn’t take the best features. They just took these fringe features that are well tested and everybody kind of knows how they work but nobody really wants to use them. Great examples of that are Always Encrypted, partition tables, you hardly ever see that stuff out there and, yes, it’s going to make ISVs’ lives a little easier, but it really doesn’t help DBAs at all. DBAs, all the features that we wanted, online index rebuilds in Standard Edition, creating indexes online, merry-go-round reads, there’s a lot of things that I think we should have had in Standard Edition if we’re going to get features and we didn’t get those.

Richie Rump: Yeah. As a developer, I should be really jazzed about all the cool stuff they threw in Standard but without raising the memory limits, it’s meaningless.

Brent Ozar: Yeah.

Richie Rump: It’s almost like, “Hey, look all the stuff that you can get and you can do.” And I’d be like, “Yay, I get to play.” Then all of a sudden it’s like, but you only can stay on Standard and like why is it so slow? I don’t understand.

Brent Ozar: Yeah, if you need partitioning, the odds that you’re going to be able to still perform well on 128 gigs of RAM are not good. Usually, stuff like partitioning is designed for much larger datasets.

Richie Rump: Yeah.

Brent Ozar: I’m actually—like when I saw it drop I’m like, okay, I have a vision two years from now the amount of partitioning work we’re going to be doing is skyrocketing because everybody is going to go throw partitioning in as soon as they can and then they’re going to find out, “Oh, it actually makes queries slower instead of faster.”

Richie Rump: Yep, and, “Why I can’t do partition swapping because I’ve put an index across the entire table,” and, yeah. Yeah, that’s going to be fun.

Brent Ozar: Yeah, not a performance fix.

Brent Ozar: Someone says, “So is 2016 better than 2014 and worth going to?” Absolutely, for DMV improvements that are much better. Query Store is my personal favorite. I believe that feature alone is worth going to 2016 for. Thanks, Conor Cunningham, I love you, even though you don’t love me right now because I wrote a blog post that made you angry.

 

Why do I have user problems when I fail over an AG?

Brent Ozar: One person, let’s see here, says, “Doing some testing with Always On Availability Groups and I’m having trouble with SQL users in the replicated database losing their mapped link after a failover. Is there an easier way to reestablish this link after failover? It must be a common issue.” It is and in this case Always On Availability Groups has a lot of similarities to database mirroring. Database mirroring exhibits the same problem. The guy who wrote the book on database mirroring, his name is Robert Davis, aka SQLSoldier, if you search for SQLSoldier sync logins—not as in sank my battleship—but as in synchronize logins, he has a script out there that will help you go about doing that.

 

How do I get my RPO and RTO in writing?

Brent Ozar: “Any suggestions for getting RPO and RTO for high availability disaster recovery from an organization?” Yeah, absolutely, go to our site. Go to brentozar.com. Put in your email address or click on “free stuff” up at the top of the blog. We have a worksheet, an HA/DR worksheet for this exact purpose. What this does is it lets you fill in the current state first. You fill in how much data you’re going to lose when the server goes down, how long you’re going to be down for based off of things like your backups and whatnot. Then hand that to the business and say, “Here’s current state. If you want it to be nicer, you pick what you want.” The second page of that worksheet has prices to help you make that a lot easier with the business.

 

What causes database timeouts?

Brent Ozar: J.H. says, “What are some factors on the database side of things that cause timeouts?” So there’s really two parts to a timeout question. One is timing out before you connect to SQL Server. The other part is timing out waiting for your results. For timing out connecting to SQL Server you really have to watch on the application side of things because often SQL Server never even hears the call. For timing out on query executions, SQL Server doesn’t have a timeout. It will sit there and run your crappy query for years or decades. So then that’s also an application setting on how long you want to wait for your queries. So on both cases of those I start working with my developers and saying, “All right. If you’re having problems connecting, then let’s go set up ping monitors, any kind of monitoring utility from your application server over to the SQL Server so we can start to see when you’re not even able to ping the SQL Server. Then we can work with the network guys to figure out how to fix that.”

 

What should I do with statistics IO output?

Brent Ozar: Oh, there’s someone who asks an interesting question. “Set statistics IO on gives you great information where tuning in terms of reads.” I know where you can go with that information. Richie, tell them where they can go with their set statistics IO output.

Richie Rump: Thanks, Don Pardo, you could go to statisticsparser.com.

Richie Rump: Dang, I was almost there.

Brent Ozar: So close. The Don Pardo thing was what did it. I was like, ah, that’s good. He says, “Am I correct in assuming that each read is 8K so when I’m talking to my developers I can tell them how much data their current query is reading and start asking does it make sense that this thing is reading say 10 gigs worth of data when they’re writing their queries?”

Richie Rump: Yes.

Brent Ozar: Those are great, good—and it’s always a consistent number. It doesn’t change.

Richie Rump: Yeah. And if you go to statisticsparser.com it will not only parse it out for you so you can actually read it and know that that’s millions and not just kind of having to do the, “Oh, here’s a comma, here’s a comma, here’s a comma,” but also aggregate it for you as well. So it will give you the total number of page reads over a query.
Brent Ozar: Super helpful.

 

Can database IDs be different on AG replicas?

Brent Ozar: Someone says, “How do I resolve the issue where a database…” blah, blah, blah, “I have a database with one database ID say 6 on the primary and 8 on the secondary in Always On Availability Groups.” That’s totally okay. It doesn’t affect anything at all. Database IDs can be different on every server.

 

Thanks for helping me with Twitter.

Brent Ozar: Someone says, “Brent, you are awesome because two years ago I was new to Twitter and posted a question not using the #SQLHelp, instead I just named certain people. I was mocked by others while you helped me understand how to use #SQLHelp by reading your blogs.” I’m glad I could help. Twitter is not intuitive. Twitter is the opposite of intuitive. It’s like those old IRC—Richie, were you ever on IRC?

Richie Rump: Uh, of course, Brent.

Brent Ozar: Yeah, I thought so, and BBSs and all that.

Richie Rump: Oh, BBSs, oh. Let’s upload more stuff so I can get more download credits.

Brent Ozar: Or that deal—there was a ZMODEM hack where you could cancel your download at the last second and it would not fill up your account points. It would do it right after you sent the confirmation. Yeah, no, I’m all for that. Everything about databases is cryptic, online is cryptic, so whenever I can help people. Just make sure you can pay it forward.

 

Psst – introducing Brent’s newest project

Brent Ozar: My current attempt to go pay it forward is GroupBy.org. If you go to GroupBy.org, all one word, no spaces or minus signs, at GroupBy.org we’re building a platform where people can teach each other. Where you can go submit a session. You can vote on which sessions you’d like to see. You can leave feedback to presenters so that they can help make their own abstract better. So it’s like a live conference submissions process. We’re just going to take the top-rated submissions from readers and then go let them present them online on January 13th.

Richie Rump: It’s such a great idea.

Brent Ozar: Thank you.

Richie Rump: It’s such a great opportunity, man.

Brent Ozar: It’s going to be so much fun. Plus, it uses the exact same platform that you’re looking at here. Those of you who are online, those of you who are listening to the podcast, same exact deal. So you’ll see talking heads. Like I’ll be in there as a cohost for whoever is doing the presenting. So if something goes up in flames, we can still have a good time talking about it. I could post the questions that people ask verbally, just really helps presenters relax as they see the talking heads and not freak out.

Richie Rump: Yeah. I could totally see how me presenting with Brent Ozar’s face right there is not going to freak people out at all. No, no. I mean, it’s totally normal. It happens to me all the time.

Brent Ozar: I could wear a mask. Yeah, see, now that you say that, I’ve kind of always thought of myself as being like—no, but yeah, no, I think you have a point there. I could be actually worse. And it won’t be me every time, too. We’re going to rotate in and out hosting duties. So we’ll do just like go contact people in the community who like doing that kind of co-hosting duty and different, other people can do it. Because no one wants to see me every webcast. It needs to be moving around to different people.

Richie Rump: Dude, you’ll totally get some troll in here, you know, like myself. “Wroooong! That is incorrect!”

Brent Ozar: Or just hold up little signs like, “This slide I’m voting a 2.”

Richie Rump: Yes.

Brent Ozar: Someone says, “When will the registration option be available at Group By?” (Update: it’s available now.)

 

Are you seeing much interest in SQL Server on Linux?

Brent Ozar: Next one. “Are you guys seeing much interest in SQL on Linux? Allan Hirt told me he was in August at his class in Chicago.” Nope. Nope.

Richie Rump: I’m surprised someone actually has interest in SQL on Linux. Please, tell me more.

Brent Ozar: Yeah, I don’t get it. I get that there are people who are Linux-only shops and they want an easier way to run databases. I find it kind of suspicious that people who have standardized on a free, open-source operating system are going to spend $2,000 a core on Standard Edition and $7,000 a core on Enterprise Edition.

Richie Rump: Yes.

Brent Ozar: There are some pretty good open-source databases out there. I’m not saying SQL Server sucks. It’s amazing. I love SQL Server. It’s fantastic. But if you’re really drinking the Linux kool-aid, which is a delicious kool-aid, and I endorse it highly, great stuff, I don’t see a lot of people going, “I sure can’t wait to give Microsoft my database money.” I just don’t see that.

Richie Rump: Yeah, let me just throw out this Postgres thing that does some things really, really well and let’s get into this SQL Server thing and it’s not really running on Linux but on this kind of faux hybrid OS-y thing that runs on top of Linux. Let’s go with that.

Brent Ozar: Yeah, it’s a virtualization layer basically. Drawbridge, which is now known as SQLPAL 2, SQL Platform Abstraction Layer 2. Anthony Nocentino has a great blog post about it that will be in next week’s weekly links, how he reverse engineered what Microsoft is doing in order to run SQL Server on Linux. Early testing, for example, Klaus Aschenbrenner has done some early testing with Hekaton and found that it’s almost twice as slow. Now, I have no reason to believe that Microsoft won’t fix that stuff as we get closer to production but I don’t see a lot of people going, “I’d like to use it on Linux and have it be slower, please.” I just don’t see that.

Richie Rump: Yeah, let me use an in-memory database that’s slower and it costs me more.

Brent Ozar: Yeah, yeah. Someone else says, “I realize it’s not a replacement for Windows, it’s more for people running a lot of Linux concurrently.” Linux isn’t starving for database choices. What Microsoft will tell you is that people who are using Oracle are looking for a way out. I would just say, as an independent database consultant, if you’re going to go rewrite your app, because essentially if you’re using PL/SQL, you’re going to be doing an app rewrite to move from one database to another. If you’re going to rewrite your app, why not go with open source? Why not go with something that has no licensing costs to it? Sure, you’re going to still pay support but Postgres has got some pretty good looking legs on it right now.

Richie Rump: [Whistles] I mean at that point if you’re thinking about changing your platform, why not go to the cloud? I mean, there’s a lot of stuff out there that’s looking really, really good right now and just because I get to play with it every day, I’m not just saying that, right? It’s true. There’s some good stuff out there right now.

Brent Ozar: Yeah, but AWS has Aurora, there’s all kinds of interesting scalable databases. I am a huge fan of Azure SQL db. I think if you’re going to write a .NET application that wants a relational database up there and you can build it from scratch inside one database, no cross-database queries if you can avoid it, Azure SQL db takes a lot of crappy work out of database administration, stuff that we just don’t like to do.

Brent Ozar: Follow up question, “Can you teach a fish to climb a mountain? Linux is a different animal. I don’t see that getting anywhere.” Well, so, because I happen to be an Alaska guy, I’m a real big fan of Alaska, I’ve seen salmon going upstream, so I do know that that can happen. And they taste fantastic.

 

What do sleeping queries mean?

Brent Ozar: Next question. “When queries are in a sleeping status, does that mean that the application is not closing out the connection to SQL?” Yep, exactly. If you look with one of my favorite diagnostic tools, sp_WhoIsActive, if you look with sp_WhoIsActive, that’s something that’s not included with SQL Server, it’s an additional diagnostic tool that Adam Machanic out of the Boston area wrote, it will not only show you if it’s sleeping, it will show you if it has open transactions too as well.

 

Why are partitioned table queries slower?

Brent Ozar: Next question. “Why are queries slower on a partition table? You just mentioned this.” If you go to brentozar.com/go/partitioning, we have links that explain it. Kendra Little did a few example blog posts using the Stack Overflow database. If you split a table up into a bunch of little, bitty child tables underneath, SQL Server has to recombine those results when it goes to join them to another table. So you can end up with even more expensive sorting. There have been a couple of clients where 80 to 85 percent of the cost of their select queries was simply reassembling the data from multiple partitions and then resorting it into a way that you could have had an index on a non-partitioned table. “Well, you can put indexes on a partition table.” Yes, but then you can’t do partition switching. You can’t drop a partition or load a partition at once.

Richie Rump: Yep. That’s why we always—I had a rule when I was working on that 60-terabyte [monster monitor] database that we always had to include the partition key. That way we didn’t get any sort of craziness, joining things back together and things like that.

Brent Ozar: You mean included in the select query, like in the where clause.

Richie Rump: That’s correct, yep.

Brent Ozar: As long as you can do partition elimination, as long as you can tell SQL Server, “Only look in these partitions” and it’s a minority of the partitions, you can get good performance improvements. And I’m not saying I don’t love partitioning, partitioning is awesome. You just have to find the right use case for it. Maintenance, making table maintenance faster is the great use case for partitioning. Making selects faster, not quite as much.

Richie Rump: But if you do the maintenance incorrectly, you could totally hose up a database. Let me tell you.

Brent Ozar: Oh, I need to have a blog post from the team on what were your worst database memories. Worst outages we’ve ever had.

 

We got an error…

Brent Ozar: Let’s see, we’ll take two more questions. One of them is, “We tried to install SP3 to SQL Server 2012 and got an error hexadecimal value, invalid character, blah blah blah. Tried redownloading the SP, same error.” I would just open a support call with Microsoft. Things like that they may actually even know about the problem and have a fix for it.

 

What feature do you wish SQL Server would add?

Brent Ozar: Last one we’ll take is, “What would be the number one feature you would want to see in the next version of SQL Server that some other database platform has but we don’t?” So I’ll let you think for a minute, Richie, because I have mine, and both me and Erik have blogged about this. I want non-logged tables. I want to be able to say that this specific table I don’t need to log it at all. Whatever data is in there I’m going to live with. If something hoses up mid-transaction whatever I have for records in there is totally fine for me. The great examples of that are sessions, state tables, reporting tables for like data marts that I could just clean out at the drop of a hat. You can’t just fix this by putting a database into simple mode—model. Tara will kill me if I don’t say “model.” Simple recovery model will still log all your inserts, updates, and deletes to disk. Whereas if like Postgres has non-logged tables, that would let me solve some problems with clients really quickly by just saying, “This table is garbage. Let’s turn off the logging.” Like if I had that over the last year how many real customer problems would I have solved? It’s probably on the small number of fingers on one hand, it’s just so cool, so I’m kind of jaded there. How about you, Richie?

Richie Rump: So I could go the, hey let’s get some of the windowing functions to spec fully because they’re mostly there but they could—but I’m not going to go there. There’s a feature that they just released for Aurora that I’ve been playing around with, probably finish playing around with this week, but you could load data from S3 into Aurora, so I’m going to say that, being able to load data from a JSON or XML file and just by one line on the command line and it just goes up and does it.

Brent Ozar: All right, so now I can think of another one. I would also love to have on Always On Availability Groups, I would love to have—and this is just pie in the sky, this is never going to happen. I would love to have one replica that doesn’t have indexes. I would love to have a replica that is just purely the clustered index, not any of the nonclustered indexes, or different nonclustered indexes somewhere else. So it would kind of give me the best of transactional replication plus Always On Availability Groups. It is never going to happen. I say that, I would love to have it happen, I just don’t see it happening. J.H. says, “How about table level restores?” All the third party backup software can already do that anyway so I’m kind of like, eh, I’m used to seeing that out of LiteSpeed, Redgate, and Idera. So if you want that, you can get that today for like $1,000 a server.

Richie Rump: There’s another comment, “Am I mistaken but isn’t no-logged table the same as in-memory tables? Or do those get logged?”

Brent Ozar: It’s not. With in-memory tables, first off, you have to accept all the restrictions of in-memory tables, which rule out a whole lot of capabilities of regular, plain old tables. Plus, if they’re durable, they are still written to disk. They’re just written to disk in a different format. You can say nondurable in memory tables but those are simply erased whenever you go restart that SQL Server. I do want the data there, I just don’t need it logged.

Richie Rump: Yeah, another great way to put session information inside of SQL Server because that’s always a great idea. Yeah, let’s do that.

Brent Ozar: And you know, so I actually came around. When they said that they’re going to allow Hekaton in Standard Edition I was like, okay. So finally, I will accept session state in Hekaton on Standard Edition. I’m not happy about it, not excited about, it still seems like flushing money down the toilet but if you had to flush money down the toilet, now you can just flush it down with Standard Edition money instead of Enterprise.

Richie Rump: Yeah, so if you want to flush any more money down the toilet, you can go to Richie Rump Palmetto Bay, Florida 33157. There you go.

Brent Ozar: There you go. And on that bombshell, we will end. Good night, everybody. Thanks for watching. We will see you next week. Adios.


SQL Server 2016 SP1 Bug with Graphical Execution Plans

Execution Plans, SQL Server
19 Comments

Early adopters: if you’re rolling on SQL Server 2016 SP1, you may have noticed that when you click on query plans in sp_BlitzCache or sp_WhoIsActive, they sometimes render as ugly XML rather than the beautiful graphical showplan.

Here’s a 3-minute video showing the bug:

SQL Server 2016 SP1 Query Plan Bug

To work around it, save the XML plan as a file – but you’ll need to frame the name in double quotes, like “my beautiful query.sqlplan” and end it with a sqlplan extension. Then, open it up again, and SSMS renders it as a graphical plan.

To get this fixed, vote up these 3 Connect items:


Index Scans: Not Nearly As Bad As You Might Think

Execution Plans, SQL Server
14 Comments

Using our good old buddy the StackOverflow demo database, let’s look at the Users table. I want to get the first 10 users in the table ordered by Id, the primary clustered key, which is an identity field.

Here’s the actual execution plan:

https://www.brentozar.com/pastetheplan/?id=HyuSDxZye
https://www.brentozar.com/pastetheplan/?id=HyuSDxZye

Ooo – clustered index scan – that’s bad, right?

Not so fast. Click on the plan to view it in PasteThePlan.com, and hover your mouse over the Clustered Index Scan:

Only 10 rows read. Hmm.
Only 10 rows read. Hmm.

We didn’t scan the whole clustered index – just part of it.

And you’ve always heard seeks are great, right?

Let’s test that out too with a very similar query:

https://www.brentozar.com/pastetheplan/?id=B1laKeWJg
https://www.brentozar.com/pastetheplan/?id=B1laKeWJg

That query plan gets a clustered index seek. Perhaps we’ve been told that seeks are better than scans. So is this query less impactful than the first query? Absolutely not – this query takes a heck of a long time to run, and if you click on the plan and hover your mouse over that seek operator, you’ll see a lot more than 10 rows get returned.

The only difference between seeks and scans: where you start.

Seek means we know the starting point’s value. The starting value might happen to be the first row in the table, but that’s irrelevant. Seek means we know the value – in the case of our latter query, we knew the Id would be 1. Sure, as human beings, you and I know that we started our identity field with a value of 1 – but SQL Server doesn’t know that, nor does it know for sure that you never set identity insert on, and backloaded a bunch of data with negative identity fields.

Scan means we start at one end of the index, regardless of what value we find there.

Neither seek nor scan has anything to do with where we stop reading. We can do a scan that only reads a few 8KB pages, and we can do a seek that reads the entire table.

This has two big performance tuning implications. First, when you see a plan, you can’t just look at seek vs scan. Second, when you read the index usage DMVs, you can’t judge efficiency based solely on seek vs scan.