Blog

There’s Something Funny About Variable Assignment

Humor, SQL Server
35 Comments

Guess who

I’m going to give you three queries, and you have to guess what the output will be before you run them.

Here they are:

Go ahead

Feel free to sub in the compatibility level that your databases are actually in before running these on your own server.

I promise, they won’t bite.

But let’s talk about your guesses first!

Will it return…
master
tempdb

Or

master
master
tempdb

You have to pick one before you run it.

Hardweirded

If you guessed two masters, congratulations! You’re smarter than I was a few days ago.

While working on a script, I ran across a weird bug in it. Whatever database was processed last would get processed over and over again until another database was ready.

Picture having 5 databases — on the first run, the loop would hit database 5 and process database 5 several times. On the second loop, it would happen with database 4, and so on, then repeat.

I couldn’t figure out why at first, but then it hit me — the variable wasn’t resetting when it hit a NULL value! It only reset when it hit an actual value. I had to hard code a reset to NULL in between runs.

SET vs. SELECT

I could have avoided some of this trouble by using SET instead. There are some basic differences between the two, but this is an important one.

If I replace my code with this, it correctly assigns the NULL and returns nothing.

Thanks for reading!


Announcing Expert SSIS Training with Andy Leonard

Company News
2 Comments

SQL Server Integration Services has changed a lot over the last decade, but one thing has remained the same: Andy Leonard has been the go-to name in the business. I’m excited to announce that we’ve partnered with Andy for online delivery of his master-level course, Expert SSIS Training.

Here’s the abstract:

Data integration is the foundation of data science, business intelligence, and enterprise data warehousing. This instructor-led training class is specifically designed for SQL Server Integration Services (SSIS) professionals responsible for developing, deploying, and managing data integration at enterprise-scale.

Expert SSIS Training by Andy Leonard

You will learn to improve data integration with SSIS by:

  1. Building faster data integration.
  2. Making data integration execution more manageable.
  3. Building data integration faster.

We surveyed thousands of subscribers, and based on your choices, here’s the topics Andy will teach:

  • SSIS Design Patterns for Performance – how to build SSIS packages that execute and load data faster by tuning SSIS data flows and implementing performance patterns.
  • SSIS Deployment, Configuration, Execution, and Monitoring – the “Ops” part of DevOps with SSIS using out-of-the-box tools and the latest utilities.
  • Automation – how to use Business Intelligence Markup Language (Biml) to improve SSIS quality and reduce development time.

It’s a live online course on September 11-12, Monday-Tuesday. It’s $1,995, but if you register in June with coupon code JoinOnline, you can save $1,000 off. Learn more and register now.


[Video] Office Hours 2017/06/21 (With Transcriptions)

This week, Brent, Erik, Tara, and Richie discuss cross-database transactions in AGs, agent job owners, column store indexes, linked servers, migrating SQL Server databases to AWS cloud, synchronous readable secondary, rebuilding indexes, backups using SANs, licensing, Always On Availability Groups on CDC, Azure Cosmo DB, compression, and encryption.

Here’s the video on YouTube:

Office Hours Webcast - 2017/06/21

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes

Office Hours Webcast – 2017-6-21

 

Cross-database transactions work in AGs – why aren’t they supported?

Brent Ozar: Brian asks, “In SQL Server 2016 and below, Microsoft says cross-database transactions aren’t supported with AGs, but I’m testing and it seems to be working, what’s the big deal? Why shouldn’t I do cross database transactions and AGs?”

Tara Kizer: It’s not that you can’t do them. What’s going to happen on the rollback, isn’t that where the issue – there could be an issue on the rollback, so as long as your servers are up and there’s no issue, it’s going to look like it’s working. I would test – do a BEGINTRAN in one of them and do a bit of changes and then restart one of the servers. Do this in a test environment, obviously. Or if it’s on the same server, crossing the same instance and just, you know, take the other database offline and see what happens on rollback.

Erik Darling: What’s the gotcha on distributed transactions, like when you actually use DTC and not just cross database?

Tara Kizer: [crosstalk] with AG 2016 – you know, I just did an AG client yesterday so I have this fresh in my memory, but you need Windows 2016, and our deck says that it’s somewhat supported. DTC support and Windows 2016, sort of…

Brent Ozar: It’s so bad. It’s only supported between two databases on different servers [crosstalk] database transaction, they can’t be in the same AG, they can’t be on the same server. So it’s – if you go to the Books Online page about AGA support for DTC, scroll all the way to the bottom, there’s a beautiful example just in plain English, like a thought experiment that shows you why you’ll lose data.

Erik Darling: I think they fixed that up a little bit for 2017, like it’s fully supported.

Brent Ozar: Yeah, it’s totally supported…

Erik Darling: On the same server and everything.

Brent Ozar: There’s a gotcha, when you set up the AG, you have to define where you want to lose data. If you have extra data in one database, do you want to discard it or how do you want to react to it?

Erik Darling: Oh that gave me the chills, I’m sorry.

Brent Ozar: But at least it’s supported, because how often does that really happen, you know, that you failover mid transaction? I say it sarcastically…

Richie Rump: On purpose?

Brent Ozar: Right, on purpose you would just do a plan failover and it would be pretty graceful, it wouldn’t be a very [crosstalk]…

 

Should I change my Agent job owners to SA?

Brent Ozar: J.H. says, “is it safe and best practices to switch all my agent job owners to SA?” He said, “I’m worried about someone leaving and having their login disabled.” What do you guys for agent job owners?

Erik Darling: That.

Tara Kizer: Yes, that’s the best practice. I had a client recently where it had created an sp_whoisactive job, so that it would start collecting some data for us, and they saw me adding SA as the job owner. They were like, oh wait a second, SA is disabled. It’s like, it doesn’t matter, still is going to work.

 

I’m trying to compare 60M rows over linked servers…

Brent Ozar: M.M. says, this is probably going to be good for Richie, M.M. says, “I’m trying to help a developer…” I already love this guy, or woman, who knows? M.M. I can’t tell. “I’m trying to help a developer and myself to improve performance. We have a query that every day it joins a table across network servers using linked servers or open query, and it’s trying to compare 60 million rows, 11 columns on both of them, one to one. What are my best options?”

Tara Kizer: Unsubscribe. You need to use SSIS for that type of thing. Don’t do this type of stuff in T-SQL. SSIS can handle this more efficiently than T-SQL can.

Brent Ozar: Richie, what’s your thoughts too?

Richie Rump: Yeah, my first thought was move everything to one server, but yeah, SSIS would be better, especially with that kind of load, you know, 60 million rows. Yeah, probably better.

Brent Ozar: And if it’s only once a day, if you wanted to move everything onto the same server, you just do a restore. Just backup – take your nightly backups and restore them over onto the other server, because it doesn’t have to be up to the exact moment in time.

 

Can you explain columnstore indexes?

Brent Ozar: Let’s see here, Michaela – see I remembered your name from last week. Michaela says, “can you guys explain column store indexes?” I’ve got to start reading these questions before I ask them.

Tara Kizer: Erik, that’s on you… And go.

Erik Darling: Well you see, there’s this dictionary… So column store indexes are neat. They are meant for big analytical or data warehouse type workloads. They have excellent compression, and assuming you have all your row groups and you’re not using any data types that disagree with column store indexes, like for example, strings, and you don’t care much about constraints, they’re great. But there are a lot of gotchas with them; probably too many for me to get into here. Also, they’re not good for OLTP type workloads because when you’re doing lots of little inserts, updates and or deletes, you end up with all sorts of bad things. Well, you end up with things that don’t get compressed correctly or fully and it’s just not really a good scene for that.

 

Is Microsoft getting rid of linked servers?

Brent Ozar: So let’s see, Dorian says, “does Microsoft have any plans to get rid of linked servers?”

Erik Darling: I don’t know how you would.

Tara Kizer: I don’t think you could.

Erik Darling: One could only hope.

Brent Ozar: And cursors and triggers and…

Erik Darling: Everything I don’t like.

Brent Ozar: And I would like a pony. No, yeah so probably not, because as long as anything’s being used out there in the wild, Microsoft’s all about adoption rate. Like they want to get more people using the product and taking stuff away is a sure fire way to get them to stop upgrading.

Erik Darling: And plus, if you take away linked servers, you take away a very easy path that people have to take data from another platform and put it into SQL Server and start paying Microsoft money for licensing.

Brent Ozar: That’s true.

Erik Darling: How else do you get off MySQL?

Brent Ozar: [Laughs] … Well, you make that face, you’re using Postgres right now, so at least better than MySQL [crosstalk].

Erik Darling: Different universes, man.

 

Will AWS help me migrate into their cloud?

Brent Ozar: J.H. says, “does AWS offer contractors to help us migrate SQL Server databases up into the AWS cloud?”

AWS Snowball

Erik Darling: They’re called Snowballs.

Brent Ozar: Oh that’s true, talk about what a Snowball is.

Erik Darling: They send you a cooler with a disk drive in it and you move all your data to it, and then you ship it back to Amazon and then Amazon puts it magically up into the cloud. So if you’re dealing with terabytes and terabytes of data, it’s awesome because it’s much safer and quicker to do that. Imagine the feeling of upload failed on like ten terabytes. So it’s a lot safer and easier way to do things. Probably they wipe the disk that they sent you before they give it to anyone else.

Brent Ozar: AWS Snowball, very cool. And do they have consultants? Yeah, sure, absolutely, and there are tons of people out there. That’s not something that we do, we don’t help you with the migration process, although we do help you sketch out should you go to AWS or not, and then should you do RDS or EC2.

 

What’s the best way to use SSRS with AGs?

Brent Ozar: Steven says, “Thank you so much for running these sessions…” You’re welcome. “I watch every week on Youtube and this is my first time watching live…” Oh, very cool. He says, “what’s the best way to use SSRS with AlwaysOn? We’re moving to AlwaysOn with a synchronous readable secondary. Okay, so let me stop there and ask, well what are your thoughts when you hear synchronous readable secondary?

Tara Kizer: I’m like, yes. I’ve supported, you know, extremely mission critical system that had high-performance requirements and, you know, an extremely busy system and you have to be careful on that synchronous readable secondary, because the load there can impact your writeable side. And we certainly had issues with that, but our reports had to be near real time, and an asynchronous readable secondary could have significant latency, depending upon your transactions.

We didn’t do index rebuilds on this system anymore because of the latency on the synchronous server, the blocking that it would cause there. But a lot of people are doing index rebuilds, and just imagine what the latency would be on the asynchronous readable secondary. So I’m okay with what he’s saying, it’s just that you have to make sure you have the hardware to support it, the indexes to support your queries and the right queries on that synchronous readable secondary. If you don’t, you’re going to cause problems on your primary replica.

Erik Darling: You know, and Tara’s talking about rebuilding indexes, which I totally agree with, that’s a pain in the butt with AGs, but even if you find that after you’ve put stuff over on the secondary and all of a sudden you have this additional reporting workload or a change in reporting workload and you need new indexes, even creating indexes over there, which has the same effect as rebuilding, can be difficult. So words of caution there as well.

Brent Ozar: He has a follow-up part, he says, “would it be best to install SSRS on that secondary node or have another VM with SSRS on it?” And that probably answers your question there. As little workload as you can get on that VM is probably a good idea.

Tara Kizer: Well that and the best practice is not to put SSRS on your SQL Servers anyway. This needs to go on a separate server…

Brent Ozar: And why?

Tara Kizer: Well I’m all about SQL Servers being dedicated to SQL Server. Yes, SSRS is a SQL Server product but it’s really a separate product, I’m talking about the database engine, only supporting software like maybe antivirus software or tape backup software should go on your server but everything else – SSRS can go on a smaller VM. It doesn’t need the capacity that we have for the database engine.

Erik Darling: That’s generally our recommendation for any of the, sort of, tertiary non-engine components, SSIS, SSRS, put them on a different server. You don’t want them competing with your SQL Server, just because those resources are precious and, you know, unless you have things really well defined with how much stuff actually needs. They’re going to walk on each other and it’s not going to be pretty.

 

If I have two SANs, where should I store my backups?

Brent Ozar: James says, “we have two SANs, we have one SAN just used for data and one SAN is for backups. To make life easier in the event of either SAN failure, where would I store my SQL Server backups? Should I store my SQL Server backups to the data SAN or to the backup SAN where my VM backups are kept?”

Erik Darling: Yes.

Brent Ozar: That’s kind of interesting, yeah…

Erik Darling: I would do both. I would have them in both places.

Brent Ozar: I kind of like where you’re going with that. Well yes, because you don’t want to lose point-in-time recoverability if you lose either SAN, that’s a good point. And your backups aren’t usually that big. I mean, you don’t have to keep 30 days of backups on both SANs.

 

Does SSRS need a license when run separately?

Brent Ozar: Ben says, “if SSRS is on a separate SQL Server, does that require a separate license?”

Tara Kizer: Sure does. Sorry, best practice.

Brent Ozar: It was cheap – you might be able to get by with Standard too, you may not need Enterprise Edition there.

Erik Darling: And it can be a much smaller VM, you know. Like on your regular one, you might have 16, 24 cores on the SSRS when you might be able to get away with 4, 6, 8.

 

Have you used 5 AG replicas with CDC?

Brent Ozar: Seybou asks, “do you guys have any experience of using Always On with five replicas with Change Data Capture (CDC) enabled? What are some of the issues that might happen?” Any Of you guys used CDC?

Tara Kizer: I have used CDC but it was not on an availability group server. I can’t really think of issues – well I can think of one issue, CDC uses the transaction log. That’s where everything gets logged, so you get CDC reading the transaction log and you’ve got availability groups reading the transaction log on what to send out to the other servers. So I know when you have replication and database mirroring, those could have contention because everything’s reading the transaction log, so there’s trace flags in place that can help you to have – I think it’s to have replication skip over the mirroring records and things like that. I wonder if there’s something for CDC that would allow it to – you know, so it doesn’t have to scan everything in the transaction log, because I’ve had issues with that in the past. I have had supported systems with five replicas, but not with CDC enabled.

Brent Ozar: Gotcha, Seybou, if you want more info in there too, you might want to let us know what you’re going to use those five replicas for, what you need all five for, and maybe what you’re using CDC for as well.

 

What is Microsoft CosmosDB?

Brent Ozar: Folks, we’re down to a couple of questions left in the queue. If you have questions, feel free to put them in over in the questions pane and GoToWebinar. Alexandre asks, “guys, what is Microsoft Cosmo DB?”

Erik Darling: It’s what Document DB used to be.

Brent Ozar: And go on about – so was it just a rename or?

Erik Darling: I’ve read nothing about it. I know that it used to be Document DB and it’s a NoSQL document store with lots of JSON involved and that’s where I bail out. That’s where I pull the parachute whether I’m jumping or not.

Richie Rump: Yeah, didn’t they add some other types to that, as opposed to Document, there’s key value pairs and stuff like that?

Brent Ozar: Yeah, they had Document DB for the longest time and it wasn’t catching on. And I looked at it, we looked at it I think, using it briefly for execution plans, but the maximum document size was like 400KB. So it was like a key value store originally, then they just kept layering more capabilities on top of it. And now with the new Cosmo DB, they renamed it. They clicked file save as in visual studio and they saved it under a different name. Now Cosmo DB has all kinds of other things it can store instead of just 400KB key value pairs, but in terms of, like, us knowing any more than that, no, because stuff – like all the NoSQL databases, you generally have to write your frontend app differently or your backend apps differently to access data in different ways. It’s not like they’re just capable accepting inserts, updates and deletes, you have some different work to do, so it’s not lightweight.

 

Can I compress backups of varbinary(max) data?

Brent Ozar: James says, “our beautiful and attractive developers store attachments in the database, so I have a lot of columns that are varbinary max. Is it possible to compress my backups?” Well so generally, if it’s binary files they’re probably not all that compressible to begin with, like if it’s Excel or Powerpoint.

Erik Darling: Yeah I’m just thinking about, like, the way that gobbledygook compresses when you use TDE, and I don’t think that does too well either. I wonder if even, like, page compression would work well on those at all?

Brent Ozar: If it’s off-row data you’re screwed. Yeah, as soon as it goes off-row, you’re out of luck. What I would do is, if you have anything like – I always blast dedupe appliances. I’m like dedupe appliances are horrible for SQL Server backups, but this is kind of what they’re designed for. They use additional CPU power in order to do like single instance storage. It might actually help you out there. But the other thing you could try is you could just try zipping or RAWRing your backups, just the files. See if they compress at all, if they don’t you’re screwed.

Erik Darling: Yeah, we have a 7zip script. What I would focus on, rather than that, is trying to talk my developers out of doing that. Just storing pointers to the files on the file system, getting rid of the blob files in the database and just, you know, pointing to wherever they live and letting SQL find them – or letting the application fins them that way, just doing the lookups through SQL.

Richie Rump: If the developers are beautiful, does that mean they took beauty over brawn… Brains.

Brent Ozar: He actually said they were stupid, but I didn’t want to say…

Tara Kizer: I was going to say. I read the question.

Richie Rump: Oh there it is, yes, stupid developers. Well I can’t argue with that one.

Brent Ozar: Man, … there’s my little soapbox is – you know, we always think – I say we, as database administrators, a lot of us have this inflated ego, like we know everything and those other people in the shop are stupid. And then we run sp_Blizt on your servers and I go, you don’t have any pants on, you know, just stop calling other people stupid, you don’t have backups set up.

 

What backups can I take on an AG replica?

Brent Ozar: Brain says, “For availability groups, can you take differentials and log backups on my read – which backups can I take, fulls, differentials or logs on my secondary replicas?”

Tara Kizer: No differentials and only full copy only, but you can take log backups there. I don’t recommend taking backups on secondaries, honestly, because they could be out of date. Even a synchronous commit replica can be out of date. SQL Server could drop it to an asynchronous replica silently because of some timeout issues with syncing the data over there. So I’ve only ran all my backups on the primary replica, and if you’re trying to offload that kind of task, I wonder if maybe you’ve got some I/O issues, you know. Why are you having to offload a backup task? Since when did backups cause issues?

Erik Darling: Tara, you said one thing in there very causally but it’s something that no one – it’s something that I forget a lot about AGs, is that behind the scenes, they could get dropped to async if SQL Server detects enough latency. And that’s something [crosstalk] insane.

Tara Kizer: I actually didn’t even know that, you know, until I saw the blog post from, I think it was Brent, you know. The quote from MSDN is like oh, I wonder if that actually ever happened to me. I’ve no idea because it’s doing it silently, you don’t know, and you know, I’ve never had an issue with failovers. Obviously a failover is not going to occur if it is an asynchronous replica. It’s going to say it can’t do it right now, or whatever it’s going to say. So I don’t even know if it’s every occurred on the systems I’ve supported.

Erik Darling: That is not a bolded bullet point in the Microsoft pamphlet about availability groups. That is not up front.

Brent Ozar: [crosstalk] I should totally do a blog post of my favorite little hidden nuggets in Books Online, just gems. Because the one like the async index DMVs, column order is not a part of our recommendations, we’re just giving you a CSV list of fields, and everybody assumes that when they see the missing index creation in an execution plan or in the DMVs, they’re like oh this must be the order, right, they’re comma delimited.

Tara Kizer: Did that issue exist for database mirroring, for synchronous database mirroring?

Brent Ozar: I don’t remember if it was in Books Online or not, because I hardly ever use synchronous database mirroring. It’s one of those, looked really cool but…

 

Is full text indexing a performance drain?

Brent Ozar: M.M. asks, “some of our servers have full text indexing turned on and other ones don’t. The architecture was set up by an accidental DBA; could this be affecting performance? How do I know if full text index is being used, or can I just turn it off?”

Tara Kizer: Turn it off and wait for the problems. I’ve had it installed on most of my servers, you know, on failover cluster instances where it might be hard to add later. It doesn’t cause any performance issues, you know. When I’ve had performance issues on a cluster, I’m looking to maybe stop some services and I have had issues where that service was failing, that cluster resource was failing. And I didn’t want to bother with trouble shooting, so I just deleted the resource, you know, because I didn’t need it. I’ve never had it cause performance issues where it wasn’t being used.

Erik Darling: Yeah, I’m trying to think of ways you can figure it out, like maybe if you looked at the last crawl date – like if you look at some of the DMVs for full text indexes, you could see the last crawl date and the last crawl completed date. You might want to look through your plan cache to see if there are any queries that have contains or contains table. And I’ll make a deal with you, if you can send me a query that has XML that uses full text, I’ll see if I can get something into BlitzCache that detects when full text indexes get used. I can’t guarantee it, because I don’t know if it’s going to be in the cache plan versus the actual plan, but I’ll see if I can do something with it.

 

Is there a future in SQL Server health checks?

Brent Ozar: Dorian says, “I saw an ad from somebody where companies can pay to have a DBA look at their databases. Do you see this as becoming the norm?” Well given that that’s what we do for a living, yes.

Erik Darling: No, we should be the only ones doing it [crosstalk] huge outlier.

Brent Ozar: We got in – it’s funny, several of us on here have been through – I originally got started with, it was Microsoft’s RAP, they have this – I think it’s called a risk assessment program, or risk assessment something or other.

Tara Kizer: Something, yes.

Brent Ozar: But it was – I went through that process and I was like, this is amazing, you learn so much. They give you like a 400 page PDF of everything about the database. And I was like, well I really just want the TLDR, I just want to too long didn’t read. Give me the five things I need to do or whatever, but it seems to be working out well; we get clients. Knock on wood, I should touch wood.

 

More about that 5-replica AG…

Brent Ozar: Seybou on his five node availability group, he clarifies back in, he says he’s, “got two nodes for high availability, two nodes for disaster recovery in another location, and then a third node for reporting that’s all set up in with the primary [crosstalk]

Tara Kizer: Same exact configuration I had when I first deployed availability groups. We actually had two synchronous replicas, secondary replicas on SQL 2012. One was for reporting, because it needed near real-time reporting, and the other one was the failover node, and then two asynchronous servers out in DR. We actually had three out there, because we needed exact hardware that we did at the primary site in case we had to do a failover to that environment, we had to run full power. But on 2012 you could only have four secondary replicas, so we had a cold standby. Microsoft wanted us to log ship to it, but I was like – once we did the manual failover, that’s when I’ll pull that one into the AG, yeah.

 

How can we encrypt data on Standard Edition?

Brent Ozar: Let’s see, Steven says, “if we don’t have transparent data encryption because we’re only on Standard Edition, what’s the best way to encrypt our data? Should I consider BitLocker?” It’s going to kill me – Netlib… So there’s a company called Netlib that offers a filter driver that sits in between SQL Server and the file storage, and they’ll encrypt the data. It’s way cheaper than Enterprise Edition. I’ve never used it. I did a webcast for them years ago and I was like looking at the product and I was like, this is kind of amazing. But I’ve heard good things about it since, I just haven’t used it myself. So Netlib…

Tara Kizer: That’s interesting that they picked that for their company name, because that’s the net library. That’s how you connect to the database drivers, I believe. Netlib was the GUI to them, I think.

Brent Ozar: It was an odd name [crosstalk]

Richie Rump: Nuclear option, have the app encrypt it.

Brent Ozar: And so talk about what the advantages of that are.

Richie Rump: So the advantages of that is, is that from the app and all the way to the database, the data will be encrypted. So that’s the big thing; and then if you don’t want anybody who has database access to be able to get to that data, it will be encrypted in the data. The problem is that if you want to do a SQL query on that, it’s kind of encrypted too.

Brent Ozar: Linked server queries, replication, etcetera…

Erik Darling: I would just avoid BitLocker, because that only encrypts the drive and if you move data off the drive, it’s not encrypted. And if you have BitLocker on other drives, like I have it on my laptop, and every time I boot my laptop up, I have to unlock the D drive. If you have a server go down and come back up, you have to have someone unlock that drive or it’s not really all that bit locked. If it’s just available when you start up, it’s not as useful.

 

Can you provide reference links?

Brent Ozar, Srijith says, “can you please provide a reference link for an AG secondary backup can cause an issue due to loss of connectivity?” I don’t know if we can do a reference link or whatever; I would just say kind of think through it. If the primary stops sending data to the secondary, for example, if the secondary restarts or gets behind, you can backup that thing every minute, but if it doesn’t have fresh data and your primary goes down, I hope that resume’s up to date.

Erik Darling: even just a network hiccup could put things behind a little bit.

 

Can my SAN encrypt my data?

Brent Ozar: Ben says – and I hear this about the cloud too. Ben says, “if you’re on a SAN, don’t most SANs have an option for encrypted storage?” oh man, so this is kind of tricky. So a lot of SANs have the opportunity – or, like Azure calls it SSE, self signed encryption stuff, where they go, scouts honor, promise, everything’s encrypted and we can’t access it. You don’t have to set up a key; we just do it for you automatically. And that’s true; right up until you do something like take a backup. You can take a backup of your database to another location, it doesn’t matter if it’s encrypted or rest, you’re still kind of screwed there. Same thing with querying the data, like Richie pointed out. If you’re worried about the DBA selling your data to the Russians, you’re still kind of out of luck there.

Richie Rump: [crosstalk] very much, yes.

Erik Darling: That was a full sentence.

Brent Ozar: There are a lot of Russians in South – I keep wanting to say North Cuba or South Miami, right. There’s a Russian contingency there.

Richie Rump: Yeah, they’re like a North Miami Beach, yeah, so…

 

Can high performance power mode cause problems?

Brent Ozar: M.M. asks, “somehow all our power plans reverted to balanced.” I’ve seen this with group policy, group policy’s done this. “Have you ever seen an instance where going from basic to high performance has caused a problem?” That’s a good question; I don’t think so. Not that I can think of. I’ve seen it going the other way, you know, going from high performance to basic has caused a problem.

Tara Kizer: He should be able to just do it live, don’t even have to reboot the server. The only comment, you need to look at your BIOS and if this is a VM environment, also look at the BIOS level setting.

 

Would you rather…

Brent Ozar: And then the last question we’ll take for today. Steven asks, “would you rather be chased by 1000 duck sized horses or one horse sized duck?” It’s a tongue twister even.

Erik Darling: I’ll take the one duck.

Richie Rump: yeah, I’ll take the one.

Brent Ozar: I don’t know, see I think I would die of laughter if I was being chased by 1000 duck sixed horses. I would – I’d just lay down and go see what happens, until they start peeing on me, that would probably…

Tara Kizer: If I was being chased by one horse sized duck, I would wonder if I’m in one of those movies from a few decades ago where it’s like these giant insects are coming through the city.

Brent Ozar: Honey I Shrunk The Kids…

Tara Kizer: I’d wonder if I was dreaming.

Erik Darling: You know what the problem with the little horses is? They’d be too cute. Remember there was that commercial? I forget what it was for, but the guy had a miniature pet giraffe, and at the end of the commercial – it was like a CGI pet giraffe and he would like give it a little kiss and giggle. I forget what that was, but I saw it and I was like holy cow, if you could genetically engineer me a miniature giraffe, that would be my pet. It would sit right here on my desk with a little pillow and hang out, I would be [crosstalk]. So when I think of miniature horses, I think about that giraffe, and I’m like, that would be adorable.

Richie Rump: Yeah, but think about it, horses are still going to be pretty fast, even if they’re miniature, right, but a duck, that’s going to be whiling around pretty slow. I could probably out-maneuver a horse sized duck as opposed to these thousand tiny horses, almost like fast moving zombies on me. Yeah, I’m not about that.

Brent Ozar: That’s true; I buy that. And with that, ladies and gentlemen, we will end this edition of Office Hours, thanks, everybody.


Does Separating Data and Log Files Make Your Server More Reliable?

Architecture
98 Comments

The old advice went something like this: “Put your data and log files on separate drives and your server will be more reliable. If you lose the data drive, you can still do a tail-of-the-log backup, and you won’t lose any data.”

It’s advice. But is it actually good advice?

Let’s think through it.

  • If SQL Server loses connection to shared storage, you’re still screwed. No surprise there.
  • If it loses connection to just one volume, and it happens to be the log file volume…you’re still screwed.
  • But if it happens to lose connection to just your data file volume, you’re safe! Well, you’re down, but you didn’t have any data loss (assuming you know how to do a tail of the log backup.)

At first, that sounds like you’ve cut your risks by 50% – but let’s dig deeper. This scenario correctly assumes that a single volume can fail. I’ve certainly had that happen:

  • A SAN admin accidentally remapped one of my volumes to a different server
  • A messed-up snapshot ran out of space (the SAN admin had accidentally taken a snapshot of one of my server’s volumes)
  • A raid array became corrupt

I can’t even begin to estimate how often these things happen, so just to pick a number, let’s say any given volume has a 1 year time between failures.

So it’s time for a quiz:

  1. If you put all of a SQL Server’s data files & logs on a single volume, how many failures will that server experience per year?
    • Bonus question: what kinds of data loss and downtime will each of those failure(s) have?
  2. If you split a SQL Server’s data files onto one volume, and log files onto another volume, how many failures will that server experience per year?
    • Bonus question: what kinds of data loss and downtime will each of those failures have?

Think carefully about the answers – or read the comments to see someone else’s homework, hahaha – before you move on.

“I disagree with your volume failure rate idea.”

I hear some of you saying, “Wait – I believe failure rates are not tied to volumes. It’s not that each volume can fail – it’s that a server will have a failure rate. I believe a server will lose a volume once a year.

OK, hotshot, let’s say that once a year (again, just picking a number), your server will lose one of its volumes. In that case, which design would give you the least data loss and downtime?

  1. Just 1 volume, with both your data & logs on it
  2. 2 volumes, 1 with data files and 1 with logs
  3. 10 volumes, 9 with data and 1 with logs
  4. 100 volumes, 98 of which are empty, then 1 with data and 1 with logs

If you’re arguing for answer #2, keep in mind that when your server has its annual volume failure, you stand a 100% chance of downtime and a 50% chance of data loss.

Now THESE are single points of success.

Whereas #4 has a 2% chance of downtime, a 1% chance of data loss. Brilliant! 1,000 empty volumes probably equals five 9s of uptime, woohoo!

Except now you stand a 100% chance of getting a bottle to the face from your storage admin. Ask for this configuration, and they’ll be happy to explain why adding more empty volumes to your server doesn’t magically protect any valuable volume.

If you still have a single point of failure in your log file volume, adding other volumes to do other things doesn’t help you. You’re just adding more single points of failure with their own failure rates.

On a related note, check out my 2009 post, Adding Reliability to Your Infrastructure.


Trainable SQL Servers

Funny thing

The human body is pretty good at adaptation. When you get sick or injured, your body responds and (most of the time) heals itself up. When you stress your body in a certain way, over time, your body will feel less stressed by the same level of that activity. If you do 10 push-ups every day, after a week you’ll barely notice you’re doing them.

One of my favorite sources for information about human body training is Mark Rippetoe. Perhaps it’s a bit of confirmation bias, because I like to eat, sleep, and not do cardio. I do like to lift something heavy a few times in a few different ways, then come back (after eating and sleeping and not doing cardio) and lift something a little heavier. That’s sensible training. To me, anyway. Again, confirmation bias.

Almond Butter and Creatine Sandwich

If I were a runner, I’d want to try run some distance faster, or be able to run a longer distance.

If I were into CrossFit, I’d try to juggle kettlebells while riding a unicycle on a balance ball and deadlifting a barbell I’m standing on so next time I can fight a truck tire and drag a fridge up a ramp while wearing a fire hydrant for a hat. Or something? Cool shorts, though.

But that’s enough about the human body. It’s all gross inside. Gross and smelly.

What if your SQL Server adapted?

Azure has something a little bit like that. It will monitor missing index recommendations, test the changes, and then decide whether to keep them or not based on some further analysis. But this is more like a doctor trying different medications on you until one works.

There’s also, somewhere in the future, a feature called Adaptive Query Processing coming to a vNext near you. But again, this isn’t training. This is more like you saying “doctor, it hurts when I do this” and the doctor saying “stop doing that”. Even a flatworm turns away from pain.

This also… is not training.

This doesn’t help your CPUs if your app takes on lots of new users, who start running more queries. This doesn’t help your disks as users start committing more transactions and reading more data. It doesn’t help memory when your data doubles or triples past your currently allotted RAM. It doesn’t help tempdb when the 50 GB drive you gave it is all of a sudden 150 GB too small.

Your server faithfully uses the same hardware, and level of hardware, with absolutely no adaptation.

Just sits there. Does not change. Despite the fact that you stressed it (workload), and fed it (data), and it got some rest (users went to bed), it did not change.

It had the entire feedback cycle necessary available to it, but it did nothing to react. That part is still up to you.

What would adaptation look like?

The first thing to figure out is where and how this is possible.

Physical servers are pretty much out, but it could work if you had a several node AG and some kickass ROBOTS that had access to shelves of compatible hardware. You’d also have to trust some internal process to failover, take a node offline, replace hardware, bring it online, and so on down the line. If you think this sounds ridiculous, you’re right. If you don’t, you might be one of those phone company employees who thought the internet getting popular meant everyone was going to need two phone lines.

Physically situated VMs are… okay. But if you underpowered your host(s), you’re stuck with the same kind of scenario as above. You can’t give a VM more RAM than exists in the host. Again, ROBOTS! And ridiculous.

The cloud is the only place this is really doable. There’s just gobs of hardware and redundancy already.

No offense to AWS or GCE, but Azure is sort of a natural fit. It’s Microsoft, and they can stick their fingers in the source code to add whatever additional feedback a server would need to adapt to workload changes. Granted, any of them could use wait stats and other DMVs, or even Extended Events, and probably do a really good job of automating self-tuning hardware. It would still be a process, but the sweet part is that you spin up a new VM with powered up parts where deficiencies are detected and fail right over to it. That part of the cloud is spectacular.

Of course, having read some Microsoft RAP reports, I’d be a little nervous about them just making any ol’ change to my prod server.

“What did we find?”
“Context switching!”
“By God, you know what to do.”
[Sets stats to update async for msdb]

And yet I digress! You probably wouldn’t want this to kick in for a momentary spike. You’d probably want some period of sustained load around 5-10 minutes before the ROBOTS started organizing to upgrade something. Much in the same way you don’t want to fail your entire AG over for a 10 second network blip. You’d want an “Is Waffle House Open?” strategy. You don’t want a reaction until something serious is happening.

Any monitoring tool, too, could feed these metrics out.

Unless you’re hitting THREADPOOL or some RESOURCE_SEMAPHORE* waits. Then nothin’ is feedin’ nothin’ nothin’.

Which just might be a good time to start thinking about the ROBOT stuff.

You also may want to set some thresholds for the ROBOTS to respond to. There are all sorts of business rules that could dictate automated hardware increases. Increases in user counts or data size, after a code release, or if you on-board a new client. The possibilities are… possible. Sorta. I think there’s some “machine learning” fad kicking around that could help with this kind of thing.

We all know how that ends though.

The future

Stuff like this is fun to speculate about. It’s like IT science fiction. Which I guess is regular science fiction. Trans-serverism? I don’t know.

It’s probably already in the works somewhere, maybe under a secret patent.

Sorry if I spoiled anyone’s big reveal by writing this.

Thanks for reading!


24 Hours of PASS: Last Season’s Performance Tuning Techniques

#SQLPass
6 Comments
Brent, last season (and I have so many good bad photos to share)

I’m excited to announce that Erik and I are presenting the opening session at this year’s free online 24 Hours of PASS webcasts! Here’s our session:

Last Season’s
Performance Tuning Techniques

You’re kinda-sorta comfortable doing performance tuning on SQL Server. You’ve read a few blogs, you monitor Page Life Expectancy, you rebuild your indexes, and you add an index here or there. However, you haven’t been to a day-long performance tuning class yet, and you’re wondering what you’re missing.

In SQL Server, performance tuning changes fast. One day Perfmon counters are in, and the next day they’re out. It’s totally okay if you’re still using SQL Server 2008 – that’s not the problem. You shouldn’t be wearing hammer pants, and you shouldn’t be looking at PLE, monitoring Disk Queue Length, or putting your data and log files on separate drives.

In this rapid-fire session, we’ll show you specific examples of why performance tuning tips and tricks have fallen out of fashion, and why they’re not too legit to quit.

Erik, last season (his parents made him cover up the tattoos during the holidays)

Come for the bad fashion.
Stay for the SQL training.

We have a dangerous combination: a box full of old family photos, and no sense of shame.

You’ll see bad hairdos, fashion choices, and SQL Server tuning techniques. You’ll probably recognize some of your own bad decisions in there.

We won’t be able to bring your skills entirely up to date within an hour, but at least we’ll be able to show you why we’re teaching a Summit pre-con, Expert Performance Tuning for SQL Server 2016 & 2017. We see so many DBAs with years of experience, but they haven’t bothered to update their skills – or their wardrobe. Let’s get you fixed up.

Register for the free 24 Hours of PASS, then talk to your boss about getting to the Summit. Let’s get you dressed for 2017.


Builder Day: Cloudbuilding Episode 1 – The Road to HANA

Staving off obsoletion

Brent has recently set aside “free” time to learn about stuff up in the cloud. I only put free in quotes because we do have to do something with a cloud theme, but we can pick whatever we want.

Naturally, I chose the hardest thing with the least practical use.

I chose to set up SAP’s in-memory HANA database, and load some data.

That seems to me like a natural way to learn about a database platform; after all, that’s the first thing I did with SQL Server.

What follows is a slow descent into madness, where I learned that ports need to be open, people still use PuTTy, and the German word for “damned”.

Fair warning: I already had a VPC and security set up for me, because we use AWS for a bunch of stuff. If you’re unsure about how to do that, join the club. You’ll need to figure that stuff out if you want to follow along at all.

Bright and early

My day started full of hope and optimism. It ended with this blog post.

Everything is going to be fine.

When you’re setting up your HANA instance for the first time, you need to put it in a VPC if you want to use certain instance types.

Why?

I don’t know.

Is there a point?

You’ll also need a Key Pair set up.

Why?

The reason for this is a bit more straightforward. You need this for your PuTTY session!

But, funny story — PuTTY can’t use the native .pem files that AWS gives you. You have to use PuTTYgen to convert them to a ppk.

So, with my freshly converted .ppk, I set off to SSH in to my brand new AWS HANA instance!

HANA shot first

No experience with SSH is easy. The first problem I had: Ports!

I had to go into my security group settings and open up the SSH port.

Being a DBA is cool.

Which means everything was cool, right?

Wrong.

Thanks, pal.

Question:

Q

Answer:

A

Language Lessons

Okay, look, I’m not a networking guy.

But still.

C’mon.

In search of

If you’re gonna tell me to go to a site, at least make the path clear.

NOT COOL

At that point, I went in search of a drink. Since it’s Monday, that means there’s nothing left in the house.

With an empty liver, I then went in search of the right word to describe my feelings, which usually leads me to the German section of Google Translate.

They’re emotional people. Like me.

Nothing more than feelings.

If I ruled the world

I’d open all the ports.

Okay, so not all of the ports.

But I’d open this one.

Sillypants

With that done, things finally started cooperating. Temporarily.

Man who thought he’d lost all hope loses last additional bit of hope he didn’t even know he still had.

 

Summagun
NO YOU HAVE FUN SSH HEAD

Where are we now?

Well, nowhere.

I still don’t have a running HANA instance, or data loaded. But I do have access.

Grumpy

This went on for a long time.

Staaaaaaaaaaahp

But as far as setup goes, things generally went okay. I had to generate a new Access Key ID, because I couldn’t remember what I did with the .csv export of the one I made when Jeremiah first made me create one in 2014. Those Secret Access Keys are not easy to remember.

Straight up jibberish, really.

But eventually, I had a working instance. Connecting was also straightforward. I just needed the IP address and the SYSTEM account password. You set that up in the web gui that appears after you okay the software agreement.

I got a license key error initially, but it was easy to fix using the instructions here.

One thing to keep in mind is that when you connect to your instance, you can mess with elastic IPs and host files if you want, but you’re most likely going to need to ’00’ in the Instance Number field. This had me stumped for a bit; I didn’t see it in any of the standard documentation, but then again my eyes tend to glaze over when I hit standard documentation.

Walkaway

And I still haven’t loaded any data

It’s been hours. I had like four birthdays. What the heck, cloud?

One last thing stood between me and having sample data to mess with.

SAP documentation.

This is one of the worst written pages on the internet, and it links to equally poorly written pages. Let me save you a ton of time and liver fat.

When the documentation page tells you to use the “Import Utility”, and the linked documentation pages all talk about the list of stuff in the SAP HANA Modeler, which looks like this:

Obviously

You don’t want that. You don’t want anything in there.

Do not click on any of this time wasting nonsense

You will spend a stupid amount of time clicking on things that don’t work.

You want the import option here:

FOREEEEVERRRRRR

You can pretty easily navigate the next couple screens, and then FINALLY

😀

AT LONG LAST

😀 😀

YOU WILL HAVE SAMPLE DATA

😀 😀 😀

And then you can be just as disappointed as I was when you find out the largest table (ORDERDETAILS) is just shy of 3000 rows.

But hey, we did it.

Now we can go to the gym.

Thanks for reading!

Brent says: I wanted to have him do a point-in-time restore, but as the day dragged on, I didn’t have the heart to tell him to finish that part. Maybe I should make them work with other databases so they appreciate just how good their job is. If one of our team members (myself included) ever comes running up to you, sobbing, and screams, “MY GOD I LOVE SQL SERVER,” then you’ll know it’s Builder Day at Brent Ozar Unlimited.

Erik says says: What Brent doesn’t know is that I did backup and restores, it was just really boring and easy with the HANA GUI, so the footage ended up on the cutting room floor.

She’s In GUIs

Builder Day: Doing a Point-in-Time Restore in Amazon RDS for SQL Server

Brent had an idea that we should set aside a day per month where we could build things in the cloud and blog about our experiences. I was tasked with Amazon RDS for SQL Server: create an instance, configure backups and test point-in-time restores.

After grabbing some coffee, I logged into the AWS console. We use AWS EC2 instances for our lab environment, so we already had much of the environment created, such as networking and security. With those out of the way, I figured my task would be easy and quick. I would soon learn that the motto of the day was HURRY UP AND WAIT.

Creating a New RDS Instance

In the RDS Dashboard, I clicked the option to launch a new database instance and selected SQL Server Standard Edition.

As one of our goals was to see what our clients go through, I selected the production option.

Next I was presented with a bunch of fields and dropdowns.

I selected my options in the dropdowns and filled out the fields.

I was presented with the advanced setting configuration page.

The only changes I made were to the VPC security group and the Windows Authentication directory.

It was now time to launch the instance.

It took about an hour before the instance was “available” and usable.

From a machine inside our AWS lab environment, I was able to connect to the RDS instance via SSMS.

RDS adds an rdsadmin database. You’ll notice that it’s in a “Principal, Synchronized” state. RDS uses Synchronous Database Mirroring if you kept the “Multi-AZ Deployment” option at the default. This option gives you a standby option in another Availability Zone which makes the databases highly available. One thing to note about this that could be significant for your application is that Synchronous Database Mirroring could cause dramatic performance degradation. Except for one system that was low volume, I’ve only used synchronous when the other server was in the same data center. When using the multi-az configuration, the two servers are in different Availability Zones, meaning different data centers not close to each other. Be sure to test if your application is okay with the performance degradation of Synchronous Database Mirroring.

I knew backups weren’t “normal” in RDS, but I wanted to see what kind of errors I’d get if I tried to create Ola’s database maintenance stored procedures. I was not disappointed.

No sysadmin for you.

You don’t get sysadmin in RDS. You get processadmin and setupadmin.

Creating a Database and Putting Data In It

I created a database and copied some data into it from an EC2 instance that we have in the lab. I was kind of surprised that moving data from EC2 into RDS was so easy. I was expecting errors using the Import/Export wizard.

After getting some data into the database, it was time to do a point-in-time restore.

Restoring an RDS Database to a Point in Time

I picked 11am since that was after the instance had been created and before I had imported the data. Note that you must select a new instance name. If your connection strings are referring to the RDS instance, then you’ll need to change them to the new instance name if you ever do a point-in-time restore. Or use a DNS alias.

It took about 45 minutes for the point-in-time restore to complete.

After sql2016rds2 was “available”, I tried connecting to it.

I am very familiar with that error, so I tried a few things before throwing in the towel. Brent noticed that the security group wasn’t correct and said I needed to change that. It wasn’t one of the options when I did the point-in-time restore, so apparently this is an extra step you have to do. I modified the instance to use the correct security group.

(Brent says: in fairness, Amazon’s point-in-time restore documentation does say you have to manually change the database’s security group after the restore, but, uh, none of us read that until the next day. So there you go. We don’t read the manual either, dear reader.)

After it was finished being modified, which took just a few minutes, I tried to connect again and received the same connection error as before.

I then did a comparison between the two instances and saw a discrepancy. sql2016rd2 was missing the Directory information in the “Security and Network” section. I modified the instance again and added our directory.

That change took almost 30 minutes. I was finally able to connect to the instance!

But the rdsadmin database was is in a restoring state, which prevented me from creating a database. Well shoot. I guess I had restored to a point that was too early.

Restoring, Take Two

I deleted sql2016rd2 and tried another restore, naming it sql2016rd2 again and this time selecting 11:45am for the restore time. The lab directory was already filled in, so I wondered what it looked like the first time I tried it. Unfortunately, I didn’t take a screenshot of it the first time around.

After the instance was created, I modified it to use our security group and then connected to the new instance. The two databases were in the proper state and ready for connections!

Summary

It has become quite clear to me that I am a tactile learner. I can’t just listen to someone present a topic, read a blog post or documentation. No matter how many times I’ve read about RDS, it just wasn’t sinking in. I’m no expert as a result of this experience, but at least I’ve touched RDS now, even if it was minimal work. Most of the time was spent waiting for things to become available. I now understand some of Richie’s frustrations with the cloud.

Brent says: In the cloud, your boss probably expects you to be able to do a point-in-time restore fairly quickly. Demos make it look like a simple wizard. Just try this kind of thing ahead of time so you can give realistic time estimates – I was totally surprised at the half hour it took for Amazon to modify a database’s security group, and we had nearly zero status messages along the way. During a business outage, that’d be a heck of a problem.


Builder Day: Doing a Point-in-Time Restore in Azure SQL DB

We’re trying something new at the company: Builder Day. We define a slightly-out-of-the-norm task, and then the team splits up and tackles the task on different cloud platforms, writing it up as we go. These posts aren’t going to make you an expert on the topic – they’re just meant to let you skim through technologies you might not otherwise get the chance to play with. The cloud changes fast, so some of this is just about seeing what providers are up to these days.

This week, our task is doing a point-in-time restore:

  1. Create an empty database
  2. Set up point-in-time backups
  3. Create a table with some stuff in it
  4. Make sure backups are happening
  5. Do a point-in-time restore

I’m up first with Azure SQL DB. Tomorrow, Tara will cover Amazon RDS SQL Server, and Erik’s on SAP HANA for Thursday.

1. Create an empty database in Azure SQL DB.

In the Azure portal, go into SQL databases, and click Create database. Since it’s my first database in this subscription, I’ll also have to create a server. (You might think pricing/sizing would be tied to the server, but it’s not – it’s actually tied to the database.)

In this screenshot, I’m in the midst of creating my new database, but I have to create the new server first. The Azure portal has this weird left-to-right navigation where modal dialog boxes appear on the right. Even though you can still edit stuff on the left, like the database, you have to finish creating the stuff on the right before you can move on:

Creating my server and database

After configuring my server, I’m asked how fast I want it to go:

Performance size defaults

Performance sizing isn’t the point of this post – we’re talking about point in time backups and restores – but the performance tiers can affect your backup/restore choices:

Azure SQL DB tiers as of mid-2017 – note backups at the bottom

I love how Azure SQL DB has the same backup retention across the upper tiers, kinda like how they make the same security options available across the board. They’re not trying to Enterprise-Edition you up into higher tiers by restricting basic features from Standard.

One drawback: if you want more than 35 days of backup history, you’re about to meet my friend from Spain, Mañuel Labor. You’ll need to export the data, restore it to another server, etc.

I’ll go with the defaults here of a Standard S2 with 50 DTU, 250GB data, since I’m not worried about performance.

After I click Create, the server and database takes a few minutes to deploy.

I’ve seen you on TV

While that happens, I’d like to point out what the wizard didn’t mention:

  • Backup schedules & frequencies
  • Backup destination & redundancy
  • Corruption checking schedules
  • Security & firewalls (more on that in a minute)

2. Set up point-in-time backups.

Well, dear reader, this step is going to be a bit of a letdown in Azure SQL DB.

After the deployment finishes, if you go into database properties, there’s nothing about backups:

She really shows you all she can

Even if you use the search for backup:

No results found

This is one of the things I love about Platform as a Service (PaaS): it’s a new religion. The worst piece of drudgery about database administration, the thing you hate the most – protecting them – is simply not your problem anymore. It’s Microsoft’s problem.

Even encryption is just taken care of by default. Click on Transparent Data Encryption, and:

You’ve built your refuge

TDE doesn’t even merit capital letters anymore. It’s just plain transparent data encryption.

3. Create a table with some stuff in it

Because it’s our Builder Day, and I feel like learning things, I’m going to:

After installing the extension, I can hit F1 and type in SQL, and get options:

Visual Studio Code – F1

I thought I wanted to Connect, but hold back the rain – that user experience is actually painful. Then I tried with Manage Connection Profiles, and things got – well, the best way is to show you:

Connecting to Azure SQL DB with Visual Studio Code

It turns out that to connect to SQL Server, you also have to install openssl, then .NET Core, then restart Visual Studio Code.

Next, you’ll get an error about not having the firewall open:

Because you’re lonely in your nightmare let me in

Unless you have the world’s largest monitor, you can’t see the entire alert, but the short story is that you have to go into the Azure Portal, hamburger menu, SQL Servers, click on your server, click Firewalls, and add your IP to allow remote access.

A brief word about security

“Wait a minute,” you might be saying to yourself, “can’t I VPN into Azure or something to access Azure SQL DB? Am I really about to open up access to the outside world?”

This is a known issue:

2,466 users are hungry like the wolf

Your request is important to us, please hold:

Don’t say a prayer for me now

For now, I’ll just open up my Azure SQL DB to my home IP address.

Back to creating a table with some data in it

I used to use a char(8000) field to stuff a lot of data in fast, but ever since I saw Erik using sys.messages to create a lot of junk fast, I’ve been a convert:

I’m just dumping a bunch of data into a new table then dropping it, over and over. This database isn’t going to grow after the first insert, but the change rate is going to mean that I should see backups happening on a pretty regular basis.

Here’s what it looks like in Visual Studio Code:

Adding data with Visual Studio Code

While it runs, I’ll check the server’s wait stats with sp_BlitzFirst @ExpertMode = 1:

That wait is pooooiiisoooon – wait, wrong band

And Eureka! We have a poison wait detected, which means our SQL Server is going to feel like it’s locked up or slowed down across the board. In on-premises boxes, common examples are RESOURCE_SEMAPHORE or THREADPOOL, but here it’s LOG_RATE_GOVERNOR. Azure is limiting our queries because we’re using too many resources.

(Frankly, that’s fair, right? What kind of moron has two thumbs and dumps sys.messages into a table over and over? This guy!)

When I go into the Azure portal to see what’s up, Microsoft politely points out that my DTU usage is rather high:

You make me feel alive, alive, alive

If I drill into details, I get a Query-Store-driven dashboard of my top queries, and you can guess which query it is – but I digress.

4. Make sure backups are happening.

I don’t get to see a list of backups in Azure SQL DB, but I’m a little paranoid. I wanna check via wait stats to see if SQL Server’s spent any time waiting on backups. Even with super-fast storage, backups usually pop up in wait stats.

To see, I’ll run sp_BlitzFirst @SinceStartup = 1, which works fine in Azure SQL DB and gives me wait stats since, uh, startup:

sp_BlitzFirst @SinceStartup = 1

The HADR_FABRIC_CALLBACK wait is undocumented (listed as “TBD”), but given that Always On waits typically start with HADR, it’s gotta have something to do with data protection. I’m not gonna call that one backup, though.

LOG_RATE_GOVERNOR is SQL Server giving us the hand, telling us to slow down our writes.

SOS_SCHEDULER_YIELD indicates CPU issues, so I’ll set that aside.

Coming in at #4, we finally have BACKUPIO. There have been 22,403 times where SQL Server has waited on backups, albeit only 10.8ms, which is pretty doggone quick. I’m just happy to see any backup waits at all, which means my data’s probably going to be protected. I’m a trusting guy.

5. Do a point-in-time restore.

My little load query has been running for about half an hour straight, so I’m going to click on Restore at the top of the window, and the fun starts:

Pick any date, like a birthday or a pretty view

Here’s where things start to get a little different: restores are a new database.

First, you don’t restore over the top of an existing database on the same server. If you want the same database name, you’ll need to drop the old database first, or restore to a different server.

Second, this also means that you’re paying for a new database – see the Pricing Tier option at the bottom? Your restore could be on a faster or slower tier. If you need to restore production databases into a dev/test server, this is really handy. (This can also be a bit of an eye-opener for folks who didn’t realize that there’s no such thing as a “free” Development Edition database in Azure SQL DB once you’re over your monthly free credits. That’s totally fair, too – cloud hardware still costs money.)

The restore point is also a little confusing: the monitoring graphs are in local time, the oldest restore time is in 24H UTC, and the restore point time entry form is in 12H UTC. Just type in the recovery point time you want – any point of seconds is available. You’re not restricted to 15-minute-increments or anything like that.

Next, the Azure portal shows a little popup with status:

Restoring

My database restore is off and running. (And when the restore finishes, that new database will have automatic backups too, all without me lifting a finger.)

After 9 minutes, my newly restored database was available for queries.

I’ve got my own way

How long will your restore take?

In the documentation, Microsoft writes:

How long will my database restore take?

That’s completely fair, but here’s the thing: with Azure SQL DB, you don’t know most of that data. You’re not going to know how many logs, the amount of replay, the amount of network bandwidth, or the number of concurrent restore requests.

You don’t get a percentage completion in the portal, and you can’t check DMV tools like sp_WhoIsActive because the database isn’t available yet.

What I learned (or remembered) today

The Azure Portal is still fingernails-on-chalkboard. The billing, subscriptions, and permissions setup is a complete pain in the butt, but – how often do you actually do that? Almost never. (Except today, when I had to add Erik and Tara into my Azure account, and I flipped a table.)

Creating a database still has some awkwardness around creating a server first. The “server” concept feels artificial because it’s not useful for much – there’s no three-part-name cross-database queries allowed, and it’s not like you get Agent jobs.

But once the server & database are up, backups are a hassle-free dream. Love it. Two slight exceptions:

  • Backup space is 2x your database size. If you have a high change rate (like my absurd demo query), you’ll need to buy more storage space or retain less backup history. See the FAQ for details.
  • If you want >35 days of backups, you’ll need to export the database contents, like to a BACPAC. If you want transactional consistency in that copy, you’ll need to copy the database first, then export the copy. Small price to pay though.

Restores are super-easy, but not necessarily fast or measurable. But hey, that’s the cool part about the cloud – the restore speed is somebody else’s problem. If your boss is pissed about how slow the restore is going, you just conference them in with your Microsoft support rep while you do other stuff. (Is that how the cloud works? I think that’s how the cloud works.)

And hey, Visual Studio Code is making progress! Not quite a bird of paradise yet, and I think I’ll stick with SSMS in a VM, but it’s coming along.


[Video] Office Hours 2017/06/14 (With Transcriptions)

SQL Server, Videos
0

This week, Brent, Erik, Tara, and Richie discuss setting up an archive process, CTEs vs temp tables, deadlocks, alternatives to replication, Erik and Brent’s pre-con for this year’s PASS Summit, and more.

Here’s the video on YouTube:

Office Hours Webcast - 2017/06/14

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes

Office Hours Webcast – 2017/6/14

 

How should I archive data?

Brent Ozar: Alright, we’ll go ahead and get started. So James says, “do you know any good blog posts on setting up an archive process?” And if you don’t have good blog posts, I would say what are some of the lessons you’ve learned around setting up an archive process?

Erik Darling: Batching.

Brent Ozar: Go on…

Erik Darling: Batching is super important with the archive process, especially if you’re not using, sort of, table partitioning, where it’s easy to swap things in and out, or if you’re not using partition views where, like, you sort of have free reign to knock a table out if you don’t want the data in it anymore. If you don’t feel like dealing with it then you have to be really careful, kind of dealing with, sort of, one giant monolithic table, that you don’t lock the whole thing down trying to delete a few hundred thousand or million rows at a time. You kind of want to break that into smaller bites.

Brent Ozar: The thing that I didn’t ever understand until, I don’t know, I started doing the whole consulting thing was the whole lock escalation process.

Erik Darling: Yeah, it’s no fun.

Brent Ozar: I’m just like, I’m only deleting 15,000 rows, you know, what’s the big deal?

Erik Darling: Why do I need a table lock? Oh…

Brent Ozar: Yeah, and when there’s like five million rows in the table, 50 million rows in the table, what’s the big deal? Tara, what about you, what other lessons have you learned along the way with building an archive process?

Tara Kizer: I mean databases that I’ve supported have had to, maybe not archive the data, but purge the data based upon whatever the data retention policy was for customers. And, you know, these systems have existed since SQL Server 2000, and it was always just some kind of delete process, but batched, you know. Definitely batching, you know, while loop and delete top whatever. But, for me, if it were a new system, I’d be putting table partitioning or partition views in place right off the bat because I don ‘t want to have to put those in place later on when the database gets bigger, because it’s harder to put in place later. So start off with it, even if you don’t think your database is ever going to be big. I want to start avoiding the deletes and having to – most of the systems I have supported have been 24/7, you know. Maybe there’s a slower period, but it’s still – there’s users using the system and those deletes are just problematic; even in small batches. Or it would be because you do such a small batch, your job ends up taking hours because it’s only doing a little bit of work at a time, so it starts creeping into the busy times.

Brent Ozar: I would say, another thing I’d recommend with archiving is, if you can archive at certain days and times – say it’s Sunday at 10pm, if you can only archive at that one time, which is awesome, after you’re done archiving, put the database in read only mode. The other database where you’re archiving to, if you have the wonderful luxury of being able to archive to a different database, seal that database as read only, and that’s when you take your backup. You don’t have to backup that database again until your next round of archiving hits. Granted, that’s more technical data 0:02:43.2], It’s more work that you got to go take on, but when you start to grow to large sizes of databases, it’s totally awesome.

 

How is “Michaella” pronounced?

Brent Ozar: Ah man, Michaela, you got to send me an instruction on how to pronounce your name, because I’m going to see you all the time in here, you’re a faithful listener…

Tara Kizer: It’s usually Michaella.

Brent Ozar: That’s what I thought too…

Erik Darling: But then last week he said Michaella, something else, and she said you got it right, but we didn’t know which one it was for.

Brent Ozar: Like a Hawaiian kind of thing. Michae-la [crosstalk] there we go. See, the thing is Michaella, I’m not going to remember next week.

Tara Kizer: He can’t even say my name right, so you have no hope.

Brent Ozar: I literally had – when Tara started I literally put a Post-it note on my screen that said “tear a piece of paper”, because that’s the instruction that she gave me on how to pronounce her name.

Tara Kizer: That’s what people teased me for in school.

Brent Ozar: And it works. So Michae-la piece of paper, I might be able to remember that.

 

What’s better, CTEs or temp tables?

Brent Ozar: Michaelle says, “CTEs versus temp tables; I notice that on SQL 2016 Service Pack one that the CTE is using indexes from the table…” which might imply that maybe the temp table is not. So Erik, speaking of your T-SQL Level Up level two, I believe you talk about CTEs inside there.

Erik Darling: Oh I sure do. I think I talk a little bit about temp tables as well. So CTEs are okay for certain things. I think what you’re looking at is, is there a performance difference? Well the CTEs can use the base tables and they can use any indexes you have on the base tables. Whereas when you stick stuff into a temp table, you’re responsible for indexing that temp table. CTEs have always been able to use indexes because they’re accessing the base tables in the query. I think the main difference that I would put between them is that CTEs don’t materialize, whereas temp tables do. So if you, you know, run a CTE, you’re just essentially running a query and you can use the results of that query and, you know, anything after that. Whereas temp tables, you can stick data in there and then you can query them all day long.

Brent Ozar: And the other thing I would say is, if you know you’re going to reuse something 50 times, temp tables make so much sense there.

Erik Darling: Because there’s a weird quirk with CTEs where if you – the more times and, like you know, subsequent SELECTs that you join to a CTE, the more times that syntax has to execute. So if you, you know, you can just write a simple mockup of this. I think I actually have a blog post about it; I’ll find the link. Much easier than explaining it…

Brent Ozar: And it seems so obvious, when you look at the query I’m like, I only wrote the CTE once, shouldn’t you just execute it? It’s like some kind of sub-query, you just execute against once and then everything gets joined to it. Michaella – see, I got it right there Michaella, and the reason why is I left it on the screen where your answer was. Michaella says “yes, the CTE is called eight times”. Yeah, that’s one of those examples where it might make more sense to do a temp table.

Erik Darling: Hang on, I’ve got a link for coming up for you then Michaella. I’m going to say it as not Michelob. That’s how I’m going to remember it.

Brent Ozar: Oh my goodness.

Tara Kizer: But, you know, if the CTE is recursive, then if you switch that to a temp table, you’re having to update that temp table, right, in order to achieve the same recursiveness?

Brent Ozar: And then recursive CTEs have their other problems too, which is, don’t those ones go single threaded?

Erik Darling: Well the recursive part runs serially. Anything on the outside can go parallel; it’s funny.

 

What’s an alternative to replication?

Brent Ozar: James says, “what is…” Oh Richie’s here. James says, “what is an alternative solution besides replication?” He hates it.

Erik Darling: Buy a printer.

Tara Kizer: Do you have Enterprise Edition? That’s always my first question, because if you don’t, replication is the right solution, in my opinion, for reporting; if you’re asking about reporting.

Brent Ozar: Or log shipping…

Tara Kizer: I hate log shipping as a solution for reporting.

Brent Ozar: What?

Tara Kizer: Well because as far as – if you can put a delay in your restores, log shipping’s a good solution, but every single time that transaction log has to be restored, it kicks out the reports, and generally speaking you’ll have long running reports that could take a couple of hours to run; so you have to have a delay in your restores. And the systems that I’ve supported cannot have that long of a delay in the data, you know. Even just the latency that’s caused by say the archival process is problematic for the systems that I’ve supported. Or rebuilding indexes causes a lot of latency, so log shipping delay – it just depends on your company, of course. But if you have Enterprise Edition, I’d go for a different solution.

Brent Ozar: What would be that solution?

Tara Kizer: I’d go with availability groups with a readable secondaries.

Brent Ozar: it’s so much easier – it doesn’t matter if the database changes, it doesn’t matter if people change – if they add 50 tables tomorrow, if they drop 50 tables tomorrow, AGs are just going to work.

Tara Kizer: one of the drawbacks with AGs though is that in transactional replication, a lot of times your reporting solution would have different indexes because you just didn’t need those on the OLTP side, you needed more indexes on the reporting solution. That’s not possible in AGs; you’re stuck with putting all of your objects, tables, indexes, everything has to go on the writable side so that the readable side has it. It’s a big drawback and if you do a synchronous readable secondary, then that can cause performance issues to the writable side. So you may want to do asynchronous where there’s some latency.

Brant Ozar: You can have different amounts of data on both sides, for example. You cannot replicate deletes over to your secondary, so then that you can keep forever history over on your replication subscribers. Pretty slick stuff.

 

How do I resolve deadlocks?

Brent Ozar: Uday says, “do you guys have any recommendations or blog posts on how to approach resolving deadlocks?”

Tara Kizer: Only thing I can think of is anything Jonathan Kehayias writes, you know. He’s the deadlock expert, that I know about at least, you know. I’ve attended a deadlock presentation he did at PASS a few years ago; it’s good information. It’s a really complicated topic. If there’s anyone who has a blog post out there, it would probably be him. They’re not fun to troubleshoot – I know he thinks it’s fun, but most of us really want to stay away from that. I’ve had to do significant deadlock troubleshooting where just one stored procedure would take like three weeks to figure out because it was just – the stored procedure wasn’t necessarily complex, but understanding everything else that went into it, it was hard to figure out. On systems that I’ve supported where I know what the application can do, I’ve recommitted snapshot isolation level to help reduce deadlocks and blocking, but not all systems can support that.

Erik Darling: And the other thing with RCSI is that if you’re dealing with dueling write operations, then you don’t get much dice out of that. And enabling snapshot isolation to deal with, you know, dueling writes is complicated. [crosstalk] Yeah, you have to have all this crazy error handling and…

Brent Ozar: I have it as a bold point on a slide, you can use this for updates, any questions? Okay no, great, moving on. I just kind of cheat on deadlocks. So what I would do is I would run sp_BlitzIndex. With sp_BlitzIndex, it gives you this warning about aggressive indexes, meaning indexes that have had a lot of lock waits on them. This isn’t smoking gun proof that it’s tied into your deadlocks, it’s just that in the kinds of places where I see aggressive index warnings, those are usually the tables that are involved. And then I’ll just look at what are the right non-clustered indexes to add into this scenario and maybe do I have – one of two things is happening, either I have way too many indexes and updates are taking forever to lock all of them, or I don’t have any. Like there’s a clustered index on a table and every time I’m trying to do an update I’m locking the whole honking table.

I’d also say that this is a great time for monitoring software too. If you have a SQL Sentry performance advisor, or what do they call it, SQL Sentry One? Sentry One SQL Sentry for SQL Server Sentry… whatever. Idera SQL Diagnostic Manager or Quest Spotlight, they will all send you emails with the deadlock graph every time deadlocks occur. I’m not saying that makes it easy.

Tara Kizer: I had performance advisor at my last job and that three week stored procedure – it took three weeks because we were doing load testing, trying to simulate production load to cause the deadlocks to happen, because you can’t necessarily replicate this in your test environment, dev environment. You need a production load on it, so it took us a while, plus I was also working on other things. But I had the deadlock graph, I’m just like, okay. You know, so I knew what indexes were involved and I knew what processes are involved, I was like oh [crosstalk]…

Erik Darling: Then the other SSMS window…

Tara Kizer: Yeah, and this is – they use NOLOCK like it was the turbo button, you know, because this was implemented before I joined the company. But NOLOCK was everything and it still is deadlocking.

Erik Darling: You know, if you don’t have a monitoring tool, what – I would probably just fall on the sword and fire up extended events to catch deadlocks, because you get the whole deadlock graph…

Tara Kizer: It’s pretty low, you know, it doesn’t use a lot of resources when it’s just – you know, it fires after it happens. It’s not like you’re gathering execution plans, which is a major hit. [crosstalk]

Brent Ozar: Lou Fritz says, “Kendra Little has a great free session on deadlocks.” I totally forgot about this. So if you go to SQLworkbooks.com, this is Kendra’s new site for training courses, and she has a course that’s free right now. She may end up charging for it later, so if you want it, I would go sign up for it now, look at her trouble shooting blocking, locking and deadlocks for beginners.

Tara Kizer: I might even check that out, except I hate deadlocks; it’s a tossup. Do I really want to learn more? I do want to learn from Kendra, she’s a great presenter, you know.

 

What’s up with licensing?

Brent Ozar: When we first started carving up our video courses – our in person courses, Jeremiah, Kendra and I were carving up modules, like which ones of us were going to tackle which courses. And one of them was licensing, where like we have to have licensing in the senior DBA class and all of us made the same face that you did. All of us were like screw that, and I lost the rock paper scissors. And now I love licensing…

Tara Kizer: Oh good.

Brent Ozar: Well I had to spend like three weeks of my life researching licensing to write training decks. Now I’m like, it’s so cool.

Erik Darling: Learning to love licensing.

Tara Kizer: It’s funny about licensing, the Microsoft employees don’t even understand the licensing. They will tell you, you need to buy licensing for this server, and it ends up being completely false, you now, it’s just a salesperson trying to get you to buy more licenses; but they don’t understand it either. So you pull the legal documents and say, no right here it says I don’t have to.

Erik Darling: Yeah, they just hope you won’t do that.

Tara Kizer: Exactly.

Erik Darling: No, no you need licenses for that.

Richie Rump: I just always installed dev edition, and I’m good.

Brent Ozar: Says the guy who’s doing all of our production work…

Erik Darling: No one ever checks. They have so many customers, what are the odds? It’s like cigarettes, what are the odds? …

Brent Ozar: Oh god, this is why our company’s going to go down the drain, people, you’ll remember this moment.

Tara Kizer: You know what’s funny, two jobs ago I was assigned the licensing task and it was about a month before I left my job because…

Brent Ozar: It’s going to take me about 35 days.

Richie Rump: Now, you leaving has nothing to do with that licensing thing does it? Oh no, of course hot. Oh no…

Brent Ozar: Can you hand over the work that you’ve gotten done so far? Oh my dog ate it, oh dang, I had a hard drive crash…

Erik Darling: There was a printer jam and that was it.

Richie Rump: I lost my 3.5 floppies, it was here somewhere. I’m missing floppy 35 of 35 [crosstalk]

Tara Kizer: Installing Windows NT 351 on floppies, there was like 40 or something.

Brent Ozar: Yes, I remember signing up for a Windows beta, just so I could get 30 some free floppy disks. Like this is awesome, they’re just going to give them to me?

Erik Darling: Free storage.

 

Introducing our first-ever podcast sponsor

Brent Ozar: So we’ve come to a point in the podcast where – we’ve never really wanted to take ads on the podcast, but we’re going to go and start something new. We always wanted to keep it purely educational, but Microsoft came to us and they said, hey, you know what? There’s some parts of SQL Server that just don’t get the attention that they deserve, and we want to start publicizing these parts of SQL Server so that people can get more mileage out of them. So with that in mind, I’d like to introduce our first podcast sponsor.

Now we’re not perfect. All of us have our little individual quirks, we don’t highlight the right things in the script when we go to execute. We write things we didn’t really mean to intend. Sometimes we import the wrong data or sometimes we just code a really crappy stored procedure that has the wrong impacts that we don’t want. You don’t want to redefine all your table from scratch.

Tat’s why our podcast today is brought to you by the DELETE command. The DELETE command doesn’t judge. Whether it’s a single row or thousands, DELETE is there for you in your darkest hours to hide all of your mistakes. DELETE is there when you need to hide the bodies.

So remember, when you’re in a dark place, there’s always the DELETE command. So thanks very much to the DELETE command for being the first sponsor on our webcast.

Erik Darling: A subsidiary of Truncate co…

Brent Ozar: A subsidiary of Truncate Incorporated…

Richie Rump: Yeah don’t use those crappy WHERE clauses, they tend to foul things up, so…

Brent Ozar: Well it’s a performance tuning technique; the WHERE just slows you down. WHERE would require you to use an index.

Erik Darling: It’s complicated, there’s logic, you have to think about them. Who wants to do that? I don’t have time for that.

 

How do I preserve row order during ETL?

Brent Ozar: So Ben says, “when we’re using SSIS and we’re importing from a flat file, can the  identity column in the destination table ensure preservation of original order of rows in the source file?” Oh you’re getting tricky. Am I the only one who’s done this? Judging by thee looks…

Tara Kizer: I’ve done it.

Erik Darling: With SSIS, yeah, maybe. I haven’t done that with SSIS.

Brent Ozar: So the way that the table is sorted, even if you have an identity column, that doesn’t mean that table’s sorted in that order. Even if the clustered index is on it, the order that the rows, paired with the clustered index, can matter. So you want to have an ORDER BY basically on whatever you’re pulling out of the source file; you can sort the stuff in SSIS to determine the direction of what’s coming out. I just would never rely on it, because what happens, if you start loading duplicate rows, you’re going to over – if you tune the SSIS package, you’re going to update existing rows, so they won’t be in the same order in the table.

Tara Kizer: If I’m moving data via SSIS and I want to preserve the identity column value, which is a little bit different than what Ben’s asking, then I’m using the identity insert on option, so that I’m copying over exactly the value. That’s a sysadmin function though, so you have to make sure that, you know, the job, the process, has sysadmin. But I don’t want different values on two systems for the same row. So I’m preserving the identity value with that function.

Richie Rump: Okay, so why do we care about order with an identity column, right? So identity is a surrogate key; as a surrogate key, it’s meaningless. It should never have any sort of meaning. So, when we put some sort of order on it, now we’re inferring some sort of meaning onto that. Just dump it. Have a separate column and put an order there, if you really need it., but it shouldn’t be on the identity column, at all.

Brent Ozar: He adds, “there’s no identity in the source file, but the problem is that the order of the rows in the source table is critical, it’s just not specified in the source file.” Yeah, with like SSIS, you can add your own fields as part of the import, and that’s what you would add a row number, as part of that.

Richie Rump: Exactly.

Brent Ozar: Don’t rely on the database’s side, because you’re going to import the file over and over again, and you want that row number every time.

Tara Kizer: So the order of the data in the file is sorted, from what I’m gathering. So it’s in the correct order, and when you are inserting from a file to SQL Server with an identity column, it should preserve the order, because it’s going to be inserting row by row by row over.

Brent Ozar: Yeah, I just worry about somebody doing something like resetting an identity field, you reseed it and go in the negative direction, you know, anything like that. He says, “ah derived field, you rock”. Yeah, every now and then we get one right.

Erik Darling: Can that be parallelized? Because I wonder if that might screw up order as well.

Richie Rump: It will screw up order if it is parallelized. If you want to do a bulk upload on that, it will screw it up. So yes, you want to do that before you do all your, sort of, your imports.

 

What’s up with the PASS Summit this year?

Brent Ozar: Nice. Coleen says she “went to Kendra’s PASS 2016 session on deadlocking” and said, “it rocked.” Yeah, Kendra’s a great teacher, really fun to learn from.

Richie Rump: I was there and I was completely lost, but I’m not [crosstalk]…

Tara Kizer: Yeah, wasn’t that the session that we all showed up and she said you guys don’t belong here, this is for beginners.

Erik Darling: Yeah, that was the one.

Tara Kizer: I still learned some stuff.

Brent Ozar: I think that’s true with every session at Summit. I don’t think I’ve ever gone somewhere and not learned something, you know. There’s always some like tidbit where you’re like, whoa, I never thought of that. So speaking of which, Erik and I have pre-con this year. So Erik, what’s our pre-con about this year at PASS Summit?

Erik Darling: Perf [crosstalk] perf tuning, expert perf tuning. In the year 2017, so we’re not going to sit there and talk about profiler with you. We’re not going to…

Brent Ozar: Oh man, I’d written a session already.

Erik Darling: You’re fired. Fired from the pre-con.

Brent Ozar: Damn it.

Erik Darling: You can still get…

Brent Ozar: So yeah, it’s about 2016 and 2017, so what are you talking about during the pre-con?

Erik Darling: Well, I’m going to talk about that new stored procedure that’s up on the screen – oh, that was up on the screen. Now I look dumb, now I look quite foolish. So I wrote I stored procedure recently, another one. This one is called sp_BlitzQueryStore, and this one does for the query store what sp_BlitzCache does for your plan cache. It does have a little bit of a different method to it. It goes in and kind of looks at your worst periods of time for certain metrics and then collects anything that ran during those time periods. So it’s a little bit different, you’re not just giving it one sort order and saying give me the whole thing by this. But I’m going to be talking about how to use that to sort of delve into query store the way you delve into the plan cache with BlitzCache. So that will be fun.

Brent Ozar: Yeah, I’m going to start off with looking at wait types. I’m like, screw perfmon count, or screw this queue length, screw page life expectancy. All those things, they’re not really relevant anymore. What makes so much more sense is just asking SQL Server, hey, what’s your bottleneck? And then focusing on that one thing, because none of us can do a perfect job of systems administration. You’re going to have a slow tempdb, you’re not going to have enough RAM, you’re not going to have perfect indexes. Just focus on the one thing that the business users are going to notice as quickly as possible. So we’re going to talk about how to analyze wait stats with sp_BlitzFirst, how to look them at short periods of time or since the server startup, and then a quick decoder ring as to which wait types mean different things.

 

Are you coming to the mid-Atlantic?

Brent Ozar: Ben says, “when are any of you in the Mid Atlantic?” I don’t think ever. Mid Atlantic is like North Carolina, right? [crosstalk]

Erik Darling: I can’t row that far either.

Brent Ozar: Yeah, I don’t think we have anything planned for that anytime soon. I’m in Boston at the end of this month, but that’s as close as I get. Richie…

Erik Darling: I think south, you know, if I went any further south, I’d be out of Brooklyn, so I don’t want to do that.

Brent Ozar: And yeah, if Richie went any further north, he’d be out of Miami Dade, so yeah, neither of those work.

Richie Rump: I’d be out of Northern Cuba…

Erik Darling: There be snakes.

Brent Ozar: Coleen says, “one more and I’ll shut up. I went to your session too, Brent, and waited for a selfie, but the line was too long.”

Tara Kizer: That’s crazy.

Erik Darling: There was no line for me. Tara had a longer line than me.

Tara Kizer: I was surprised when people would come up to me while we were walking and Brent would be right there and they were like, hey Tara, I recognize you from [inaudible] Why are you recognizing me, here’s Brent.

Brent Ozar: Yes, it will be even funnier as more of those cartoons get out; it’s fun. Tara has a new, like, It Rocks cartoon, and it’s pretty cool. Alright, well thanks everybody for hanging out with us this week at Office Hours, and we will see you guys next week. Adios everybody.


Query Plans: Trivial Optimization vs Simple Parameterization

Facemaking

You know when you think you know something, and the obviousness of it makes you all the more confident that you know it?

That’s usually the first sign that there’s a giant gotcha waiting for you.

And that’s what happened to me over the weekend.

The setup

I answered a question.

And after I answered it, I got to thinking… Would partitioning help? After all, smart people agree. Partitioning is ColumnStore’s friend.

So I started looking at it a little bit more differenter.

First, I had to set up a partitioning function and scheme. I used dynamic SQL to create the function because no way in Hades am I typing numbers from 1-999.

Next, I loaded in a billion rows.

Yes, a billion.

And you know, it turns out generating a billion rows at once is terribly slow. So I wrote an awful loop to insert 1 million rows 1000 times.

Then, you know, I needed some indexes.

Lemme be straight with you here: don’t create the indexes first. After about the 4th loop, things grind to a halt.

This is why other smart people will tell you the fastest way to load data is into a HEAP.

Lemme be straight with you about something else: I couldn’t figure out how to just create the ColumnStore index on the partitioned table. I had to create a regular clustered index, and then the ColumnStore index over it with DROP_EXISTING.

So there I had it, my beautiful, billion-row table.

Partitioned, ColumnStore-d.

What could go wrong?

Plantastic

Let’s look at one query with a few variations.

The plan for it is alright. It’s fairly straightforward and the query finishes in about 170ms.

We can see from the graphical execution plan that it’s been Simple Parameterized. SQL Server does this to make plan caching more efficient.

GOLD STARS FOR EVERYONE

With parameters in place rather than literals, the plan is accessible to more queries using the same predicate logic on the Id column.

We can also see the plan is Trivial. If you’re following along, hit F4 while the SELECT operator is highlighted in the query plan , and a Properties pane should open up magically on the right side of the screen.

Pallies

Well, I can change that! I know a trick! I learned it from Paul White!

If I add 1 = 1 somewhere in my WHERE clause, I can get around Simple Parameterization and the Trivial plan.

Right?

Heh heh heh, that’ll show you.

Stringy

Oh but, no. The plan is still trivial, and it still runs in about 172ms.

D.R.A.T.

So how do you beat the trivial plan?

You use yet another trick from Paul White.

Dear Paul,

Thanks.

— The rest of us

If you stick a goofy subquery in, the optimizer will regard it with contempt and suspicion, and give it a full pat down.

Here’s the plan for this query.

Still not Simple!

And, amazingly, it gets FULL optimization.

Full of what, exactly?

Why is this better?

Astute observers may have picked up that the plan changed a little bit. Earlier plans that got Trivial optimization used a Stream Aggregate operator, where the Full optimization plan uses a Hash Match Aggregate.

What’s the deal with that?

Aggregatin’

To make matters more interesting, the Full optimization query finishes in 11ms.

That’s down from 172ms for the Trivial plan.

What else was different?

There’s a really important difference in the two plans.

The Trivial plan was run in RowStore execution mode. This is the enemy of ColumnStore indexes.

It’s essentially using them the old fashioned way.

Voices carry

The plan that gets Full optimization runs in Batch execution mode, and this makes everyone very happy.

CAUGHT

This is why sp_BlitzCache will warn you about both Trivial plans, and ColumnStore indexes being executed in RowStore mode.

Because that’s why.

Thanks for reading!


New #SQLPASS Summit Pre-Con: Expert Performance Tuning for SQL Server 2016 & 2017

#SQLPass, Company News
0

Going to the PASS Summit in Seattle this year? Join me & Erik Darling on Tuesday, the day before the conference, at our pre-con session, Expert Performance Tuning for SQL Server 2016 & 2017.

Your job is making SQL Server go faster, but you haven’t been to a performance tuning class since 2016 came out. You’ve heard things have gotten way better with 2016 and 2017, but you haven’t had the chance to dig into the new plan cache tools, DMVs, adaptive joins, and wait stats updates.

In one fun-filled day, Brent Ozar and Erik Darling will future-proof your tuning skills. You’ll learn our most in-depth techniques to tune SQL Server leveraging DMVs, query plans, sp_BlitzCache, and sp_BlitzFirst. You’ll find your server’s bottleneck, identify the right queries to tune, and understand why they’re killing your server. If you bring a laptop with SQL Server 2016 and 120GB free space, you can follow along with us in the Stack Overflow database, too.

Along the way, we’ll give out great prizes for the best questions and run an end-of-day competition to win the best stuff. You’ll go back to the office with free scripts, great ideas, and even a plan to convince the business to upgrade to SQL Server 2016 or 2017 ASAP.

Can’t upgrade to 2016? We’ll even show you memory grant and compilation tracking tricks that work in newer service packs for 2012 and 2014.

This is not an introductory class: you should have 2-3 years of experience with SQL Server, reading execution plans, and working on making your queries go faster.

Attendees will even get a one-year Enterprise (Everything) Bundle – with that, the $495 pre-con pays for itself!

PASS Summit 2017

Head over to PASS.org and register now. During registration, pick our Tuesday pre-con. See you in Seattle!

Update 2017/09/22 – over 300 seats are sold, so if you still wanna join in, move quick before it sells out.

Update 2017/9/25 – we’ve added stretch goals! If we hit 325, 350, 375, or 400 attendees, you’ll win even more training.

Update 2017/10/06 – the pre-con is now sold out! If you couldn’t make it to Seattle or didn’t get your tickets in time, we’re teaching it online, too.


New Class: Data Science Fundamentals with R

Now that SQL Server ships with R, and you can use R code in T-SQL, your managers might be asking, “Hey, can you start doing some analysis and visualization?”

Good news! We’ve got a new class for that.

Come learn the fundamentals of analysis & machine learning in R, and how to operationalize your work. In this live two-day course, we’ll take you from no data science knowledge to understanding the fundamentals and being able to put them into production using SQL Server.

Focusing on using R, you’ll learn how to important exploratory data analysis, build common predictive models, and embed them with best practices.

You won’t be a PhD in Machine Learning by the end of this, but you’ll be comfortable coding in R and understand the basics of data science.

Steph’s Data Science Fundamentals with R

Let’s get together for a live 2-day online class to cover:

  • Basics of R – Understand how R fits into the data science world, get comfortable with the coding environment, and perform basic manipulations of key R objects.
  • Data Manipulation in R – Learn how to read data in from different sources, work with columns and rows, join tables, and more. This is the biggest task in any data science project so we need to do this well.
  • Data Visualization in R – Get practiced with exploring and presenting data in static and interactive graphics to help you and others gain understanding.
  • Machine Learning Fundamentals – Understand the most common types of models, what they’re used for, and how to code them in R. Work through the data science process, from data preparation to sampling to building models to evaluating them.
  • Working with R from SQL Server – Understand how you can work with R in SQL Server and what architectural and operational considerations you should think about. Embed models into SQL Server and learn how to use these models for batch and real-time predictions.

I’ll be attending myself – Lord knows with my math skills, I need it – and I’m excited to announce that our medical school teacher for this course is Steph Locke (@SteffLocke). If you’ve been around the SQL Server data science community at all, you’ve seen Steph around community events. She’s an MVP with a decade of BI and data science experience, and she’s got a ton of blog resources out there too.

Wanna watch a free presentation from Steph? Register for GroupBy.org’s September event, where she’s giving Statistics 101.

The class is October 10-11 online. Read more about it, and use coupon code JoinOnline to save $1,000 if you register in the month of June.

See you there!


New Class: Always On Availability Groups: The Senior DBA’s Field Guide

Availability Groups are all the rage right now, especially since they’re included with SQL Server 2016 Standard Edition. Our Availability Groups blog post category is one of the most popular on the site, and in my 4-day Senior DBA class, people have always been asking for more in-depth coverage of clustering and AGs.

Let’s get together for a live 3-day online class to cover:

Edwin’s Senior DBA Field Guide to AGs
  • Fundamentals of Windows Server Failover Clustering
  • Designing & implementing a cluster for both HA and DR
  • Designing & implementing AGs
  • Leveraging readable secondaries, distributed AGs, and basic AGs
  • Managing & monitoring AGs
  • Troubleshooting when things go wrong

To do it, we’ve partnered with Edwin Sarmiento, a Microsoft Certified Master and MVP out of Canada who I’ve known for years. He’s been cranking out great HA/DR training material – I’ve heard great stuff from readers who have been through his classes.

The class is September 5-7 online. Read more about it, and use coupon code JoinOnline to save $1,000 if you register in the month of June.

See you there!


Ola Hallengren’s Maintenance Scripts are Now on Github.

For some of you, the headline is self-explanatory, and you just want the URL, so here it is: https://github.com/olahallengren/sql-server-maintenance-solution

For the rest of you, here’s what it means.

What are Ola Hallengren’s maintenance scripts?

Ola Hallengren’s maintenance scripts are a widely used replacement for maintenance plans. His backup, integrity check, and index optimization stored procedures are powerful and flexible. The whole thing is open sourced with the MIT License, which means you can use them at work or even bundle them with paid products. (For example, kCura ships a modified version with Relativity.)

Ola’s site has phenomenal documentation, but if you prefer video format, or if you just wanna see & hear Ola himself, here’s his session from the 2014 PASS Summit.

I highly recommend his scripts for backups and index maintenance.

Why do I care that they’re on Github?

If you use Ola’s scripts by downloading them from Ola.Hallengren.com, configure them once, and just rely on ’em, then you don’t care. You can keep using them exactly the same way – and for most people, that’s the right answer. Ola’s download process is a piece of cake and his documentation is great.

But 3 kinds of people care:

  1. People who modify the scripts for their own use, but want to sync their versions with Ola’s latest changes
  2. People who want to give their own code to Ola, but want to make it as easy as possible for him to see the differences, and decide whether to put their code in his main branch
  3. People who deploy servers via automation scripts, and want to just fetch Ola’s latest version from Github automatically during deployment

If you’re not in one of those 3 categories, ignore this blog post. You simply don’t need to do anything at all. I can’t emphasize this enough – the stuff I’m about to describe isn’t intuitive or straightforward for most database professionals. Just go on about your day.

But in the Faux PaaS project, we happen to fall in all 3 categories – more about that soon – so Scott Ellis and I asked Ola if he’d be willing to host his scripts in Github. (The way they’re licensed, we could have hosted them ourselves, but I believe Ola should still be the Commander in Chief for these. He’s got the best vision for what makes them successful.)

How do I get started with Github?

Ola’s Github repo

Watch means you’re going to get an email every time he makes a change.

Star is like bookmarking it as one of your favorite Github repos. For example, here’s my starred repos.

Fork means create a copy of Ola’s scripts in your own Github account. It’s kinda like clicking File, Save As, and putting the project in your own home directory. For example, I’ve forked Ola’s repo, so it shows up in my own Github profile, but I’m only linking to it so I can explain how this will look on your own account.

How do I make changes to Ola’s scripts?

The easy way is to simply contact Ola. That works just as well as it ever has – he’s really responsive for a guy who must get a gazillion emails, especially all the thank-you emails that you folks are sending him because you rely on his work every day to save your job. (You DO send him thank-you emails, right? Right?)

Right now, Ola’s using Github as a mirror for his download page. It’s not his main development workflow. However, if you, dear reader, are an excellent open source citizen and you do a great job of submitting clean, easy-to-test pull requests, we might be able to make his life easier with Github.

If you haven’t used Github at first, seriously, stop here. It’s not easy or intuitive. I’m going to explain the big picture only to talk about why it’s tougher than emailing Ola directly.

Github is confusing at first. On Ola’s repo, there’s a “Clone or download” button, and if you install the Github app on your machine, it’ll open and let you edit Ola’s scripts. However, when you try to do submit your changes to Ola, you’ll get permission-denied errors.

In summary, you have to:

  1. Fork Ola’s repo into your own account (so you’re working independently)
  2. Create a branch for the specific change you want to give back
  3. Check your code into your branch in your repo
  4. Submit a pull request back to Ola explaining what you changed, and why

This stuff is way, way harder than it looks. Here’s learning resources to get started:

After you step through this grind – and I’m not gonna lie, it took me months to get vaguely comfortable with Github – you might have a little bit of a grumpy attitude. (I certainly did.)

As a result, when I checked in changes to somebody else’s stuff, my pull requests had an attitude. I had put in so much work that I felt like I was doing the code author a favor. “Look,” I said, “I’ve gone through all this work to give you my code – the least you can do is just click Merge to accept my changes.”

Don’t be that person.

If already you’re a Github pro, great – but then again, you probably stopped at the first line of the post. If you’re new to Github, I just wanted to explain why the rest of us are excited. You don’t need to use Github to be a good DBA. But if you do start down the Github path, keep an upbeat, thankful attitude because the code authors on the other side are doing this in their spare time for the love of the community.

And take a moment to thank Ola.

He’s at ola@hallengren.com, and he’s been making DBAs’ lives easier for almost ten years. Now, he’s taking things to the next level again, and that’s awesome.

And if you run into Ola at a conference, Scott tells me Ola’s a fan of Johnnie Walker Blue Label, so buy him some.


[Video] Office Hours 2017/06/07 (With Transcriptions)

This week, Brent, Erik, and Richie discuss running SQL Server 2017 CTP on Linux, receive side scaling on servers and VMware, query tuning, nested roles vs. assigning roles to users, hierarchical data type, apps and scripts for automating restores, and learning Azure and AWS.

The audio’s kinda rough this week because Brent & Erik were in the same room, using a webcam microphone instead of our usual high-quality headset-type microphones. We feel sorry for our transcriptionist on this one, and we didn’t even try to make edits on it because reading it makes us chuckle as-is.

Here’s the video on YouTube:

Office Hours Webcast - 2017/06/07

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes

Office Hours Webcast – 2017-06-10

 

Have you worked with SQL Server on Linux?

Brent Ozar: Michael says, “have either of you installed or configured SQL Server 2017 CTP, like the community previews of SQL Server 2017; any initial thoughts or impressions?

Erik Darling: There’s a follow up down there, it says [crosstalk]

Brent Ozar: Oh have you done it on Linux, oh god. Oh that changes everything, I liked your original question better.

Erik Darling: I don’t I could hang out with this Linux…

Brent Ozar: So why haven’t you?

Erik Darling: I’m just not a Linux guy. I feel like I would not learn anything that I want to learn by installing SQL Server on Linux. I’ve used Linux before, I’ve installed it. I’ve tried to install Arch Linux once, it didn’t go well. I’m not a fan of just all the weird command line stuff you have to do on Linux. It doesn’t do it for me – not want to spend time fixing that.

Brent Ozar: Richie, how about you? Out of your array of laptops, do you got any VMs that are running Linux or any ones running Linux [inaudible 0:00:57.5]?

Richie Rump: No, no. I mean I’ve got plenty of other stuff in the cloud that are running stuff, but everything that I run, I can run on Windows. And it’s not really about the OS at that point, it’s about the language and the application, for me. And the question, for me, with SQL Server on Linux is, why? Right, is there a reason for you running Linux in SQL Server, this brand new product, essentially, from the ground up? And if the answer is, because that’s the only knowledge we have is Linux, I’d be like, well why not Postgres? Right, why not MySQL, why not MariaDB, right? [crosstalk] it’s alright.

 

Should I use receive side scaling in VMware?

Brent Ozar: Let’s see here. So next up we’ll take – Gregg says, “what’s the recommendation on enabling receive side scaling on both servers and VM network adapters? This is VMware, I’ve got SQL Server 2008 R2.” If you’re playing around with receive side scaling, make sure you Google for receive side scaling and SQL Server and VMware, because there was a huge bug that performance just died with SQL Server, which I should have remembered on that client the other day. But I don’t know what the symptoms are or anything like that, I just remember reading a blog post and going ooh wee. I’ve never had a problem that I went back and went, this, this is the answer here, you know, receive side scaling. But I would also argue that most of the problems that I run into are either CPU and memory, that they surface bigger that way and it’s easier to play around within – I have no problem with receive side scaling, I just haven’t used it to solve a problem yet.

 

Why do Excel queries use more resources than SSMS ones?

Brent Ozar: Let’s see, Michaela asks – I’m just going to pronounce your name different every single time, so this time I’ll say Michae-la. Michaela asks, “why do Excel queries seem to use more CPU and memory than the same query in Management Studio?” That’s a great question, which one of you wants to answer it? No, come on… There’s the same post that we – Michaela says, “got my name right”. So it is maybe Michae-la, maybe I got that right.

So my thing would be probably parameter sniffing, so there’s this thing called slow in the app, fast in SSMS. And you know what’s funny, somebody said the other week, why don’t you guys show the blog post up when you’re talking about it? You know, like it made a huge difference with them with Office Hours. And you know what, that’s exactly what I’ll do. So if you search for slow in the app, fast in SSMS; if you want, for bonus points, if you want to spell Earland Sommarskog’s name out, you totally don’t have to. But this thing is an epic monster post, it’s huge, it has a table of contents and it lists out the different things that can cause your query to be slow in one app and then fast in another, and vice versa.

So that’s the place I would start is looking at the two different execution plans on them and see if you can get them to be the same just by blowing the plan cache, using the trick techniques that he shoes inside there. But otherwise it shouldn’t make a difference – no, that’s not true. Okay, so there’s one other scenario where it could make a difference. Let’s say that your Excel user is in, like, Tijuana and you SQL Server is in Alaska, whatever the network pipe is across there, if you’re facing ASYNC network I/O waits, meaning it’s taking a long time to push the data across the…

Erik Darling: It’s asking pretty specifically, why does Excel use more CPU and memory…

Brent Ozar: That’s true. [crosstalk]

Erik Darling: It seems like they’re hitting some sort of resource there. I’d be curious about what they’re actually pulling, because…

Brent Ozar: How many rows…

Erik Darling: Just because I, for a very short time, gave project managers access to SQL Server via Access, because they were incapable of getting it any other way. So I would just set up Access as a data provider to SQL Server and they would be off to their tracks, you know, looking at their various supports and stuff. And writing access queries is awful, so even if you’re hooking it into SQL Server, it’s not fun. So at this point I would just want to know what’s going on with it, why Excel’s the choice for expressing the data.

Brent Ozar: And then if you want to share, if you can share, the execution plan to see what the differences are between the SSMS query and the Excel query, there’s this tool that Richie Rump built, Paste The Plan. So if you go to PasteThePlan.com; Richie, tell them what they’ve won…

Richie Rump: You’ve won a website that will actually format the crap for you. Actually that’s another website, oh my gosh, you just got me all mixed up with the different websites. No, this will actually – you paste in your XML and it will then give you a link, and then with that link you can then share that with other people. I went Statistics Parser on you, oh my gosh.

Brent Ozar: They’re totally different. So it shows the plan – this one’s not a good example because it’s a WHERE condition. [crosstalk]

Richie Rump: Actually, we’ll have a release here pretty soon that will remove that and will format all that cursor-y stuff. It will actually show you a plan for all that.

Brent Ozar: Don’t you go breaking my heart. That sounds amazing; I want this. Let’s see here – oh, and Michaela asks – Michae-la says, “report people love Excel and charts.”

Erik Darling: Ah, there you go.

Brent Ozar: I’m with them on that.

Erik Darling: You use a graph database.

Brent Ozar: Whoa, a graph database, that’s good.

 

I have a normalization argument with a developer…

Brent Ozar: Wes says, “I’m writing a back end database for a coworker, moving things from Access to SQL Server. I have it in perfect Boyce-Codd normal form and he wants to make changes to break data integrity in normalization. What’s the best way to pushback?” Richie, that’s got you written all over it.

Richie Rump: You could roll up tables for that, I mean, if you’re against it. I’m not one of those guys who says that everything needs to be in perfect third normal form, or fourth or fifth normal form. I’m one you go to third and if your performance or your needs need to back off and de-normalize, then go ahead and de-normalize it. It depends on the scenario that you need. And the normal forms are kind of primarily to save disk space, because disk space was extremely expensive back in the day. We don’t really have that problem now. So if you need to have some sort of roll up tables where or reporting tables that the data is duplicated or you actually, you know, break out into second normal form on some of your tables; I think that’s perfectly fine. You just need to take a look and say, okay, what’s our options between to fix this problem in the application, and if it is, means we de-normalize for speed, then you de-normalize. I mean, that’s just the way I look at it.

Brent Ozar: He might get excited by views too, if there’s – if he wants to pull data out more easily, you can write him some views to do easier queries.

Erik Darling: Lewis Davidson’s book.

Brent Ozar: [crosstalk] We’ll pull that up. Good, keep talking…

Richie Rump: Yeah, the cool thing about Lewis Davidson’s book is that when I was initially reading it, I didn’t know Lewis at all. I had never met him, and I had a question about the book, and so I Tweeted him the question about the book as I was reading it, and we started having a conversation about whatever question I had in real time, over Twitter, while I’m reading the book.

Brent Ozar: And, I got [inaudible 0:08:23.4]. most authors are probably excited to do that too because it’s like, oh my god, someone’s actually reading my book.

Erik Darling: We’ll pay you a fifth [inaudible 0:08:33.0]…

Brent Ozar: Yeah, and hopefully you didn’t tell him you downloaded it off of BitTorrent either. I mean, that’s always heartbreaking for authors.

Richie Rump: Yeah, this guy, Brent Ozar, send me a PDF, I don’t know where I got it from, it’s crazy.

Brent Ozar: Ys, he’s got a few of those in there. The most recent one, the very one on the top of the list is the newest one, you don’t need the older versions.

Erik Darling: I brought the newer version. It’s an interesting read; there’s some good stuff in there about storing names that even I thought was interesting, as someone who used to store a crap-ton of names, never thought of. When I read it, I thought oh it’s so good. But I’m not going to tell you what it is because that would ruin it. No spoilers from me.

Brent Ozar: If you’re the kind of person who likes saying the name Boyce-Codd normal forms, stuff like that, you’ll probably enjoy this book.

Erik Darling: Absolutely.

Richie Rump: There was another book that I used to go to all the time, it was the Data Model Design Patterns, and there was like three volumes of this thing. The cool thing about that book is it had different verticals. So if you were in healthcare, it gave you a bunch of models for that, if you were in certain order taking type stuff, it gave you certain patterns for that. That was pretty awesome; I’ll see if I can find the title of it and I’ll put it into the cat window.

Brent Ozar: There’s a few in there. Oh, there’s a volume three; data model resource book?

Richie Rump: I’ve got to look it up. You know, it’s usually on the top of my head, but because I’m on the spot and the book is in the library, I’ve got to – it’s not here, it’s over there somewhere, behind my wall.

Brent Ozar: behind my wall… Behind door number one…

 

Can I change passwords in a transaction on multiple servers?

Brent Ozar: J.H. says, “Will creating a stored proc with begin and end, will that rollback a user changing their own password if it succeeds…” Oh my goodness, so you’re trying to do a transaction across multiple servers through linked servers, probably, in one stored proc changing passwords. I can’t imagine that’s going to get honored.

Erik Darling: No, just begin and end [inaudible 0:10:29.4] I think you would have to do some sort of explicit BEGIN TRAN and…

Brent Ozar: Yeah, or even beyond that, what I would do is I’d like create a table of, here’s all the servers I’ve got to change, and not shucks I’m going to need their passwords, yeah it doesn’t work. You know, have you heard of Windows logins?

Erik Darling: Yeah, that would solve a much bigger problem for you.

Brent Ozar: Yeah, or just do all the changes on just one server and take the logins from there and replicate it out to others.

Erik Darling: You know what I would do, I would use a registered server query for that, because that way you can put all the servers that you need to change the password on into a group [inaudible 0:11:06.1] and do all your password changes from one place.

Brent Ozar: I like that.

Erik Darling: I mean, it doesn’t work for a stored procedure, but…

Brent Ozar: Yeah, but if it’s still like changing and enlarging. Yes, I like that a lot. So if you search for either registered servers or a central management server, just lets you execute queries across lots of servers. They may fail on one, it may still fail on one, one times out, whatever, but then at least you’ll know which one it was and you can go retry it.

Erik Darling: Would agent jobs be a good fit for that? Like if you had an agent job that [inaudible 0:11:33.3] to other servers?

Brent Ozar: The problem I worry about is, if someone’s changing their password, this has got to be like a parameter for the stored proc, and if you’re going to retry it, you’re going to write it to a database somewhere and like…

Erik Darling: Yeah, because you’re executing a stored procedure with someone, then that’s in the plan cache maybe and…

Brent Ozar: Cool… [crosstalk]

Erik Darling: Weird stuff could go on with that…

Brent Ozar: Don’t do that. So instead, if you want to sync logins across multiple servers, Robert Davis Sync Logins; Robert Davis wrote the book, literally, on database mirroring, and so he’s got scripts out there on how to copy your logins to a database. And the same trick works for even just standalone SQL Servers.

 

Should I have nested roles or lots of roles?

Brent Ozar: let’s see what we got next here. Deborah says, “is it better to have nested roles or assign many roles to users?”

Erik Darling: It’s much better to have an empty nest.

Brent Ozar: Yeah, much more peaceful, says the guy with kids [crosstalk]…

Richie Rump: Ten more years they go to college, yeah.

Brent Ozar: You’re going to be so lonely. I don’t think any of us have an opinion on that one.

Erik Darling: No, we don’t have a lot of security opinions in general.

Richie Rump: Do security, that’s my opinion, do security.

Brent Ozar: Someone else do security. Look Securing SQL Server by Denny Cherry. It’s now, I think, in its third edition. So just check to make sure you’re getting the most recent version – yeah, third edition. Normally I say SQL Server security book and people are like, whoa. It’s great, really well written, it’s fun to read, it’s a really enjoyable digest for reading. And if you know Denny Cherry, you’ll think that someone else probably wrote it, because there’s not a typo in the book as far as I remember seeing. It’s really nice and cleanly spelled. I kid, because I love Denny, he’s wonderful.

Richie Rump: No, you kid because it’s kind of true.

Brent Ozar: It is completely true, yes. When Denny said his wedding vows to his wife, he couldn’t even get I do correctly. It’s like yeah….

Erik Darling: Do we, do I?

 

Have you used the hierarchical data type?

Brent Ozar: Wes Palmer says, “have you ever used the hierarchical data type, what are your views on it?” We just had a client question about that…

Erik Darling: Yeah, it’s okay, there are…

Brent Ozar: Gleaming endorsement.

Erik Darling: It’s okay, it does solve the hierarchical problem pretty well. There are some great articles out there that, I think, are much better choices. Jeff Moden, over on SQL Server Central, has some cool hierarchy on steroids articles, where he goes through how to calculate left and right bowers and traverse trees. And then Adam Machanic has a really good one, using CTEs and some crazy CLR code [inaudible 0:14:27.9] and that was a good read. The funny thing about that article was, one of my first blog posts when I was on your site was about – just a simple thing about common table expressions and how if you join a copy table expression it re-executes the syntax, blah, blah, blah. There was a comment on it from Adam, because this thing, this monster dropped like the same day or the day after or the day before, and he left a comment on my post, like I guess we both blogged about CTEs this week, and I was like yes, I guess…

Brent Ozar: I was so proud. And then he does the big monster post, of course.

Erik Darling: Yes, and so he does a lot of cool stuff. And [inaudible 0:15:10.6] kind of thing with the same author, the T-SQL querying book [crosstalk] there’s a pretty good chunk of space dedicated to hierarchies in there as well. So those are the – I would explore those choices before committing to the hierarchy ID type, just because, you know, it can be a little bit more clean and you don’t have to have all those crazy functions in there. I’ve seen, not CLR waits, but weird sleep task waits that can kind of dominate query execution when you’re waiting on those hierarchy IDs parent of, and whatever you calculate and you come back and forth.

Brent Ozar: Just saw that…

Richie Rump: Yeah, with everything, I think I say this ever week, but test it. You know, just go ahead, try one, try the other, test it, write a CTE. I’ve written a ton of recursive CTEs; if you know what you’re doing, you can get it to be fairly fast. Just go ahead and try it, test it and see what works in your environment.

Erik Darling: The one thing that I will say is cool about the hierarchy ID type is that the functions that you can use on that to, like, calculate [crosstalk] you can make computed columns from those and index those. So you can help yourself out a little bit in those situations. Where that’s a little bit harder to do, like persist, with a recursive CTE.

Richie Rump: And it really depends what you’re doing in a hierarchy. You could actually pre-build your trees and kind of do that, if you’re not going to touch it very much. But if your trees are always, kind of, moving in flex and if you kind of want to go from one node down to another one, or move up. There’s a ton of stuff that you could do with the tree, but do you need to do those things with the tree is really the question.

Erik Darling: Yeah, I played around with it a little bit just using the stack overflow database, because in the post table [crosstalk] and then the comments table, you can track that back. So you can make these kind of cool lists. You know, here’s like the initial question, here are answers to the question, here are comments to the answers and – so you can do kind of cool stuff like that.

Brent Ozar: Sum up like total score for a question [crosstalk]

 

Brent Ozar: let’s see, Jacqui says, “I bought your online everything classes and looking forward to getting to review them, hope to learn a lot.” Awesome, cool, I hope you learn a lot as well. Erik’s been adding performance classes too, and added some new T-SQL classes.

Erik Darling: Yeah, I’m going to start rolling out some of the triage-y stuff. [crosstalk] Yeah, closer to DBA stuff and performance stuff. Sort of how some DBA fundamentals work.

James says, “your T-SQL level up, level two was awesome.”

Erik Darling: Oh thanks, you’re the one who watched it [crosstalk] You know, I didn’t give my mom a log in because I didn’t want her to be that one view. So thank you for being the one.

 

Any good apps to automate database restores for testing?

Brent Ozar: Dorian says, “are there any good apps or scripts that can automate database restores for testing and development. [crosstalk]

Erik Darling: Yeah, it’s awesome for that, they have a crap-ton of automation built in. they even have a really cool one that Jason [inaudible 0:18:27.8] was telling us about at DBA Days, where they automate the backup restore, CHECKDB and drop of databases to a secondary server, if you want to get really crazy and offload your CHECKDB somewhere else. That is a really cool [problem 0:18:41.5] with a lot of good stuff built into it.

Brent Ozar: People often say, you know, like does anyone still buy backup utilities, you know, what’s the use now that compression is in the product. Stuff like that, adaptive compression, wizards for log shipping are better and they have cool extended stored procedures you can call for log shipping; some neat stuff.

Erik Darling: They also – the stuff that always blows me away about Litespeed though is, not to get too product endorse-y, you can do object double restores, you can restore a table, they’re really good at reading – you can read through the logs, to a certain point, behind the transaction you want. So like, if you have log backups every 15 minutes, you need to restore to a point within a log backup, it makes it dead simple to bring you to like the seven, eight, nine minute mark to a certain transaction. Like the stuff that the DBA [inaudible 0:19:31.5] if you had to do that on your own at 3am, you would be [inaudible 0:19:36.5].

Richie Rump: I have a question about that question. Why are we restoring production to test? Why are we restoring to dev? Our dev and test environments should be something completely different so that the developers can test, not just the 80% case that is typically in production, but the case that is kind of out there that usually doesn’t happen when I got data, kind of, all messed up. So our test databases need to be different than production and our development databases need to be different than test. So I’ve never been a fan of just restoring production into test and calling that a good test, because typically, you’re going to get the same thing in production and not the oddball whack-a-do problems that we need to be testing in test.

Erik Darling: But production’s a good thing to base that on. If you restore production data, you can then mangle that in some way to get your edged cases in, or you can, you know, do some scrubbing.

Brent Ozar: You can sell it to the Chinese…

Erik Darling: Yeah, Russians…

Richie Rump: But at that point, put it into a stage environment and test your deployment that way. [crosstalk] You should have certain test cases, be testing out your tests in a test environment.

Brent Ozar: Yeah [inaudible 0:20:42.4] You should, dear readers, also know that we will often advocate for things we don’t do here. So like, I think our dev databases actually are different here; in this case, they’re all with Stack Overflow. Ooh, the other thing I was going to say was – mainly on CHECKDB – if you’re the kind of shop where you can’t buy tools and you’re doing the restores in order to offload CHECKDB, the Minion Ware folks have this free tool out called Minion CHECKDB. And this is so neat to me because they built in all this load balancing, and they’ve also built in a popup that I can’t seem to close. Okay hold on [crosstalk] there we go. So this thing will let you, kind of, round robin your CHECKDBs. If you only have one server to do restores, it will restore different production boxes each day. This isn’t the kind of, like Litespeed has a GUI, click your things and a wizard. This isn’t that; you’re going to be 100% table-driven here, so it’s just more work, but it has a lot of cool capabilities.

 

Should I start learning Azure?

Brent Ozar: Alright, let’s see, next one, Ronnie says, “based on your experience in the field, do you think it’s necessary to start learning Azure or support cloud databases to the backups restores? Is that the direction of the big money jobs or is that a specialty skill set?” Well, we should do our round robin, so…

Erik Darling: You should learn R, I don’t know.

Brent Ozar: Oh yeah, sure.

Erik Darling: No, it’s – I think it’s always worth it to know this stuff and be prepared if, you know, someone wants you to start doing it. But would I make that my primary job focus? Probably not, you know, just looking at specifically Azure adoption rates. I’m not sure that that’s where, like, the big money – you could probably find some niche work in there. I don’t think you’re going to find, you know, the end of your career in it.

Brent Ozar: And so we should explain that, because other people out there are probably like is everyone doing Azure? So the work that you do out of there, what percentage of the work is like normal [inaudible 0:22:41.7] versus Azure versus AWS?

Erik Darling: Oh I think I’ve worked on maybe one or two Azure boxes in my time consulting with you. I work on a ton of AWS boxes. I haven’t hit a Google Cloud box yet aside from when we did the white paper work, but the majority is on-prem and on-prem virtualized.

Brent Ozar: Richie, how about you?

Richie Rump: I’ve kind of never been a fan of learning any knowledge without an understanding on where you want to go with it, right. So if you don’t have a plan of what I’m going to do with this cloud knowledge, then it’s probably going to be wasted time and wasted knowledge. I’ve always preferred a, kind of, multi-step approach to learning knowledge. I want to be able to know enough about the cloud so that I can have an intelligent conversation about it and maybe not necessarily how to implement it. And if I get into a gig where I need to implement it, then I’m going to take that next step and know a little bit more on how it actually works and maybe how to implement it and how – what the nuts and bolts are. And then maybe after that, I take another step to where I’m actually teaching about it, where I’m really intimate with the product and I can kind of show others how to do it.

The second part of your question is, well is the jobs going to the cloud? Kind of, maybe. I think a lot of companies are struggling with that question, especially, you know, we’ve got CIOs saying, we’ve got to go to cloud because we’re going to save all this money; and that’s, kind of, really out for debate about what you’re trying to do in the cloud. I think those companies that are, you know, doing Greenfield development in the cloud are saving money, and those that are putting Legacy stuff into the cloud are probably not saving as much. That’s just kind of my gut feeling for it. And the whole Azure versus AWS thing, I try to be agnostic, but I do a ton of Amazon work here, and I kind of really learned to love it, and as a Microsoft geek. And 20 years of my career has all been at Microsoft, I kind of feel bad about that, but then I yell at the Microsoft PMs in Azure and I just say make it better, you know. Because this is what AWS does, and now make this better; and it kind of makes me feel good that I’ve actually, you know, unloaded stuff off my chest and say Microsoft, you need to be more like AWS because I don’t know what Azure does, because it does like four or five different things, but it’s the exact same thing.

Erik Darling: One other thing you might run into is if you – say you spend the next three to six months and you learn a whole bunch of Azure stuff, and then you put it down and you come back to it in a year, year and a half, or even six months later, a lot of what you know might not be all that relevant. It probably won’t be as true as it was when you were using it, you know, it just changes so drastic.

Richie Rump: No, it’s totally going to change, and here’s a perfect example. So Lambda when it was first released, and it was only released what, two years ago, not even that?

Brent Ozar: [inaudible 0:25:28.5].

Richie Rump: Yeah, so when it was first released, there was a certain way you needed to go and end a function, right, and the way that you would end a function, you had to call certain things and it’s done. And that is still in the documentation, because it technically still works, but now they have a new way to do it. Essentially you call a callback and it’s done, and that old way is now essentially saying we’re deprecating it. It works but at some point, we’re going to pull it. And, you know, if you kind of knew that and then you came to it now and you start writing this old way, then a few months later they actually pull it out of the product, then you’re like – well one, what happened? And two, how does it actually work now? And three, oh boy, I was way behind the curve. Or even worse, you come back to it a year later and it doesn’t even work [crosstalk]

Erik Darling: Then their apps stop working because you wrote the function.

Richie Rump: It’s so crazy on how the cloud is just constantly changing; it is maddening. Then you get to an environment like Javascript that is also constantly changing as well, right. So moving target on moving target, and you know, trying to say that this is where we’re going to be for the next five, ten years, it’s not going to happen. This isn’t like we’re writing Cobalt code and it’s not going to change for 30 years, it’s a whole different animal.

Brent Ozar: Yeah, I would concur. Especially that you don’t want to waste your time learning something you’re not using. Literally, we wrote the Google white papers, they’re out of date within three or four months, yes, things start changing. I do either an Amazon or an Azure blog post about all the stuff around backup speeds. It’s going to be out of ate within a short number, a couple months, so just change [crosstalk]

Erik Darling: Like a lot of the stuff I wrote in the perf white paper is now not exactly true anymore because AD couple memory and CPUs from being tied together. Like usually have X amount of GBs per CPU and it has to even it out and we actually need more CPUs and more RAM and stuff. I should go back and revisit…

Brent Ozar: You asked one thing that was interesting, is that the direction of the big money jobs? It’s always supply and demand. So when there’s a huge demand and low supply, all of a sudden, those people will be raking in the big bucks, but it may not be for a long period of time. So for example, R and Python right now, there’s a huge demand, there’s very little supply. Those people are raking in the bucks right now. That will change as more R and Python people just simply come out of college, because those technologies are free.

Richie Rump: I mean that was true with Hadoop too right, a couple of years ago? And now, where’s Hadoop? I mean, you don’t hear about it anymore.

Brent Ozar: We blogged about it, like Jeremiah was doing Hadoop work, and really quickly realized that, oh my god, all these college kids are coming in and they’re making, you know, $25-$30 an hour, why would you want to compete with that? It’s a bad direction. Alright, well thanks everybody for hanging out with us this week at Office Hours, and we will see you guys next week. Adios.


Hyper-V: Getting Your Host And Guest Connected

Because I always forget stuff like this

I have a confession to make, dear readers:

I am a horrible VM admin.

I’m also not great at networking.

When I go to set up a VM to do something like install SQL vNext, I want to be able to talk to it.

Why bother?

Well, SSMS is no longer part of the SQL install. I don’t really want to copy another 800 MB file over to my VM and install it.

I’m also a devoted fan of Red Gate’s SQL Prompt, because I’m lazy and forgetful. I only have the one license, and I do my very best to respect that by not installing it (even temporarily) in more than one place.

And, you know, I should also probably practice what I preach about not RDPing into a server and running SSMS. That’s not very nice. Not very nice at all.

ICU TCP

So how does one get SSMS connectivity between a Hyper-V host and guest?

The magic of magic numbers. And by magic numbers, I mean IP addresses.

In your Hyper-V Manager, click on the Virtual Switch Manager

Switch Hitter

You’ll want to create an Internal network

Internal Affairs

Next, in your VM guest settings, you’ll want to assign it to use the internal network.

Doot doot

Now, on your host, head on into the Networking Control Panel where your connections are listed. You should have a new one for the Internal network you created in Hyper-V

Iconic

Get the properties of that adapter, and go into the IPV4 properties. I set mine to a generic subnet, and a random 192.168.X.X IP address. Just be careful here if you have hard coded IPs sitting around somewhere.

IP Freely

Now when you start up your VM guest, you’ll want to repeat those steps and assign it the same subnet and a different random IP address.

IP Green

Et Voila

You may have to kill Windows Firewall, or add in the appropriate exceptions on your Guest and Host to get things working in both directions. But this should be the end of it, and now you should be able to connect SSMS directly to your Guest from your Host.

Thanks for reading!


4 DBA Resume Anti-Patterns

Every now and then, clients ask me to interview candidates for their open DBA positions. I get to see a lot of resumes, and there are a few warning signs that pop up regularly.

1. The Keyword Encyclopedia. I get it: you’ve opened SSMS before, and you clicked on all the options and wizards, and you think that makes you qualified to use ’em. However, let’s do some quick math: take the number of weeks you’ve been at this job, and divide it by the number of features you mention. If you’ve been at a job for a year, and you’ve mentioned 50 features, I’m going to bet you don’t really know any of them deeply.

2. Red Fuzzy Underlines. Sure, SQL Server keywords don’t show up in all dictionaries, so doing a spell check on your resume can be an exercise in clicking the add-to-dictionary button. Trust me, it’s worth the time. It’s hard to overcome the manager’s skepticism when you don’t spell or capitalize SQL Server correctly.

3. Homer’s Odyssey. Just because you’ve been doing this for ten years doesn’t mean you have to write an epic poem about your journey. Get your resume down to 1-2 pages. While you may still have the battle scars from that DTS package implementation in 2003, you don’t need to wear them with pride on your resume.

4. Complete Lack of Soul. You’re going up against dozens of other candidates who are all typing the same exact keywords and job duties into their resume. Bring a little fun and personality in. If you’re going to put an objective or mission statement at the top, make it warm and inviting. Make the interviewer want to bring you in just to meet the fun person behind the keywords.

If you recognize these anti-patterns in your resume, just take a quick step back and ask, “What specific tasks do I love doing? What parts of this job am I in love with?” And then figure out how to communicate that level of excitement about those tasks.


Automated Tuning And The Future Of Performance Tuners

Before you get too excited

This isn’t a dive into any of the new automated tuning features of SQL Server 2017.

I’m interested in them because Microsoft thinks they can put us out of business with them, but…

Fat chance.

See, a lot of the automated performance tuning features assume you EVER HAD GOOD PERFORMANCE IN THE FIRST PLACE.

I mean, sure, it may have seemed okay when your database was 5, 10, or 20 GB, but now it’s 200 GB and things have just been getting worse for months. Or years.

These features are cool, but they don’t go in and fix your crappy code, your wrong settings, or get you off that VM with 8 GB of RAM that your VM Admin swears by the grace of his body pillow girlfriend is all a SQL Server will ever need.

After all, he read a book once.

Best of the worst

Our typical customer is worried about…

Keeping the lights on:

  • If they’re on the right hardware
  • If they can or should move to the cloud
  • If they’re meeting RPO/RTO
  • If their HA/DR strategy is correct

Decoding hieroglyphics:

  • How do I validate server settings?
  • How do I figure out what my problems are?
  • How do I find my worst queries?
  • How do I tune my indexes?
  • Bonus points: How do I interpret output from your free scripts?

Some customers are more savvy, have pains around parameter sniffing that they’ve identified, and want help with those. They’re tired of restarting the server, freeing the proc cache, or recompiling queries. But the thing is, there’s usually so much other stuff that gets uncovered, parameter sniffing getting fixed is almost a side effect of other changes.

Some examples:

  • An index adjustment took the bad plan choice away
  • It wasn’t parameter sniffing, it was a scalar function
  • There was a blocking problem
  • Non-SARGable predicates buried four nested views deep

The future

So, while I’m totally keen on automated performance tuning, and SQL Server 2017 being able to compare and correct bad plan choices, I’m still not worried about my job.

We still talk to people on 2008-2012 who are locked in because of vendor requirements, or other reasons, who just aren’t going to see those new features in the near future.

And plus, they still assume performance was good in the first place.

Thanks for reading!


Introducing sp_BlitzQueryStore

Oh, so you’re the one

While SQL Server 2016 adoption is still pretty low, and 2017 still quite isn’t out yet, that’s no reason to not invest a few days of dev time into a stored procedure that is only usable by the smallest portion of SQL Server users. Right?

Look, someone around here has to be future-proofing things.

With that in mind

I set out to write a stored procedure that does for Query Store (BOL) what sp_BlitzCache does for the plan cache, but with a new approach. sp_BlitzCache is glorious, don’t get me wrong! You ask it a question by giving it a sort order, and it gives you all the stuff in your plan cache ordered by that metric.

But, the plan cache is a fickle flower. Prone to clearing for various reasons, and often riddled with inconsequential items. I’m not saying Query Store isn’t also riddled with inconsequential items, but we can hopefully just do a bit more to avoid them.

I also didn’t want to reinvent the wheel – after all, there’s a GUI for Query Store. It has built in reports. I don’t wanna rewrite someone else’s query and dump it into a table and blah blah blah. That’s not fun for me, and it’s not helpful to you.

What we look for when we examine slow servers

When we come in to look at a SQL Server that’s in trouble, we wanna know what was going on when things were bad.

When you run sp_BlitzQueryStore, it basically:

  • Looks at your workloads over the last 7 days (that’s configurable)
  • Finds the highest resource consuming periods of time
  • Finds the top 3 queries that used the most of that resource in that time (the top is configurable)

So, if your highest CPU usage was on 2017-05-31 between noon and 3pm, we’ll grab the top three queries that used CPU, then we’ll repeat that for all the other metrics we grab. As of now, that’s: duration, cpu, logical reads, physical reads, writes, memory, and rows returned. By sampling each of those, we can get a pretty well-rounded view of the worst stuff in your Query Store data.

After we have a list of plans to go after, we circle back to grab additional information. Doing this all at once wasn’t very efficient, especially considering the breadth of data available in Query Store, plus collecting query plans and text. Also a departure from sp_BlitzCache, we’re no longer using one impossibly wide table to store everything. Data is reasonably normalized across temp tables that have specific purposes. Weird, huh?

Parameters you pass in

Right now, these are configurable parameters:

  • @DatabaseName NVARCHAR(128) — Database you want to look at Query Store for
  • @Top INT — How many plans per metric you want to bring back
  • @StartDate DATETIME2 — Start range of data to examine (if NULL, will go back seven days
  • @EndDate DATETIME2 — End range of data to examine (if NULL, will default to today’s date, though this changes if @StartDate isn’t NULL)
  • @MinimumExecutionCount INT — Minimum number of executions a query must have before being analyzed
  • @DurationFilter DECIMAL(38,4) — Minimum length in seconds a query has to run for before being analyzed
  • @StoredProcName NVARCHAR(128) — If you want to look for a particular stored procedure
  • @Failed BIT — If you want to look for only failed queries
  • @ExportToExcel BIT — Backwards compatibility, skips the generalized warnings, doesn’t display query plan xml, and cleans/truncates query text
  • @HideSummary BIT — Hides the general warnings table
  • @SkipXML BIT — Skips XML analysis entirely, just returns unanalyzed plans
  • @Debug BIT — Prints out any dynamic SQL used for debugging

Example calls

--Debug
EXEC sp_BlitzQueryStore @DatabaseName = 'StackOverflow', @Debug = 1

--Get the top 1
EXEC sp_BlitzQueryStore @DatabaseName = 'StackOverflow', @Top = 1, @Debug = 1

--Use a StartDate
EXEC sp_BlitzQueryStore @DatabaseName = 'StackOverflow', @Top = 1, @StartDate = '20170527'

--Use an EndDate
EXEC sp_BlitzQueryStore @DatabaseName = 'StackOverflow', @Top = 1, @EndDate = '20170527'

--Use Both
EXEC sp_BlitzQueryStore @DatabaseName = 'StackOverflow', @Top = 1, @StartDate = '20170526', @EndDate = '20170527'

--Set a minimum execution count
EXEC sp_BlitzQueryStore @DatabaseName = 'StackOverflow', @Top = 1, @MinimumExecutionCount = 10

Set a duration minimum
EXEC sp_BlitzQueryStore @DatabaseName = 'StackOverflow', @Top = 1, @DurationFilter = 5

--Look for a stored procedure name (that doesn't exist!)
EXEC sp_BlitzQueryStore @DatabaseName = 'StackOverflow', @Top = 1, @StoredProcName = 'blah'

--Look for a stored procedure name that does (at least On My Computer®)
EXEC sp_BlitzQueryStore @DatabaseName = 'StackOverflow', @Top = 1, @StoredProcName = 'UserReportExtended'

--Look for failed queries
EXEC sp_BlitzQueryStore @DatabaseName = 'StackOverflow', @Top = 1, @Failed = 1

For each of these, I’ve set @Top to 1, but you can change it to whatever. I’d ask you to be reasonable (say, less than 10), unless you’re skipping XML processing; it can be time consuming. Though we do de-duplicate collected plan_ids to reduce overhead, the higher number you use here does tend to increase the chances that we’ll have more distict plans.

What sp_BlitzQueryStore passes out

Maybe that was a bad choice of words. Anyhoo, here are some example of what output looks like, with some explanation.

I wanna take you to the Query Store

No Blitz proc would be complete without some thoughtful (hey, we try) analysis of collected data. This should look familiar to you if you’ve used sp_BlitzCache:

You’re a superstar

We have the database examined, query cost, query text, if it was a statement, or if it came from a stored proc or function, the query plan, any warnings we generated, and new here is the pattern column.

We also use similar routines to sp_BlitzCache to examine plan XML for actionable items. This is still limited to data in cached plans, but hey. It’s not like you’re used to better. Tee-hee.

One new feature in sp_BlitzQuery store is a column called ‘parameter_sniffing_symptoms’. Since we have fairly robust historical data about past runs of a query, we go and look for sharp discrepancies across a range of metrics, and given enough variation, we’ll list things out for you. This hits all the metrics we have data for (cpu, duration, et al.)

We’ll give you back a list of queries in a very familiar output table with all the analysis stuff, and important metrics. We also give you a generalized warnings table, just like in sp_BlitzCache, though with some new items in it.

Remember when I said we de-duplicate plan_ids? Before we do that, we set up a csv list of all the patterns a plan popped up in, since it’s possible that a query was running during more than one top metric. Also new is the parameter sniffing symptoms column. This is a csv list of all the large deviations in execution metrics, so you know how different parameters might be causing trouble.

Swipe right for more detailed metrics

We also show you a bunch of collected metrics from the query store views. These should’t need much explanation.

Yeehaw!

One thing to keep in mind is that data in Query Store is noted in microseconds, 8k pages, and KB. We standardize that to milliseconds and MB across the board. That makes things a bit easier for you human beings to consume.

If you’re on 2017 (or if the wait stats DMV here gets back ported to 2016), we’ll show you the top 3 wait stats for each query if they’re significant. Right now significant means >= 5ms. Maybe that’ll change, or be configurable in the future. Let’s see how this thing takes off first.

If you’re on a supported version (right now, 2017) this is what the top three waits column will look like. You’ll have to forgive this screenshot being boring, I haven’t done a lot on this particular VM lately. I promise, it’ll look snazzier when you do it 🙂

It turns out MSTVFs don’t de-duplicate well.

Introduce yourself

That’s a quick introduction post to our newest stored procedure. It’s available in our First Reponder Kit, which is still free. Please give it a shot and send me any feedback. This is v1, so the water is pretty blue for suggestions.

Thanks for reading!