5 Questions to Ask When You Upgrade SQL Server

Whether you’re looking to go cutting-edge and hop on SQL Server 2019 as soon as it’s available, or whether you’re just trying to evacuate from SQL Server 2008 to anything that’s going to be supported after this summer, you’re probably doing upgrade projects. This is a good time to ask yourself 5 questions:

1. Do we really need the higher compatibility level right away? When you flip the switch at the database level to enable a newer compat level, you’re taking a risk that your performance will be better across the board for every single query. The reality is that sometimes, it’s not. You probably tested your slow queries and found they went faster with the new compat level, but you didn’t test your fast queries to see if they regressed. After migrating to a newer version, just leave the compat level at the same one you had before for a week or two. Later, flip the switch on the newer compat level, watch what queries perform better or worse, and be ready to pop back to the older compat level temporarily if necessary. This just buys you and your team time to do performance mitigations on queries that regressed.

2. Will we enable Query Store? This built-in tool helps you do a better job of identifying which queries are regressing, and why. It’s not intuitive, and the default settings are a really bad idea, but you can work around those if you spend some time learning it ahead of time. It’s just a question of whether you’re willing to invest that time. (The tool isn’t going away, and I think learning how to configure and use it is a good investment that would pay you dividends in the long term.)

3. What’s our back-out plan? I’m not a fan of upgrading in place because it leaves you without a safety net. However, even if you build new VMs and migrate over to them, you still need to do a round of user acceptance testing to make sure everything’s okay on the new environment. Once you go live on a newer version, and you start doing deletes/updates/inserts (DUIs) on that new version, your data just flat out now lives in the new version. You can’t restore a database to an earlier SQL Server version. So before you let the users in, designate a team to do a round of quick performance tests, and have everyone agree on the success criteria. If they don’t sign off on go-live data, scratch it, and stay on the older infrastructure for another week or two (or more) until you iron out the performance issues.

4. What sp_configure settings and trace flags are we using, and why? Over the years, we might have implemented different tweaks in order to fix performance or reliability issues. When you upgrade, that’s a good time to stop and revisit those settings to see if they’re still relevant. Plus, their behavior can change – here’s a problem I’m running into right now with trace flag 834 and CTP 3.0:

Microsoft just doesn’t have the bandwidth to test every possible combination of trace flags and sp_configure settings. The more you venture away from the defaults, the more you’re taking risks that your particular combination won’t behave the way it has in the past.

Update May 29: with the help of the brilliant @sqL_handLe, we traced down that it’s a bug in the Hybrid Buffer Pool feature that’s trying to use conventional memory as a PMEM device. Boy, talk about a good example of it being difficult to test every combination of feature – I never would have thought to investigate Hybrid Buffer Pool as a culprit, and I’m still not sure how to do good root cause analysis on that feature given the limited instrumentation that I found during my investigation.

5. How often will we patch? SQL Server 2017 doesn’t have Service Packs anymore, only Cumulative Updates. Going to 2017 or 2019 is a good time to stop and talk with management about their expectations for patch outages. If they’re used to only taking annual outages for Service Packs, they’re going to be in for a big surprise with the frequency of Cumulative Updates: every 30 days for the first year of a product, and then every 60 days after that. When managers tell me they want less frequent maintenance windows, I walk them through the list of things fixed in recent updates. Take SQL Server 2017 Cumulative Update 15, for example: 2017’s been out for almost two years now, and yet CU15 still found & fixed bugs involving incorrect query results, Dynamic Data Masking security leaks, incorrect query results, stack overflows during backups, slight memory issues, and incorrect query results. Bugs are real, y’all.

What, you’re upset that SQL Server has bugs?

Well, it could be worse. The bugs could be there, and not getting fixed. Like, uh, the SQL Server 2008 that you’re supposed to be binning this summer.

Setting MAXDOP During SQL Server 2019 Setup

Just a short note – in SQL Server 2019, the setup process has a new MAXDOP tab:

The hyperlink about MAXDOP configuration points here.

I did a few quick test installs:

  • 4 cores gave me a MAXDOP 4
  • 8 cores offered MAXDOP 8
  • 12 cores offered MAXDOP 6

Historically, the 0 default meant unlimited cores. I like this setting instead because if somebody amps up the VM’s core count later, MAXDOP won’t automatically rise to match. If you jump from, say, 4 cores to 8, then I still kinda like the 4 setting better.

In vaguely-related-news, this week’s CTP 3.0 adds a lot of diagnostics for Azure SQL DB Hyperscale, plus a new sp_configure option for hybrid_buffer_pool (no trace flag required.)

My classes have really sucked lately, and I apologize.

I teach my Mastering classes in a regular rotation: Mastering Index Tuning, then Mastering Query Tuning, then Mastering Server Tuning. I go 1-2-3 like that because they build on each other, and I repeat ’em over and over through the year so people can gradually make progress.

I just finished another rotation, so it’s time to go through the reviews – not just the good ones, mind you, but all of the reviews, in order, to understand how I’ve been doing lately. Here goes:

Mastering Index Tuning Reviews

Mastering Index Tuning

Absolutely incredible! Indexing is something we can all work on, and honestly this is the place to brush up. Brent’s class was entertaining and engaging, and with the labs he sets up well worth the money. I couldn’t be more happy with the class, will definitely be looking into Mastering Query Tuning next!” – Brandon, March 11

“Throrough and entertaining. Brent hits Sql Server indexing’s high points, low points and all points in between. Then he gives you a lab to do the work to let the lecture sink in. During the lab, he’ll help you out, in real time. And later you can watch the session again and do the labs again, on your own. It’s a great way to learn. I wish there were similar courses for C#, Angular and Javascript.” – Jay Taylor, March 8

Brent’s way of teaching is unmatched! This class is an absolute must for SQL Server professionals. Perfect amount of lecture, lab time, and Q&A.” – Mohammad Darab, March 8

“Absolutely loving it. The class, Brent style, everything. I have to admit, I had my reservations before the class, but boy I was wrong about it. Less than 5 minutes in the first lab anyone can see the insane amount of work what was put into the class. This admiration only grows through the 3 days as everything is marching toward the last lab. I am really grateful to Brent and his team to use their extensive real-life knowledge and put this class together. Cannot wait till Query Tuning starts, probably I will start the previous recordings” – Kalotai, March 8

This class was just top notch. Brent is very entertaining and he explains the material in such a clear and concise manner. I learned a ton. Looking forward to the next class” – Katie, March 7

“The Master Index Tuning class is great. From the way the material is presented, to working the labs I really enjoyed this class. In fact, I have take 4 of Brents classes and they all are wonderful. I do highly recommend. And since I purchased the season pass, I look forward to the next class I take.” – Michael DeMore, March 7

Brent is a great instructor with deep SQL Server knowledge and has a great sense of humor. Class was extremely educational and I highly recommend it. Looking forward to signing up for additional courses in the future.” – C. Brien, March 7

Mastering Query Tuning Reviews

Mastering Query Tuning“If Mastering Index Tuning is a must have, Mastering Query Tuning is an absolute must have. I already know I will be in trouble for Mastering Server Tuning in about a month time :). Back to Mastering Query Tuning: it is not only informative, but the knowledge transfer is very entertaining too. And there is a great amount of knowledge in the class. By the end of the third day my head was tried to burst open a bit with the extra information, as it turned out I was not alone with this. I feel about Mastering classes as I feel about electric heated motorcycle gear: until you try it, you cannot even imagine what you missing from life. Once you had it, you do not want to live without it. If you work with SQL and you can only have one training course in your life, you really need to pick the Mastering classes. Go for all 3.” – Kalotai, April 8

“Another very educational, eye-opening course in the Mastering series! I’m not a developer by any means but after this course I am pretty comfortable tackling slow performing queries. Thank you Brent!” – Mohammad Darab, April 5

“After taking the Mastering Index Tuning class, I was expecting to come into Mastering Query Tuning and learn a lot. I was not disappointed. I would argue this class is harder than Index Tuning. There are just more moving parts to understand and take advantage of. One of the things I enjoy about this training is watching someone do things that are not necessarily part of the “plan” and then figure out what went wrong. Brent does a good job of it live. He’s not just a “trainer” who is parroting things off an MS SQL Server Training slide deck. I really like this training format. Brent does a great job in front of the camera. P.S. The difference between people that want 10 and 20 minutes breaks made me laugh out loud !” – Andy Benner, April 4

Mastering Server Tuning Reviews

Mastering Server Tuning“Brent is making harder and harder to write a review and not to sounds like a fanboy (fangirl). In my previous reviews (Mastering Index Tuning, Mastering Query Tuning) I have highlighted how great is the material, how well the lectures are built. The true appreciation for the classes comes though after you did all 3. I can only repeat myself that these courses a must have if you ever just looked at MS SQL server. The best thing what you can do for yourself is that you are not working on the days of the course, and you dedicate your whole day for the classes. I consider myself very lucky that I signed up for this. One of my best decisions in my life. Thank you Brent for creating these courses and doing a fantastic job to transfer the knowledge. Oh, and I am definitely a fanboy for life.” – Kalotai, May 14

“I highly recommend taking Brent’s “Mastering Server Tuning with Wait Stats” class. For three days I learned more about how to performance tune my database servers than I have in any other class I’ve taken. What I liked best about this class is that the examples and labs were not just some steps to follow to get to a predetermined solution, as I’ve had in other classes, but they were specific problems set up to make the students think about and work through the solution using the knowledge from the lecture. Brent’s lecture style is easy to follow along with and a lot of fun. He does a really good job of keeping the subject material interesting and keeping the students engaged. I found myself wanting more lecture just because he’s a lot of fun to listen to. I will say this class would be more beneficial if you take either “Mastering Index Tuning”, “Mastering Query Tuning”, or both first. I did not and felt slightly confused at times. It’s not a big hindrance, and you will eventually catch up, but it would give the student a better background. Regardless, Brent’s teaching style and real-world examples easily help you overcome any slight deficiencies from not taking those classes first. Either way, you can be positive that you will be a better at tuning SQL Server after taking this class.” – Chris Fair, May 9

“I LOVED this class! Before this class, I would feel flustered when tuning issues came up and I had one hammer that I would turn to when I was in trouble. Now I have a full tool kit and more confidence to find how I can improve a servers performance. The labs help tons and are great to help me fully understand the problem before I see the solution presented. While I took the online only class, I felt there was a ton of great interaction through the slack channel. I loved the stories that help me better understand the concepts and the easy and kind way that Brent answers questions. He doesn’t make you feel stupid for not understanding a concept and is patient to help you get the best result. I will absolutely be back for more! Thank you!” – Andrea Allred, May 7

“This class was well worth the price of admission. Brent is constantly trying to improve his classes and it shows. The labs and lectures show a great deal of effort has gone into them. The labs are rewarding when you get them right and there is a depth to them so that it’s not a cakewalk. On the other hand, Brent’s not afraid to tackle a problem live with everyone watching. I think that is one of the more enjoyable aspects of his classes. I really like the format of this training and I am hoping I can convince my boss to get another season pass.” – Andy Benner, May 6

“Awesome content, the labs are hands on and they have real day to day opportunities to analyze and solve . I highly recommend this training.” – CHaro, May 6


Live Class Season PassForgive me for a moment while I bask in the glow of your reviews. I’ll give you fifteen minutes to stop that. Okay, maybe twenty.

I’ve seriously poured my heart into building a balance: a balance of lecture versus hands-on labs, a balance of tough challenges versus things you can actually accomplish in the time given, a balance of theory versus real-world stuff you can put to use the very same day.

I’m really proud of what I’ve built, and how you’ve reacted in classes while we’ve worked together. I’ve loved the questions you ask, the excitement you share with me when we get to go off-track on new demos, and your willingness to collaborate with others in Slack on your homework. I love seeing y’all work together – and compete against each other, too!

I think my Mastering classes are the best way to learn things that will really make a difference in your career. However, what I think doesn’t matter – all that matters is what y’all think.

And I’m so excited and humbled and fist-pump-y that y’all like ’em too.

Another round is about to start.
Wanna join us?

Heck yeah, you do. Time to pitch it to the boss. Here’s my upcoming lineup for the next quarter, in order:

And if you wanna take all of ’em, that’s where my Live Class Season Pass comes in.

I’m not gonna lie: my classes aren’t cheap. If you want inexpensive classes taught by people who’ve never done this stuff in real life, who don’t build the very open source scripts you use in your everyday performance tuning, you’ve got plenty of choices. There are plenty of local instructors who’d love to read you the slides that somebody else wrote.

You should read the reviews on those, though.

Oh, the students don’t have the ability to leave public reviews?

Huh. Wonder why that is.

9 Tips for Faster SQL Server Applications

Wanna make your SQL Server applications go faster? Here are 9 simple tips that will give you a better chance for success. They’re not carved in stone, but they’re a good place to start.

Query Design Tips

1. Code for readability first. The easier your query is for humans to understand, the more likely SQL Server will be able to understand what you’re trying to do, and the better execution plan you’re likely to get.

2. Avoid scalar user-defined functions. Yes, functions are great for reusable app code, but SQL Server scalar functions are notoriously slow. They run row-by-row against your result set (possibly much more than that, depending on the query), and queries that call scalar functions are single-threaded, regardless of how much work they’re doing. Your functions might go faster in SQL Server 2019, but until then, don’t gamble.

3. Avoid table variables (@tablename). They sound like they’re going to be faster than temp tables, but they get bad row estimates, which drives the rest of your query to get a bad execution plan instead. If you need to store data somewhere temporarily, embrace temp tables (#mytable.) They’re the better of those two choices.

4. Never take user input to build a query. Crafty users will bypass your security using a technique called SQL injection. I know, you think you’ve built in a lot of security, but your code has to be perfectly bug-free every time. The hacker only has to be right once.

When you’re ready to learn more about query design, pick up the book T-SQL Fundamentals by Itzik Ben-Gan. The title sounds deceivingly easy, but this book will take you through several years’ worth of learning, and is great for veterans as well as beginners.

Table Design Tips

1. Start every table with a clustered index on an identity. If you’re just getting started with table design, use this tip to avoid problems with empty pages and forwarded fetches. Here’s an example of how to create a table with this ID built for you:

There are exceptions to the clustered index rule. If you need to generate unique values client-side, you can use GUIDs. If you have a staging table that is loaded from scratch and truncated every day, then a table without a clustered index (a heap) can be fine. However, if you’re not sure about what that means, just stick with a clustered index on an identity as a starting point.

2. Avoid NVARCHAR fields unless you really need ’em. The difference between NVARCHAR and VARCHAR fields is that NVARCHAR can store Unicode, but NVARCHAR requires twice as much storage space (even if you’re not using Unicode.)

3. Avoid MAX datatypes unless you really need ’em. You can use VARCHAR(MAX) and NVARCHAR(MAX) to store up to 2GB of data, but that comes at a performance price later when it’s time to index that data or compare it to other columns. You’re better off naming your length, say VARCHAR(50) or NVARCHAR(1000) based on the longest data you’ll put in the column. Don’t go overboard: SQL Server bases its memory grants in part based on the column definition. Oversize your column, and you’ll oversize your memory grant – which sounds good until you hit concurrency issues.

Hey buddy!

4. Recommended indexes are not smart, nor in order. When you see index recommendations that would supposedly make your query run faster, they’re generated by the modern equivalent of Clippy, the Office assistant. They’re better than nothing, but they’re rarely accurate. Think of them as nothing more than a hint that it’s time to consider hand crafting an index, and I teach you how in Fundamentals of Index Tuning.

5. Avoid storing files in the database. Store files in a filesystem, and in the database, just store a pointer to the file (either its http location or UNC path.) This keeps your database size lean and mean, lowers your licensing costs (because performance requirements will be lower), and makes your backups & restores faster.

When you’re ready to learn more about table design, pick up the book Pro SQL Server Relational Database Design by Louis Davidson and Jessica Moss.

Follow those tips, and your app has a much better chance of scaling.

Does the Rowmodctr Update for Non-Updating Updates?

Update May 20 – make sure to read to the end for an update.

Okay, look, it’s a mouthful of a blog post title, and there are only gonna be maybe six of us in the world who get excited enough to check this kind of thing, but if you’re in that intimate group, then the title’s already got you interested in the demo. (Shout out to Riddhi P. for asking this cool question in class.)

The system table sys.sysindexes has a rowmodctr that tells you how many times rows have been updated in an object. This comes in handy when we need to figure out if it’s time to update the statistics on a table.

To set the stage, let’s create a table, put a couple of indexes on it, and check rowmodctr to see how many changes have been made:

No rows have been modified yet since no data has been loaded. Let’s load 1,000 rows of data, and then check rowmodctr to see how many modifications have been made to the table:

Rowmodctr is 1,000 because 1,000 rows have been modified in the table – hey, inserts count as modifications.

Now the fun starts: let’s update everybody’s PersonName to be the same thing:

Remember, we had rowmodctr 1,000 for both indexes just a second ago – and now the index has gone up (saying it was modified), and the clustered index stayed the same (indicating that it wasn’t.)

I’m gonna be honest with you, dear reader: this was not the result I expected.

As a naive, delicate flower, I expected SQL Server to treat the rowmodctr the same way on both the clustered and nonclustered indexes. Instead, here we’re showing zero new modifications on the clustered index, but 1,000 new modifications on the name index.

In the big scheme of things, this isn’t a showstopper problem. It’s not like I’m over here waving my hands going, “OMG, this violates the laws of space and time!” It’s a system table, and if anybody’s relying on this thing for razor-sharp accuracy, they probably deserve what’s coming to them. I just find it so amusing that it’s handled differently on the two kinds of indexes – even though neither of them were modified. Huh.

How about if we explicitly tell SQL Server to set it to the same value:

Again, the nonclustered index shows modifications, and the clustered index doesn’t.

So what’s the takeaway?

I don’t think this is a performance-shattering problem: it’s probably fairly rare that you find an application that runs updates even when no changes were made.

To be clear, the fix isn’t to switch to a delete-the-old-row-and-insert-a-new-row design pattern: that would be twice as bad, since now you’re doubling the number of rows modified with each operation.

People who bought this post also bought “The Impact of Non-Updating Updates” by Paul White and the excellent answers to this DBA.StackExchange question about updates.

Update May 20 – hoooweee, boy are my skills dated on rowmodctr. Since SQL Server 2005, Books Online points out that rowmodctr isn’t a counter for the number of modified rows, as the name might imply. Instead, it’s about the number of updates that would have affected the statistics on the index involved. Since the clustered index here is just about the Id column, and the updates weren’t on the Id, the rowmodctr doesn’t increment there. Doh! Shout out to the folks who DMd me on Twittter and left comments here. Good catch!

Building SQL ConstantCare®: Letting You Mute Recommendations

When we first launched SQL ConstantCare®, I saw it as a mentoring service, not a monitoring service. I wanted to give you advice that I’d learned the hard way across managing lots of systems.

With that in mind, I told users as they came onboard, “If there are any recommendations that you want me to mute, just let me know.” Sometimes, folks would email in about why they didn’t want to do something, and I agreed, so I muted it. Other times, we had a mentoring conversation, and they learned about why an issue might be more important than they’d assumed.

Here are the recommendations you muted most often:

  1. Consider Enabling Query Store – muted by 42 users (and I blogged about that)
  2. Server Triggers Enabled – 33 users (which makes sense, once we talked through what their triggers were doing – turns out a lot of you use these in smart, creative ways)
  3. Transaction Log Larger than Data File – 28 users (mostly due to SSIS)
  4. Check for Corruption ASAP – 28 users (offloading to secondaries or restored servers)
  5. Move User Databases Off the C Drive – 28 users (mount points, servers on death marches)
  6. Take a Full Backup – 27 users (offloaded to AG secondaries not being monitored, dev servers, scratch databases, truly ginormous archive databases)
  7. Move TempDB Off the C Drive – 20 users (see #5)
  8. Check Unusual Database States – 18 users (offline databases kept around for safekeeping)
  9. Take a Log Backup – 15 users (see #6)
  10. Too Much Free Memory – 15 users (usually queries with comically large memory grants that ran simultaneously, and the DBAs couldn’t fix it for various reasons)

And…I’m okay with those. (There were plenty of others that I talked people out of, like auto-close, auto-shrink, CPU schedulers offline, and the plan cache being erased daily.) But for the most part, y’all made pretty good decisions about what you wanted to mute.

So now, you can do it yourself in the emails.

Starting with this week’s release, there are Mute links next to the recommendations, plus next to the individual details, like which database hasn’t been backed up:

You can either mute the recommendation altogether on this server, or just for one (or more) specific databases.

I’m really curious to see how this changes user behavior because I’m positive that folks are going to go mute-crazy. I have a hunch that y’all don’t want to tell me you’re going to ignore something – but you just want that recommendation to go away. (We send regular followups reminding you of what you’ve muted – because in a lot of companies, managers are included in the emails too, and we don’t want them to think their employees have a perfect bill of health when in reality they’re just muting everything.)

Behind the scenes, here’s how it works.

Normally, when you click on a web link, you expect to open a web page, which is served to you by a web server. That server takes your request, makes a connection to a database, and does your bidding.

That’s so 2008.

Instead, we’re using Function-as-a-Service, aka serverless. Serverless is a terrible name since there are still servers involved, but like I blogged about a year ago, it just means the servers aren’t my problem. Here’s how it works:

  • You click on a link in the email – and the link contains a unique string that represents you and the thing you can mute (think GUID, but it’s different)
  • Your web browser opens a static HTML web page – no fancypants web server required, just an HTML file (and supporting documents) that live in Amazon S3
  • Your web browser itself calls Amazon API Gateway – which runs the appropriate Lambda function, which can log your requested GUID (which we could write straight to the database, or just log it to a queue to be picked up later, since it’s not urgent)
  • If that request goes successfully, your browser renders a success message – or else it gives you info to send to us for diagnostics

It’s exactly how PasteThePlan works, too. We don’t have to provision app servers, patch them, manage load balancers, patch databases, etc. Richie can just focus on the code, the unit tests, and fixing my crappy Postgres functions. It’s not that serverless is easy, mind you – it’s hard, especially since it’s still in its infancy – but it just moves the work from systems administration into development.

Hosting? Well, Amazon takes care of the rest and gives us nice reports on how often y’all are using it:

This is one of those posts where I step away from the DBA role for a minute and tell you about something I just find personally mesmerizing. The serverless code costs so absurdly little that it doesn’t even pay to bother performance tuning it:

Whereas the database pricing, that’s another story. Isn’t it always…<sigh> So looking forward to serverless databases becoming more powerful and more mainstream for these kinds of sporadic use cases.

Oh while we’re here, next up in the release list: combining all your servers into one daily email if you want. Code’s done, and we’re testing it now.

Why Order Isn’t Guaranteed Without an ORDER BY

Start with the Stack Overflow database, with no nonclustered indexes. If you’ve been playing around with demos, run my DropIndexes proc just to clear things out, which leaves the clustered indexes in place.

Say you run this query:

By default, with a clustered index on the Id field (identity, starts at 1 and goes up to a bajillion), your results will probably look like this, ordered by Id:

Because if you look at the execution plan:

SQL Server is doing a clustered index scan. Since we’re clustered on Id, the results happen to come back in Id order.

But watch what happens when I create an index on DisplayName and Id, and then run the exact same query again:

Now the data comes back ordered by DisplayName due to the way SQL Server ran the query, as shown in the execution plan:

As I talk about in How to Think Like the Engine, SQL Server doesn’t have to do a clustered index scan for queries like these. It can use the smallest copy of the table that achieves its objectives, and in this case, it’ll use the nonclustered index because it takes less time to scan.

If you have to scan the Users table to get DisplayName and Id, which of these two objects would you rather scan:

The clustered index is 58MB, but our index on DisplayName and Id is only 9MB. It’s faster to scan that nonclustered index.

SQL Server may use indexes you didn’t expect.

Say we’ve got an index on DisplayName that happens to include Reputation, and we frequently run this query:

Whenever we run the query, we always get the data back sorted by DisplayName, so we assume that it’s always going to be that way:

But that’s just because SQL Server happened to use the index on DisplayName that includes Reputation:

Because SQL Server knows a whole lotta users have Reputation = 1, so it can scan that index and quickly find the rows it needs. It doesn’t need to scan the ENTIRE index, just enough of it to find 100 users with Reputation = 1.

But the instant we create a “better” index – one that allows a seek, for example – and say that we only include DisplayName (not sort it):

The data doesn’t come back ordered. SQL Server did an index seek on our new index:

But since DisplayName was only included, not sorted, even just a 100-row result set doesn’t come back in the order you might expect.

So in summary: if you need stuff ordered, add an ORDER BY, or else your order can get changed based on the execution plan SQL Server chooses.

Which Queries Should You Cache in the Application?

I’ve talked about how the fastest query is the one you never make, and why when you’re scaling up a database application, you probably want to start caching query results in the app tier. Caching means:

  • Less licensing costs for SQL Server
  • Lower CPU usage both for query execution and for compilation time
  • Lower locking/blocking overhead (even NOLOCK requires a schema stability lock, for example)
  • Less demands on query workspace memory

But how do you go about finding which queries to cache?

Find your most frequently executed queries, then ask 3 questions about them.

Run sp_BlitzCache @SortOrder = ‘executions’, or look on the Plans Executions tab of the Consultant Toolkit. Look at the executions per minute column – and on seriously high-volume servers, I’ll usually divide those numbers by 60 to get executions per second, which drives the point home even more. With those queries in mind, ask:

Question 1: How often does this data change? Often, we’re dealing with single-table queries that fetch a value from a global config table. The data just doesn’t change often, and could easily be fetched just once at application server startup. If we needed to change a config value, worst case, we could restart the app servers gradually and have them fetch the updated data.

When you’re not sure how often data changes, look at the table’s contents:

In particular, you’re looking at the statistics on each index to show you how many writes the table has had recently. It can be tough to discern exactly when this number resets: it varies based on your version, patch level, failovers, restores, etc., but just the ratio of reads vs writes can be really enlightening. For example, at a recent client, we noted that we were fetching employee phone numbers millions of times per day, when only a handful of changes had been made over the last several months.

Question 2: Will out-of-date data really hurt? Some data absolutely, positively has to be up to the millisecond, but you’d be surprised how often I see data frequently queried out of the database when the freshness doesn’t really matter. For example, one client discovered they were fetching live inventory data on every product page load, but the inventory number didn’t even matter at that time. During sales, the only inventory count that mattered was when the customer completed checkout – inventory was checked then, and orders only allowed if products were still in inventory then.

Question 3: How much data are we talking about? Is it a config table with only a handful of rows, or is it a product list with millions of variations? The less data involved, the easier it is to cache, especially if we’re trying to cache in memory on the application servers. The more data involved, the more it’s going to hurt if we try to duplicate that data on every app server – and the more we need to start investigating a dedicated caching system like Redis, AWS ElastiCache, Azure Cache, or Google Cloud Memorystore.

Cache rules everything around me, yo

Cache rules everything around me

Which leads me to the second of my caching truths:

  1. The fastest query is the one you never make, and
  2. The first query you cache is the hardest.

When you implement caching for the very first time in an application, you’re going to be torn about a million little decisions. What caching platform do you use? How do you set it up? How do you test that it’s working? How much redundancy do you need? How will you add the code to your application?

But when you see the first results – millisecond-level response times for frequently accessed data, and lower loads on your database server – you’ll be sold. You’ll cache first, and ask questions later. That’s when you’ll start running into problems with cache invalidation, but for now, find your most frequently executed queries and ask those 3 questions.

I’m coming to SQL Saturday Salt Lake City in August!

Performance Tuning in 21 DemosI’m coming to Salt Lake City in August for SQLSaturday #887. On Friday, I’ll be teaching a one-day pre-con:

Performance Tuning in 21 Demos – You’re a developer or DBA who wants to tune indexes and queries to make your application go faster. You don’t like learning by watching slides or reading white papers – you want to jump right into the code and see what happens. You’re the kind of person who loves watching YouTube videos that start with, “Hold my beer and watch this.”

Attendees get their choice of a Recorded Class Season Pass or a Consultant Toolkit – so the pre-con pays for itself! Learn more and register now. There are only 50 seats available.

Can’t make it to Salt Lake City? Here’s where to find me this summer:

I won’t be at Summit this fall – I’m skipping that this year for a long-overdue Hawaii cruise – so if you wanna catch me this year in person, those are your best bets. (There’s always my live online classes, too.)

Book Review: Learn T-SQL Querying by Pedro Lopes and Pam Lahoud

Learn T-SQL Querying by Lopes and Lahoud

Learn T-SQL Querying by Lopes and Lahoud

You’ve been writing queries for at least a year or two, and you’re curious about what’s happening under the hood when SQL Server runs a query. You want to know what you can do to improve your queries, tune your indexes, and change SQL Server’s configuration settings so that your results will return faster.

The book Learn T-SQL Querying is for curious people like you.

When I first read the Table of Contents, I wondered how the authors were going to cover so much ground in ~450 pages – and they did a great job striking a balance between giving you enough to get started, versus dragging you into the weeds with obscure internals details. Great example: pages 170-171 only spend two paragraphs on why you shouldn’t be using cursors – nuff said. There were so many times where I read one paragraph and thought, “I sure hope the readers get how important this is.” It’s a lot of killer, and not a lot of filler.

Pam & Pedro give you enough to pique your curiosity and make your queries go faster, and if you’re interested in learning more about a specific thing, well, that’s what Google is for. This book teaches you enough to Google for the right things.

Despite the title, it is by no means a level 100, here’s-how-to-write-a-query book. The first couple dozen pages act as if they’re a gentle on-ramp to the parts of a query, but just jump past those to chapter 2, understanding query processing, and Pedro Lopes (@SQLPedro) and Pam Lahoud (@SQLGoddess) start getting into how a query is optimized.

Or if you’re not quite so curious, and you just want faster queries, jump to chapter 6 and learn about anti-patterns.

Or if you’re not sure what queries you should review first, and you’re curious about what it would take to build your own plan cache analysis tool, turn to page 268 and start reading about the plan cache DMVs.

I realized the picture should have two thumbs up, not one

That is really my only small objection to the book, and it’s a tiny one: it seems just a little bit out of order, with some really challenging concepts coming before the easier parts about simple things you can do to make your queries go faster. (Page 17 introduces the WHERE clause as if you’ve never heard of it before, and by page 30, we’re into worker threads, tasks, and degrees of parallelism.)

That’s easy to work around, however: just hit the Table of Contents, find the parts that interest you, and read those first. You’ll come back to the others later anyway because this is the kind of book that rewards repeated reading. More than that, it also rewards using the contents as a jumping-off point for additional learning.

I highly, highly recommend this book for folks who want to start a curious journey into SQL Server execution and tuning. Get it, and let me know what you think.

Developers: Azure SQL DB Serverless can save you money.

If you’ve got a database that doesn’t need to be on all the time, Azure SQL DB Serverless is in public preview now. It makes a lot of sense for those development databases you have to keep around, but don’t really want to pay for.

If your database doesn’t get any queries for 6 hours, it’ll pause billing, and automatically resume again when your queries kick in. Here’s how to set it up in the portal:

The gotchas list looks fair, but a few things stand out:

  • This tier only has 0.5-4 cores (~2-12GB RAM)
  • The minimum auto-pause duration is 6 hours (that’s some pretty doggone slow response time)
  • After auto-pause, the first login attempt will fail
  • Any queries whatsoever reset the clock (including login attempts, which means this won’t really work for databases exposed to the Internet, which you probably shouldn’t be doing without IP restrictions anyway)
  • Not currently available in Australia Central, China East, China North, France South, Germany Central, Germany Northeast, India West, Korea South, South Africa West, UK North, UK South, UK West, and West Central US

This is a total no-brainer for non-public-facing, development-only or archive databases. For hobby apps, the 6-hour time with zero queries doesn’t really make sense: it’s just too long, especially when search engines are going to hit your site at least that often, let alone things like monitoring tools and automated builds.

Update 2019/07/08 – Microsoft just updated the minimum auto-pause duration, dropping it down to 1 hour (from 6). That makes it a much more viable option for cost savings.

The Silent Bug I Find in Most Triggers

I don’t have a problem with triggers. They get the job done when you need to implement business logic in a hurry, and you’re not allowed to change the application. As long as you keep the number of statements to a minimum (say, 2-3), and don’t try to do something really slow like fire up a cursor, triggers can be an efficient way to solve hard problems quickly.

However, most triggers I run across have a really, really dangerous bug.

Let’s say we want to add a trigger on the Stack Overflow database‘s Users table. Whenever someone’s Reputation is over 1,000 points, we’re going to set their AboutMe to declare that they’re famous:

That trigger assumes only one row will be in the INSERTED table, which would be true – as long as we only update one row at a time. But what if a query updates multiple rows in a single transaction?

Let’s say we want to award a free reputation point to everyone in Brazil. Here are some of the people we’re going to affect:

There are lots of people involved:

Many of them have reputations over 1,000 points, so those folks are all going to be set to famous, right? Let’s see what happens when we run the update:

And then check to see their new famous AboutMe:

Wait…something went wrong. There are users with a reputation over 1,000, but don’t have “FAMOUS” in their AboutMe. A bunch of ’em simply got ignored.

That’s the bug.

When you declare variables and set them using one row from the INSERTED or DELETED virtual table, you have no idea which row you’re going to get. Even worse, sometimes this trigger will update one row, and sometimes it won’t – because it might happen to grab a row with a reputation under 1,000!

Here’s a better way to write that trigger.

Don’t use variables – instead, think set-based:

In that version, we’re simply joining with the INSERTED table, thereby ensuring that we get all of the updated rows. We operate on ’em all at once, too – think set-based, not a cursor.

Then, when we update multiple rows at once:

The right ones are affected:

tl;dr: when you see the variables being set with the contents of a single row from the DELETED and INSERTED tables, you need to raise an urgent alarm because that trigger probably produces incorrect results when multiple rows are deleted/updated/inserted in a single statement.

“Hey, that’s not SSMS in those pictures.”

Yeah, you caught me: I’m using Azure Data Studio on my Mac, and building SQL notebooks with it. Here’s a quick runthrough of how it works:

How to get started:

Are your CPU-intensive queries running slowly? Check your VM type.

When you build a VM for SQL Server in Amazon EC2, Azure VMs, or Google Compute Engine, it’s all too easy to overlook the CPU speeds that you’re getting. Most of the time, it doesn’t really matter because SQL Server isn’t usually CPU-bottlenecked: your queries are doing table scans with 16GB RAM, starving for IO throughput.

But if you’re seeing high CPU usage, or you’re seeing SOS_SCHEDULER_YIELD waits on your queries, stop to check the VM type you used when building this VM. Once you’ve found it, the easiest ways to look up your CPU speed are:

In AWS, for example, I run across a lot of customers in the 2xlarge tier: 8 cores and 61GB RAM. Most of the CPUs in that price range are tragically slow:

From AWSinstances.info

That means if you’ve got queries that suffer from a CPU bottleneck, and they’re not able to go parallel (like because they call scalar functions), you’re not going to have a great time with these instance types.

To pick a better instance type, do a little more digging. EC2instances.info shows some clock speeds as “unknown” when the processors are custom silicon built for Amazon:

Top secret

Top secret

To learn more about those models, hit the AWS documentation pages:

  • r5.2xlarge: custom Xeon with “sustained all core Turbo CPU clock speed of up to 3.1 GHz”
  • z1d.2xlarge: custom Xeon with “sustained all core frequency of up to 4.0 GHz”

The prices aren’t much higher than the instance types you’re already running, either. In fact, it’s been just the opposite for me on two clients in a row! Both had built their SQL Server VMs a couple years ago, and newer instance types not only gave them faster CPU speeds, but reduced costs as well. On a mature application, once it’s been up for a year or two with a stable user base, it’s easy to measure the SQL Server’s performance needs to figure out if it’s constrained by CPU, memory, storage, locks, or something else. Then, you’re better equipped to build the right VM type to save yourself money while making queries go faster.

And before you say, “But I wanna tune the code and indexes!” – you can totally do that too. Why not both? (Especially if changing the VM type saves you money.)

This is also a good reason to touch base with management to see if you’re using reserved instances, and if so, when those reservations expire. If you’ve had a reservation for more than a year, it’s probably time to revisit the latest instance types for a free speed boost.

Pop Quiz: SQL 2000 Bug or SQL 2017 Bug?

2002: Me and Bailey, mom’s Great Dane

When I was thinking about this site’s 17th birthday, I started wondering: back in 2002, what bug might I have been working with in SQL Server?

I started reading back to the list of bugs fixed in SQL Server 2000 Service Pack 1, and it hit me…these bugs aren’t all that different than the ones we’re facing today.

Before you click on each bug, guess whether it was in SQL 2000 Service Pack 1, or last month’s SQL Server 2017 CU14. When you click on each one, you’ll see the release notes for it. (Unfortunately, the 2000 bugs all point to the SP1 page – the individual bug pages have been lost to the sands of time.) Good luck!

  1. CREATE for existing object with IDENTITY column causes duplicate IDENTITY values
  2. Statistics maintenance causes significant bottleneck on SQL Servers that use 16 or more CPUs
  3. Assertion failure occurs when you try to back up database in limited disk space
  4. TCP Timeout or login time-out error occurs when you connect to SQL Server using Integrated Authentication
  5. Error removing log shipping on secondary when database name has a quote
  6. Upgrade fails with an error when sysadmin account “sa” is renamed
  7. SQL Server resource DLL may fail to log to the event viewer in a cluster
  8. Configuration option network.enablekdcfromkrb5 now set to false instead of true
  9. SELECT may not return all rows if it contains a large number of values in an IN clause on a NUMERIC column
  10. Query results are not as expected when you run a particular query from Excel
  11. Complex DISTINCT or GROUP BY query can return unexpected results with parallel execution plan
  12. “Non-yielding” error occurs when there is a heavy use of prepared statements
  13. INSERT statement incorrectly conflicts with constraint
  14. DBCC STACKDUMP doesn’t generate dump file
  15. Combination of multiple EXIST, OR, and subquery clauses may give a sub-optimal plan (you know, I’m gonna go out on a limb and say that still happens)
  16. Incorrect record is deleted or updated when a clustered index is in descending order
  17. Assertion occurs when a parallel query deletes

Let me know how you did in the comments.

Updated First Responder Kit and Consultant Toolkit for April 2019

This month, lots of improvements and bug fixes, including checking for paused resumable index operations, Evaluation Edition expiration dates, sp_BlitzCache shows the number of duplicated plans for each query, sp_BlitzIndex will skip a list of databases for you, and more.

To get the new version:

sp_Blitz Changes

  • Improvement: check for First Responder Kit stored procs that are out of date (in case you’re updating sp_Blitz, but not its relatives.) (#1994, thanks JeffChulg!)
  • Improvement: check for resumable index operations that have been paused. (#2010)
  • Improvement: warn on Evaluation Edition expiration dates. (#2017)

sp_BlitzCache Changes

  • Improvement: shows the new air_quote_actual plans from sys.dm_exec_query_plan_stats on SQL Server 2019 and Azure SQL DB. (#2019)
  • Improvement: show the number of multiple plans for a query in the Warnings column, like “Multiple Plans (105)”. (#2026)
  • Fix: arithmetic overflow when calculating terribly bad total reads for a plan. (#2018, thanks IanManton for the bug report.)
  • Fix: Azure SQL DB wasn’t allowed to sort by memory grants or spills because we were checking version numbers rather than sys.all_columns to check if the necessary columns exist. (#2015)
  • Fix: the output table wouldn’t get created when called with @SortOrder = ‘all.’ The table would get populated if it already existed, it just wouldn’t get created for the first time. (#2009)

sp_BlitzFirst Changes

  • Improvement: when we call sp_BlitzCache in the 15-minute Agent job for the Power BI Dashboard, we now run it with @SkipAnalysis = 1, @SortOrder = ‘all’. This runs faster (because it doesn’t analyze the plan XML looking for anti-patterns), and simultaneously gets you more query plans because we sort by all methods, every time. (#2009)

sp_BlitzIndex Changes

  • Improvement: new @IgnoreDatabases parameter lets you skip a comma-delimited list of database names. We use that same logic to populate databases with over 100 partitions, which means @GetAllDatabases = 1 works normally, but skips databases with over 100 partitions (and shows them in the output list as to why they were skipped.) To examine those, use @BringThePain = 1. (#1944, thanks Richard Hughes for the idea and Rich Benner for the starting point for the code.)

sp_BlitzLock Changes

sp_BlitzQueryStore Changes

  • Fix: fixed error “The multi-part identifier “s.is_cursor” could not be bound.” (#2024)

sp_ineachdb Changes

  • Fix: now it actually does stuff even if you don’t specify @Help = 1. I tell you what, people, this is how I can tell you’re not using this proc yet, hahaha. (Pull #2035, thanks Aaron Bertrand.)

For Support

When you have questions about how the tools work, talk with the community in the #FirstResponderKit Slack channel. If you need a free invite, hit SQLslack.com. Be patient – it’s staffed with volunteers who have day jobs.

When you find a bug or want something changed, read the contributing.md file.

When you have a question about what the scripts found, first make sure you read the “More Details” URL for any warning you find. We put a lot of work into documentation, and we wouldn’t want someone to yell at you to go read the fine manual. After that, when you’ve still got questions about how something works in SQL Server, post a question at DBA.StackExchange.com and the community (that includes us!) will help. Include exact errors and any applicable screenshots, your SQL Server version number (including the build #), and the version of the tool you’re working with.

SSMS v18 and Azure Data Studio: The Golden Age of SQL Server Tooling

Yesterday, SQL Server Management Studio 18.0 shipped. Here’s the official announcement, the download page, and the release notes.

Yes, it’s still free, and yes, they’re still adding features. Yes, they removed database diagrams and the debugger. (Update Apr 27: if you want the debugger, use Visual Studiodownload.) No, SSMS 18 doesn’t run on Windows 8 or older. Yes, it still has bugs (all software does), but they’ve been working hard on making it less buggy, as evidenced by the release notes:

“Crashes And Freezes Into Me” is the name of my Dave Matthews Band cover band

You, dear reader, are so lucky. Every time there’s a new release of SQL Server or SQL Server Management Studio, you can grab the latest version of SSMS and keep right on keepin’ on. Your job still functions the same way using the same tool, and the tool keeps getting better.

And it’s free. You don’t have to ask the boss for upgrade money. You can just download it, install it, and take advantage of things like the cool new execution plan est-vs-actual numbers (which also cause presenters all over to curse, knowing that they have to redo a bunch of screenshots.)

I spend a lot of time jumping back & forth between SQL Server and Postgres, and lemme just tell you, the tooling options on the other side of the fence are a hot mess. It’s a sea of half-ass, fully-expensive tools with a mishmash of feature coverage. Pick one, and you just pray that the maker keeps updating it and adding new features. Sometimes they don’t. Often, they don’t.

Microsoft isn’t just building one good free tool.

In April 2019, most people who work with Microsoft SQL Server should be using SQL Server Management Studio (SSMS.) It’s free, very robust, and it’s easy to find tutorials that show you how to do what you want to do.

However, for some of us – a very small minority, and I’m in that group – mid-2019 marks the time to add a new tool to your toolkit. Azure Data Studio (ADS) is kinda like SSMS, but for developers. It’s less about management, and more about code authorship. As of this writing, it’s best suited for people who:

  • Use Macs or Linux as their daily workstation
  • Query both SQL Server and PostgreSQL
  • Don’t need to tune those queries to make them go faster
  • Need to source-control queries, say with Github
  • Want to share some queries in an easy-to-follow-along format (Jupyter notebooks)
  • Don’t mind using a less mature, less documented, rapidly changing tool

I’m being exceedingly careful here about marking a narrow audience. People who fall slightly outside of that audience – say, Windows users who need to performance tune queries – may also use ADS, but have a less satisfactory user experience. They might download & install it, use it a few times, and go, “What the hell, this tool is garbage.” The tool just isn’t ready for those users yet – maybe it will be in time, but right now, let’s focus on who it’s ready for.

Azure Data Studio is ready for me.

I’ve used Macs for over a decade, and for the last couple of years, I’ve been splitting time between SQL Server and PostgreSQL. Recently, I’ve been spending more and more time in Azure Data Studio accomplishing my work on both databases.

You still won’t see me using it in most of my training classes or presentations yet. As a trainer, I need to meet my students where they are, and the vast, vast majority of the SQL Server world is still in SSMS – and that’s totally fine. I’m not on a mission to migrate you.

If you have job duties like me, go give Azure Data Studio a shot, and if you need help, join the #azure-data-studio channel in the SQL Server community Slack.

If you’re more mainstream, go get the new release of SSMS 18.

Either way, it’s free, and you’re lucky.

BrentOzar.com is Turning 17 Years Old.

I registered it before then, and it used to be a static HTML site, but I consider the birthday to be the first published blog post from May 7, 2002. A few fun stats:

2002 Brent: at home with a Thinkpad with a webcam module on top, reading HAL-PC Magazine

To celebrate, take 50% off your first year with coupon code Happy17th:

It’s good for the next 17 days. Here’s to another year of fun!

Fixing Performance By Throwing Cloud At It

Your databases are hosted in the cloud – either in VMs or in a database-as-a-service – and you’re having a performance emergency that’s lasted for more than a day. Queries are slow, customers are getting frustrated, and you’re just not able to get a fix in quickly.

Just ask management if they want to throw hardware at it.

Here’s what your email should look like:

Right now, everyone’s unhappy with performance, and I’m in the middle of researching the issue. I’m not going to have a definitive answer – let alone a solution – for at least a few days.

In the meantime, do you want to spend $1,500 per week to temporarily increase performance?

If so, I can change our VM from an r5.4xlarge (16 cores, 128GB RAM) to an r5.24xlarge (96 cores, 768GB RAM.) Our VM costs will go from $293/week to $1,757/week. Those costs don’t include licensing because I’m not familiar with how we’re licensing this VM – that would be a separate discussion for Amy in Engineering, who manages the licensing.

Let’s buy stuff

The email is short and to the point because:

It ignores blame – I don’t care whose fault it is, whether it’s my own fault for being unable to fix it faster, or a growing customer base, or bad code, or maybe the database server should have been larger in the first place.

It doesn’t include an end date – maybe I’m going to be able to solve it in 3 days, but maybe it’ll take a month. This open-ended temporary solution buys us breathing room to do a really good job on the fix rather than duct taping something crappy.

It doesn’t quibble over VM sizes – because during an unforeseen, unsolved performance emergency, you simply don’t know which exact VM size is the best bang for the buck. Just go with the largest size possible, end of story. If someone wants to quibble over sizes, then they’re welcome to do the research to figure out what the right size is. You are not the person to do that research – you need to keep your head down solving the problem. If you don’t even know the root cause and the software fix yet, then you certainly can’t make an exact determination on which VM size will give you the right fix.

It translates technical debt into real dollars – because sometimes the business makes the conscious decision to ship less-than-optimal code in order to bring in revenue. Sometimes, they also want the ability to continue shipping that less-than-optimal code, and they’re willing to spend money to keep bringing in that revenue. That might frustrate the anal-retentive perfectionist in me that wants every line of code to be flawless, but the reality is that sometimes we just need to keep shipping and taking customer money.

It gives the business a choice – and choices help take heat off you. When a manager is screaming at me that we have to fix things faster, I love being able to say, “It sounds like this is really important to you, and you want to move quickly. Here’s a way you can make that happen right away. Oh, what’s that you say? You’re not really that interested? I see. Well, in that case, I’m not going to work 9AM-9PM 6 days per week to fix this, because I have a family at home.”

When your company chose to migrate to the cloud, I bet one of the reasons was to gain flexibility. Let that be your friend.

Free SQL Server Load Testing Tools

So you wanna run a load test against SQL Server.

One thing I need you to understand first: you have to provide the database and the queries. Almost all of the tools in this post, except the last one, are designed to help you run queries, but they don’t include the queries. The whole idea with load testing is that you’re trying to mimic your own workloads. If you’re just trying to test a server with generic workloads, start with my post, “How to Check Performance on a New SQL Server.”

Single-Query Load Tests

These tools let you write a query, and then run that query lots of times to stress load. Of course, this doesn’t really mimic production workloads, which will run lots of different queries concurrently. You can kinda mimic that with my technique to call random queries from a stored procedure, but even with that, we’re talking about a pretty low-tech technique here.

Microsoft RML Utilities and OStress (downloadinstructions) – very barebones command line tool that lets you run a query across a number of sessions, for a number of threads. No graphical user interface here – it’s very stripped-down and non-intuitive, but that’s also kind of the draw: it’s fast and simple.

(Open source) SQLQueryStress (downloadinstructions) – graphical utility originally written by Adam Machanic, and now maintained by Erik Ejlskov Jensen. I find this way easier to use, but it’s also slower than OStress. When you have a really well-tuned workload that returns hundreds or thousands of rows per query, it’s not unusual to hit ASYNC_NETWORK_IO waits with SQLQueryStress when it’s unable to keep up with digesting the results, something I don’t see as often with OStress.

Commence the load test

Load Replay Tests

These tools capture a workload (either from production, or from a group of users clicking around in dev/QA), and then help you replay that same workload again and again, either on the same SQL Server or different servers.

(Open source) WorkloadTools (download and instructions) – Gianluca Sartori’s command line tools to collect a workload (either with tracing or XE), analyze the data, and replay the workload in real time. If I was a full time DBA in charge of SQL Server load testing at a company, this would be the one I’d start with.

Microsoft Profiler – (built in) – in theory, you can use Profiler to capture a trace of queries in production, and then replay that trace against another server. In practice, it doesn’t work well because it doesn’t accurately mimic the concurrency issues you hit in production. I don’t recommend this as a load testing solution.

Microsoft Distributed Replay (instructions) – this tool aimed to solve Profiler’s weaknesses by capturing a Profiler-style trace, but then replaying that trace in a coordinated fashion across multiple load test servers, all aimed at the same SQL Server. Before you go too far down this road, watch Gail Shaw’s Pluralsight course on it, and read the tales of woe in the Microsoft forums. The learning curve on this is pretty steep – think days and weeks of setup & experimentation time to get this working really well, not hours.

Microsoft Database Experimentation Assistant (downloadinstructions) – this builds atop Distributed Replay to let you run A/B tests with different server settings, indexes, even SQL Server versions, and then tells you which queries got better (and which ones got worse.) The drawback here is that you gotta get Distributed Replay working before you can progress to this point.

(Open source) HammerDB (downloadinstructions) – can be used either to run a standard synthetic workload (like a TPC benchmark), or your own custom queries. This one can test not only SQL Server, but also MySQL, Oracle, PostgreSQL, and more. I would only go down this road if I was a performance tuning specialist in a large company with lots of databases – this thing is really specialized.

Regardless of which tool you choose, remember to use the tips from my post, “How to Load Test a Database Application.”

15 Reasons Your Query Was Fast Yesterday, But Slow Today

SQLSaturday Israel 2019 speaker shirt

In rough order of how frequently I see ’em:

  1. There are different workloads running on the server (like a backup is running right now.)
  2. You have a different query plan due to parameter sniffing.
  3. The query changed, like someone did a deployment or added a field to the select list.
  4. You have a different query plan due to a statistics change.
  5. You’re getting a different share of hardware (perhaps someone changed a hardware setting, introduced a noisy neighbor VM, or you have hardware competition in the cloud.)
  6. Someone added or removed an index.
  7. Someone added or removed a trace flag or other server-level config setting.
  8. Your application itself changed, and it’s not able to digest the results as quickly.
  9. Someone patched, and it had an unexpected side effect.
  10. You have the same plan, but different memory grants.
  11. Someone’s modifying more rows at a time, so you’re hitting lock escalation.
  12. Your query is being throttled by Resource Governor.
  13. It never really was that fast to begin with.
    And starting with SQL Server 2017 (if you have columnstore indexes) and SQL Server 2019 (for everyone):
  14. You’re getting a different memory grant due to adaptive memory grants.
  15. You’re getting a different join type due to the adaptive join threshold changing.