Date Tables are Great for Users, but Not So Great for Performance

Date tables help users filter data based on day of week, holiday, this year vs last year, and other kinds of filters that business users usually wanna add to their reports. If you haven’t used a date table before, here are a couple of primers on how to create one and use it:

And for the purposes of this post, I’m going to use the date_calendar table created in the second bullet point above. Here’s what it looks like:

Typically when people use date tables, they join from their real tables over to the date table, and they do their date calculations & filtering by just specifying columns in the date table.

For example, if you wanted to find out what day of the week people usually sign up for Stack Overflow accounts (database), you might write a query like this:

The results look like this:

Pretty cool! But…does it perform well? To find out, let’s write an old-school version of the query that doesn’t use a date table, and we’ll compare the two using SET STATISTICS IO, TIME ON to measure CPU, duration, and logical reads:

The metrics tell an interesting story:

  • Date table method: 100,074 logical reads, CPU time 3,422 ms, elapsed time 3,425 ms
  • Old school method: 20,095 logical reads, CPU time 3,671 ms, elapsed time 1,467 ms

The actual execution plans are obviously different:

So why did the date table not perform as well as the old-school way?

SQL Server doesn’t understand the relationship between these two tables. It simply doesn’t know that all of the rows in the Users table will match up with rows in the calendar table. It assumes that the calendar table is doing some kind of filtering, especially given our CAST on the date. It doesn’t expect all of the rows to match.

SQL Server decides it’s going to scan the date_calendar table first, and then for each date that it finds, it does an index seek into the Users index on CreationDate:

If you hover your mouse over that index seek on CreationDate, the number of executions are worth a laugh:

This index seek was done over 18,000 times – once for every day in our calendar table.

Because SQL Server thought the date table would do some filtering, it also underestimated the number of rows it’d find in Users, so it also underestimated the CPU work required, so the date table approach only went single-threaded. The old school way understood that lots of rows would be involved, so the work was parallelized across multiple threads.

That’s a simple query where I’m not even doing filtering – but query plans get even more predictable as you build atop this, adding filters on the date table looking for weekdays, specific days of the month, or ranges. SQL Server just has a hard time guessing how many rows are going to come out of the related tables. The more complex your filtering becomes, the crazier the estimates become.

So if I have date tables, what should I do next?

This quick experiment doesn’t mean that you should abandon your date tables and force everyone to do calculations against the base tables. Date tables are WONDERFUL, and I recommend them highly.

However, you just need to be aware that:

  • If you filter on the date table rather than the base tables, you’ll likely get inaccurate estimates
  • The more you rely on functions or ranges of data in the date tables, the more likely you are to get increasingly bad estimates
  • To fix these estimates, you may need to add relational columns on your base tables that define exactly which rows in the calendar table match which rows in your base table
  • When doing performance tuning on a query that uses a date table, consider briefly tweaking the query to not use the date table, and see how much better the execution plan gets. If it’s a night and day difference, you may need to work on the date table’s relationships or switch this one particular query to filter directly on the base tables rather than the date tables.

Got questions? Want to try different approaches to my queries above? Grab the Stack Overflow database and try running your own experiments. That’s why I love using the Stack Overflow database as a sample: it contains real-world distribution on stuff like dates so you can prove your own ideas or disprove mine.

Watch me write this post

I streamed this post live on Sunday, July 19:


[Video] The Top 10 Developer Mistakes That Won’t Scale on Microsoft SQL Server

You’re a C# or Java developer building an application that stores data in Microsoft SQL Server, and you want to avoid common pitfalls that will cause you scalability problems down the road.

In this session from my live stream, you’ll learn why you probably want to avoid triggers, SELECT *, user-defined functions, dynamic SQL, big datatypes, heaps, storing binaries in the database, old indexes, TempDB, and SQL Server’s v1 features:

This slide deck is a decade old: I wrote it back in 2010, but I’m revisiting it now to see what’s changed. I’m kinda surprised at how much I still agree with a decade later! My feelings have changed about a lot of SQL Server tactics, but 9 out of 10 of these are still a problem.

Wanna see more of this? Every Saturday & Sunday morning from 9AM-11AM Eastern (iCal), you can catch me on my TwitchYouTube, or Facebook channels.


A One-Slide Summary of the Differences Between TDE and Always Encrypted

The folks on Twitter liked this, so sharing it here:

It’s a one-slide summary from a SQL Critical Care client’s deck, so obviously it’s abridged, but I think it does a pretty good job of summing things up.

Some highlights of the conversation:

That’s it. That’s the post. If you want more words and pictures, follow @BrentO on Twitter.


Remember “Nothing Stops a Hekaton Transaction?” Yeah, About That.

When In-Memory OLTP first came out, Microsoft liked to go around bragging at conferences that “Nothing stops a Hekaton transaction.” I remember hearing that and kinda chuckling at the time, and I was reminded of it by a recent client who was running into blocking issues with…yep, Hekaton.

To reproduce it:

  1. Do a large insert into an in-memory OLTP table
  2. In another window, try to truncate that table (which gets blocked by query #1)
  3. In another window, try to another insert into that table (which gets blocked by query #2)

The results of sp_WhoIsActive show blocking:

The reason is rather amusing: In-Memory OLTP tables don’t actually support TRUNCATE TABLE. If you try it, you hit an error:

But SQL Server doesn’t check for that until after the TRUNCATE TABLE command gets a schema modification lock. In order to get that, the first query has to finish – and until he does, query #3 can’t run. Presto, good ol’ blocking. (I can’t call that a SQL Server bug with a straight face, either.)

The solutions:

  • When your code throws errors, don’t swallow ’em. Log them somewhere, and actively track them down. In this case, the TRUNCATE TABLE was a leftover from before the table was migrated to In-Memory OLTP.
  • Before you migrate, read the T-SQL limitations of In-Memory OLTP and unsupported features.
  • If you need to quickly delete the contents of an entire In-Memory OLTP table, don’t bother trying swapping back and forth between different In-Memory OLTP tables for staging and production, either. sp_rename is supported as of SQL Server 2017, but it faces the same schema mod lock problem as TRUNCATE TABLE.

Shout out to the client manager who said, “You have to blog about that to show that we were ambitious but rubbish.” Cheers, Tiny Irish: y’all were smart enough to hire me, so there’s always that.


What SQL ConstantCare® Has Been Finding In Your Servers Lately

SQL ConstantCareSQL ConstantCare® analyzed over 3,000 of your SQL Servers this month. Here are the things we check for and the number of servers where we found this problem this month:

Backup

Corruption

Patching

Performance

Eyes up here, kid

“I’ve seen things you wouldn’t believe.”

Planning

Query Tuning

Reliability

And The Top 10 Overall

  1. Update Available – 1,676 servers affected. I get it, patching is hard, and nobody’s doing it.
  2. Queries may have non-SARGable predicates – 1,076 servers affected.
  3. Consider Enabling Query Store – 1,025 servers affected. Keep in mind that this is 1,025 servers that are patched to the level where I’d be comfortable telling them to turn on Query Store, and…they’re still not doing it. I haven’t gotten to the point where I’m going to stop recommending it, but I’m getting pretty close.
  4. Check for Corruption ASAP – 824 servers affected. This one just blows me away: even when people know they need to do it, they won’t. Sure, sometimes they offload it, but there are even tiny sub-100GB servers in here that aren’t getting checked. That’s why I’ve asked Microsoft to run DBCC CHECKDB in the background.
  5. Query Forced to Run Single-Threaded – 819 servers affected. People love them some scalar functions, and we’re only warning you about queries in your top ~50 resource consumers.
  6. Transaction Log Larger than Data File – 736 servers affected. This alert filters for log files over 5GB, too, so it’s not even like these are small databases. Folks just don’t know how to get their log files under control for some apps, and they’ve given up trying. (I need to write a blog post about the rules that are getting ignored the longest.) Why would I be concerned? Because it affects restore times: the log file doesn’t get Instant File Initialization.
  7. Query with Missing Index Request – 733 servers affected. I don’t think this is just about Clippy’s bizarre suggestions, but it’s also about third party vendor apps where folks can’t touch the indexes.
  8. Compilation Timeout – 724 servers affected. We’re specifically warning here about queries that are in your top ~50 resource consumers.
  9. Joining to Table Valued Functions – 599 servers affected. SQL Server 2019’s scalar function inlining doesn’t help this, either.
  10. Heap Needs to Be Rebuilt – 598 servers affected. A lot of folks are using Ola Hallengren’s maintenance solution, and they’re surprised to hear that Ola purposely ignores heaps.

What I Think About All This

“There’s so much corruption in here, there isn’t even any real data left.”

Now, I know you, dear reader. You have great confidence in your ability to manage your environment. You believe you’ve got everything under control, and you couldn’t possibly have the above problems on your servers, right? Right?

Except you do.

When companies hire me for consulting, they get SQL ConstantCare® for free, and the first email is usually pretty shocking. Just last week, a customer found out that SQL Server was dutifully reporting database corruption, but they didn’t know because they didn’t have corruption alerts configured correctly and no one was monitoring the suspect_pages table.

I get it: monitoring is expensive. Except it isn’t: SQL ConstantCare® is $495 a year, and that isn’t per-server: that’s one price to keep an eye on all of your servers. You’re busy. Offload some work to me.


[Video] How to Think Like Clippy (Subtitle: Watch Brent Wear Another Costume)

Remember Clippy, the Microsoft Office assistant from the late 1990s? He would pop up at the slightest provocation and offer to help you do something – usually completely unrelated to the task you were trying to accomplish.

Sadly, the Office team told Clippy that he didn’t make the stack rankings cut, so he relocated over to the SQL Server team. He knows better than to share his face in SSMS, but he toils away tirelessly suggesting indexes you should add. “Hey buddy! It looks like you’re querying the Users table by Reputation. Would you like help?”

In my Fundamentals of Index Tuning class, I explain how humans think about designing nonclustered indexes. After a few hours of the fundamentals, we switch gears and show how Clippy comes up with his column orders. Students are pretty disappointed at Clippy’s simplicity, but on the flip side, he gets these recommendations done in a millisecond – something you and I would be hard-pressed to pull off.

I decided Clippy deserved his own webcast, so this weekend, I let him take over my stream and defend himself. Enjoy!

Wanna see more of this? Every Saturday & Sunday morning from 9AM-11AM Eastern (iCal), you can catch me on my TwitchYouTube, or Facebook channels.

Here’s the demo script Clippy used in the session:


How to Install Azure Data Studio to Work with SQL Server

Download Azure Data Studio on your desktop. Do this on your desktop, not the SQL Server itself. You always want to run management tools from your own machine to reduce the overhead on the SQL Server itself. After it downloads, install it – just accept the defaults. During the installation process, it may ask:

You don’t need to register Azure Data Studio. You can leave the checkbox enabled for Add to PATH (and no, I have no idea why PATH is CAPITALIZED), but you don’t need to restart immediately for the purposes of this blog post, nor do you need to do it quickly.

Launch Azure Data Studio. After the launch, it may ask if you want to enable preview features, and if so, click Yes.

Open the Extensions pane. Click View, Extensions. This is where you can enable all kinds of cool add-ons that give Azure Data Studio more power. Today, we’re looking for a power-up that will give us the ability to manage SQL Server Agent jobs, so type in SQL Server Agent in the search box:

And click the Install button. Which button? It doesn’t matter. Microsoft just likes to make it easier for you by spraying identically named buttons all over the screen. Relax. It’s not like these extra buttons cost you extra. This is all free to you. That’s the beauty of open source.

Now, you’re ready to connect to a SQL Server.

Click the Connections button up at the top left. It looks like a game of Qix. I would say that it looks like a physical server, but it’s been so long since you’ve seen a physical server that the comparison is about as relevant as a Nintendo game.

You can also access the Connections window by clicking Control-Shift-D.

Click on the New Connection button to add a connection to your SQL Server, and then fill in the details. For example, here’s my server’s connection details:

Server – the DNS name or IP address where your SQL Server answers connection requests. This is usually the same as the server name itself, but if you have fancier setups like named instances or non-default port numbers, you’ll need to specify those here.

Authentication type – probably Windows, try that first. If that fails, you either don’t have access to the server, or it’s configured with SQL authentication.

Database – leave at Default. If you pick a specific database, then you won’t be able to see the list of databases after you connect.

Server group – leave at default.

Name – this is a user-friendly name just for your own purposes. For me, I’m calling mine “Production” because sometimes my servers have long & complicated names, and it’s easier to tell at a glance which one I’m connected to. If you have lots of servers, you wouldn’t want to use “Production” – you might want to use, say, “Production – Accounting” to clarify which server you’re on.

If everything went well, you’ll see a list of databases like this:

If things did NOT go well, and you need to get help, here are a few places to get help:

Just keep in mind that the folks on the other side of the screen are volunteers, so say please and thank you a lot when they help you out with your challenges.


How to Get Started with SQL Server Management Studio

In the big picture, you need to:

  1. Download the free Microsoft SQL Server Management Studio and install it, accepting the defaults
  2. Connect to your SQL Server
  3. Create a low-privileged login so that you don’t accidentally drop objects
  4. Learn to switch back & forth between the low-privileged account and your regular one

Let’s get started!

1. Download and install SSMS (but not on the server itself)

The first step is really easy: go here to get the latest version of SSMS and install it. If you already have a version of SSMS, the installer will automatically update it to the latest version. It doesn’t matter what version of SQL Server you’re running in production: as long as you’re running a currently supported version of SQL Server (2012 & newer as of this writing), you always wanna run the latest version of SSMS, which will include a ton of bug fixes.

Do this installation on your desktop or laptop, not on the SQL Server itself. Over time, you’ll learn that running SSMS (as with any other app) on the SQL Server itself will slow it down. You wouldn’t remote desktop into the SQL Server and start playing Fortnite, now, would you? Don’t answer that. You probably would. You’re the kind of person who reads this blog, after all, and I … let’s just stop there. Install SSMS on your desktop.

2. Connect to your SQL Server

After launching SSMS, you get a connection dialog:

Server name – the DNS name or IP address where your SQL Server answers connection requests. This is usually the same as the server name itself, but if you have fancier setups like named instances or non-default port numbers, you’ll need to specify those here.

Authentication – probably Windows, try that first. If that fails, you either don’t have access to the server, or it’s configured with SQL authentication. You might have a username & password on a post-it note somewhere, like from the person who installed it, and it might have a username of “sa”. In that case, go ahead and use that for now. In the next step, we’ll set you up a low-privilege account.

Click Connect, and you’ll be handsomely rewarded with a window that looks like this:

At this point, you’re able to do … all kinds of dangerous things, actually. We need to fix that so you don’t do something stupid.

3. Create yourself a low-privileged login.

Start a new query by clicking File, New, Database Engine Query, or right-click on the SQL Server name and click New Query:

You’ll get an empty new-query window. Copy/paste the below into your new query window so we can find out if you have Windows-only authentication turned on:

If the result is 1, that means your SQL Server only allows Windows logins. In that case, I deeply apologize, but I’m not covering that yet in the scope of this blog post.

If the result is 0, good news! You can create SQL logins. To do that, copy/paste the below into your new-query window, and note that you have some changes to make:

Change brent_readonly to be whatever username you want: typically your regular username, but append _readonly to it so you know that it’s your less-privileged account. Note that you have to change “brent_readonly” in two places.

Change the password from “changeme” to whatever you want.

If you get an error that says:

Then you need to use a more complex password.

Finally, let’s make sure the new login can’t write anything in your existing databases. Copy/paste this in, change brent_readonly to your login name (note that it’s in 3 places), and run it:

That only affects the databases you have in place today. If someone restores a database into this environment, you’ll be able to write to it. If you’ve got some time and you’re willing to roll up your sleeves a little, here’s how to deny writes in all new databases long term.

4. Learn to switch back & forth between accounts.

Now that you have a low-privileged account, you’re going to want to use this by default when you connect into SQL Servers with Management Studio. This helps prevent you accidentally causing a resume-generating event.

Close SSMS, and reopen it again. This time, at the connection dialog:

Choose SQL Server authentication because we created a new SQL login, and then type in your low-privileged username and password. Click Connect, and you’re now working a little more safely, without the superpowers of your regular domain login.

If you do need to switch over to your regular high-permission account, you could close SSMS entirely and reopen it, but there’s an easier way. To change accounts for the current query window, click the change-connection button at the top right. It looks like a plug/unplug button:

That’ll give you the connect-to-server dialog box. Note that the login change only affects the currently open query window: that’s the safest way to minimize the damage. Then, as soon as you’re done doing high-permissions stuff, close that window, and you’re back to your regular low-permission stuff.

Next steps for learning

Now that you have an account that’s safe to use for learning, here are a few next steps of tools to explore on how to get to know your SQL Server better.


Announcing Saturday & Sunday Office Hours.

Office Hours Podcast

During the quarantines, I’m doing free training. It looks like we’re going to be stuck inside here for another long while, so for the next couple of months, I’m bringing back Office Hours, but on a different schedule.

Every Saturday & Sunday morning from 6AM-8AM Pacific (iCal), you can catch me on my TwitchYouTube, or Facebook channels. I’ll be:

I totally understand that weekend mornings aren’t a great time fit for everybody’s schedule, but it’s the time that works best for my own schedule (given my existing work.) But if you can’t make it live, no worries – all of the streams will be recorded and available in Twitch, YouTube, and Facebook. If I have a particularly good stream, I’ll put it here on the blog as well.

Wanna see what they’re like? Here’s this past Saturday’s stream when I worked on pull requests in the First Responder Kit:

And here’s Sunday’s stream when I wrote a blog post about date table query plans and told the story of when Microsoft offered me a job:

 

If you follow me on my TwitchYouTube, or Facebook channels, I also do other surprise broadcasts throughout the day. As I work, I just check to see if it’s something I can share with y’all publicly, and if so, I start a stream. I won’t usually announce topics ahead of time, but I will say that this coming Saturday’s stream will involve Clippy, a green spandex suit, and a pair of giant white hands. See you there!


Get Alerted When Your SQL Server Restarts with sp_SendStartupEmail

When your SQL Server restarts, how do you find out?

I’ve seen a few senior DBAs do a really smart trick: they write a stored procedure to send them an email, and then they mark that stored procedure as a startup stored procedure so that it runs automatically whenever SQL Server starts up. (You could also try this with an Agent job scheduled to run at Agent startup, but … Agent may not actually start.)

So I wrote sp_SendStartupEmail to do just that, and the body of the email lists any databases with unusual states. It looks like this:

You can get it from my Github repo here, but it won’t be part of the First Responder Kit – I just wanted to put it out in Github so y’all could check in changes if you wanted. The documentation is in that script. (No, it’s not self-documenting code, hahaha – the documentation is in the comments.)

You can watch me write it during one of my recent Twitch / YouTube / Facebook streams:


What If You Really DO Need to Shrink a Database?

You’ve heard that shrinking a database is bad because it introduces both external and internal fragmentation, it causes blocking, it causes transaction log growth while it runs, and it’s slow and single-threaded. You understand that if it’s just a matter of 10-20-30% of a database, and the database is only 100-200GB, you might as well just leave the space there, because you’re gonna end up using it anyway.

Eyes up here, kid

No, you don’t need to shrink your 50GB database

But your situation is different:

  • Your database is 1TB or larger
  • You’ve deleted 50% of the data
  • You have 500GB+ empty space
  • You’re never going to need that space because you’re now doing regular deletions and archiving

You’ve been tasked with reducing the database size to make restores to other environments easier, like when you need to restore to a QA or development environment. You don’t want those servers to need so much useless space.

Don’t shrink. Do this instead:

  1. Add a new filegroup, add new empty files in it, and set it to be the new default filegroup
  2. Move objects to the new filegroup using ALTER INDEX…REBUILD commands
  3. When you’ve moved all the objects over, shrink the old filegroup, and it’ll shrink super-quick
  4. Just leave it there – this is your new database layout

The ALTER INDEX…REBUILD approach has several key advantages over shrinking:

  • It can use parallelism: you can specify how many cores to use with MAXDOP hints. This is really helpful on servers with MAXDOP = 1 at the server level because your query-level hint will override it, even overriding it upwards.
  • It can move a lot more than one 8KB page at a time. Heck, you can even do several index rebuilds simultaneously on different threads if you want to, really churning through the work quickly.
  • You pick the days/times for each object: maybe some of your objects have very heavy concurrency during some time windows, and you want to avoid touching those until a maintenance window.
  • You pick the settings for each object: rebuilding an index with ONLINE = ON is slower. Some of your objects might be archive tables or unused during certain days/times, so you can use ONLINE = OFF on those to get faster performance.

But it does have a few drawbacks, too:

  • It’s faster, but it’s also higher load: shrinking a database is a low overhead process: it’s hard for anybody to notice you moving just one 8KB page at a time, with just one CPU core. Index rebuilds, buckle up: people are gonna notice when you throw a ton of CPU cores at the problem and really start hammering your storage. This is just the flip side of the coin to finishing faster: if you wanna finish faster, you’re gonna do more work in less time.
  • This also means it generates more logs, faster: because we’re moving so much data and it’s a fully logged process, this can present problems for transaction log backups, log file sizes, database mirroring, Always On Availability Groups, and storage/VM replication. Ease into this gradually, starting with your smaller tables first, so you can see the impact it’s having on your transaction log sizes.
  • ONLINE = ON isn’t fully online: even online index rebuilds need a momentary schema mod lock in order to finish their work. Thankfully, since SQL Server 2014, we’ve had the WAIT_AT_LOW_PRIORITY option to help mitigate that problem.
  • You have to do some prep work: it’s easy to run DBCC SHRINKDATABASE, but ALTER INDEX…REBUILD will take much more work if you want to leverage all the cool advantages I discussed above.

Thankfully, there’s help on that last one: Bob Pusateri has an in-depth writeup and a script to help.


Where is the SQL Server Community Networking Online?

You’ve stumbled upon this blog, and you’re having a pretty good time. Now you’re wondering, “Where else can I network with database folks, get to know them, and talk shop?” You’re not getting out due to the quarantines, so your local user group isn’t fulfilling your networking cravings. You wanna know what else is going on. I’ve got a ton of good news.

We’re freakin’ everywhere, and it’s all free to join. Here are some of my favorite community gathering spots – but I don’t want you to think I use all of ’em all the time, by any means. Dabble around a little, then pick the ones that works the best for your own style.

SQLCommunity.Slack.com – get your instant free invite, then join channels like #SQLhelp for Q&A, #welcome for general chat. To see the list of popular channels, click the + sign at the top right of the channel list.

Then in the Sort method at the top right, choose most members to least, and that’ll find the lively rooms. I would disable notifications in your Slack client though – people tend to talk a LOT in here.

#SQLhelp, #SQLfamily, and #SQLserver on Twitter – the more you use Twitter, the more you start to follow specific people, have discussions, and build up your own personal network. This is probably my favorite out of the options in this post. In the beginning, before you’ve picked the folks you want to follow, just read posts with the #SQLfamily tag and find folks to follow.

DBA.StackExchange.com – like StackOverflow.com, but for database questions. I’m now up to 3 different ways to ask questions – Slack, Twitter, and DBA.se – but here’s how I think of ’em: Slack & Twitter are best for short, clear questions that are going to require a lot of back-and-forth discussion to get an answer, but you’re pretty sure you’ll get an answer quickly. DBA.se is for questions that will require more detailed documentation, and may also need to stick around on the web for a while in order to get a really good answer. (Questions disappear pretty quickly from Slack and Twitter.) Before you post a DBA.se question, check out Writing the Perfect Question by Jon Skeet. It’s a lot of work, but when you have a complex question, the free consulting help from the community pays off.

Reddit – /r/sqlserver, /r/sql, and /r/database –  because these are general subreddits, the discussion isn’t very targeted: you get a wide mix of technical questions, career advice, and just interesting links. If you had to only pick one place to network, this probably wouldn’t be a bad choice because it gets you a wide variety. There are a lot of homework questions posted here, though.

SQLServerCentral.com forums – forums are so 2000, I know, but hear me out: a lot of us in the database industry aren’t exactly spring chickens. If you’re the kind of person who enjoys a good old-school forum, there are hundreds of other folks actively chatting over at SSC. Check the Active Threads and Latest Topics.

Blog comments: I know it’s gonna sound strange, but the best part of some web sites are actually the comments. For example, I read Hacker News and BringATrailer auctions mostly for the insightful (but also crazy) comments. If you read blog posts for the comments, it’s best to read the posts (and the comments) about a week after they originally go live. That way you’re seeing the regular readers’ comments, and the author’s responses, but you aren’t subjected to the crazy Googlers who stumbled in six months later and want to ask totally irrelevant stuff. Here’s how I do it: I share my favorites, all my subscriptions, or OPML file – there are a lot of people publishing a lot of good free content these days. I read hundreds of posts per week to keep up, so if you haven’t got time for the pain, just hit my live favorites or my live bookmarks. But just check into those once per week, and read the comments then.

In summary, the SQL Server community is one of the most active, open, and friendly free communities I’ve ever seen. Folks love sharing their knowledge and meeting new peers. Find the platform that works the best for you, and you can enjoy free networking with your peers all year long.

Just be aware that as you start to do networking, especially in the year 2020, you’re going to encounter a lot of folks who use networking spaces to talk about other stuff, not just SQL Server. People have all kinds of opinions about all kinds of subjects, and thanks to the online disinhibition effect, they sure like to share their opinions publicly. It’s up to you to find the right medium that works best for you. Don’t fall into doomscrolling, spending all your time reading everyone’s every opinion about every news story.

Curate what you see on a daily basis. If someone isn’t bringing you joy, unfollow them. If a social network isn’t making you happy, leave it. Use free networking to help you survive and thrive during these tough times: it can be a force for good and happiness.


Updated First Responder Kit for July 12, 2020: Fixing sp_BlitzFirst @ExpertMode

I don’t normally do multiple releases per month, but sp_BlitzFirst had a pretty big bug in the last release for those of us who do live performance tuning. In the July release, a well-meaning contributor tried to make sp_BlitzFirst faster when @OutputType = ‘NONE’, but inadvertently also made it faster when @ExpertMode = 1 – by skipping the output. Oopsie.

If you’re working directly with the releases on Github, note that the “master” branch has been renamed to “main” instead.

To get the new version:

sp_BlitzFirst Changes

  • Fix: wait stats weren’t being returned when @ExpertMode = 1. (#2463, thanks Rich Benner and Joseph LaBonde.)

sp_BlitzIndex Changes

  • Fix: SQL Server has a bug that ignores isolation level hints on sys.identity_columns (note: no response from Microsoft on that one despite being open for several months, and hello, blue badges, that is why people get so pissed off about feedback.azure.com being like yelling into the void. This is a bug in your product, and you’re not even taking the time to read the bugs that get reported. HELLOOOO) Anyhoo, we get blocked by offline index rebuilds. To fix it, we now have a 1-second lock timeout, and sp_BlitzIndex will simply bail out if there is blocking. (#2176)

For Support

When you have questions about how the tools work, talk with the community in the #FirstResponderKit Slack channel. If you need a free invite, hit SQLslack.com. Be patient – it’s staffed with volunteers who have day jobs.

When you find a bug or want something changed, read the contributing.md file.

When you have a question about what the scripts found, first make sure you read the “More Details” URL for any warning you find. We put a lot of work into documentation, and we wouldn’t want someone to yell at you to go read the fine manual. After that, when you’ve still got questions about how something works in SQL Server, post a question at DBA.StackExchange.com and the community (that includes me!) will help. Include exact errors and any applicable screenshots, your SQL Server version number (including the build #), and the version of the tool you’re working with.


Want Your Own Private Conference?

Your team has been reading my blog, watching my free public streams, and using the First Responder Kit scripts for years. They’ve gone as far as they can go on their own, learning from Google, and now they’re ready for the next level.

I can bring the conference to you – except way better than a conference, because all of the sessions are specifically relevant to you, and you have plenty of private Q&A time with me to discuss SQL Server issues your company is facing – things you can’t ask in a busy public conference.

Here’s how it works:

Step 1: choose topics from my training catalog so you can figure out how many days of training you want.

Step 2: Check my availability, add the dates you want, then check out here. My available dates are at the bottom of that page. Payment is required to hold your dates. If you check out with the Check/PO option, I’ll contact you when I’ve received your check payment (address is in the invoice), and we can coordinate dates at that time.

Step 3: we get together on your dates and have fun learning! Let’s do this.


The 201 Buckets Problem, Part 2: How Bad Estimates Backfire As Your Data Grows

In the last post, I talked about how we don’t get accurate estimates because SQL Server’s statistics only have up to 201 buckets in the histogram. It didn’t matter much in that post, though, because we were using the small StackOverflow2010 database.

But what happens as our data grows? Let’s move to a newer Stack Overflow database, the 2018-06 one that I use for my Mastering training classes. We’ll create the same index, giving us the same statistics, and then look at the histogram to see how Miami grew over time:

Now we’re up to about 9 million users:

And the statistics histogram for Miami’s area;

The data distribution has changed, but Miami still isn’t big enough to have its own bucket. Now:

  • Miami, FL is between México and Minneapolis (note that Michigan no longer has its own bucket)
  • AVG_RANGE_ROWS = 13.3 means that any location in between México and Minneapolis will get an estimate of 13.3 rows

So when we run our Miami query, what’s it look like now:

Estimated number of rows in Miami, FL is 13, but in actuality, 625 rows come out of the index seek. The variance is starting to grow now that our data sizes have grown larger.

So what do we do about Miami’s inaccurate estimates?

Again, in this case…nothing. 625 rows vs 13 rows isn’t really that big of a deal. In my Fundamentals of Query Tuning class, I talk about how you want to read query plans from right to left, looking for places where estimates vs actual are 10x off or more. In this case, we’re 48x off, but…it doesn’t really matter that much. SQL Server still allocated enough memory to do the sort, and it didn’t spill to disk. (Note there’s no yellow bang on the sort operator.)

However, as I noted yesterday, it will become a big deal if we join to other tables. Let’s see what happens if we add a join to Comments, for example, to display the top comments left by people from Miami:

The actual plan:

We read plans from right to left, top to bottom, to understand what SQL Server did in what order. (You can also read from left to right in some cases, and I’m just legally required to say that, because if I don’t, Grant Fritchey and Hugo Kornelis will smother me as I sleep.)

So from right to left:

  1. SQL Server did an index seek on Location = Miami, FL, and expected only 13 rows to come back
  2. In reality, 625 rows came back
  3. So we also had to do 625 key lookups
  4. We had to do 625 index seeks on Comments.UserId because each person has their own set of comments
  5. In total, we found 14,176 comments left by people in Miami, FL, so that poor Sort operator never stood a chance. He has a yellow bang because he ended up spilling to disk.

So, really, now, seriously,
what do we do about Miami’s inaccurate estimates?

In this example, the query still runs fairly quickly. But if you run into this kind of problem with a larger query, larger spills to disk, larger estimate variations, and so forth, here are some techniques you can consider for performance tuning:

But the big takeaway for me is that 201 buckets in a histogram just isn’t enough to portray real-world data skew, especially as your data size grows into the millions of rows and beyond.


The 201 Buckets Problem, Part 1: Why You Still Don’t Get Accurate Estimates

I’ll start with the smallest Stack Overflow 2010 database and set up an index on Location:

There are about 300,000 Users – not a lot, but enough that it will start to give SQL Server some estimation problems:

When you create an index, SQL Server automatically creates a statistic with the same name. A statistic is one 8KB page with metadata about the object’s contents. In this case, it’s a list of up to 201 Location values, and the number of people who live in each of those locations. You can see the statistics with DBCC SHOW_STATISTICS, as we talk about in my free class How to Think Like the SQL Server Engine:

This helps SQL Server estimate how many rows will be returned. If the location you’re looking for happens to be one of the outliers that SQL Server chose as large enough to get its own bucket in the histogram,like Ahmadabad, India above, then SQL Server knows a lot about it:

  • RANGE_ROWS = 54: this means that in the range of Locations between Adelaide, Australia and Ahmadabad, India, there were 54 rows, excluding Adelaide and Ahmadabad.
  • EQ_ROWS = 159: this means that Ahmadabad (the RANGE_HI_KEY) had exactly 159 rows.
  • DISTINCT_RANGE_ROWS = 25: this means that of the 54 rows in the RANGE_ROWS group, there were 25 distinct locations.
  • AVG_RANGE_ROWS = 2.16: this means that if you pass in a location between Adelaide and Ahmadabad, SQL Server will guess that it has 2.16 rows.

For example, if I query for Ahmadabad, India, which happens to be one of the outliers that got its own bucket, your estimates will be pretty doggone accurate, like this:

Because Ahmadabad is an outlier, and it has its own bucket as shown in the screenshot above, SQL Server estimates exactly 159 rows. (See the EQ_ROWS value in the screenshot above.)

This works really well – BUT, only if you’re one of the outliers. What if you’re a big location, but not big enough of an outlier to get your own statistics bucket? Like, say, Miami, FL?

Here, I’ve even used OPTION (RECOMPILE), and yet I’m getting an estimate of just 10 rows when there are actually 99 Floridians? To find out, scroll down in the DBCC SHOW_STATISTICS output until you’re in the Miami area:

There’s your problem. Miami, FL is in between Mexico and Michigan (strangely appropriate), and so the histogram means:

  • Miami, FL isn’t there
  • Miami, FL is in the bucket between Mexico and Michigan
  • AVG_RANGE_ROWS = 10.36 means that if you’re looking for any location between Mexico and Michigan, SQL Server will estimate that the location has 10.36 people in it – and in this case, that’s not so accurate for Miami.

Miami, FL isn’t big enough to be one of the 201 outliers featured in the statistics buckets, but it’s large enough that it has a relatively unusual number of people (and boy, as Richie can tell you, the people in Miami are unusual.)

Will updating statistics fix this?

To find out, we’ll give SQL Server the best possible chance at tackling the problem:

Which still produces the same stats:

No matter how many times you update stats, there are still just only 201 buckets max, and Miami doesn’t have enough data to be one of those outliers. To find similar locations, do a TOP 1000, and the folks in the 200-1000 range are probably going to be your outliers that get bad estimates:

And sure enough, down over the 200 range is our Floridian friend:

Note that the top 200 rows by count(*) aren’t necessarily the ones who get their own buckets. Depending on data distribution and who’s next to each other, some of these may be grouped together. Overall, the 200-1000 range are great examples of possible problematic estimates, though.

So what do we do about Miami’s inaccurate estimates?

Well, in this case…nothing. 99 rows vs 10 rows isn’t really that big of a deal, in the grand scheme of things. It will become a big deal if:

  • The data size starts to grow – note that I’m using the StackOverflow2010 database in this post, but stay tuned for tomorrow’s post
  • We join to other tables – in which case the number of rows in each related table will start to cause a problem

Stay tuned for tomorrow’s post in which we’ll start to explore situations where this would be a bigger issue.


Updated First Responder Kit and Consultant Toolkit for July 2020

This month, 15 community folks contributed code – I think that’s an all-time high for a single month’s release! Good work, y’all. Thanks for payin’ it forward. The bad news is that they’re almost all bug fixes, hahaha. I do love that, though – given enough eyeballs, all bugs are shallow.

If you’re working directly with the releases on Github, note that the “master” branch has been renamed to “main” instead.

Brent Ozar's Consultant ToolkitTo get the new version:

Consultant Toolkit Changes

Updated the First Responder Kit with this month’s script updates, plus:

  • Improvement: the config file now lets you specify default values for the –upload and –deepdive command line switches. This way, you can pass even less instructions on to your clients – just have ’em run the utility, and you get their diagnostic data in your Amazon S3 bucket within minutes. Read the auto-upload documentation for more details.
  • Improvement: new –applicationintent parameter lets you connect to readable secondaries in an AG. Possible options are ReadWrite or ReadOnly.
  • Fix: no arithmetic overflow even if you have truly ginormous bigint sp_BlitzCache metrics. (#2270.)

sp_Blitz Changes

  • Improvement: the installation scripts create the SqlServerVersions table if it doesn’t exist, and repopulate it. (#2429, thanks Curtis Browne.)
  • Fix: Update descriptions of operating systems and no longer call Windows 2012 & R2 “pretty modern.” (#2418, thanks Randolph West.)
  • Fix: when alerting about tables in the master database, ignore the Dynamics NAV license table. (#2426, thanks Johan Parlevliet.)

sp_BlitzCache Changes

  • Maybe fix: possible arithmetic overflow when checking for many duplicate plans in the cache. Not entirely sure whether this actually fixes it, though, because I couldn’t reproduce it. (#2425, thanks smcnaughton, Randolph West, and Santi Swarup Choudhury.)

sp_BlitzFirst Changes

  • Improvement: now runs faster if @OutputType = ‘NONE’. (#2423, thanks Jefferson Elias.)
  • Improvement: new check for >10% of your memory being used by the USERSTORE_TOKENPERM cache. (#2134)
  • Fix: last month’s new update-stats check would fail if you had a LOT of stats updates at once, like, well, an update-stats job. Doh! Now we just show the first 4,000 characters of updated stats, sorted by rows in the table descending, so you can get a rough idea of when you had big churn. (#2409, thanks CJR aka camaro322hp.)
  • Fix: remove Page Life Expectancy warning. It’s a bad metric to monitor in the year 2020. (#2433)
  • Fix: no longer alert on >10000x cardinality UNDER-estimations because they’re false alarms when the query just started and hasn’t retrieved data yet. (#2438, thanks Nicklas Bjälemark.)
  • Fix: skips last month’s new update-stats check when you have >20 databases. (#2439, thanks Will McCardell.)

sp_BlitzIndex Changes

  • Improvement: In-Memory OLTP indexes now have a call to sp_BlitzInMemoryOLTP in the “More Info” column. (#296)
  • Improvement: in the readme, documented the levels of support for specialized index types (columnstore, graph, spatial, temporal tables.) I’m not really actively adding support for those, but just wanted to make it clear in the documentation that they show up in the results, just with varying levels of details. For example, the sizes of most of ’em aren’t shown.
  • Fix: for disabled indexes, the “Create TSQL” column now shows a create rather than a drop. (#2447, thanks Filip Cornelissen.)

sp_BlitzLock Changes

  • Fix: now works on case sensitive instances. (#2440, thanks pnauta.)

sp_BlitzWho Changes

  • Fix: handles long database & schema names when creating the output tables. (#2421, thanks Jefferson Elias.)
  • Fix: adds compatibility with Azure SQL DB by avoiding querying sysjobs. (#2435, thanks Jacob Golden.)

sp_DatabaseRestore Changes

  • Improvement: file paths now take a comma-delimited list of paths for striped files. (#2180, thanks CubsRep.)
  • Fix: won’t error out if you specify both @StopAt and @OnlyLogsAfter at the same time. (#2348, thanks Greg Dodd.)

sp_ineachdb Changes

For Support

When you have questions about how the tools work, talk with the community in the #FirstResponderKit Slack channel. If you need a free invite, hit SQLslack.com. Be patient – it’s staffed with volunteers who have day jobs.

When you find a bug or want something changed, read the contributing.md file.

When you have a question about what the scripts found, first make sure you read the “More Details” URL for any warning you find. We put a lot of work into documentation, and we wouldn’t want someone to yell at you to go read the fine manual. After that, when you’ve still got questions about how something works in SQL Server, post a question at DBA.StackExchange.com and the community (that includes me!) will help. Include exact errors and any applicable screenshots, your SQL Server version number (including the build #), and the version of the tool you’re working with.


What Do You Do Better Than the Cloud?

It sounds like a trick question, but I’m serious.

If your company’s management is just now starting to consider the cloud in 2020, your reaction shouldn’t be to cast a shadow on cloud vendors. Instead, think of it as writing your own resume: what the capabilities that you and your team are really proud of?

What are you currently doing better than anybody else?

It’s a serious question with legitimate answers.

For example, at one of my clients with 10-20TB of data per server, the production DBA team works really hard on their restores. They have the whole process automated, and they regularly performance tune it so they can restore to a point-in-time as quickly as possible. In a matter of seconds, any DBA on their team can run a script that will:

  • Put the soon-to-be-nuked database in a read-only state, keeping their application somewhat online in a degraded state
  • Initiate the restore process onto the primary replica using a database with a different name
  • Start communications with the affected teams (“your database restore has started” – “your database restore is now 25% complete” – “your database restore is now 50% complete – current status, seeding the AG secondaries” – and so forth)
  • Rename the soon-to-be-nuked database (so that it’s still online for reads, but with a different name, so application owners can do some triage) and take it out of the Availability Group
  • Rename the newly restored database to take the old one’s place
  • Email the affected teams, stop the timers, and track their success metrics for their quarterly RPO/RTO service level agreement review meetings

(Don’t even get me started on why they need to do big restores so often and so quickly, but let’s just say that not everyone’s deployment processes can be done with an undo script.)

I think the time has come and gone for small businesses to even have this discussion. For small businesses, it just doesn’t make sense to try to run your own IT for new applications. However, for the kinds of readers I’ve got here – who tend to work in more specialized environments that have full time employees focused on the data – the question still has plenty of legitimate answers in the year 2020, despite what cloud vendor brochures say.

So I’m curious: what do you think you do better than the cloud?


You Can Disable Parameter Sniffing. You Probably Shouldn’t.

During my parameter sniffing classes, people get a little exasperated with the complexity of the problem. Parameter sniffing is totally hard. I get it. At some level, it’d be great to just hit a magic button and make the whole thing go away.

So inevitably, somebody will ask, “What about the database-level setting called Parameter Sniffing? Can’t I just right-click on the database, go into options, and turn this damn thing off?”

What they THINK is going to happen is that SQL Server will do an OPTION(RECOMPILE) on every incoming query, building fresh plans each time. That ain’t how this works at all, and instead, I wish this “feature”‘s name was “Parameter Blindfolding.” Here’s what it really does.

Equality searches often benefit when you disable parameter sniffing.

I’ll take the Stack Overflow database, create an index on Users.DisplayName, and then build a stored procedure that queries users by their DisplayName:

The classic problem with parameter sniffing is that if I call the proc for a relatively rare name like Brent Ozar, SQL Server builds a plan that’s great for a limited number of rows – but then the plan gets reused for more common display names like Alex. There are about 10K Alexes, and the index seek + key lookup may make less sense there, and we get a spill to disk because SQL Server didn’t grant enough memory for the sort:

Folks think, “Well, that’s bad – I want to avoid parameter sniffing. So let’s turn off parameter sniffing, and that’ll fix it, right?”

Then try running it for Alex. The good news is that SQL Server doesn’t bother sniffing the parameters:

The bad news is that we have a different problem: SQL Server is now ignoring the incoming parameters and optimizing for the average DisplayName rather than the one being passed in.

But you know what? In some cases…that’s totally okay. You might actually want to optimize for the average value. In normal data distributions, this means you’re going to be optimizing for index seeks rather than table scans, and who doesn’t like index seeks? I like index seeks. I mean, they’re not my fetish, but I like ’em. They’re alright.

But range searches suck.

You’re not likely to do a range search on a DisplayName, so let’s switch columns and look at the LastAccessDate. We’ll turn parameter sniffing back on:

When parameter sniffing is ON, the small date range sets the stage, uses the index, and gets the small grant, and that backfires for the big date range:

Classic parameter sniffing problem. So when we “fix” it by disabling parameter sniffing, and then run the small date range:

This causes several interesting problems:

  • The row estimates are completely insane: 16% of the table
  • This is way over the tipping point, so SQL Server ignores indexes
  • The query will always use parallelism, regardless of how tiny the date range is
  • The memory grant is staggeringly large: 5GB of memory gets erased when this query runs

Parameter Blindfolding makes a lot of queries worse.

If you only have equality searches, not range searches, you could probably disable parameter sniffing at the database level and do okay. If you’ve got range searches, though, Parameter Blindfolding is going to backfire, and you’re still going to have a lot of query tuning to do.


Things To Consider When Submitting Sessions to Paid Online Conferences

Lots of paid conferences are happening online these days, so some of you will be presenting an online session or pre-con for the first time. I’ve done a bunch of these and been burned in a bunch of ways, so lemme save you some time and heartache.

Does the conference own the recordings? And if so, what are they allowed to do with ’em? Do you get any veto or say over what they’d like to do with the recordings? For example, one site I used to work with in the past would take the live session recordings and then re-present them online later, acting as if they were a “new” event. They kept bundling past recordings slightly differently, changing the titles and the abstracts as if they were new material, and then running them at specific dates/times as if I was standing in front of the camera, live. I was furious because the material wasn’t really new, and they kept doing it even when the material was outdated.

Do you own the recordings too? If you have a successful event and you’re happy with the recordings, can you get a copy and use them as you see fit? Can you distribute them to your clients? Can you upload them to your YouTube account to help spread awareness? Can you sell them online at Shopify or Gumroad? That’s an easy way to make a little side money, by blogging about your recordings and linking to a shopping page at those sites.

How will the conference distribute and protect the recordings? If the attendees get the recordings in a downloadable format, they will share it with all their friends, and that’s the end of your sales. I’ve sat in local user group sessions where the leader said, “If anyone wants the recordings from Conference X, just borrow this USB drive, copy the files to your laptop, and hand it to the next person. Who wants it first?”

Can your recordings be sold in a bundle with others? And if so, how is each speaker compensated? For example, one conference used to take all their pre-con recordings and sell them in an attractively priced bundle. Sure, attendees loved getting all the pre-cons for $200, but when you sliced out the revenue per speaker, $5 for an all-day training class ain’t great.

Can you and the conference agree on a minimum price for your recordings? Even if it isn’t bundled, the conference could pull a Udemy and sell your material for $5-$10 because it’s still free money to them. The problem is that you won’t be able to sell that material anywhere else when they’re selling it so cheaply, and you’ll even have a tough time getting private clients to pay you for training. They’ll just buy it online for $5-$10.

Do you get veto rights on the recording? If you have a really crappy pre-con day with failed demos, bad Internet, or a person asking harassing questions all day, can you nix the recording sales and get a do-over on the recording? Or what if you decide you want to sell that material online yourself, can you get them to stop selling it?

Can you give them the recordings? If you record the session on your own side locally, the video & audio quality can be way better than the conference’s recording, which happens after the video & audio are heavily compressed by their conference platform.

Is there an exclusivity clause on the live session? Some conferences don’t want you to present this same material at any other conference in the months before or after their session. They’re trying to protect their own revenue, which makes sense. However, now that all conferences are online for the foreseeable future, this can also limit the number of conferences you get involved with. If you’re going to tie yourself down, make sure you’ve got the right partner who’s going to maximize your audience in that time range.

Is there an exclusivity clause on the recording? Are they the only ones who are allowed to sell it online? Is there an expiration date?

When will you get paid? How long after the conference, and if they’re selling the recordings going forward, how often will those payouts be done? Even though the conference organizer gets their money ahead of time, don’t think that you will. For example, one event organizer kept telling me “we’re working on balancing the books” for months after the event finished.

What happens if the session delivery fails? If the conference’s online meeting platform has a terrible experience for the end users, like constant disconnects or users can’t log in or can’t hear things, do you still get paid? If users complain about the conference platform, your audio, your demos failing, etc – how will revenue disputes be handled?

Just like you, most online conference organizers are new at this right now too, and you’re probably going to find that they don’t have answers for these questions right away because they haven’t thought about ’em either. Having these discussions together helps everyone come to an agreement that works well for both sides.


Menu
{"cart_token":"","hash":"","cart_data":""}