Query Hints You Can Use to Avoid Blocking

No, not NOLOCK. You get the wrong query results, even when your query results aren’t supposed to be changing.

I’m talking about when you need to do writes, but you want your query to get along with others in high concurrency environments, without creating a blocking firestorm.

SET LOCK_TIMEOUT 1000 – if you run this before your query starts, SQL Server will wait patiently for X milliseconds before giving up and aborting your own query. This is useful when a lot of short queries are running all the time, and I don’t want to give up instantly – but I do want to give up if there’s a long-running query blocking me. If you pass in a value of 0 milliseconds, SQL Server gives up instantly. For example, in the below query, I’m trying to delete all of Jon Skeet’s posts, but I want to abandon my query if anyone has locks that stop me on either the Users or Posts tables, and those locks stop me for over 1 second:

If you use this technique, or the other techniques described in this post, your query will fail with an error if it was blocked by someone. Most of us (myself included) don’t do a great job of checking for errors and retrying our query gracefully, so we could probably all use an hour to dig into Erland Sommarskog’s series on error handling to learn how to retry automatically.

 

WITH (NOWAIT) – this is kinda like setting lock_timeout 0, but it’s a table-level hint like NOLOCK. This means it only affects a single table in your query. For example, in the below query, I’m trying to delete all of Jon Skeet’s posts, but I want to abandon my query if someone has a lock on the Users table – but I’m willing to wait forever for locks on the Posts table:

 

SET DEADLOCK_PRIORITY LOW – this isn’t about blocking, but specifically deadlocks. When I set this at the beginning of my batch, I’m saying, “I’m willing to wait for blocking, but if a deadlock situation comes up, go ahead and kill my query, because I don’t mind trying it again.” You call it like we did LOCK_TIMEOUT:

 

WAIT_AT_LOW_PRIORITY – this one is just for index maintenance, and sadly, it doesn’t work for creates or drops. “Online” index rebuilds aren’t completely online: they need a brief schema modification lock to swap in the new copy of the index. This hint lets your completed index lurk patiently in the background while other queries finish, and then swap in later:

If that last one is helpful, then your next step is to do similar blocking avoidance with DML operations by using Kendra Little’s post on which operations support online & resumable.


Updated First Responder Kit and Consultant Toolkit for January 2021

You know how a lot of app release notes say “bug fixes and improvements”? We’re not gonna lie: this one’s all bug fixes. Huge shout out to the community contributors for catching stuff and contributing fixes.

New behavior for SQL Server 2008 & R2 users: sp_BlitzLock has never worked on SQL Server 2008, but starting this month, the installation of sp_BlitzLock simply fails with an error. In this month’s release, one of the checkins uses THROW, which isn’t available on 2008. This also means that if you use the Install-Core-Blitz-No-Query-Store.sql script to do installations, you’re going to get an error. It’s okay – all of the other stored procs install successfully. This behavior wasn’t intentional by any means, and we’d like that not to happen. If you’d like to improve sp_BlitzLock’s installer to bail out gracefully without an error, we’d gladly accept a pull request for that.

To get the new version:

Consultant Toolkit Changes

To help troubleshoot plan cache pollution due to unparameterized queries, there’s a new tab for the top 10 queries with the most duplicated plans in cache. It’s based off this troubleshooting query. We also capture the top 3 plans for each query, sorted by duration descending.

This does mean a new querymanifest.json and spreadsheet. If you’ve customized those files, you’ll need to copy your changes into the new querymanifest.json and spreadsheet if you want to get the new tab.

sp_Blitz Changes

  • Fix: standardized the @Help = 1 behavior across this & all of the procs. When you run ’em with @Help = 1, they return the help info and immediately return. (#2717, thanks Todd Chitt.)
  • Fix: offline databases no longer show up in auto-shrink or page verification checks. (#2750, thanks Erik Darling and Ali Hacks.)

sp_BlitzFirst Changes

  • Improvement: if you turn on @Debug = 1, it now raises “Running CheckID #___” messages in the messages tab in SSMS to help you debug. (#2739, thanks Adrian Buckman.)
  • Fix: fix warning typo on queries with max memory grant and clarified that the query’s memory grant is a percentage of max workspace. (#2732, thanks Ali Hacks.)
  • Fix: forwarded fetches in table variables now show that it’s a table variable rather than a temp table. (#2742)
  • Fix: forwarded fetches in tempdb objects were reporting the cumulative number of fetches since the object was created rather than the forwarded fetches created during the sp_BlitzFirst sample. (#2743)

sp_BlitzIndex Changes

  • Fix: if a missing index recommendation included columns whose names included ampersands, sp_BlitzIndex was failing with an error. (#2734, thanks snckirkmarken.)

sp_BlitzLock Changes

  • Improvement: now runs better without sysadmin or db_owner permissions by wrapping an update statement in a try/catch block. (#2751, thanks Chad Baldwin.)

sp_ineachdb Changes:

  • Fix: case sensitivity bug introduced in last month’s version. (#2737, #2744#2747, and #2748, thanks Jefferson Elias and sm8680.)
  • Fix: GO was removed at the end of last month’s version, which broke the Install-All-Scripts.sql installer when running on brand-new servers. (#2753, thanks TLSystemsLLC.)

sp_BlitzWho Changes

  • Fix: we’re supposed to sort queries from longest-running to shortest-running, but last month’s version introduced a bug in that. This should fix it. (#2721, thanks Erik Darling and Ali Hacks.)

For Support

When you have questions about how the tools work, talk with the community in the #FirstResponderKit Slack channel. If you need a free invite, hit SQLslack.com. Be patient – it’s staffed with volunteers who have day jobs.

When you find a bug or want something changed, read the contributing.md file.

When you have a question about what the scripts found, first make sure you read the “More Details” URL for any warning you find. We put a lot of work into documentation, and we wouldn’t want someone to yell at you to go read the fine manual. After that, when you’ve still got questions about how something works in SQL Server, post a question at DBA.StackExchange.com and the community (that includes me!) will help. Include exact errors and any applicable screenshots, your SQL Server version number (including the build #), and the version of the tool you’re working with.

Watch Us Working On It

I live streamed while I was working on the forwarded fetches fixes for sp_BlitzFirst:


We Lost Gareth Swanepoel.

Thirteen years ago, I met Gareth Swanepoel (@GarethSwan) for the first time, interviewed him for his first real sysadmin job, and … turned him down.

Gareth Swanepoel circa 2007

Thankfully, my manager overruled me and hired him anyway.

Thank goodness – and thank goodness Gareth forgave me for my incredibly bad judgment, hahaha.

Gareth proved to be one of the hardest-working, most-determined people I’d ever met. He was endlessly curious – it wasn’t enough to solve a problem – he wanted to know how the thing worked, whatever it was. He wanted to know everything about hardware, operating systems, networking, science, and eventually, databases.

When I left Southern Wine & Spirits, Gareth had the “good fortune” of cleaning up the messes I’d left behind. He grew to love databases, moving on to Pragmatic Works, then teaching classes to share what he knew, then moved on to the big goal that most SQL Server people dream of accomplishing: he got a job at Microsoft.

Along the way, we shared so many laughs, beers, and meats. When Tim Ford & I launched SQLCruise, Gareth immediately wanted to do a SQLBBQ at his house, cooking the pre-cruise party, sharing his pool, and introducing everyone to his wonderful family.

To know Gareth was to smile and to share.

Gareth greeted everyone with incredibly wide smiles, and he wanted to know everything about you. If you knew him, you surely witnessed him nodding excitedly and saying, “Yeah?” He wanted to encourage you to go on about whatever you were interested in, and he wanted to develop the same interests too. He wanted to know what made you happy, and it made him happy, too.

 

I have so many great memories with Gareth. The last time we were hanging out, we were racing a Porsche around in the Nevada desert and eating a crazy dinner with Ryan & Andrea Allred. He said yes to everything, and the joy that he found everywhere was inspiring.

Gareth brought the best parts of childhood into adulthood. He was endlessly curious and excited. From his Lego collection to his love of Disney to his passion for databases, he lived life to its fullest.

Sadly, we lost Gareth last week due to COVID-19.

The data community was so lucky to have him while we did. He was an incredible blessing to those who knew him, and he touched so many of us with his excitement and generosity.

There’s a GoFundMe to help his two children get started on the right foot, and you can watch his memorial service, part 1 and part 2. Gareth’s sister begins speaking at 48 minutes into part 1.


Which Microsoft Certification Should You Get?

You work with data, and you’re thinking about getting a Microsoft certification. Which one should you get?

If you need the certification to get a job…

Then get the certification you see mentioned the most often in the descriptions of the job you’re looking for.

This is just a little tricky because certifications shown on jobs often lag behind by a couple/few years. The HR department may just have copy/pasted the job requirements from the last job ad they posted. That certification may no longer even be offered by Microsoft. (Microsoft has a tendency of reinventing their certification lineup every 2-3 years, throwing the whole thing into the SlapChop for a remix, seemingly just to make their customers spend more money on new certs while learning more about Microsoft’s cloud products.)

I gotta tell you that the certificate absolutely, positively will NOT get you the job. It’s just that when you’re applying for a publicly posted job, you’re competing with a huge, huge stack of wildly unqualified applicants. In order to cut down the stack of resumes, the hiring department will use any kind of filter they can think of, and often, that’s going to be whether someone mentions a Microsoft cert number on their resume.

I don’t agree with that strategy, but I understand why they have to do it.

Long term, though, I’d say that a better strategy for getting a job is to start building your brand online. Start writing blog posts and delivering presentations, and after a couple/few years, you’ll have people coming to you with jobs. You won’t have to worry about standing out in a stack of resumes because you’ll be the only candidate. It takes time, but it’s a much better long term gamble than trying to stand out in a stack.

If you need the certification for your current job…

Sometimes companies require their staff to show that they’re continuously improving, and one way they measure it is by seeing you pass industry tests.

In that case, get whatever your manager or HR department wants you to get.

If they don’t care which one you get, then get the oldest one that’s relevant to your job. That might sound counter-intuitive: you might think you want the latest and greatest cert. The problem is that the new certs will have less training material available. If you’re only trying to check a box for your HR team, an older cert with tons of free YouTube prep videos and cheap PDFs will be easier to pass.

If you want the certification for personal growth…

Then get the newest certification that calls to you personally. It’s going to have less established training material available, but that’s okay: you’re doing this for personal growth anyway.

The best personal growth doesn’t come from reading a book with a bunch of canned test questions and answers. The best growth comes from understanding what the test is trying to measure, and then building your own experience working with that tool. Make a list of the tools and techniques you want to experiment with, and then check out Microsoft’s SQL Workshops repository on Github. It’s full of free self-learning resources for people like you who want to grow their own skills.

No matter why you want the exam,
just take it first.

You have a serious case of impostor syndrome: you believe you’re not qualified to pass the test. Because of that, you’re going to fill yourself with self-doubt as you prep for the exam, and you’re going to think the exam is incredibly challenging. It’s not – it’s just a test. It’s not the final boss of your life’s video game. After all, I bet you’ve met plenty of people who were supposedly certified, but couldn’t solve the most basic problems at work.

Go take it first. You’re going to fail, but you’ll be comfortable with that failure – because you didn’t expect to pass. Then, you’ll have a better feeling of what the test-taking process is like, the level of detail in the questions, and how much work you need to put into studying.

Go get ’em!


Want to Avoid Deployment Downtime? Replication Probably Isn’t the Answer.

You’re tired of taking outages to deploy changes to your database schema.

Wouldn’t it be cool to have two different SQL Servers so that you could:

  • Point your customer traffic at ServerA
  • Apply your deployment scripts to ServerB (which isn’t taking any customer traffic)
  • Point your customer traffic over to the newly deployed ServerB
  • Apply your deployment scripts to ServerA (which isn’t taking any customer traffic)

Presto! All your servers are up to date and in sync, right? Well, no, not in SQL Server at least.

Illustrating the problem with a simple schema change

We’ll use the Posts table in the Stack Overflow database as an example:

The Posts table holds questions & answers. In the dark days of the original design, the Tags column held up to 5 tags to describe a question. In our new design, we’d like to break Tags out to a separate child table, one per row.

If we only have one SQL Server, our deployment script looks like this:

  1. Create a new PostsTags table:
    CREATE TABLE dbo.PostsTags (PostId INT, Tag NVARCHAR(40));
  2. Shut down the app
  3. Populate that table (which is going to require more work than I’m prepared to type out here)
  4. Alter the stored procedures that insert/update rows in dbo.Posts so that they now populate the new dbo.PostsTags table
  5. Start the app back up again

Why shut down the app? Because people can still be inserting & updating rows in Posts while steps 3 & 4 are happening. If I don’t shut down the application, I’ll need to do some kind of additional coding steps afterward to detect the dbo.Posts rows that changed while steps 3 & 4 ran, and make the appropriate changes in dbo.PostsTags.

There are absolutely ways around this problem if you have just one SQL Server:

  • Deploy the dbo.PostsTags table early, and modify the code to keep BOTH the dbo.Posts.Tags column and the dbo.PostsTags table in sync with new inserts/updates, or
  • Write triggers to keep both the column and the table in sync, or
  • Write another process (like SSIS, ADF, or your own custom app) to keep them in sync

But none of those are built into SQL Server. They’re all additional development work, not traditionally considered part of database administration.

Will replication solve this?

The line of thinking is that if we had TWO SQL Servers, then we could just make these changes to a server while it isn’t facing any user traffic:

  1. Create a new PostsTags table:
    CREATE TABLE dbo.PostsTags (PostId INT, Tag NVARCHAR(40));
  2. Populate that table (which is going to require more work than I’m prepared to type out here)
  3. Alter the stored procedures that insert/update rows in dbo.Posts so that they now populate the new dbo.PostsTags table

All the while that these changes are happening, some other SQL Server somewhere is handling all the application traffic – so we don’t have any data consistency problems, right?

Wrong: that other SQL Server is still getting inserts & updates to the Posts table. Replication only keeps the Posts table in sync: it doesn’t translate changes between the Posts table and PostsTags.

Replication COULD be used to translate changes between two different database schemas – but that’s up to you. It’s not as simple as checking a box inside replication. You have to write your own custom replication stored procedures, and that’s left as an exercise for the reader.

And if you’re willing to do that…you don’t need replication. Remember back in the last section how I said there were absolutely ways around this problem if you only had one SQL Server? Go back to that section – those options are simpler than introducing replication.

Replication also makes high availability way harder.

That sounds backwards for folks who grew up with other database platforms where replication is a valid solution for high availability and disaster recovery, hahaha.

However, SQL Server’s high availability and disaster recovery mechanisms rely on the transaction log. ServerA and ServerB will both have completely different transaction logs – after all, they’re taking different changes at different times. (One gets live customer data, then gets patched – the other is happening in the opposite order.)

Because they have different transaction logs, you have to protect them each individually. If you’re using Always On Availability Groups for HA/DR protection, that means you would have two separate Availability Groups, likely each in their own set of servers. (You can’t use the same servers if you want to use just one database name: database name has to be unique per server.)

So you’d end up with a convoluted set of replicas for both ServerA and ServerB. You’ve just doubled your server count, patching, and disaster recovery work.

Note that I didn’t say it was impossible.

You could absolutely do all this – and I’ve had one client in the past who actually did. They were even able to reduce their (perceived) outage windows because:

  • Their deployment scripts were extremely disk-intensive: they liked to rewrite entire tables to delete or reorder columns
  • Their storage sucked, so big logged changes took forever
  • They didn’t mind replication getting way out of date (like hours) – as long as users could see any data on the site, they didn’t care if it was accurate or not, so the replication lag during deployments was tolerable
  • They simply didn’t bother with high availability or disaster recovery for the replication systems – they snapshotted entire servers daily, and they were willing to lose a day of data
  • They were willing to massively staff up to write their own custom replication stored procedures – they had a team of 5-6 people who just wrote replication stored procs for deployments (because they were a mature application with hundreds of tables, each of which required their own custom procs when they made schema changes, and Lord knows they loved to make schema changes, but they wanted the replication procs to be extremely well-tested)

You can do all that. Frankly, though, I just wouldn’t bother – use the single-server methods instead, or standardize on only doing additive, not destructive, changes.


Are You Underpaid? Find Out in the Data Professional Salary Survey Results.

We asked what you make, and 1,747 of you in 69 countries answered. Altogether, you made $171,879,034 this year. Hubba hubba, there’s some earning power in this audience.

Download the Data Professional Salary Survey results (XLSX).

A few things to know about it:

  • The data is public domain. The license tab makes it clear that you can use this data for any purpose, and you don’t have to credit or mention anyone.
  • The spreadsheet includes the results for all history since 2017. We’ve gradually asked different questions over time, so if a question wasn’t asked in a year, the answers are populated with Not Asked.
  • The postal code field was totally optional, and may be wildly unreliable. Folks asked to be able to put in small portions of their zip code, like the leading numbers.
  • Frankly, anytime you let human beings enter data directly, the data can be pretty questionable – people put in questionable units of measure for salary, and we discarded a few responses who were obvious inconsiderate trolls about their gender.

I did some quick slicing and dicing, focusing on SQL Server DBAs in the United States, and deleted the top & bottom 10 outliers (because they were kinda bananas.) Looks like DBA salaries are up again this year:

With the wildly turbulent year we just had, I was wondering if y’all would change your career goals. You said, that in 2021, your plans are to:

And that mix almost identically mirrors last year’s responses, so it looks like the pandemic didn’t change your plans too much. 2/3 of you usually plan to stay in the same company doing the same thing.

What’s the newest version of SQL Server you have in production?

The numbers are way up for SQL Server 2019 this year, up from 11% last year. That’s great! Now, on the flip side, what’s the oldest version you have in production?

Oof. About 1/3 of y’all are still supporting something that Microsoft refuses to support. You’ve got about another year and a half on SQL Server 2012, but that still means you want to start making your transition plans this year. Interestingly, the numbers for 2008 & 2008R2 didn’t drop all that much: last year, 40% of you were running those.

Hope this helps make your salary discussions with the boss a little bit more data-driven, and hope it helps justify keeping your salary competitive. If you’ve got questions about the data, go ahead and jump in – download the results in Excel. Don’t ask me to slice and dice the numbers for you – you’re a data professional, remember? Get in there and analyze it to answer your own questions. You’ve got this!


Your Ten Favorite Blog Posts from 2020

The most popular posts this year were when I ran my Fundamentals classes for free throughout the year. I ran a different class per month, and posted signups and videos here on the blog. I’m setting those aside, though, because they’re not evergreen: they’ve since disappeared since I only ran ’em free for a limited period of time.

My favorite new pic by Eric Larsen

Here are the still-present posts I wrote this year that y’all visited the most:

Over the course of the last couple of years, I’ve been writing more “how do I ___?” posts – and to understand why, here are the posts you visited most in 2020 regardless of when we wrote ’em:

“How to” posts are particularly evergreen: as long as they’re about long-lasting pain points, they’re the kinds of things you can write and then gradually get more and more viewers over time. It’s also the key to more subscribers: as readers stumble across your stuff repeatedly in Google, they’re more likely to subscribe to your work because they know you write useful stuff.


[Video] Office Hours: SQL Server Career Q&A, Part 3

Here’s the last video in my vacation series with the answers to your most-upvoted professional development questions. Enjoy!

I won’t be taking new questions in the comments, but I’m leaving the comments open on this post so that y’all can discuss the video together with each other. Just be mindful that your name will be publicly visible, and if you use your real email address, your Gravatar icon will show up next to your comment. If you have something you want to say anonymously, best to use a throwaway name/email on this one.


As an Attendee, What’s Important to You In a New Data Community?

With PASS shutting down, it’s time to think about what the community builds in its place.

There are a lot of efforts going towards building replacements for PASS, including Microsoft’s own efforts. Because these efforts are often led by active community volunteers, they’re very well in touch with what event organizers, sponsors, and speakers want in their community.

But I’d like to ask you, dear reader – when you attend (not volunteer or speak at) an event, whether it’s in person or online, whether it’s a one-time annual event or a monthly user group meeting:

  • What’s your idea of a great data community?
  • What are things that you want to see included?
  • What are features or attributes that are important to you?
  • How do you want to interact with other folks in the community?

I’ll ask clarifying questions in the comments just to help hone your point of view, but I’m not going to give any answers here. I’d rather just pose this and let y’all talk through it with each other, and then let organizers read through this for insights.


[Video] Office Hours: SQL Server Career Q&A, Part 2

Last week, I posted a call for questions about SQL Server careers and professional development, and y’all asked a lot of good ones. Here’s the next video with the questions you upvoted the most, and I’ll post the last round of ’em next week.

I won’t be taking new questions in the comments, but I’m leaving the comments open on this post so that y’all can discuss the video together with each other. Just be mindful that your name will be publicly visible, and if you use your real email address, your Gravatar icon will show up next to your comment. If you have something you want to say anonymously, best to use a throwaway name/email on this one.


How to Batch Updates A Few Thousand Rows at a Time

You’ve got a staging table with millions of rows, and you want to join that over to a production table and update the contents. However, when you try to do it all in one big statement, you end up with lock escalation, large transaction log usage, slow replication to Availability Groups, and angry users with pitchforks gathered in the Zoom lobby.

In this post, I’ll explain how to use a combination of two separate topics that I’ve blogged about recently:

Setting up the problem

I’ll start with the Stack Overflow database (any version will work) and make a copy of the Users table called Users_Staging, and I’ll change a bunch of values in it to simulate a table we need to import from somewhere else:

If I try to run a single update statement and update all of the rows:

Then while it’s running, check the locks it’s holding in another window with sp_WhoIsActive @get_locks = 1:

See how the Users table says “OBJECT” request_mode “X”? That means my update query has gotten an eXclusive lock on the Users table. That’s your sign that other queries will be screaming with anger as they wait around for your update to finish.

Now, sometimes that’s what you actually want: sometimes you want to rip the Band-Aid off and get all of your work done in a single transaction. However, sometimes you want to work through the operation in small chunks, avoiding lock escalation. In that case, we’re going to need a batching process.

How to fix it using the fast ordered delete technique and the output table technique

I’m going to encapsulate my work in a stored procedure so that it can be repeatedly called by whatever application I’m using to control my ETL process. You could do this same technique with a loop (like while exists rows in the staging table), but I’m choosing not to cover that here. When you get your own blog, you’ll realize that you also get to control what you write about … and everyone will complain regardless. Here we go:

When I execute that stored procedure, the locks look like a hot mess, but note the lock level on the Users object:

Now, they say “OBJECT” request_mode=”IX”, which means INTENT exclusive as opposed to just straight exclusive. This means that SQL Server is starting some work, and it might need to escalate the locks to table level…but as long as you keep the number of rows & locks low, it won’t have to. In this case, my stored procedure runs & finishes quickly without escalating to a table-level lock.

There two parts of the proc that make this magic happen. This part:

Tells SQL Server that it’s only going to grab 1,000 rows, and it’s going to be easy to identify exactly which 1,000 rows they are because our staging table has a clustered index on Id. That enables SQL Server to grab those 1,000 rows first, then do exactly 1,000 clustered index seeks on the dbo.Users table.

The second magical component:

Tells SQL Server to track which 1,000 Ids got updated. This way, we can be certain about which rows we can safely remove from the Users_Staging table.

For bonus points, if you wanted to keep the rows in the dbo.Users_Staging table while you worked rather than deleting them, you could do something like:

  • Add an Is_Processed bit column to dbo.Users_Staging
  • Add “WHERE Is_Processed IS NULL” filter to the update clause so that we don’t update rows twice
  • After the update finishes, update the Is_Processed column in dbo.Users_Staging to denote that the row is already taken care of

However, when you add more stuff like this, you also introduce more overhead to the batch process. I’ve also seen cases where complex filters on the dbo.Users_Staging table would cause SQL Server to not quickly identify the next 1,000 rows to process.

If you’re doing this work,
you should also read…

This blog post was to explain one very specific technique: combining fast ordered deletes with an output table. This post isn’t meant to be an overall compendium of everything you need to know while building ETL code. However, as long as you’ve finished this post, I want to leave you with a few related links that you’re gonna love because they help you build more predictable and performant code:

Or, if you’d like to watch me write this blog post, I did it on a recent stream:

To see more of these, follow me on Twitch or YouTube.


The End of the Professional Association for SQL Server #SQLPASS

This afternoon, PASS announced that their operations will cease in January:

We are saddened to tell you that, due to the impact of COVID-19, PASS is ceasing all regular operations, effective January 15, 2021. We encourage you to take full advantage of any access that you have to PASS content between now and January 15, 2021. After that point, PASS servers will cease to function and members will no longer be able to access any PASS resources. So, in the meantime, you can watch sessions on our website, and download session recordings that you’ve purchased or that you have access to as a PASS Pro member. Please take full advantage of this exclusive content while you can.

For perspective on the financial numbers involved, read the final board meeting minutes (PDF):

Tim outlined the outstanding debt as 1.87M with a total potential deficit of $3.2M. There is also future hotel and convention center cancellation fees at around $6M over the next 5 years. Tim presented the PASS Executive’s recommendation outlining that with the magnitude of debt, no cash on hand, nor forecasted in person event for revenue generation, PASS has no choice but to move forward with insolvency.

There’s going to be a lot of discussion around this over the coming days: folks will be mourning what’s lost, working together to save the community’s resources like the videos and the chapter lists, and planning new communities.

It’s too soon for me to have any written thoughts together, but I let y’all post questions for me, and I answered them from vacation:


Frequently Asked Questions from Office Hours

In my Twitch & YouTube live streams, some questions seem to come up every week. I’m not expecting this post to stop ’em by any means – most folks on the live stream don’t seem to read the blog regularly – but I thought I’d share ’em here.

Q: Why is the same query sometimes fast and sometimes slow, depending on where I run it?

It’s most likely parameter sniffing.

Q: My index maintenance jobs take forever. What should I do?

When you rebuild a 50GB table’s indexes, you’re basically reloading that table from scratch. SQL Server has to make a brand-new copy of the table on new data pages, and it logs all this stuff in the transaction log – which means your backups take longer and your Availability Group gets way far behind.

If you’ve been rebuilding indexes nightly, consider easing that off to weekends instead. If you’re worried that will affect performance, you’re probably mixing up the difference between rebuilding indexes and updating statistics. Read about how out-of-date statistics cause bad query performance, and then consider doing daily stats update jobs rather than rebuilding your indexes.

The more you stick around my blog and live streams, the more you’ll see me point out that daily stats updates are a bad idea for most databases too, but in the grand scheme of things, they’re still way better than daily index rebuilds.

Q: Which cloud provider is better for SQL Server?

Companies don’t pick their cloud hosting based on one specific database. They pick on an array of things including overall cost, available services, existing licensing agreements, etc. What’s better for you might be worse for somebody else.

Having said that, at this moment in time:

  • If you run SQL Server in a VM, you can get more VM selection & performance at AWS
  • If you want to rent SQL Server as a service, Microsoft’s Azure SQL DB and Azure SQL DB Managed Instances are quite a bit ahead of Amazon RDS

Brent's officeQ: What’s that thing on your wall?

It’s an Every Day Calendar by Simone Giertz. I use it to track the days that I work on building stuff that will produce passive income, like writing new classes or designing new apps. It’s a helpful visual reminder that I need to stay focused on building for the future.

Q: How should I get started learning SQL?

One of the nice things about SQL is that it’s a mature language. It changes very slowly and gradually over the years. Because of that, there’s a ton of training material out there available at a very low cost, like the edX classes on SQL or the book SQL for Dummies.

If you don’t find yourself immediately interested in the material, bail out and pick another author/speaker.

Q: Why would someone choose SQL Server over MySQL or Postgres?

Microsoft SQL Server costs about $2,000 USD per CPU core for Standard Edition, and around $7,000 per core for Enterprise Edition. What do you get for all that money? I’m going to give you a few reasons why companies choose SQL Server. I know you’re going to be tempted to argue with me, and you might have some great points – but you don’t need to convince me, dear reader. I’m a huge believer in other database platforms – I’ve already written about why we use AWS Aurora, plus we also use DynamoDB and MySQL. I’m just explaining why other companies often choose SQL Server, like my clients:

  • Support – some companies want to know that they can call the company who wrote the database and get support 24/7.
  • Enterprise-friendly features like auditing, encryption, and single sign on security – for example, if you’re a hospital, you may need to track everyone who queries George Clooney’s medical records.
  • High availability and disaster recovery – SQL Server has so many built-in features around clustering, Availability Groups, log shipping, etc. that make it easier to support higher uptime. I’m not saying it’s easy by any means – but easier than some open source alternatives.
  • Inertia – because a lot of companies just already standardized on SQL Server, have a big install base of it, have a lot of staff who know how to develop for it and manage it.

Q: You use a Mac – how do you manage SQL Server?

With a jump box, a VM set up with all of the tools I need. When I do long term work with a client, I have them set up a jump box for me. That way I can just use a VPN client and a remote desktop client from anywhere.

When I’m teaching training classes, I spin up fairly beefy (8-core, 60GB-RAM, 2TB SSD) VMs in the cloud with SQL Server. That approach lets me run really ugly performance workloads, maxing out CPU and storage, without worrying about the high workload interfering with my video stream.

Q: Why don’t you use Azure Data Studio during streams?

For one, the execution plan experience isn’t anywhere near as good as SQL Server Management Studio. It’s not bad, it’s just not complete – there are tons of property details you just can’t see with ADS. Since I teach performance tuning, I need those details.

Also, I gotta meet my audience where they are. The vast majority of SQL Server professionals are still using SSMS, not ADS. If I’m going to teach you something inside the span of an hour or a 1-day class, I need to pick my battles.

I do use Azure Data Studio a ton myself because I split my time between Microsoft SQL Server and AWS Aurora PostgreSQL.

Q: Will you teach classes on other databases?

Never say never, but it’s pretty unlikely. I’m terrible at Postgres.

I wouldn’t wanna teach a Postgres tuning class until I could explain how the optimizer works. I don’t have plans to even learn that, let alone build classes to explain it.

There’s another question that comes up constantly, but it deserves its own blog post, so stay tuned for that this week.


[Video] Office Hours: SQL Server Career Q&A, Part 1

I posted a call for questions about SQL Server careers and professional development, and y’all asked a lot of good ones. Here were the questions you upvoted the most, and I’ll post another round of ’em next week.

The audio on this isn’t up to my usual home-studio quality, of course!

I won’t be taking new questions in the comments, but I’m leaving the comments open on this post so that y’all can discuss the video together with each other. Just be mindful that your name will be publicly visible, and if you use your real email address, your Gravatar icon will show up next to your comment. If you have something you want to say anonymously, best to use a throwaway name/email on this one.


Wanna Ask Me Anything About Careers & Professional Development?

Brent Ozar in Cabo

I’ll be answering your questions from right here.

Ever wanted to ask me questions about careers or professional development?

Now’s your chance.

Post your questions here (update: questions are closed)– and you may wanna use a fake name if you’re asking something that’s a little revealing about your current job or employer.

Don’t have a question? Vote on the questions that other folks are asking if there’s something you’d like to see me talk about.

This evening, I’ll flip on my camera and record an Office Hours stream. It won’t be live – I’m down in Mexico on vacation with pretty low bandwidth – but I’ll work my way through the highest-voted questions and talk through ’em. I’ll upload the recording and post it here on the blog.

Nothing’s off limits: ask whatever you like, as long as it’s related to careers or professional development. I’m doing a non-technical Office Hours because it’s easier to answer those kinds of questions without a computer. It’s just gonna be you, me, and tequila.

Comments are disabled on this post just because I want to make sure you ask your questions at the right place, here. (Update: questions are closed.)


Updated First Responder Kit and Consultant Toolkit for December 2020

sp_BlitzLock gets better identification of parallel deadlocks, sp_DatabaseRestore uses Ola’s scripts for logging, and lots of bug fixes this month.

To get the new version:

Consultant Toolkit Changes

I updated it to this month’s First Responder Kit, but no changes to querymanifest.json or the spreadsheet. If you’ve customized those, no changes are necessary this month: just copy the /Resources/SQLServerSetup checklist folder over your existing files to replace the SQL scripts.

sp_Blitz Changes

sp_BlitzFirst Changes

  • Improvement: in @ExpertMode = 1, when we show the top files by read & write stalls, show 20 files instead of 5, and also sort them by the worst stalls. (#2707)
  • Fix: ignore FT_IFTSHC_MUTEX waits. (#2697)

sp_BlitzIndex Changes

  • Fix: index suggestions on really long table names could be truncated. (#2680, thanks Ralf Pickel.)
  • Fix: columnstore visualization only worked on tables in the dbo schema. (#2683, thanks Ali Hacks.)
  • Fix: nonclustered columnstore visualization didn’t report the correct columns. (#2684, thanks Ali Hacks.)
  • Fix: if you passed in databases to be ignored, they were still counting against the 50-database limit before we made you pass in BringThePain = 1. (#2693, thanks skrishnan31.)
  • Fix: temporal tables had table.schema in the detail names rather than schema.table. (#2694, thanks Mark Hions and Mikey Bronowski.)

sp_BlitzLock Changes

sp_DatabaseRestore Changes:

  • Improvement: commands are now run with CommandExecute so that they get logged if things go wrong. (#2700, thanks Frederik Vanderhaegen.)
  • Fix: the new @SkipBackupsAlreadyInMsdb switch was ignoring transaction logs if you were restoring them on the same server where you were taking backups (since they were already in msdb.) (#2710, thanks Greg Dodd, and you can watch a video of him fixing it.)

sp_ineachdb Changes

  • Fix: now handles database names with spaces in ’em. (#2702, thanks renegm.)

For Support

When you have questions about how the tools work, talk with the community in the #FirstResponderKit Slack channel. If you need a free invite, hit SQLslack.com. Be patient – it’s staffed with volunteers who have day jobs.

When you find a bug or want something changed, read the contributing.md file.

When you have a question about what the scripts found, first make sure you read the “More Details” URL for any warning you find. We put a lot of work into documentation, and we wouldn’t want someone to yell at you to go read the fine manual. After that, when you’ve still got questions about how something works in SQL Server, post a question at DBA.StackExchange.com and the community (that includes me!) will help. Include exact errors and any applicable screenshots, your SQL Server version number (including the build #), and the version of the tool you’re working with.

Watch Us Working On It

When Greg Dodd found bug #2710 in sp_DatabaseRestore, he live streamed his investigation and the fix:

You can also subscribe to Greg’s YouTube channel if you want to be alerted the next time he streams when he’s working.

And then on the repo maintainer side, I often live stream when I’m merging pull requests. Here’s one of my sessions from this month’s release:

 


Free Webcast: 3 Ways to Prove Your SQL Server Storage is Slow

Your users are complaining that their queries are slow. You’ve tried changing SQL Server settings, tuning indexes, but the problem just won’t go away. You’ve got a hunch that it’s a storage problem, but you need real, empirical proof. You need numbers that no one can argue, and you need to know acceptable targets for those numbers.

In this session, I will give you three queries to run on your SQL Server. I’ll talk through what good and bad numbers look like, and you’ll share your numbers with the overall audience to get a feel for whether your storage is hot or not.

Register here for the free webcast with me & Argenis Fernandez from Pure Storage.


Are You Underpaid? Let’s Find Out: The Data Professional Salary Survey is Open.

It’s time for our annual salary survey to find out what data professionals make. You fill out the data, we open source the whole thing, and you can analyze the data to spot trends and do a better job of negotiating your own salary:

We pay Richie in query bucks

Take the Data Professional Salary Survey now.the survey has closed.

The anonymous survey closes Friday, Jan 1, 2021. The results are completely open source, and shared with the community for your analysis. (You can analyze ’em now mid-flight, but I’d wait until the final results come in. I’ll combine them into a single spreadsheet with the past results.)

Thanks for your help in giving everybody in the community a better chance to talk honestly with their managers about salary.

Update: here are the 2020 Data Professional Salary Survey results.


New Class: Fundamentals of TempDB – Instant Replays Ready Now

Yesterday, I taught my first Fundamentals of TempDB class, and as always, you can watch the Instant Replays right after class completes. Here’s the abstract:

You’ve been working with Microsoft SQL Server for a couple of years, and you know a little bit about TempDB. You’ve heard that temp tables and table variables have different performance characteristics than regular user database tables, and you’ve heard that you’re supposed to have multiple data files.

You’re wondering what exactly goes on behind the scenes.

This class is for curious folks who want to learn:

  • What uses TempDB, and how that affects performance: not just temp tables and table variables, but also triggers, cursors, sorting indexes, workspace spills, the version store, and more
  • How to host TempDB depending on how you use it: understanding what goes in the log file, the data files, why the number of data files matter, and whether your particular workload should be on SAN, local storage, or ephemeral storage, and why
  • How to monitor it: after you understand what uses it and how to host it, you need to keep tabs on how its performance changes over time using DMVs, Perfmon counters, and the First Responder Kit scripts

This course is 90% demos: the only slides are the introductions at the start of the day, illustrations to support a few topics, and the recap at the end of the day. The rest of the time, we’ll be working in SQL Server Management Studio. Roll up your sleeves and join me!

You can watch the Instant Replays now if you’ve got a Live Class Season Pass or a Recorded Class Season Pass. If you’d like to join the next live, class, it’ll be on February 8, 2021 at 9:00-5:00 Eastern (click to see in your own time zone) – iCal.


SQL ConstantCare® Population Report: Fall 2020

Ever wonder how fast people are adopting new versions of SQL Server, or what’s “normal” out there for SQL Server adoption rates, hardware sizes, or numbers of databases? Let’s find out in the summer 2020 version of our SQL ConstantCare® population report.

Out of the 3,650 servers sending in data recently, the most popular version of SQL Server is still 2016. The combination of 2014, 2016, and 2017 make up 74% of market share right now:

On a percentage basis, SQL Server 2008, 2008R2, and 2012 all lost some share this month – but they didn’t lose it to SQL Server 2016 or 2017, both of which actually went down too. They lost it to 2019 and Azure, which went up a combined 5%. The newer products are at the top of this chart, and the new data’s at the right, so you can see the new stuff gradually pushing down the old stuff over time:

I love that with a year’s worth of data, we’re able to start seeing clear trends.

My thoughts:

  • SQL Server 2016 is still the juggernaut, with 1 in 3 instances overall.
  • This marks the first quarter where SQL Server 2019 has more adoption than SQL Server 2008R2! That’s awesome. It’s also up to 1/2 of the market share of SQL Server 2012, which doesn’t sound like much, but looking at its speed of growth, I’m hoping to see 2019 beat 2012 by mid-2021.
  • Azure SQL DB is getting close to the market share of 2008R2 too, also, which is good. I’m sure Azure SQL DB is underrepresented in these metrics, though, because we haven’t been marketing specifically to Azure SQL DB users.

In last quarter’s survey, I asked readers to respond in the comments why they weren’t adopting Azure SQL DB. They said things like:

  • We use cross-database queries, filestream, filetable
  • Lack of control over backups, inability to restore your own data
  • Lack of feature parity: xp_cmdshell, CLR, Service Broker
  • Size challenges, like a 14TB database
  • Cost predictability

None of these are permanent showstoppers: it’s all fixable over time as Microsoft continues to invest more in Azure SQL DB. It’s just a question of, “when are they going to ship enough improvements that a good chunk of users will be able to adopt it?”

Right now, we’re in a bit of a holding pattern. Microsoft hasn’t announced much around those above problems recently. However, I gotta think there’s gonna come a point where, BAM, they’re gonna remove several of those barriers, and suddenly companies are going to be much more interested in SQL-as-a-Service instead of SQL-in-a-VM.

As a consultant & trainer, I have to keep my finger on the pulse of the real market (not the marketing market) too. We all have to figure out where the right place is to invest our learning time and which skills to learn: we just can’t afford to pick ’em all up. I look at the calendar for 2021 and ask myself, “Alright, given this adoption curve, what stuff do I want to learn next year? What skills will I need to pick up in the spring that will pay off by the time fall rolls around?”

This is where it gets so tricky: the features just aren’t out yet, and the adoption isn’t there yet either. With those two things in mind, looking at the calendar, I think I can continue to punt on learning Azure SQL DB specifics for another 6-12 months, and focus on where 96% of the market is: good ol’ SQL Server.

That new Babelfish sure does look interesting, though….


Menu