As an Attendee, What’s Important to You In a New Data Community?

Conferences and Classes
51 Comments

With PASS shutting down, it’s time to think about what the community builds in its place.

There are a lot of efforts going towards building replacements for PASS, including Microsoft’s own efforts. Because these efforts are often led by active community volunteers, they’re very well in touch with what event organizers, sponsors, and speakers want in their community.

But I’d like to ask you, dear reader – when you attend (not volunteer or speak at) an event, whether it’s in person or online, whether it’s a one-time annual event or a monthly user group meeting:

  • What’s your idea of a great data community?
  • What are things that you want to see included?
  • What are features or attributes that are important to you?
  • How do you want to interact with other folks in the community?

I’ll ask clarifying questions in the comments just to help hone your point of view, but I’m not going to give any answers here. I’d rather just pose this and let y’all talk through it with each other, and then let organizers read through this for insights.


[Video] Office Hours: SQL Server Career Q&A, Part 2

Professional Development, Videos
0

Last week, I posted a call for questions about SQL Server careers and professional development, and y’all asked a lot of good ones. Here’s the next video with the questions you upvoted the most, and I’ll post the last round of ’em next week.

I won’t be taking new questions in the comments, but I’m leaving the comments open on this post so that y’all can discuss the video together with each other. Just be mindful that your name will be publicly visible, and if you use your real email address, your Gravatar icon will show up next to your comment. If you have something you want to say anonymously, best to use a throwaway name/email on this one.


How to Batch Updates A Few Thousand Rows at a Time

T-SQL
17 Comments

You’ve got a staging table with millions of rows, and you want to join that over to a production table and update the contents. However, when you try to do it all in one big statement, you end up with lock escalation, large transaction log usage, slow replication to Availability Groups, and angry users with pitchforks gathered in the Zoom lobby.

In this post, I’ll explain how to use a combination of two separate topics that I’ve blogged about recently:

Setting up the problem

I’ll start with the Stack Overflow database (any version will work) and make a copy of the Users table called Users_Staging, and I’ll change a bunch of values in it to simulate a table we need to import from somewhere else:

If I try to run a single update statement and update all of the rows:

Then while it’s running, check the locks it’s holding in another window with sp_WhoIsActive @get_locks = 1:

See how the Users table says “OBJECT” request_mode “X”? That means my update query has gotten an eXclusive lock on the Users table. That’s your sign that other queries will be screaming with anger as they wait around for your update to finish.

Now, sometimes that’s what you actually want: sometimes you want to rip the Band-Aid off and get all of your work done in a single transaction. However, sometimes you want to work through the operation in small chunks, avoiding lock escalation. In that case, we’re going to need a batching process.

How to fix it using the fast ordered delete technique and the output table technique

I’m going to encapsulate my work in a stored procedure so that it can be repeatedly called by whatever application I’m using to control my ETL process. You could do this same technique with a loop (like while exists rows in the staging table), but I’m choosing not to cover that here. When you get your own blog, you’ll realize that you also get to control what you write about … and everyone will complain regardless. Here we go:

When I execute that stored procedure, the locks look like a hot mess, but note the lock level on the Users object:

Now, they say “OBJECT” request_mode=”IX”, which means INTENT exclusive as opposed to just straight exclusive. This means that SQL Server is starting some work, and it might need to escalate the locks to table level…but as long as you keep the number of rows & locks low, it won’t have to. In this case, my stored procedure runs & finishes quickly without escalating to a table-level lock.

There two parts of the proc that make this magic happen. This part:

Tells SQL Server that it’s only going to grab 1,000 rows, and it’s going to be easy to identify exactly which 1,000 rows they are because our staging table has a clustered index on Id. That enables SQL Server to grab those 1,000 rows first, then do exactly 1,000 clustered index seeks on the dbo.Users table.

The second magical component:

Tells SQL Server to track which 1,000 Ids got updated. This way, we can be certain about which rows we can safely remove from the Users_Staging table.

For bonus points, if you wanted to keep the rows in the dbo.Users_Staging table while you worked rather than deleting them, you could do something like:

  • Add an Is_Processed bit column to dbo.Users_Staging
  • Add “WHERE Is_Processed IS NULL” filter to the update clause so that we don’t update rows twice
  • After the update finishes, update the Is_Processed column in dbo.Users_Staging to denote that the row is already taken care of

However, when you add more stuff like this, you also introduce more overhead to the batch process. I’ve also seen cases where complex filters on the dbo.Users_Staging table would cause SQL Server to not quickly identify the next 1,000 rows to process.

If you’re doing this work,
you should also read…

This blog post was to explain one very specific technique: combining fast ordered deletes with an output table. This post isn’t meant to be an overall compendium of everything you need to know while building ETL code. However, as long as you’ve finished this post, I want to leave you with a few related links that you’re gonna love because they help you build more predictable and performant code:

Or, if you’d like to watch me write this blog post, I did it on a recent stream:

To see more of these, follow me on Twitch or YouTube.


The End of the Professional Association for SQL Server #SQLPASS

#SQLPass
44 Comments

This afternoon, PASS announced that their operations will cease in January:

We are saddened to tell you that, due to the impact of COVID-19, PASS is ceasing all regular operations, effective January 15, 2021. We encourage you to take full advantage of any access that you have to PASS content between now and January 15, 2021. After that point, PASS servers will cease to function and members will no longer be able to access any PASS resources. So, in the meantime, you can watch sessions on our website, and download session recordings that you’ve purchased or that you have access to as a PASS Pro member. Please take full advantage of this exclusive content while you can.

For perspective on the financial numbers involved, read the final board meeting minutes (PDF):

Tim outlined the outstanding debt as 1.87M with a total potential deficit of $3.2M. There is also future hotel and convention center cancellation fees at around $6M over the next 5 years. Tim presented the PASS Executive’s recommendation outlining that with the magnitude of debt, no cash on hand, nor forecasted in person event for revenue generation, PASS has no choice but to move forward with insolvency.

There’s going to be a lot of discussion around this over the coming days: folks will be mourning what’s lost, working together to save the community’s resources like the videos and the chapter lists, and planning new communities.

It’s too soon for me to have any written thoughts together, but I let y’all post questions for me, and I answered them from vacation:


Frequently Asked Questions from Office Hours

In my Twitch & YouTube live streams, some questions seem to come up every week. I’m not expecting this post to stop ’em by any means – most folks on the live stream don’t seem to read the blog regularly – but I thought I’d share ’em here.

Q: Why is the same query sometimes fast and sometimes slow, depending on where I run it?

It’s most likely parameter sniffing.

Q: My index maintenance jobs take forever. What should I do?

When you rebuild a 50GB table’s indexes, you’re basically reloading that table from scratch. SQL Server has to make a brand-new copy of the table on new data pages, and it logs all this stuff in the transaction log – which means your backups take longer and your Availability Group gets way far behind.

If you’ve been rebuilding indexes nightly, consider easing that off to weekends instead. If you’re worried that will affect performance, you’re probably mixing up the difference between rebuilding indexes and updating statistics. Read about how out-of-date statistics cause bad query performance, and then consider doing daily stats update jobs rather than rebuilding your indexes.

The more you stick around my blog and live streams, the more you’ll see me point out that daily stats updates are a bad idea for most databases too, but in the grand scheme of things, they’re still way better than daily index rebuilds.

Q: Which cloud provider is better for SQL Server?

Companies don’t pick their cloud hosting based on one specific database. They pick on an array of things including overall cost, available services, existing licensing agreements, etc. What’s better for you might be worse for somebody else.

Having said that, at this moment in time:

  • If you run SQL Server in a VM, you can get more VM selection & performance at AWS
  • If you want to rent SQL Server as a service, Microsoft’s Azure SQL DB and Azure SQL DB Managed Instances are quite a bit ahead of Amazon RDS

Brent's officeQ: What’s that thing on your wall?

It’s an Every Day Calendar by Simone Giertz. I use it to track the days that I work on building stuff that will produce passive income, like writing new classes or designing new apps. It’s a helpful visual reminder that I need to stay focused on building for the future.

Q: How should I get started learning SQL?

One of the nice things about SQL is that it’s a mature language. It changes very slowly and gradually over the years. Because of that, there’s a ton of training material out there available at a very low cost, like the edX classes on SQL or the book SQL for Dummies.

If you don’t find yourself immediately interested in the material, bail out and pick another author/speaker.

Q: Why would someone choose SQL Server over MySQL or Postgres?

Microsoft SQL Server costs about $2,000 USD per CPU core for Standard Edition, and around $7,000 per core for Enterprise Edition. What do you get for all that money? I’m going to give you a few reasons why companies choose SQL Server. I know you’re going to be tempted to argue with me, and you might have some great points – but you don’t need to convince me, dear reader. I’m a huge believer in other database platforms – I’ve already written about why we use AWS Aurora, plus we also use DynamoDB and MySQL. I’m just explaining why other companies often choose SQL Server, like my clients:

  • Support – some companies want to know that they can call the company who wrote the database and get support 24/7.
  • Enterprise-friendly features like auditing, encryption, and single sign on security – for example, if you’re a hospital, you may need to track everyone who queries George Clooney’s medical records.
  • High availability and disaster recovery – SQL Server has so many built-in features around clustering, Availability Groups, log shipping, etc. that make it easier to support higher uptime. I’m not saying it’s easy by any means – but easier than some open source alternatives.
  • Inertia – because a lot of companies just already standardized on SQL Server, have a big install base of it, have a lot of staff who know how to develop for it and manage it.

Q: You use a Mac – how do you manage SQL Server?

With a jump box, a VM set up with all of the tools I need. When I do long term work with a client, I have them set up a jump box for me. That way I can just use a VPN client and a remote desktop client from anywhere.

When I’m teaching training classes, I spin up fairly beefy (8-core, 60GB-RAM, 2TB SSD) VMs in the cloud with SQL Server. That approach lets me run really ugly performance workloads, maxing out CPU and storage, without worrying about the high workload interfering with my video stream.

Q: Why don’t you use Azure Data Studio during streams?

For one, the execution plan experience isn’t anywhere near as good as SQL Server Management Studio. It’s not bad, it’s just not complete – there are tons of property details you just can’t see with ADS. Since I teach performance tuning, I need those details.

Also, I gotta meet my audience where they are. The vast majority of SQL Server professionals are still using SSMS, not ADS. If I’m going to teach you something inside the span of an hour or a 1-day class, I need to pick my battles.

I do use Azure Data Studio a ton myself because I split my time between Microsoft SQL Server and AWS Aurora PostgreSQL.

Q: Will you teach classes on other databases?

Never say never, but it’s pretty unlikely. I’m terrible at Postgres.

I wouldn’t wanna teach a Postgres tuning class until I could explain how the optimizer works. I don’t have plans to even learn that, let alone build classes to explain it.

There’s another question that comes up constantly, but it deserves its own blog post, so stay tuned for that this week.


[Video] Office Hours: SQL Server Career Q&A, Part 1

I posted a call for questions about SQL Server careers and professional development, and y’all asked a lot of good ones. Here were the questions you upvoted the most, and I’ll post another round of ’em next week.

The audio on this isn’t up to my usual home-studio quality, of course!

I won’t be taking new questions in the comments, but I’m leaving the comments open on this post so that y’all can discuss the video together with each other. Just be mindful that your name will be publicly visible, and if you use your real email address, your Gravatar icon will show up next to your comment. If you have something you want to say anonymously, best to use a throwaway name/email on this one.


Wanna Ask Me Anything About Careers & Professional Development?

Professional Development
Brent Ozar in Cabo
I’ll be answering your questions from right here.

Ever wanted to ask me questions about careers or professional development?

Now’s your chance.

Post your questions here (update: questions are closed)– and you may wanna use a fake name if you’re asking something that’s a little revealing about your current job or employer.

Don’t have a question? Vote on the questions that other folks are asking if there’s something you’d like to see me talk about.

This evening, I’ll flip on my camera and record an Office Hours stream. It won’t be live – I’m down in Mexico on vacation with pretty low bandwidth – but I’ll work my way through the highest-voted questions and talk through ’em. I’ll upload the recording and post it here on the blog.

Nothing’s off limits: ask whatever you like, as long as it’s related to careers or professional development. I’m doing a non-technical Office Hours because it’s easier to answer those kinds of questions without a computer. It’s just gonna be you, me, and tequila.

Comments are disabled on this post just because I want to make sure you ask your questions at the right place, here. (Update: questions are closed.)


Updated First Responder Kit and Consultant Toolkit for December 2020

sp_BlitzLock gets better identification of parallel deadlocks, sp_DatabaseRestore uses Ola’s scripts for logging, and lots of bug fixes this month.

How I Use the First Responder Kit
Watch and learn

To get the new version:

Consultant Toolkit Changes

I updated it to this month’s First Responder Kit, but no changes to querymanifest.json or the spreadsheet. If you’ve customized those, no changes are necessary this month: just copy the /Resources/SQLServerSetup checklist folder over your existing files to replace the SQL scripts.

sp_Blitz Changes

sp_BlitzFirst Changes

  • Improvement: in @ExpertMode = 1, when we show the top files by read & write stalls, show 20 files instead of 5, and also sort them by the worst stalls. (#2707)
  • Fix: ignore FT_IFTSHC_MUTEX waits. (#2697)

sp_BlitzIndex Changes

  • Fix: index suggestions on really long table names could be truncated. (#2680, thanks Ralf Pickel.)
  • Fix: columnstore visualization only worked on tables in the dbo schema. (#2683, thanks Ali Hacks.)
  • Fix: nonclustered columnstore visualization didn’t report the correct columns. (#2684, thanks Ali Hacks.)
  • Fix: if you passed in databases to be ignored, they were still counting against the 50-database limit before we made you pass in BringThePain = 1. (#2693, thanks skrishnan31.)
  • Fix: temporal tables had table.schema in the detail names rather than schema.table. (#2694, thanks Mark Hions and Mikey Bronowski.)

sp_BlitzLock Changes

sp_DatabaseRestore Changes:

  • Improvement: commands are now run with CommandExecute so that they get logged if things go wrong. (#2700, thanks Frederik Vanderhaegen.)
  • Fix: the new @SkipBackupsAlreadyInMsdb switch was ignoring transaction logs if you were restoring them on the same server where you were taking backups (since they were already in msdb.) (#2710, thanks Greg Dodd, and you can watch a video of him fixing it.)

sp_ineachdb Changes

  • Fix: now handles database names with spaces in ’em. (#2702, thanks renegm.)

For Support

When you have questions about how the tools work, talk with the community in the #FirstResponderKit Slack channel. If you need a free invite, hit SQLslack.com. Be patient – it’s staffed with volunteers who have day jobs.

When you find a bug or want something changed, read the contributing.md file.

When you have a question about what the scripts found, first make sure you read the “More Details” URL for any warning you find. We put a lot of work into documentation, and we wouldn’t want someone to yell at you to go read the fine manual. After that, when you’ve still got questions about how something works in SQL Server, post a question at DBA.StackExchange.com and the community (that includes me!) will help. Include exact errors and any applicable screenshots, your SQL Server version number (including the build #), and the version of the tool you’re working with.

Watch Us Working On It

When Greg Dodd found bug #2710 in sp_DatabaseRestore, he live streamed his investigation and the fix:

You can also subscribe to Greg’s YouTube channel if you want to be alerted the next time he streams when he’s working.

And then on the repo maintainer side, I often live stream when I’m merging pull requests. Here’s one of my sessions from this month’s release:

 


Free Webcast: 3 Ways to Prove Your SQL Server Storage is Slow

Storage
7 Comments

Your users are complaining that their queries are slow. You’ve tried changing SQL Server settings, tuning indexes, but the problem just won’t go away. You’ve got a hunch that it’s a storage problem, but you need real, empirical proof. You need numbers that no one can argue, and you need to know acceptable targets for those numbers.

In this session, I will give you three queries to run on your SQL Server. I’ll talk through what good and bad numbers look like, and you’ll share your numbers with the overall audience to get a feel for whether your storage is hot or not.

Register here for the free webcast with me & Argenis Fernandez from Pure Storage.


Are You Underpaid? Let’s Find Out: The Data Professional Salary Survey is Open.

Salary
19 Comments

It’s time for our annual salary survey to find out what data professionals make. You fill out the data, we open source the whole thing, and you can analyze the data to spot trends and do a better job of negotiating your own salary:

We pay Richie in query bucks

Take the Data Professional Salary Survey now.the survey has closed.

The anonymous survey closes Friday, Jan 1, 2021. The results are completely open source, and shared with the community for your analysis. (You can analyze ’em now mid-flight, but I’d wait until the final results come in. I’ll combine them into a single spreadsheet with the past results.)

Thanks for your help in giving everybody in the community a better chance to talk honestly with their managers about salary.

Update: here are the 2020 Data Professional Salary Survey results.


New Class: Fundamentals of TempDB – Instant Replays Ready Now

TempDB, Videos
2 Comments

Yesterday, I taught my first Fundamentals of TempDB class, and as always, you can watch the Instant Replays right after class completes. Here’s the abstract:

You’ve been working with Microsoft SQL Server for a couple of years, and you know a little bit about TempDB. You’ve heard that temp tables and table variables have different performance characteristics than regular user database tables, and you’ve heard that you’re supposed to have multiple data files.

You’re wondering what exactly goes on behind the scenes.

This class is for curious folks who want to learn:

  • What uses TempDB, and how that affects performance: not just temp tables and table variables, but also triggers, cursors, sorting indexes, workspace spills, the version store, and more
  • How to host TempDB depending on how you use it: understanding what goes in the log file, the data files, why the number of data files matter, and whether your particular workload should be on SAN, local storage, or ephemeral storage, and why
  • How to monitor it: after you understand what uses it and how to host it, you need to keep tabs on how its performance changes over time using DMVs, Perfmon counters, and the First Responder Kit scripts

This course is 90% demos: the only slides are the introductions at the start of the day, illustrations to support a few topics, and the recap at the end of the day. The rest of the time, we’ll be working in SQL Server Management Studio. Roll up your sleeves and join me!

You can watch the Instant Replays now if you’ve got a Live Class Season Pass or a Recorded Class Season Pass. If you’d like to join the next live, class, it’ll be on February 8, 2021 at 9:00-5:00 Eastern (click to see in your own time zone) – iCal.


SQL ConstantCare® Population Report: Fall 2020

Ever wonder how fast people are adopting new versions of SQL Server, or what’s “normal” out there for SQL Server adoption rates, hardware sizes, or numbers of databases? Let’s find out in the summer 2020 version of our SQL ConstantCare® population report.

Out of the 3,650 servers sending in data recently, the most popular version of SQL Server is still 2016. The combination of 2014, 2016, and 2017 make up 74% of market share right now:

On a percentage basis, SQL Server 2008, 2008R2, and 2012 all lost some share this month – but they didn’t lose it to SQL Server 2016 or 2017, both of which actually went down too. They lost it to 2019 and Azure, which went up a combined 5%. The newer products are at the top of this chart, and the new data’s at the right, so you can see the new stuff gradually pushing down the old stuff over time:

I love that with a year’s worth of data, we’re able to start seeing clear trends.

My thoughts:

  • SQL Server 2016 is still the juggernaut, with 1 in 3 instances overall.
  • This marks the first quarter where SQL Server 2019 has more adoption than SQL Server 2008R2! That’s awesome. It’s also up to 1/2 of the market share of SQL Server 2012, which doesn’t sound like much, but looking at its speed of growth, I’m hoping to see 2019 beat 2012 by mid-2021.
  • Azure SQL DB is getting close to the market share of 2008R2 too, also, which is good. I’m sure Azure SQL DB is underrepresented in these metrics, though, because we haven’t been marketing specifically to Azure SQL DB users.

In last quarter’s survey, I asked readers to respond in the comments why they weren’t adopting Azure SQL DB. They said things like:

  • We use cross-database queries, filestream, filetable
  • Lack of control over backups, inability to restore your own data
  • Lack of feature parity: xp_cmdshell, CLR, Service Broker
  • Size challenges, like a 14TB database
  • Cost predictability

None of these are permanent showstoppers: it’s all fixable over time as Microsoft continues to invest more in Azure SQL DB. It’s just a question of, “when are they going to ship enough improvements that a good chunk of users will be able to adopt it?”

Right now, we’re in a bit of a holding pattern. Microsoft hasn’t announced much around those above problems recently. However, I gotta think there’s gonna come a point where, BAM, they’re gonna remove several of those barriers, and suddenly companies are going to be much more interested in SQL-as-a-Service instead of SQL-in-a-VM.

As a consultant & trainer, I have to keep my finger on the pulse of the real market (not the marketing market) too. We all have to figure out where the right place is to invest our learning time and which skills to learn: we just can’t afford to pick ’em all up. I look at the calendar for 2021 and ask myself, “Alright, given this adoption curve, what stuff do I want to learn next year? What skills will I need to pick up in the spring that will pay off by the time fall rolls around?”

This is where it gets so tricky: the features just aren’t out yet, and the adoption isn’t there yet either. With those two things in mind, looking at the calendar, I think I can continue to punt on learning Azure SQL DB specifics for another 6-12 months, and focus on where 96% of the market is: good ol’ SQL Server.

That new Babelfish sure does look interesting, though….


Autoparameterized Trivial Queries May Not Get Partition Elimination

Partitioning
5 Comments

How’s that for a niche topic? You’re probably never going to know this, but since I had to figure it out the hard way, I’m writing this down so I don’t forget it: queries that get trivial optimization may not get partition elimination.

This post is going to be a little on the long side just because I have to set up a partitioned table. I also have to warn you that if you’re a partitioning pro, you’re not going to like the way I’m doing it in this post, but the thing is, I’m reproducing an exact client problem – setting up partitioning exactly the way their table had it. Let’s not waste time flaming the partition setup.

Setting up a partitioned table for the demo

I’m going to start with the Users table from the Stack Overflow database.

To reproduce the client issue that brought me here, let’s say we always filter users by Location and DisplayName, like this:

And say that we decided to partition the users by ranges of locations, alphabetically. I’ll create a partition scheme, function, and then copy the Users table into a Users_partitioned table.

Yes, you’re going to be angry that Location isn’t a great candidate for partitioning because data isn’t evenly distributed by Location, and you’re probably going to be angry about RANGE LEFT, and you’re going to have questions about different collations. Zip it. That’s not the point of this demo. This was the client’s existing partitioning strategy, and I have to do several demos for them to show different issues that we’re having with the setup. I love you a lot – no, a LOT – but I’m not going to craft a different demo for you. I’m just sharing this one demo with you because it’s easy to share publicly, whereas some of the rest I need to show can’t be shown publicly because it relates to their IP. I’m typing all this out because partitioning people are passionate about proper performance, and I just know they’re gonna flame me in the comments as if *I* designed this strategy. Anyhoo, moving on.

Running a trivial query

Let’s try our query to see if it divebombs into just one partition – the one containing San Diego. The actual query plan looks simple enough:

And it looks like we divebombed straight in, but right-click on the Clustered Index Seek and click Properties:

“Actual Partitions Accessed” shows that we accessed partitions 1 through 27. Another way to see it is SET STATISTICS IO ON, which shows that we read all 27 partitions of the alphabet.

Why?

Right-click on the SELECT operator in the plan and look at the properties, and scroll down to Optimization Level. SQL Server believed this query was trivially simple, so it didn’t put much thought into building an execution plan.

It’s not that SQL Server didn’t put any thought into it – note the 122ms of Compile CPU time. Partitioned tables generally see longer compilation times, even when they’re trivial. (And the more partitioned tables you have in a query, the worse this seems to get.)

Even though our query didn’t have parameters, SQL Server thought, “They’re probably going to run queries like this a few times, so I’m gonna automatically turn those literals (San Diego and Brent) into parameters.” Note the “Parameter List” in the screenshot – this is autoparameterization.

Now look, I gotta pick my battles here: I can’t teach you all of the intricacies of autoparameterization and trivial optimization inside one blog post.

Let’s set that aside, and keep going by building a stored procedure.

Put that same query in a stored procedure, and magic happens.

Instead of passing the literals in directly from the app, let’s make a stored procedure. Making a proc isn’t the only way of fixing this problem by any means, but it’s just a way of fixing it:

Now the logical reads tell a different story:

And while the actual execution plan looks the same at first, showing a clustered index seek:

Look at the clustered index seek’s properties, and it shows that we only accessed 1 partition (partition count), partition #20:

And it isn’t because this query got full optimization, either!

It’s still trivial. That’s kinda wild.

The morals of the story are NOT:

  • “You should put every query in a stored proc”
  • “You should use variables”
  • “You should add complexity to your queries to make them get full optimization”

The morals of the story ARE:

  • Just because you see a clustered index seek on a partitioned table doesn’t mean it really seeked into one specific area.
  • Partition elimination investigation requires looking at partitioned tables in a query plan and checking to see how many partitions were accessed.
  • Just because you should get partition elimination doesn’t mean you will.
  • Just because you do get parameterization in a stored procedure doesn’t mean you will get it everywhere that you run a similar query.
  • Partitioning really isn’t a query performance feature: it’s a maintenance feature, making data loading & index maintenance easier, especially for columnstore tables. This whole article just plain ol’ wouldn’t have mattered on a non-partitioned index at all: we would have gotten a plain ol’ clustered index seek, less logical reads, and faster plan compilations.
  • You can get faster query performance with table partitioning – but in most of the shops where I see it, it’s making query performance worse, and we have to put additional work in just to get equivalent query performance that a non-partitioned table would get.

I said you couldn’t flame me for the partition design on this table, but…you’re totally allowed to flame me for that last bullet point. That’s fair game.


When Do I Need to Use DESC in Indexes?

Indexing
4 Comments

If I take the Users table from any Stack Overflow database, put an index on Reputation, and write a query to find the top 100 users sorted by reputation, descending:

It doesn’t matter whether the index is sorted ascending or descending. SQL Server goes to the end of the index and starts scanning backwards:

If you right-click on the Index Scan and go into Properties, you can see that the data is ordered, and SQL Server is scanning backwards:

You don’t need a separate descending index for that.

But if you sort multiple fields ASC/DESC/ASC, it gets complicated.

Say we’re looking for the highest-reputation people who live in London, sorted by reputation descending, and in the event of a tie, we want them listed in alphabetical order. Here’s an index we might build, plus the query:

In true Clippy style, SQL Server is recommending an index on Location – but with the rest of the columns just included in the index, not even sorted. Good times.

Ah, Clippy, good times. Stay in school, buddy.

We’re getting the sort because our data is kinda sorted, but not sorted enough, and if you hover your mouse over the Sort and look at the bottom of the tooltip, you’ll see that SQL Server is sorting by Reputation descending, DisplayName ascending.

To understand why, think about how the data is arranged when we seek to London, go to the highest reputation, and start reading backwards. Here’s a visualization query to see what’s in the index:

To simulate a backwards scan, go to the end of the result set and start reading upwards. At first, it looks like the data is perfectly sorted, but as you continue to scan backwards and start hitting ties, we have a problem:

If you’re reading from the bottom up:

  • Row 6752: you read the first 1140, tsvallender
  • Row 6751: you read another 1140, and you realize that the data’s not in order
  • You could in theory now jump back down and re-read 6752, and now you have the 1140s in order, but…how do you know that 6751 was the last 1140? You’d have to look up at row 6750
  • Row 6750: you read this, and it’s the first 1138, but
  • Row 6749: he’s also 1138, so you have to keep reading upwards, and…

That’s gonna get old. It’s too much random jumping around, and it’s not a scan anymore, so rather than doing that dancing during the reading, SQL Server just says, “I’m gonna add a sort to the execution plan because the data isn’t ordered the way I need it to be ordered.”

We could fix that with a DESC index.

But it can’t just be EVERYTHING descending. The sort order has to match our query’s order, like this:

So now our execution plan doesn’t have a sort or a memory grant:

Thing is, though, I almost never need to use this trick. Most of the time, the sort in the query plan just isn’t that expensive – like in this case, if you repeatedly compare the two queries, we’re talking about very small differences in memory grants and CPU consumption. The difference grows as the volume of sorted data grows, like if we’re talking about bringing back millions of rows, or if the query frequency grows, like if we’re running the query thousands of times per second.

Want to learn more tricks like this?

If you like tricks like this, you’ll love my Mastering Index Tuning class. The next one starts December 8th (iCal), and after that, Feb 12-14 (iCal.)

Folks with a Live Class Season Pass are welcome to drop in anytime, or just watch the Instant Replays on a schedule that works for them. Just head to my current training page, check out the schedule at the bottom of the page, and grab the calendar files for the classes you’re interested in. You don’t have to register ahead of time – just drop in anytime I’m streaming.

See you in class!


How to Set & Get the Next ID Without Serializable Isolation

Say for some reason you can’t use identity columns or sequences, or maybe your application was built a long, long time ago – and you’ve got a table with a list of IDs to use next:

Whenever your application wants to insert a row, it needs to do two things: it needs to grab an ID for that table, and it needs to increment the CurrentID by one. This is a common design pattern I see with older applications that need to get an ID from the database, but then do some processing on the application side. For example, they want to reserve an OrderID, and then in the application code, they build a list of insert statements for not just the Order, but the line item tables as well.

One way to code this would be to use serializable isolation while you work, holding a transaction so that nobody can change the table while you’re working in it:

In case you haven’t seen output variables before for procs, here’s how you use them:

This stored procedure works, but it doesn’t scale well when you get into thousands of inserts per second, especially scattered across lots of tables. (And I’m simplifying here: in a real-world scenario, this kind of stored procedure would have some error checking built into it as well, especially given the blocking scenarios you can hit with serializable isolation levels.)

Here’s a quick improvement that doesn’t require ripping out the stored procedure and switching to identity columns or sequences:

This leverages the fact that you can both update data AND set variables during an update statement. With this trick, I don’t need to touch the Ids table multiple times, which means I don’t need a transaction, which means I don’t need serializable. Suddenly, this really opens the floodgates on concurrency with this table.

I still like identity columns & sequences better, though.

Want to learn more tricks like this?

If you like tricks like this, you’ll love my Mastering Query Tuning class. I’ve got two upcoming sessions: December 11-13 (Fri/Sat/Sun, iCal) and January 12-14 (Tues/Weds/Thurs- iCal.)

If you’ve got a Live Class Season Pass, you can drop in on this class or any of my live online classes at any time. Just head to my current training page, check out the schedule at the bottom of the page, and grab the calendar files for the classes you’re interested in. You don’t have to register ahead of time – just drop in anytime I’m streaming.


How to Insert Rows and Get Their Identity Values with the OUTPUT Clause

T-SQL
20 Comments

Say you’ve got a two-step process where you’re:

  1. Inserting rows into a table that has an identity column, then
  2. Querying that table to figure out what identities you got

There’s a a faster way that doesn’t require hitting the table twice: the OUTPUT clause.

I’ll demonstrate it with the Badges table from the Stack Overflow database, which has an Id column that’s an identity. I’m going to find all of the Users who live in Iceland, give them a badge, and then return the list of badge IDs I just granted:

This code pattern is kinda painful because:

  • We lock rows twice (when we read & insert, and then again when we read back what we just did)
  • We have to be careful to only fetch the rows we really just inserted – so we end up putting in all kinds of convoluted logic to work around concurrency problems

Instead, use the OUTPUT clause.

Here’s how to do it without touching the Badges & Users tables twice:

The OUTPUT clause is kinda like the virtual INSERTED/DELETED tables: it lets you grab the output of what you’re doing and redirect it to another place. Presto, less locking, less T-SQL to manage, less guessing as to which rows were affected by your operation.

Isn’t that slick? In my own line of work, I sure don’t need to use it often, but when I do, it’s amazing.

Want to learn more tricks like this?

If you like tricks like this, you’ll love my Mastering Query Tuning class. I’ve got two upcoming sessions: December 11-13 (Fri/Sat/Sun, iCal) and January 12-14 (Tues/Weds/Thurs- iCal.)

If you’ve got a Live Class Season Pass, you can drop in on this class or any of my live online classes at any time. Just head to my current training page, check out the schedule at the bottom of the page, and grab the calendar files for the classes you’re interested in. You don’t have to register ahead of time – just drop in anytime I’m streaming.


Finding the One Query to Tune in a Multi-Query Batch

Execution Plans
5 Comments

When you have a stored procedure or application query that has multiple queries inside of it, how can you tell which query to focus on?

For starters, do NOT believe the percentages you see in execution plans.

Let’s take a classic performance tuning issue: scalar functions. I’ll start with the Stack Overflow database – if you want to play at home, best to use the smallest database you can because this query performs terribly – then create a function and a stored procedure:

Then I’ll call the stored procedure, which takes 30 seconds even on the smallest Stack Overflow database:

And then use sp_BlitzCache to ask which queries ran the longest:

I’ve rearranged the sp_BlitzCache output columns in this screenshot to tell the story better:

In the result sets:

  1. The stored procedure itself shows up first with a total duration of 33,365 milliseconds
  2. One specific statement in the proc took 33,221 milliseconds of the above
  3. Another statement took just 143 milliseconds

When I click on any query plan in the “Query Plan” column, I’m going to get the query plan for the entire batch, not that specific statement. It doesn’t matter whether I click on the plan in line 1, 2, or 3 – I’m going to get the entire stored proc’s plan. That gets a little confusing because now, looking at the plan, we have to figure out which query took 33 seconds, because that’s the one we need to tune. Can you guess which one it is?

I’ll zoom in a little to help:

You might guess that it’s Query 1.

It has the biggest arrows and shows 100% of the cost.

Except that’s wrong.

That’s 100% of the estimated cost, calculated before the query executed.

Take another look at the sp_BlitzCache output, and pay particular attention to the Cost column:

The top query is the stored procedure, which took 33.3 seconds in total. Its total cost is so large that SQL Server resorted to scientific notation. The second query, which took the vast majority of the time, only cost 726 query bucks – way, way less than line 3.

Hover your mouse over the root node in each plan (in this case, the selects) and look at their costs. The one that cost 726 query bucks is actually Query 2, not Query 1:

To find the right query to tune,
listen to sp_BlitzCache.

When you’re looking at sp_BlitzCache’s output for multi-statement queries, pay particular attention to the “Cost” column.

Don’t look at the highest cost – instead, just look at the statement that ranks first in sp_BlitzCache’s output, then make a note of its cost. When you open up the query’s execution plan, look for that statement & that cost number – and that’s the real query that you need to focus on tuning. It’s often very different than the one with the highest estimated cost.

People who liked this post also liked Erik Darling’s technique for getting specific query plans from long stored procedures. He edits the proc to turn on SET STATISTICS XML ON right before the specific query runs, and then turning it back off again immediately afterwards. Slick!

Want to learn more query tuning tricks?

You’ll love my 3-day Mastering Query Tuning class. I give you live, running workloads hitting the Stack Overflow database, and you have to investigate & solve the issues using tricks I taught you in class. The upcoming rotations are:

And you can join in live anytime if you have a Live Class Season Pass, or watch the Instant Replays.


What It Takes To Write Two Blog Posts

Blogging, Videos
2 Comments

This week, I published two blog posts:

Let’s talk about the process of writing ’em.

A couple I was putting together this week’s First Responder Kit release, I realized sp_BlitzIndex didn’t have URLs for a couple of common families of problems: bad statistics and scalar user-defined functions. I made myself a couple of Github issues (#2670 and #2671) to track the documentation work I needed to do, and I decided to live stream it on this past Saturday so I could show y’all what my blogging process looks like.

In this two-hour session, I walk you through writing those two posts:

In that session, here are some of the things I talk about:

  • Your blog has two kinds of readers: your regulars and people who just found this one specific page via a Google search. These two posts were specifically crafted for the latter. Sure, my regular readers would consume the info, but they wouldn’t be raving about how awesome the posts are.
  • Scoping a post is hard: it’s hard to limit yourself to just writing about specific parts of an issue. It’s really tempting to just let the writing flow, and then next thing you know you’ve lost an entire day and you’re nowhere near what you’d consider “finished.” I try to scope my posts with the clock: how much can I actually cover in an hour? Sometimes I write for a very junior-level reader (as Nick says on the stream, someone who’s on chapter 1) and sometimes I write for an expert-level reader (and in that case, I don’t cover a lot of the prerequisites.)
  • Anytime you feel guilty for not covering more scope in the post, remember that you can finish the post with a list of recommended reading for the reader to continue their learning journey.
  • Writing takes up time. I wrote these posts on a Saturday morning, and about 75 minutes in, I get a text from my wife, making me write with a little bit more urgency. That’s a good reminder that the time you put into blogging needs to pay off somehow – whether it’s in the form of improved mental health for you, or a sense of reward for helping others, or literally making you more money. I’m a consultant and trainer, so the blog posts and videos are effectively marketing material for my “day job.” That makes it easier to put in work because I hopefully see a return on it later.

If you want to learn more about the process of writing to forward your career, check out my 2-hour conference session, 500-Level Guide to Career Internals.


When You’re Troubleshooting Blocking, Look at Query #2, Too.

When I’m troubleshooting a blocking emergency, the culprit is usually the query at the head of a blocking chain. Somebody did something ill-advised like starting a transaction and then locking a whole bunch of tables.

But sometimes, the lead blocker isn’t the real problem. It’s query #2.

Here’s a sample scenario:

  1. A long-running select with nolock starts, taking out a schema stability lock: nobody can change the structure of the table while he runs
  2. An online index operation needs to finish up, which needs a schema modification lock on that same table in order to switch in the newly built index – but he can’t do that until select #1 finishes
  3. Other selects with nolock start – but they need a schema stability lock, which they can’t get until #2 finishes

Most monitoring tools will say that #1 is the lead blocker, and they’re technically correct. However, it’s really confusing for users because they look at the monitoring tool and ask questions like:

  • “How can that #1 query possibly block someone else?”
  • “What could I even do to reduce the impact of select #1?”
  • “How are a bunch of selects with nolock being blocked by another select with nolock?”

Reproducing the problem with an “online” index rebuild

Let’s reproduce it with the Stack Overflow database. I’m going to put a tiny index on the Posts table:

That index only has a few rows in it, so it’ll be quick to rebuild – and the rebuild is what I’m going to use to illustrate the problem.

Query #1: To start our blocking chain, I kick off a long-running select with nolock:

Query #2: I kick off an online index rebuild on that tiny filtered index – which would normally happen instantly, but it needs a schema modification lock to switch in the new index:

That query is blocked by #1, which wouldn’t be a big deal, but now…

Queries #3: I fire off Query #1 again, but this time in SQLQueryStress so I can throw a lot of sessions at it at once:

They’re all blocked.

The problem is easy to diagnose with sp_WhoIsActive.

Here’s what the situation looks like in sp_WhoIsActive @find_block_leaders = 1, which helpfully organizes the blocking chains in a row:

The lead blocker looks like a select with nolock – but if you’re going to troubleshoot something, that’s not the query to troubleshoot. He’s fairly harmless. Instead, you have to go down a level to figure out who’s blocking the rest. I just love this approach.

It's the one between the aubergine query and the eggplant query.
See the index rebuild?

But the reason we’re gathered here today, dear congregation, is because I had to troubleshoot this exact issue for a client. They were seeing strange blocking problems that didn’t make sense: all day long, random queries that never should have been a lead blocker were suddenly blocking dozens or hundreds of other queries out of nowhere.

Their monitoring tool just listed all of the waiting statements in a pile – without distinguishing query #2, which was the real problem. The client didn’t realize there was a hidden “online” index operation in that colorful mess. That was the real culprit, but the monitoring tool only showed query #1 as the lead blocker – and since query #1 was different every time the index rebuild would sneak in, it was nearly impossible to troubleshoot.

In this case, SQL Server 2014 & newer has an easy solution.

SQL Server 2014 introduced a few new options for index rebuilds:

Now, when I rerun the same scenario, sp_WhoIsActive looks wildly different:

Only the “online” index rebuild is blocked, and he has a different wait type – he’s waiting at a low priority. He’s sitting by in the background, graciously letting other queries get in ahead of him.

If you use Ola Hallengren’s IndexOptimize script, use the parameters for WaitAtLowPriorityMaxDuration and WaitAtLowPriorityAbortAfterWait to configure these options.

Want to learn more about troubleshooting this stuff?

Check out my How I Use the First Responder Kit class. I walk you through using sp_BlitzWho, sp_BlitzFirst, sp_BlitzIndex, and many of the other scripts in the open source First Responder Kit.


How Scalar User-Defined Functions Slow Down Queries

T-SQL
6 Comments

When your query has a scalar user-defined function in it, SQL Server may not parallelize it and may hide the work that it’s doing in your execution plan.

To show it, I’ll run a simple query against the Users table in the Stack Overflow database.

I don’t have an index on Reputation, so SQL Server has to sort all of the Users by their Reputation. That’s a CPU-intensive operation, so SQL Server automatically parallelizes it across multiple CPU cores:

The racing stripes on the plan operators indicate that the operations went parallel. Another way to see that is by using SET STATISTICS TIME ON, which adds CPU time and execution time information to the Messages tab of SSMS:

See how CPU time is higher than elapsed time? That’s an indication that the query went parallel. Because work was distributed across multiple cores, SQL Server was able to get 969 milliseconds of work done in just 357 milliseconds.

But when we add a scalar user-defined function…

Say our users want to see their names and locations formatted a little more nicely. Instead of two columns that say “Brent Ozar” and “San Diego”, they want a single column that says “Brent Ozar from San Diego”. And I don’t want to put that concatenation logic all over the place in every query I write, so I encapsulate it in a function:

The results are a little easier on the eyes:

Now, when I use that function inside the query, the query technically works fine:

But the execution plan is missing a little something:

And the statistics time output shows that it still needed a lot of CPU, but since it didn’t go parallel, it took longer on the clock:

If you dig deeply enough in the execution plan properties, SQL Server notes that it couldn’t build a valid parallel execution plan, but it doesn’t say why:

There are a few ways to fix this.

One way is to inline the contents of your function – literally copy/paste the scalar function’s contents directly into your query:

This query goes parallel, proving that the concatenation and coalescing isn’t what was stopping us from going parallel – it was the presence of a scalar user-defined function:

Statistics time output shows that we went parallel and ran faster overall:

 

Another way to fix it is to upgrade to SQL Server 2019 and set your database to 2019 compatibility level. One of 2019’s most ambitious features, code named Froid, aims to automatically inline scalar user-defined functions without you having to rewrite them.

When I set my database into 2019 compat level, the query with the scalar function goes parallel again:

And statistics time output shows the performance improvement:

However, SQL Server 2019’s scalar function inlining comes with a huge number of drawbacks. Your query can actually go slower, or your scalar function may use features that SQL Server 2019 refuses to inline.

To fix this problem:

Want to learn more about troubleshooting these kinds of issues?

Check out my How I Use the First Responder Kit class. I walk you through using sp_BlitzWho, sp_BlitzFirst, sp_BlitzIndex, and many of the other scripts in the open source First Responder Kit.

If you’ve got a Live Class Season Pass, you can drop in on this class or any of my live online classes at any time. Just head to my current training page, check out the schedule at the bottom of the page, and grab the calendar files for the classes you’re interested in. You don’t have to register ahead of time – just drop in anytime I’m streaming.

Or, if you’ve got a Recorded Class Season Pass, you can hop in and watch the recordings in your account at any time.

See you in class!


Menu