4 Things I Love About Integers #TSQL2sday

I love integers.

Furthermore, I think we’ve all been taking them for granted.

Integers are clean, elegant datatypes that do exactly what they’re supposed to. They’re the Golden Retrievers of the database world: they have a few shortcomings, but so, so few. And so for this month’s T-SQL Tuesday, I wanna share my love for integers with you.

1. Integers have clear contents right from the start.

You can declare an integer variable or column datatype, then set it to a string – and it’ll work great as long as the string can be converted into a number:

But if you try to stuff something into an integer that doesn’t belong, you get a clear error right away:

I can guarantee that the only thing in there is gonna be whole numbers, and that I won’t have to play any of the finicky games involved with floating point math or someone trying to store dashes in there to format their social security numbers. If you wanna do formatting on integers, you can – there are functions to help – but it’s usually pretty clear that any formatting should be done on the application side.

2. Integers rarely have implicit conversion issues.

If someone passes in a string, or even a unicode string, and tries to do an equality search of a number field, SQL Server is smart enough to convert the string to an integer first. SQL Server even leverages statistics to understand what the string number will produce in a search!

The resulting execution plan is beautiful because SQL Server converted the parameter, not the contents of the table, and accurately estimated the number of rows:

3. Integers are small – and compress down even smaller.

Integers take up just four bytes of space, and if a few billion values isn’t enough for you, their big brother BIGINTs take up just eight bytes. Eight bytes! I see people constantly slapping NVARCHAR(100) all over the place, storing un-normalized descriptions in row after row after row, just flushing space down the drain when they could be linking to a table with the handful of descriptions that they always use.

Need to save space? Row compression is a piece of cake to implement, and even BIGINT columns are just automatically stored in the smallest possible number of bytes. Row compression makes it easy to justify using BIGINTs as identity columns on tables where you’re worried about a large number of rows in the future because they just don’t take any extra space to store.

Got a lot of repeated values, and want even more compression? Enable page compression.

And when it’s time to build fact tables for reporting systems and data warehouses, buckle up: columnstore indexes bring really bananas levels of compression, like 70% or higher with repeated integers like order line item quantities, prices, or part numbers.

4. Integers even work in URLs.

If you’ve surfed Stack Overflow, you might have noticed how the URLs have integers in them, like this:
https://stackoverflow.com/questions/7395915/does-inserting-data-into-sql-server-lock-the-whole-table/

Well, you know what’s really cool? Lop off everything after the numbers, like this:
https://stackoverflow.com/questions/7395915

And go to that URL:

Stack Overflow automatically pulls up the full URL. Wanna know how they do it? Everything after the number is extraneous, because the number corresponds to the actual ID they’re storing in the Posts table! You can see the database version of the question you’re looking at by running a query:

And presto:

Try doing that with your fancypants strings, bucko, and welcome to URL encoding hell.


Download the Current Stack Overflow Database for Free (2021-02)

Stack Overflow, the place where most of your production code comes from, publicly exports their data every couple/few months. @TarynPivots (their DBA) tweets about it, and then I pull some levers and import the XML data dump into SQL Server format.

Stack Overflow’s database makes for great blog post examples because it’s real-world data: real data distributions, lots of different data types, easy to understand tables, simple joins. Some of the tables include:

  • Users – now up over 14 million rows
  • Posts – over 52 million rows’ worth of questions & answers, 143GB in just the clustered index alone
  • Votes – over 208 million rows, making for fun calculations and grouping demos

This isn’t the exact same data structure as Stack Overflow’s current database – they’ve changed their own database over the years, but they still provide the data dump in the same style as the original site’s database, so your demo queries still work over time. If you’d like to find demo queries or find inspiration on queries to write, check out Data.StackExchange.com, a public query repository.

I distribute the database over BitTorrent because it’s so large. To get it, open the torrent file or magnet URL in your preferred BitTorrent client, and the 53GB download will start. After that finishes, you can extract it with 7Zip to get the SQL Server 2016 database. It’s 4 data files and 1 log file, adding up to a ~411GB database.

Want a smaller version to play around with?

  • Small: 10GB database as of 2010: 1GB direct download, or torrent or magnet. Expands to a ~10GB database called StackOverflow2010 with data from the years 2008 to 2010. If all you need is a quick, easy, friendly database for demos, and to follow along with code samples here on the blog, this is all you probably need.
  • Medium: 50GB database as of 2013: 10GB direct download, or torrent or magnet. Expands to a ~50GB database called StackOverflow2013 with data from 2008 to 2013 data. I use this in my Fundamentals classes because it’s big enough that slow queries will actually be kinda slow.
  • For my training classes: specialized copy as of 2018/06: 47GB torrent (magnet.) Expands to a ~180GB SQL Server 2016 database with queries and indexes specific to my training classes. Because it’s so large, I only distribute it with BitTorrent, not direct download links.

If you only have a limited amount of bandwidth, you don’t have to keep seeding the database after you get it – I’ve got it hosted on a handful of seedboxes around the world.

As with the original data dump, these are provided under cc-by-sa 4.0 license. That means you are free to share it and adapt it for any purpose, even commercially, but you must attribute it to the original authors (not me):

so-logo

Happy demoing!


It’s March. We’re still quarantined. Let’s do some free classes for Europeans.

Last year, during the first few months of quarantines, I ran free training classes to help y’all pass the time and raise your skills.

Thankfully, I don’t need to wear this during the online classes.

It’s March 2021, and so many of us around the world are still stuck indoors, not going into work, not seeing our friends, dealing with mask breath.

So let’s do another round of free live classes – this time around, in Europe-friendly times, 8:00-16:00 UTC.

No registration required – just do your prerequisites, then show up at www.BrentOzar.com/training/live/ a few minutes before the class starts and join in on the fun.

The classes will be recorded, but the Instant Replay recordings will only be available to folks who bought the classes or my Recorded Class Season Pass.

You might be asking, “Brent, how can you keep taunting Europeans about things they can’t buy? Why bother doing classes in European times if the Europeans can’t get the recordings?”

And I’ve got some very good news about that. Stay tuned!


#TSQL2sday 136 Invitation: Blog About Your Favorite Data Type (Or Least Favorite)

T-SQL Tuesday is a monthly blogothon where we get together and write about a different topic. I’m hosting this month’s episode.

Your mission: write a blog post about your favorite data type, and schedule it for next Tuesday, March 9.

Mad about money? Feverish about float? Tell us what you use your favorite data type for, and what people need to know before they get started using it. Data types have so many interesting edge cases, like when Aaron Bertrand taught us to use lower case for our data types. (I still don’t do that, but for the record, I do feel guilty.)

The rules:

  • Your blog post has to go live on Tuesday, March 9
  • Include the T-SQL Tuesday logo, and if somebody clicks on the logo, it has to link back to this post
  • Include a link to this blog post, or after you publish it, leave a comment here so I can track you down for the roundup blog post next week
  • If you want to host a future T-SQL Tuesday episode, tweet @way0utwest or leave a comment on this post

Don’t worry about somebody else picking the same data type – this is the beauty of blogging. Readers are here for your unique take on things, in your voice. And don’t worry about repeating things that are in the documentation – nobody reads that. You could seriously copy/paste the contents of the documentation and you’d get comments saying, “Whoa, I never knew that.”


Free Webcast: Planning a SQL Server to Azure Migration in 2021

You manage database servers for your company, and they’re thinking about migrating everything to Microsoft Azure. You’ve searched the web for information and advice, but everything you’re finding is outdated.

You just want a list of things to review on your servers, an overview of what won’t work in the cloud, and how to pick alternatives or workarounds.

In this one-hour session on Thursday, March 18, we’ll cover pure technical details, zero marketing, and get you up to speed quickly with none of the fluff.

Register now for the free webcast. If you can’t be there live, no worries – links to the recordings will be sent to the registrants. And if you’ve got questions you’d like to see me touch on during the webcast, you can get a jump on the Q&A by asking ’em below in the comments.


New SQL ConstantCare® Feature: One Serious Query to Tune

Our monitoring tool, SQL ConstantCare®, is different.

Some monitoring tools send you a whole bunch of emails for every little thing that’s happening on every one of your server – and having you end up just creating an Outlook rule to dump them all into a folder that you never read. PAGE LIFE EXPECTANCY IS LOW! DISK QUEUE LENGTH IS HIGH! It all just turns into noise.

SQL ConstantCareSQL ConstantCare® is minimal: just one email per day. We tell you when there’s something seriously wrong that demands attention.

So in the interest of minimalism, we’ve got a new feature: One Serious Query to Tune. When >50% of your SQL Server’s workload is caused by just one query, we tell you about it. We know you’re too busy to tune every query or index that could help performance, but in cases like this, we call out that there’s a serious issue that will probably be worth your time to fix.

You might be thinking, “Brent, how often does that really happen?”

Well, last week alone, it happened for 328 of our users!

Sign up now and get a free 14-day trial. You can cancel anytime without charge before your trial ends. What will we discover together about your servers?


Updated First Responder Kit and Consultant Toolkit for February 2021

Along with the usual bug fixes & improvements, this month’s release includes a brand new sp_BlitzAnalysis script from Adrian Buckman. It’s for folks who have an Agent job to log sp_BlitzFirst to tables every 15 minutes, saving wait stats, Perfmon counters, file stats, etc into tables. sp_BlitzAnalysis reads those tables and sums up activity in your chosen time period, like this:

To install it, run the sp_BlitzAnalysis.sql file included in the First Responder Kit. It’s not a complex installation or anything – I just didn’t want to include it in the main installer scripts because this is the first public release, and I haven’t tested it myself yet either. (Been a little busy with the whole Iceland move.) To learn how to use sp_BlitzAnalysis, read the documentation. For questions or support on it, hop into the #FirstResponderKit Slack channel. (If you need a free invite, hit SQLslack.com. Be patient – it’s staffed with volunteers who have day jobs.

When you find a bug or want something changed, read the contributing.md file.

To get the new version:

Consultant Toolkit Changes

I updated it to this month’s First Responder Kit, but no changes to querymanifest.json or the spreadsheet. If you’ve customized those, no changes are necessary this month: just copy your spreadsheet and querymanifest.json into the new release’s folder.

sp_Blitz Changes

  • Fix: the check for the most recent CHECKDB dates was running even when @CheckUserDatabaseObjects = 0. This was the only instance where we were using sp_MSforeachdb with @CheckUserDatabaseObjects = 0, so it’d slow sp_Blitz down on instances with hundreds or thousands of databases. (#2779)
  • Fix: put comments around the term xp_cmdshell to fool a firewall. (#2788, thanks TLovie.)
  • Fix: changed typo of BrenOzar.com to BrentOzar.com on one check. (#2786, thanks Michael Gowan.)

sp_BlitzAnalysis Changes

  • All new! This is the first release that includes sp_BlitzAnalysis by Adrian Buckman. If you want to install this script, install sp_BlitzAnalysis.sql – it’s not included in the normal installation scripts yet just to limit the blast radius in case something goes wrong.
  • For tips on how to use it, check out the documentation.
  • For questions or problems, see the “For Support” section of this doc.

sp_BlitzCache Changes

  • Improvement: default output now includes PlanGenerationNum from sys.dm_exec_query_stats. Helps troubleshoot scenarios where the plan is frequently recompiling due to statistics changes. (#2792, thanks Tom Lovie.)
  • Fix: a recent release broke @SortOrder = ‘recent compilations’, and that’s fixed. (#2772)

sp_BlitzFirst Changes

  • Improvement: if more than 20% of the queries are runnable, waiting on CPU power to make progress, we warn you as shown above. sp_BlitzFirst makes two passes through to check – once at the beginning of the sample defined with @Seconds (defaults to 5), and again at the end. (#2776, thanks Erik Darling.)
  • Fix: if you passed in @OutputTableRetentionDays because you wanted to store more or less history in the output tables, that change wasn’t being passed to sp_BlitzWho, which was still just storing 3 days. (#2758, thanks Emanuele Mezzo.)
  • Fix: the high compilations/sec and recompilations/sec warnings had way too high of thresholds. They were looking for 1,000 batch requests per sec, but you can see high CPU usage due to compilations with even just 10 big ugly plans per second getting generated. Lowered the threshold to either 10 compilations/sec, or more compilations than batch requests per second – which happens when a proc has multiple statements with recompile hints in them. (#2770)

sp_BlitzIndex Changes

  • Improvement: duplicate indexes are now sorted by row count descending so that the big bang-for-the-buck stuff comes first. (#2762, thanks Todd Chittenden.)
  • Improvement: Aggressive Indexes warnings are now sorted by total lock wait time descending. (#2768)
  • Fix: a few releases back, I added support for SQL Server 2019’s new missing index DMV that tells you which queries are triggering the request. I had some challenges getting it to work reliably, and Microsoft just now documented it so I’m removing support for it temporarily. It’ll be back. If you want to help with coding for it, check out #2185. (#2780)

sp_AllNightLog Changes

  • Fix: the RPO and RTO now honor overrides set in the config table. (#2775, thanks Alin Selicean.)

sp_ineachdb Changes

  • Improvement: Azure SQL DB compatibility in the sense that it’ll run on Azure SQL DB, but it can’t actually change databases since Azure SQL DB doesn’t allow that. I’m referring to Azure SQL DB “Classic Flavor” here, not Managed Instances. (#2790)

For Support

When you have questions about how the tools work, talk with the community in the #FirstResponderKit Slack channel. If you need a free invite, hit SQLslack.com. Be patient – it’s staffed with volunteers who have day jobs.

When you find a bug or want something changed, read the contributing.md file.

When you have a question about what the scripts found, first make sure you read the “More Details” URL for any warning you find. We put a lot of work into documentation, and we wouldn’t want someone to yell at you to go read the fine manual. After that, when you’ve still got questions about how something works in SQL Server, post a question at DBA.StackExchange.com and the community (that includes me!) will help. Include exact errors and any applicable screenshots, your SQL Server version number (including the build #), and the version of the tool you’re working with.


How to Trace a Nested Stored Procedure Without Using Profiler

You’ve got a stored procedure that calls other stored procedures that build dynamic SQL that call other functions, and there’s so much calling going on that you would like to phone a friend.

Your query plans look like piles of spaghetti, and when you zoom in, you just see more tiny piles of tiny spaghetti.

Finding the slow parts is really easy thanks to sp_HumanEvents by Erik Darling (Blog@ErikDarlingData.) Download it, install it in your master database, and you’ll be ready to go.

In one window, get ready to run the nasty stored procedure. Make a note of which session you’re in – it shows up at the bottom of your SSMS window like this:

See the “SQL2019\Brent (60)”? That number in parenthesis (60) is my session ID.

In another window, fire up sp_HumanEvents:

The parameter names & values are pretty self-explanatory there. After 30 seconds, you’ll get a nice rundown of what queries ran, how long they took, their execution plans, and much more.

You could do this same thing with Profiler, but it’s more painful to set up, has more overhead, and doesn’t give you the nice, clean, filtered result set. You could also do this same thing by rolling your own Extended Events session, but seriously, who’s got time for that? Erik’s done all the hard work for you.

If you like this, you’ll love how easy it is to find queries using option recompile.


Changing Statistics Cause Longer Compilation Times

I need to be up front with you, dear reader, and tell you that you’re probably never going to need to know this.

I try to blog about stuff people need to know to get their job done – things that will be genuinely useful in your day-to-day performance tuning and management of SQL Server. This, however, is not one of those things. This is fun, obscure trivia, documenting a situation you’ll probably never run into, prompted by Forrest’s post proving that compilation time includes more than parse time.

Start with a hard-to-compile query plan.

I’ll use any Stack Overflow database and the hard-to-compile query plan from my post, Bad Idea Jeans: Building Big Query Plans, modified to join to the Users table:

Run the query, and check the Messages tab:

That complex query plan is hard to build, so it took some time:

The table variable doesn’t have any rows in it, though, so execution time is nearly instantaneous. The hard part here is clearly building the query plan.

But it gets much worse.
Try updating stats while it runs.

In one window, free the plan cache and start rebuilding the indexes on the Users table with online = on so the CTE query can execute:

Then in another window, run the CTE query again. Make a pot of coffee or hit the loo, because you’re gonna have some time. Several minutes later, both queries finish, and the output of the CTE’s statistics is pretty bizarre:

What’s happening? Every time the Users rebuild finishes, it’s also updating stats. The query compilation sees those updated stats, and…decides that the plan it just got done building is invalid, so it’d better go try to build another plan! Thus, the repeated messages about parse & compile time.

Yes, I actually hit this problem.

It sounds crazy, but to recreate a client issue, I was loading a 10 billion row table, which takes a while. As it was loading, its statistics were constantly changing, because as it grew, it was constantly hitting the automatic stats updates threshold.

As a result, simple queries with no where clause or order by could finish instantly, but add in even the simplest where clause, and the query would never finish. It was constantly recompiling as more data was loaded.

Shout out to Joe Obbish and Paul White who helped me troubleshoot this problem last year – all credit for solving the mystery goes to them:


What’s New & Undocumented in SQL Server 2019 Cumulative Update 9

Last week’s SQL Server 2019 Cumulative Update 9 snuck in some undocumented things.

We got two new DMVs for Always On Availability Groups, sys.dm_hadr_cached_replica_states and sys.dm_hadr_cached_database_replica_states. They have a subset of data from other AG DMVs:

I sure wish I could point you to the documentation or KB article on those.

But we have no idea what changed in CU9 here, or why, or what to do about it.

I’m gonna be honest with you, dear reader: I’m starting to tell clients to choose SQL Server 2017 over SQL Server 2019. 2019 just doesn’t seem like it’s baked, and 2019’s Cumulative Updates have just become too large of a risk: the release notes aren’t honest about you’re deploying. That’s not fair to organizations that depend on SQL Server for mission critical deployments: they need to know what changed so they can be on the lookout for new unusual behaviors.

If your own developers were changing production without telling you how or why, you’d call them into the office, close the door, and have a conversation.

In lighter news, there are also a few new messages in sys.messages:

  • 898: Buffer Pool scan took %I64d seconds: database ID %d, command ‘%ls’, operation ‘%ls’, scanned buffers %I64d, total iterated buffers %I64d, wait time %I64d ms. See ‘https://go.microsoft.com/fwlink/?linkid=2132602‘ for more information.
  • 7877: Backup on secondary replica is not supported in the connection to contained availability group. Reconnect to SQL Instance and retry the operation.
  • 8659: Cannot continue columnstore index build because it requires %I64d KB misc memory per thread, while the total memory grant for query is %I64d KB, total Columnstore bulk insert memory is limited to %I64d KB in query, and the maximum memory grant is limited to %I64d KB per query in workload group id=%ld and resource pool id=%ld. Retry after modifying columnstore index to contain fewer columns, or after increasing the maximum memory grant limit with Resource Governor.
  • 35528: unable to read page due to invalid FCB
  • 46933: PolyBase configuration is invalid or corrupt. Re-run PolyBase setup.

Query Store is the Flight Data Recorder. It’s Obvious What We Need Next.

When Microsoft brought out Query Store, they wrote:

Similar to an airplane’s flight data recorder, Query Store collects and presents detailed historic information about all queries, greatly simplifying performance forensics by reducing the time to diagnose and resolve issues.

I have a straight face under here, for real

It certainly works, making it easier for you to understand when a query’s execution plan suddenly goes downhill, taking your server with it.

But it isn’t enough.

In airplanes, it’s not enough to know that the throttles were powered down and the altitude plummeted. It’s not enough to know what happened: we need to know why it happened. What problem did the team think they were facing? Was there a misunderstanding in the cockpit about a metric? A disagreement between team members about how to solve the problem?

That’s why we need Voice Store.

Just as a cockpit voice recorder helps investigators better understand why decisions were made, SQL Server Management Studio needs a Voice Store to capture what the team was trying to do in the moments leading up to the crash.

To help speed up accident investigations, we simply can’t afford delays. That’s why it’s also important that Voice Store be connected directly to YouTube, automatically uploading recordings to help the community get to the root cause analysis.


What’s a Size-of-Data Operation?

Database professionals say, “That’s slow because it’s a size-of-data operation.”

A size-of-data operation is when the database has to write all of the affected data again. What you’re asking for might seem small, but for the database, it causes a lot of work.

For example, let’s say you live in a strict household, and your parents have told us that under this roof, everyone’s hair can only be black, blond, brown, or red. (Don’t blame me. I don’t make the rules. They’re your parents, not mine. Go talk to them. And tell your mom I said hi.)

If your parents suddenly loosen up and say, “We’re now allowing you to dye your hair pink,” that doesn’t mean we have to touch everyone’s head. It’s a metadata operation: we’re just making a new note so that when someone walks into the door and we check their hair color, we have a new acceptable option.

However, if your parents freak out and say, “Everyone’s hair must be pink,” that means we’re gonna need to dye every person’s hair. We’re gonna have to gather everybody together and do some real work.

That’s a size of data operation: however many people live in the house, we’re gonna need to work on all of ’em.

How that translates into databases

I’ll use the Users table in a large Stack Overflow database to explain.

If I alter a table and add a new nullable column, it happens instantly, and SQL Server doesn’t have to touch any pages:

Adding a column like that is just a metadata change. It tells SQL Server, “When new rows walk into the house, they’re allowed to specify their HairColor now. When rows leave the house, if they don’t already have a HairColor set, just say it’s NULL.”

However, if I populate that column and dye everyone’s hair pink, that’s gonna take some time, because we’re gonna have to touch every row:

That’s a size-of-data operation, so it’s slower.

How that affects development & deployment

When you’re just getting started developing an app, and all your tables are empty, you don’t really have to worry about this kind of thing – because size-of-data operations are fast when there’s, uh, no data involved.

But the larger your data becomes, and the slower your storage becomes (like if you move from nice on-premises flash storage up to the cloud), then the more you have to pay attention to this. You’ll want to run your deployment scripts on a real production-sized database, not just an empty development shell, and time the deployment scripts. The longer an operation takes, the more likely it is a size-of-data operation. On larger databases – at 100GB and higher – and at 24/7 uptime shops like online stores, you’ll likely have to tune your deployment scripts to avoid size-of-data operations.

I haven’t seen a good rundown of all the operations that are size-of-data operations for Microsoft SQL Server, but it’s changed over the years. Hey, if you’ve got a blog – there’s a potential idea for you! Not me though. The hot tub is calling.


I Sat Down for an Interview with Forrest Brazeal.

Here’s how the interview started:

Forrest Brazeal: Brent, you’re a performance specialist who’s worked with massive SQL Server deployments; how does it make you feel when you hear cloud architects saying things like “relational databases don’t scale?” What do you wish they knew?

Eyes up here, kid

“Time for a smoke break.”

Brent Ozar: They’re right. Nothing scales worth a damn. Even stuff that looks easy, like Amazon S3 file storage, can run into scaling problems if you design it without an analysis of how AWS partitions reads, and how you have to watch out for your object prefixes.

But the thing is, nobody reads the documentation.

Read the interview here. I had a lot of fun with it.


Partitioned Tables Cause Longer Plan Compilation Times.

Folks sometimes ask me, “When a table has more indexes, and SQL Server has more decisions to make, does that slow down execution plan generation?”

Well, maybe, but the table design choice that really screws you on compilation time is partitioning. If you choose to partition your tables, even tiny simple queries can cause dramatically higher CPU times. Even worse, as the famous philosopher once said, “Mo partitions, mo problems.

We’ll start with any Stack Overflow database, create a numbers table, and then dynamically build a partition function that partitions our data by day:

Then we’ll build a partitioned copy of the Users table, partitioning them by CreationDates:

Let’s compare partitioned and non-partitioned performance.

I’ll create a non-partitioned index on both the Users and Users_partitioned tables. Note that even if I don’t specify partitioning on the Users_partitioned index, it still ends up partitioned by default, which is kinda awesome:

Now we’ll run an exceedingly simple query that only returns one row:

At first glance, the plans look the same – but notice how the non-partitioned table is 0% of the cost, and the partitioned table is 100% of the cost:

That’s because the non-partitioned estimated cost was way less than 0.001, and the partitioned query’s estimated cost is over 15. What’s worse, the compilation time, execution time, and logical reads are totally different – the top one is the non-partitioned query, and the bottom is the partitioned one, both cleaned up for readability:

Hubba hubba. Just compiling the partitioned table’s plan took 27ms of CPU time. I know what you’re thinking: “Brent, who cares about 27ms of CPU time?” Well, remember, this is an overly simple query! In real-life examples, it’s not unusual to see 250+ milliseconds of CPU time spent just compiling the plan – which means you can only compile 4 queries per second, per CPU core. That’s when plan cache pollution due to unparameterized queries really screws you.

I’m writing this particular post because I had a client who chose to partition every table in their data warehouse, regardless of size, and used exactly the same partition granularity on all sizes of tables to keep their dynamic ETL queries simple. Unfortunately, even querying simple configuration tables was taking 250+ milliseconds of CPU time just to compile a plan.

Fine – but what about non-partitioned indexes?

I’ll drop that partitioned index and create a non-partitioned index on the Users_partitioned tables – note that I have to specify ON PRIMARY as the filegroup for the partitioned table, or else any nonclustered index will by default automatically be partitioned as well – then try the queries again:

Now we’ll run an exceedingly simple query that only returns one row:

Now the estimated costs are neck and neck:

But that’s just an estimate – the only thing neck and neck here are the logical reads:

Designing and executing a query plan takes longer when any of the objects involved are partitioned.

This overhead isn’t large as long as you’re comparing it to the overhead of big data warehouse queries where partitioning data can reduce reads. But if you’re comparing it to smaller objects – say, 100GB or smaller rowstore indexes – that are queried more frequently, with more varying queries that require building new execution plans – then the overhead of partitioning starts to add up. Mo partitioned objects, with mo partitions in each object, mo problems.

Partitioning is one of those features where I never hear people say, “Wow, every which way I turn, partitioning has just made performance better!” Rather, it’s a feature where people keep saying, “I had no idea partitioning was going to cause me problems over THERE, too.”

In my Mastering Index Tuning class, we spend just enough time on partitioning to explain where it’s a good fit – but much more time on indexing techniques that are more appropriate for 99.9% of workloads out there.


Unusual Parameter Sniffing: Big Problems with Small Data

Normally when we think about parameter sniffing, we think about a tiny data vs big data problem: the tiny data’s plan goes in first, and when we try to process big data, performance is terrible.

But sometimes, it’s the exact opposite.

I’ll start with the large Stack Overflow database (I’m using the 2018-06 version, but any large one will do) and write a stored procedure to find the most recent posts authored by folks in a specific location:

To help SQL Server out – I’m a giver like that – I’ll create a few indexes to give SQL Server some choices:

Call it for tiny data first…

When we call the proc for a less-populated location first, SQL Server chooses to look up the people who live in Iceland (because there aren’t many), and then go find their posts. Small data finishes in under 1 second:

If we now run it for a larger location, like Germany, the query takes 24 seconds to run because we find so many people in Germany, and they’ve posted so many answers. (I’m sure they’re not asking a lot of questions. Germans are really smart.)

And the sort spills to disk because we only granted Iceland-size memory.

This is the typical parameter sniffing problem that people blog about: put the tiny data parameters in memory first, and SQL Server just isn’t equipped to deal with big data.

But if the big data plan goes in first…

Do things perform better? Let’s free the plan cache, then start with Germany:

This time around, SQL Server says, “Ah, Germany’s really common. You’re asking for these posts to be sorted by CreationDate – as a reminder, here’s your query and your indexes:”

“So since you’re asking for the top 200 by CreationDate descending, I’ll just scan the CreationDate index from newest to oldest. For each post I find, I’ll go look up the Users table to see where that user’s from. I bet I won’t have to look up too many before I’ll find 200 of ’em that were posted by Germans.”

And he’s right: he only had to look up about 20,000 of them before he found 200 that were written by Germans. Very cool.

But…when we run this for Iceland…

SQL Server has to read over a million Posts, doing over a million Posts key lookups, and over a million checks into the Users table before it’s found 200 posts that were written by Icelanders. Ten seconds isn’t so bad, but if you pass in an even rarer location, like the charming and easy-to-pronounce Hafnarfjordur, Iceland:

SQL Server again scans the index on Posts, doing a key lookup on Posts for every single row that it finds, and I bet that sounds goofy. Oh it gets worse: for all 40,700,647 Posts, it does a clustered index seek against the Users table. We do over 124 MILLION logical reads on a table that only has 143K pages in it. We read the Users table 871 times over, and the Posts table 6 times over due to all the key lookups:

That sure is a lot of work – would be nice if SQL Server parallelized that query across multiple threads. No can do, though – the query plan was designed with Germany in mind, when SQL Server thought it’d find rows quickly. The whole time this query runs, it’s just hammering one of my CPU cores while the rest sit idle:

There is no one good plan here.

I always laugh when I see demos that talk about “the good plan” and “the bad plan.” I wish the real world was so simple, so black-and-white. In reality, queries have many possible plans, and many of ’em just don’t work for other sets of parameters.

SQL Server 2017’s “Automatic Tuning” does nothing here, either: it can try throwing a different plan in thinking there’s been a regression, but it’s not smart enough to pick different execution plans for different parameters. It picks one plan and tries to make it work for all of ’em – and that simply doesn’t work here.

It’s up to us to fix it, folks.

That’s why I teach my Fundamentals of Parameter Sniffing and Mastering Parameter Sniffing classes. Folks with a Live Class Season Pass can start watching the Instant Replays whenever they want, or drop in live. I’m teaching a bunch of classes over the next few weeks:

See you in class!


My Live Streaming Setup for 2021: Iceland Edition

Erika and I moved to Iceland for a while. For those of you who don’t follow my personal blog, here’s why we moved, and how we got here with a teleworker visa. You can spy on our adventures via my Instagram, Facebook, or Twitter.

We’re working half the time while we live here, and sightseeing the other half. It’s easy to telecommute from Iceland because it’s one of the most connected countries in the world. Most houses have symmetric gigabit fiber available – even though we’re extremely isolated, we still have great bandwidth:

This isn’t a permanent move – probably just 6-12 months – and we’re doing some traveling through the country, so I didn’t want to pack my full home office setup with my Mac Pro and big monitors. I downsized my streaming setup, and I figured some of y’all might find it interesting. It’s closer to what most folks would use for remote presenting.

Laptop: MacBook Pro 16″ – the fastest Mac laptop available today, a 2.4GHz 8-core Intel Core i9, 64GB RAM, but…it’s a long, long way from the 16-core Mac Pro I usually use. When I live stream, I really put the CPU fans to work. I was tempted to try one of the new MacBooks with the Apple Silicon M1 processor, but I’m holding out for the higher-horsepower versions that will surely follow. I brought my Keychron K8 keyboard and an Apple Magic Touchpad at the last minute at Erika’s insistence – I was trying to pack lighter, heh.

External monitor on right: 15″ generic 1080p. Normally for monitors, you want high resolution, but when you’re streaming, you actually want a plain 1920×1080 because that’s the max size you’d stream out to your audience. The monitor has inputs for USB-C, HDMI, Mini DisplayPort, etc, but the nice thing is that it can use a single USB-C cable for both power and for the display signal.

When I’m streaming, the MacBook Pro’s high resolution display has Slack (for attendee chat & questions) and OBS’s control panel and preview. The 15″ 1080p monitor is what’s showing to the audience, so I’ve either got SSMS on there, or a PowerPoint.

I usually teach while standing up, but I’m trying the dining room table to see if I can make that work. If it turns out I just absolutely gotta stand, we’ve got a high bar table picked out at a local furniture store – if necessary, we’ll grab that and leave it in the rental house when we leave.

Cameras: two Sony A6100s each plugged into Elgato Cam Link 4K: in my normal home studio setup, I plugged the Sonys into a Blackmagic Design HDMI recorder, but that’s a PCI Express card. The Elgato is an easier/smaller USB solution for laptops. I do prefer a two-camera setup, alternating between a head-on camera and a from-the-side camera for casual discussionss. I still use my iPhone with the NDI camera app as a scenery camera during the breaks – that’s good for breaks, but not for a main camera because it has unpredictable latency, meaning your speech and moving lips can drift in & out of sync, and I can’t stand that.

Microphone: DPA In-Ear Broadcast Headset plugged into a Focusrite Clarett. I’m only reusing these because they’re part of my desktop setup, so I don’t need to buy them again, but they’re overkill for most streaming setups. I have a Rode VideoMic GO ($100) on-camera microphone as a backup, but the sound on that isn’t nearly as good since it also picks up echoes from the room, keyboard noises, laptop fan, etc.

Customizable keyboard control: Elgato Stream Deck. LCD keys so you can configure what they show & do, like switching between different cameras. Could you do this by memorizing a bunch of hotkeys? Probably, but as you start to rely on more advanced OBS functionality, like playing sound effects, this will come in super handy.

USB-C dock: CalDigit TS3 Plus: if you count the above devices, that’s 5 high-speed USB ports right there, plus Ethernet. The MacBook Pro only has 4 USB-C ports, plus it needs one for electric power, so I needed a USB-C dock. The CalDigit is by far the most stable dock I’ve found – a lot of USB-C docks flake out during sustained 4K streaming with multiple cameras.

Tripod mounting: Manfrotto Magic Arm plus Super Clamp: I could theoretically clamp these directly to the dining room table that I’m using for a desk, but they’d shake when I jostle the table. Instead, I brought a couple of tripods, plus these Magic Arms so I can mount multiple devices to the tripod, like a camera plus lights.

Green screen: collapsible with stand: at home, I used a wall-mounted pull-down green screen, but obviously that’s not gonna work on the road. This folds up, and I’ll have it behind me while I’m teaching. I expect to be teaching sitting down while I’m on the road, but I’d be completely delighted if I could put together a setup where I could continue standing & teaching. The box it shipped in was exactly the max dimensions for standard airline checked luggage. It collapses, but not small enough to fit in a regular checked bag – it’s still an odd shape.

Carrying the electronics: Pelican 1620 case: on one of our Iceland flights, the airport crew unloaded the baggage during a serious rainstorm. Our bags and their contents were all completely soaked with water. We didn’t have any electronics in ’em, thank goodness, but for this trip, I wanted something waterproof that I could check as luggage. (It won’t have any lithium batteries in it – I take those in my carryon.)

If you wanna see the results and you’ve got a Live Class Season Pass, hop into my live class page during any of my classes this month:

See you in class!


Can You Get Parameter Sniffing on Updates and Deletes?

Sure you can – parameter sniffing is everywhere. Anytime that you have one piece of code that needs to handle a varying number of rows, you’re probably gonna have to worry about parameter sniffing.

I’ll start with any Stack Overflow database and run a couple of non-parameterized update statements. I’m using literals here, not parameters:

These two statements have two very different actual execution plans. The first plan, updating just Brent, is very simple. Reading from right to left, we scan the clustered index to find all the Brents, and then we update the clustered index:

 

Except…the simplicity is a vicious lie. We’re updating LastAccessDate, and before running this update, I’d already created half a dozen indexes:

Every one of ’em includes LastAccessDate. SQL Server doesn’t have the concept of asynchronous indexes, so all of the indexes have to be updated whenever we change their contents. Hover your mouse over the clustered index operator to see the truth: SQL Server’s updating those, too.

The second plan is different because there are a lot of Alexes at Stack Overflow – Alex is one of the most common names in the database. Read the second plan right to left, and the first thing SQL Server does is find & update all of the Alexes in the clustered index:

But since we’re updating a lot of rows, SQL Server changes the way it approaches updating all of these nonclustered indexes. After the clustered index operator, keep reading…

For every index, SQL Server sorts all of the Alexes in that order so we can find ’em faster in the index, and then updates the index. It also does this process one index at a time, serially, in order, for Halloween protection.

These are called narrow and wide plans.

Paul White is both narrow and tall, but his Query Buck is wide and short

Brent’s plan, which does all of the updates in a single Clustered Index Update operator, is called a narrow plan.

Alex’s plan, which breaks out the work for each index and does them one at a time, is called a wide plan.

No, that doesn’t make any sense to me either. They should be called short and tall plans, because frankly, Alex’s plan is the wide one. I don’t make the rules, though, Paul White does. Take it up with the Kiwi. Even when his rules are wrong, they’re still right. Or maybe they just feel that way.

Put this in a stored proc, and you’ve got parameter sniffing.

Here’s our stored procedure to handle updates:

And now depending on which parameter runs first – Brent Ozar or Alex – we’ll get a different execution plan.

When Brent runs first, we get the narrow plan, so:

  • Brent does 143,688 logical reads and has no worktable
  • Alex does 352,260 logical reads, has no worktable, gets a 72KB memory grant, and the clustered index scan is single-threaded

And when Alex runs first, we get the wide plan, so:

  • Alex does 553,530 logical reads – more than 3 times the number of pages in the entire table! – plus another 41,339 on a worktable, gets a 5MB memory grant, and the clustered index scan goes parallel (although the rest of the plan is serial because that’s how SQL Server does it)
  • Brent does 143,730 logical reads, plus 17 on the worktable

The differences in this case aren’t all that bad, but keep in mind that I’m dealing with just a 1GB table – the smallest one in the Stack Overflow database. The larger your data becomes, the more you have to watch out for this kind of thing.

That’s why I teach my Fundamentals of Parameter Sniffing and Mastering Parameter Sniffing classes. Folks with a Live Class Season Pass can start watching the Instant Replays whenever they want, or drop in live. I’m teaching a bunch of classes over the next few weeks:

See you in class!


“But Surely NOLOCK Is Okay If No One’s Changing Data, Right?”

Some of y’all, bless your hearts, are really, really, really in love with NOLOCK.

I’ve shown you how you get incorrect results when someone’s updating the rows, and I’ve shown how you get wrong-o results when someone’s updating unrelated rows. It doesn’t matter – there’s always one of you out there who believes NOLOCK is okay in their special situation.

The latest NOLOCK defender left me an angry comment:

I challenge you to show us one example of using nolock on a static database with incorrect results.

No problemo – I assume you mean tables that aren’t getting inserts, updates, or deletes. Well, folks can do index reorganize operations during business hours – it’s a completely online operation, even in Standard Edition – and that hoses NOLOCK too.

I’ll create a table and populate it with 100K rows, but do it in a way that makes it highly fragmented:

Then, in one window, I’ll reorganize the index:

While in another window, I’ll run a COUNT(*) repeatedly:

And presto: the results change, going up and down as the pages are moved around:

Even if you think no one’s doing index maintenance at your shop, the answer is still clear here: if you want to build safe, reliable code that produces accurate query results at all times, NOLOCK should not be involved.

I get why you use NOLOCK. You read a blog post somewhere once that said, “NOLOCK is faster because it does allocation scans instead of navigating those big hefty index pages.” But you never actually bothered to measure whether that’s a significant problem for you – and you never bothered to ask your users if they need accurate query results.

Sometimes, you really don’t need accurate query results, and in that case, NOLOCK can be fine. For example, one of my clients runs a monitoring query every few seconds just to see if any new carts have been started in their online store. They get the latest row, check to see whether it’s higher than the previous check, and if it is, they’re happy campers. If it isn’t, they don’t mind checking again to see whether they got bad data, and they check a couple of times before they sound an alarm. They don’t need accuracy on exactly how many carts have been started – they just want a rough idea that the latest value is going up. NOLOCK works fine for their needs.

But for the rest of y’all, when you’re ready to do things the right way, I welcome you to my SQL Server training classes.

You should probably take one of them quickly, before your boss reads this post and starts asking you why you were so vigorously defending your constant use of NOLOCK.


Free Webcast: Help! My SQL Server Maintenance is Taking Too Long!

Eyes up here, kid

“Time for a smoke break.”

You manage growing SQL Server databases with shrinking nightly maintenance windows. You just don’t have enough time left each night to do the necessary backups, corruption checking, index maintenance, and data jobs that your users and apps want to run. Cloud storage isn’t helping the problem, either.

Stop playing Tetris with your job schedules and step back for a second: are we doing the right things, at the right times, with the right SQL Server configuration?

In this session, I will give you three queries to run on your SQL Server to make sure you’re focused on the right problems. We’ll figure out how fast your backups are really going (and why), whether nightly index maintenance actually makes sense for you or not, and whether your hardware is suffering from serious performance issues.

Register now for the free webcast on Tuesday, February 16th. 


What SQL Server Feature Do You Wish Would Go Away?

I asked a two-part question on Twitter: if you could only remove one feature from SQL, why would it be cursors?

The responses were great:

Justin’s referring to MERGE’s hot mess.

But out of all of ’em, this one intrigues me the most:

How about you?


Menu