Blog

Watch My Training Classes Offline with the Latest Teachable App

Company News
1 Comment

If you’re the kind of person who likes to learn on the go, my training provider Teachable just updated their apps.

Now, both the Android and iOS apps let you download classes ahead of time and watch ’em disconnected.

I also love the built-in reminder features of the app: you can set a reminder time each day to make progress and push through more videos. That’s so helpful because every year, folks buy my courses, then let time go by without actually watching any of ’em. You can’t learn by osmosis – holding the phone up to your ear won’t cut it – so the reminders help get you back in there.

Between the offline viewing and the reminders, you stand a pretty good shot of being able to finish the Fundamentals in 90 days, and for those of you who can pull it off, there’s a sale going on!


[Video] Office Hours: Everything’s Gone Wrong Edition

Videos
2 Comments

I threw in the towel on my workday at 3PM when everything started going wrong in the Ozar home office, so I switched to answering your top-voted questions from https://pollgab.com/room/brento.

Office Hours: Everything's Going Wrong Edition

Here’s what we covered:

  • 00:00 Start
  • 03:59 MyTeaGotCold: How often do you see indexed views these days?
  • 04:14 I 3Fedora: Hi Brent, Not strictly database related but what are you thoughts on Linux vs Windows in general?
  • 04:27 Dick Bowen: With SQL Server STD, we run update statistics and reorganize indexes hourly as the supplier uses heap tables with GUID on indexes in their DB, they then then we run re-index at the weekend. Is there anything else that could be done to speed up the DB?
  • 05:01 Dopinder: Do you know of any good tools that evaluate SQL Server configuration for best practices by CPU? I.e. AMD Epyc has configuration recommendations for optimizing SQL workload performance. Intel has their own.
  • 05:24 SummerFondness: What is your favorite vs code prompt that sounds dumb, but returns great results? Bonus, what is your favorite nuclear threat to get better results. Mine starts, Your children are being held hostage and if your results do not meet kidnappers expectations … Insert problem ..
  • 06:06 Maciej: In December last year you mentioned you thought about writing a series of blog posts about Continue.dev (VS Code extension to use local LLMs). Are you still using it? If yes, what is your opinion about it?
  • 06:23 Backing up 15 TB databases: Is https://www.brentozar.com/go/FasterBa… a broken link now? I used to love that Microsoft white paper.
  • 06:49 SteveE: Hi Brent, Have you come across many or any contained AGs in the wild? They seem like they could be really useful in environments with heavy agent workloads
  • 07:11 Accidental DBA: Do you have any recommendations of books (or other ways to learn) Sql Server Architecture better? One example, running out of disk space on the log drive (does it attempt to use memory before setting the DB to readonly)? Rather than googling each time; trying to learn ahead.
  • 08:34 Waikikamukau: Hi Brent. We have been asked to setup TDE. Our storage guys are saying this will break the dedupe and the effective storage use will skyrocket! Say it isn’t so!
  • 09:28 Alen: What is the best practice for assigning a database owner in SQL Server: should it be the sa login (disabled), or a named user/service account ?
  • 09:49 Dick Bowen: When I run a Query with OPTION (RECOMPILE) from SQL, it runs faster.
  • 10:10 Dopinder: What is your experience/opinion of using AI to analyze the corruption report produced by CHECKDB and make suggestions for remediation?
  • 11:03 NeedsMoreMoose: I love the moose that shows up in your images sometimes. What is the lore behind the moose and where could I find more of said moose?
  • 11:54 Benjamin: I’ve just inherited a sql server with two volumes, C and the data drive. The sql server stores tempdb on the C drive, and I understand that comes with risks. Do you think I can plan on relying on SQL Server 2025’s new tempdb resource governor features to mitigate the risks?
  • 12:45 SteveE: Hi Brent, Standard edition licensing matters aside, is there any difference / benefit to running a 16core SQL server as 4 sockets / 4 cores, 2 sockets / 8 cores , 1 socket / 16 cores etc

Update: SQL Server 2025’s REGEX Performance Isn’t So Bad!

T-SQL
14 Comments

Back in March 2025 when Microsoft first announced that REGEX support was coming to SQL Server 2025 and Azure SQL DB, I gave it a quick test, and the performance was horrific. It was bad in 3 different ways:

  1. The CPU usage was terrible, burning 60 seconds of CPU time to check a few million rows
  2. It refused to use an index
  3. The cardinality estimation was terrible, hard-coded to 30% of the table

Prompted by a comment from Erland Sommarskog this month, I circled back and ran the tests again with the release version of SQL Server 2025. Great news! Microsoft fixed 1 of the problems, and… well, one of them is a little tricky. To demonstrate, I’m going to use the large 2024-04 Stack Overflow database to create a worst-case scenario, then start with an index on the small Users table and query it via regex like we did in the March 2025 post.

The actual execution plan:

Actual plan for TOP 100

It took about 8 seconds, all of which was spent burning CPU. That’s actually GREAT, a HUGE improvement from last time! 8 seconds of CPU time sounds bad, but it’s fantastic given the number of rows that SQL Server had to examine to find 100 matches:

Number of rows read

Because the data I was looking for was relatively rare, SQL Server had to read about 10 million rows in order to find 100 matches. That means SQL Server was able to read 1.2 million rows per second, and examine their contents with regex. That’s awesome! I love it, and I wish the story ended there.

But let’s switch over to examining the Title column of the Posts table, one of the bigger ones in the database. I’ve created an index on the Title column:

The table has about 60M rows, the clustered index is 163GB, and the index on just Title is 3GB. If SQL Server will use the index, this will give us a giant performance boost over having to scan the whole table.

Posts table size

Let’s run the same WHERE clause filter, but use a SUM(1) this time instead of TOP 100 so that SQL Server is forced to hit all of the rows, and so I can demonstrate the cardinality estimation:

The actual plan doesn’t look great at first glance, but hang in there, because this really is a worst-case scenario – there’s some great stuff in here:

Index scan on Posts.Title

First, it used the index! That’s fantastic. Obviously we can’t seek on it, but at least we’re only reading 3GB of data instead of 163GB. That’s good – that’s the one problem Microsoft completely fixed. Love it.

Second, it went parallel automatically, recognizing that it was gonna be a lot of work. It had to read 60M rows, and it took 7 minutes, so it processed about 145K rows per second. That’s… not good. That’s a huge drop from our Users table processing which was hitting about 1.2 million rows per second. While it was running, ooof, our poor server:

It's getting hot in here
It’s getting hot in here, so light up all your cores

Wait stats are a parallelism disaster:

Wait stats

So… is parallelism a problem? I’ve heard folks say CXCONSUMER is harmless, but long-term readers here will know better. Slap an OPTION (MAXDOP 1) hint on the query, and it runs in 32 seconds:

32 seconds for MAXDOP 1

Which brings us back up to 1.86 million rows per second processed by REGEX. That’s honestly fantastic. If you need to find a needle in a haystack with regex, and you’ve got an index so it can scan less data, and if CPU scheduling doesn’t get in the way, this is a dang fast way to do it. Note that I didn’t try more complex regular expressions – I don’t wanna make up synthetic stuff for testing, and instead you should test with regexes you actually intend to use in your work.

On the down side, note that the estimated number of rows is still hot garbage – SQL Server estimated that 5,383,710 rows would come back, when in actuality none did. I think that’s fair, because I don’t know how you could predict the number of rows that would match a given regex. Hell, I can’t even READ most regexes and understand what they’re trying to do. If your query has a regular expression in the filter, you probably want to load the matching rows into a temp table first so that on subsequent joins, SQL Server better understands the number of rows that’ll be involved.

So in summary, SQL Server 2025’s regex situation is better than it was in Azure SQL DB – at least it’s using indexes now, and CPU is better than it was. Just be careful with running it in production – if you’re just using it as a utility query for quick research, try hinting it with MAXDOP 1 for two reasons. It might run faster, and it’ll be less likely to dominate the server’s entire CPU stack.


How’s the Job Market? Let’s Find Out Together.

Salary
2 Comments

Salary Survey OptionsEvery year, I run a salary survey to help folks have better discussions with their managers about salaries, benefits, and career progression.

This year, the survey is for unemployed folks, too! The first question has a new option for “Unemployed (and looking).” Note that if you’re retired, thanks, but you can skip this survey. Congrats on the retirement though!

Take the survey now here.

The anonymous survey closes Sunday, January 11th. On Wednesday the 14th, I’ll publish the overall responses on the blog in Excel format so you can do slicing & dicing. The results are completely open source, and shared with the community for your analysis. You can analyze ’em now mid-flight – do not email me asking for edit permissions, buddy – but I’d wait until the final results come in. I’ll combine them into a single spreadsheet with the past results, and publish those here.

Thanks for your help in giving everybody in the community a better chance to talk honestly with their managers about salary.


[Video] Office Hours: Fireside Chat Edition

Videos
5 Comments

It’s the perfect time of year in Las Vegas when we can chill outside by the fire, enjoy a glass of wine, and go through your top-voted questions from https://pollgab.com/room/brento.

Office Hours: Fireside Chat Edition

Here’s what we covered:

  • 00:00 Start
  • 00:54 iliyan Rashev: Hello Brent, I am on SQL Server 2019 RTM Please recommend a patching path: 1. Apply CU32+latest Sec.update 2. Apply only latest Sec.update In other words using the 2nd approach will be faster,but do I get all the previous fixes from CU32 that contain all fixes from previous CUs?
  • 02:51 DBArchitectChick: Brent, I am trying to solve the struggle in my head. I admit I love stored procedures. I even wrote a robust unit testing suite for them. Unfortunately, my fellow application developers call them “code smell”. Do I need to give up on pressing for SPs? I might need therapy
  • 06:05 Elwood Blues: What are the top issues you see when running SQL Server on VM’s with multiple NUMA nodes / high core count?
  • 08:53 Adrian: Hi, I saw code, not mine, that purges old data reading WITH READPAST and deleting from tables using joins and ROWLOCK, READPAST for the table where it deletes and READPAST for the tables in the join. I want to know if it is safe, a good idea, are better options? No RCSI
  • 10:23 stittdba: In the past, there used to be a good rule of thumb to wait for SP1 before migrating to a new version of SQL Server. With SPs now a thing of the past, what general guidance would you offer for a time to migrate to a new version?
  • 10:25 stittdba: In the past, there used to be a good rule of thumb to wait for SP1 before migrating to a new version of SQL Server. With SPs now a thing of the past, what general guidance would you offer for a time to migrate to a new version?
  • 11:57 BrentFanBoy: Hi Brent, I have a server that is under heavy load and need to catch an SP that runs very fast, unfortunately it does updates and deletes so I can’t run it myself. What is the best way of catching query plans and runtime stats for that SP?
  • 14:49 Harold Bright: Hi Brent, Is there a way in MSSQL Server where i can call the procedures in Async Mode? i.e. Calling parent proc which has couple of procs to be execute. At present it goes subsequently, but is there a way i can get it started together?

Get ChatGPT’s Advice On Your Queries with sp_BlitzCache.

First off, I understand if you read the headline and you have a knee-jerk reaction. I totally understand that there are a lot of people out there who hate AI, and believe me, I hate most uses of it. Lots of its advice can be misleading at best, and absolute garbage at worst.

However, if you do wanna use AI in appropriate ways – like a developer getting a second opinion about a slow query – then I’ve got something you might like. Our free query analysis tool sp_BlitzCache has a new parameter:

It adds a new AI Prompt column in the result set:

AI Prompt column

Click on that, and you get a copy/paste prompt to put into your favorite AI tool. The prompt includes query performance metrics, the query text, and the query plan, all in a copy/paste-friendly wall of plain text:

AI Prompt sample

The query plan’s there if you scroll down – you can see the scroll bar at right.

That feature works on any currently supported version of SQL Server, Azure SQL DB, Amazon RDS, Google Cloud SQL, etc. It doesn’t require any special ChatGPT or JSON functionality – I’m just building a string that you can copy/paste into your LLM of choice.

It Can Even Call Your AI API For You.

If you’re one of the lucky folks on SQL Server 2025 or Azure SQL DB, buckle up, because this is where it gets wild. If you call it with @AI = 1, we’ll use the new sp_invoke_external_rest_endpoint to call ChatGPT or Google Gemini for you. Your sp_BlitzCache results will have a new ai_advice column:

sp_BlitzCache with AI Advice

Click on a row to see the advice for that query:

AI Advice

You can choose your AI provider and model, and right now we support OpenAI ChatGPT and Google Gemini. I’d definitely welcome pull requests to support other models.

Before You Get Started

It’s probably going to cost you some money. My documentation is going to assume that you’ve handed your credit card over to OpenAI ChatGPT or Google Gemini in order to get a paid API key. I totally understand that people are going to want to use it for free, and there are definitely ways to do it, but I’m not going to try to keep the documentation up to date on that. The free options for AI providers are constantly changing. If you want to contribute pull requests to the documentation to include notes about free plans, that’s welcome, as long as there aren’t some kind of referral codes involved.


The longer your queries are, and the larger their execution plans are, the more you’ll end up paying. I’ve had months with heavy client work where I’ve spent $300-$400 on AI – but believe me when I say during those months, AI saved me hundreds of hours of labor and enabled me to do things I wouldn’t have been able to do without hiring multiple people. I vividly remember one gig where a client had been struggling with memory starvation for months, and on the call together, we identified the query with sp_BlitzCache, then pasted the query into ChatGPT. It found the query’s problem in minutes, rewrote the query to work around the problem, and then we got started testing the changes to make sure they worked the way we needed ’em to work. (That’s the big challenge with AI right now – the hallucinations.)

If you want to point sp_BlitzCache at a free locally hosted model, like with LMStudio, that’s also left as an exercise for the reader for now. I’ll write that up at some point because I’m a big fan of that kind of thing, especially with corporate code bases. It’s just tricky to write generic beginner instructions for that because SQL Server (NOT SSMS) needs to contact the model. I know that seems counterintuitive because you think, “Oh hey, SSMS is on my laptop, and LMStudio is on my laptop, can’t they all just get along?” But remember, we’re calling sp_invoke_external_rest_endpoint on the SQL Server itself, and in many production situations, it’s unlikely that the SQL Server is going to have direct unfettered network access to phone out to LMStudio running on your laptop.

Calling an API directly requires SQL Server 2025 or Azure SQL DB because it uses the new sp_invoke_external_rest_endpoint stored procedure. Crafty people could write their own stored procedure that accepts the same parameters and outputs the same results, and name it the same thing, and then they’d be able to use it on earlier SQL Server versions. That’s left as an exercise for the reader.

Your database context will suddenly matter, because Microsoft requires you to store your AI API key in a database-scoped credential, and the master database isn’t allowed. If you want to call @AI = 1, you’ll need to be in a database where you stored your credentials. That doesn’t mean you can only analyze queries and plans from that one database – sp_BlitzCache is still server-wide. It just means you have to be in the right database when you call sp_BlitzCache.

It’s going to involve security choices and concerns in a few different ways. First, if you use a hosted model like ChatGPT or Gemini, your queries and execution plans will be going up to the cloud, and that’s how personally identifiable data gets around. Next, it’s going to involve storing your API keys as database-scoped credentials, which means if your credential password gets out, other people can run up API charges on your credit card.

Still with me? Alright, let’s set it up.

How to Set Up AI Advice in sp_BlitzCache

First, let’s do some background setup:

That master key is at the entire SQL Server level. Whenever you wanna use database-scoped credentials, you’re gonna need to type that password in (or paste it from your password manager.)

Now, switch into the user database where you’ll be running sp_BlitzCache. It doesn’t have to be the same database where you create sp_BlitzCache: I like to put sp_BlitzCache in the master database so that I can call it from anywhere. However, since I have to save my API keys in a database-scoped credential, I put that in my DBAtools database. When I wanna run sp_BlitzCache with @AI enabled, I switch into the DBAtools database, so that’s where the keys go.

Next, you’ll need to create a ChatGPT API key or a Google Gemini API key, and then save the key text as a database-scoped credential. (You can’t use Github Copilot for this – they don’t allow API access.) You’ll never need to type these in again, and indeed, you probably don’t even wanna store ’em anywhere when you create them. It’s trivially easy to create additional keys in those vendor portals. You can do either one of these, or both, whichever provider you wanna use:

The database scoped credentials have name rules. I’mma be honest with you, dear reader, these rules suck hard:

Credential rules

This sucks so bad because I really want different credentials for different teams. I don’t want everybody sharing the same API keys (and therefore, credit card) for all calls to ChatGPT. Because of that, you probably only wanna put your keys in your DBAtools database, and guard that database pretty heavily.

So that’s why your database-scoped credentials have to be named https://api.openai.com/ or https://generativelanguage.googleapis.com/, and can’t be “https://api.openai.com-DBA_Team” or something like that.

If you’d like to share these keys with your fellow team members, ideally you’ve already got a role already set up, and you can use that. If not, you can create a new role, add people or groups to it, and grant the role permissions to the credentials. You’ll also need to share the master key password with them so they can open the credentials.

Test sp_BlitzCache with Your First AI Call

Here are sample calls for ChatGPT and for Gemini. The ChatGPT one is shorter because we default to ChatGPT’s 5 Nano model, and if you wanna call Gemini, we need a little more info:

In both of these calls, I’m only asking for the top 1 query plan so that the results come back quickly. Even the lightest weight AI models can take 15 seconds or more depending on your query complexity. If you’ve been livin’ right, you should see an ai_advice column in your sp_BlitzCache output.

If the query fails and SSMS kicks you over to the Messages tab with a red error… read the error. Some of ’em are pretty descriptive, like if you don’t have your database scoped credentials set up or opened.

If the query succeeds but the ai_advice column shows an error, scroll across to the ai_raw_response column and read that. It usually means the AI service was successfully called, but there was a problem with your API key, account payment, or with the AI payload we assembled. I only wanna hear about the latter.

Once you’ve gotten these quick & easy versions over, you can remove the @Top = 1 parameter to process the default 10 rows instead of just 1, or call it for a particular stored procedure, plan handle, or query plan hash that you’ve been struggling with.

How AI Advice Works on Multi-Statement Queries

When the sp_BlitzCache result set has multiple rows, like if you get the most resource-intensive queries by CPU usage, the result set might have a mix of statements, multi-statement batches, functions, stored procedures, and the like. Here’s an example screenshot:

sp_BlitzCache with stored proc results

In this example, going from the top lines down:

  • usp_VoteInsert – the stored proc itself has no AI advice because some of its child statements show up in the result set. You’ll want to focus on the advice on each individual statement. (In the future, I wanna consolidate all of the advice into a single “big picture” advice for the parent proc.)
  • Statements (parent usp_VoteInsert) – each of these has its own unique advice. The AI has been instructed to focus its advice on just this one statement, but on lower-powered AI models, you’ll see some bleed-over advice from other statements because the query plan contains other statements in the batch.
  • usp_Report2 – line 4 – this stored procedure’s child statements do not show up in the top 10. This happens when a proc has lots of lines in it which individually aren’t a big deal, but when you add them up, the cumulative effect hoists this proc up into the top 10. In this case, AI is called on to give overall advice across the entire plan with all its statements.
  • usp_CommentsByUserDisplayName and usp_Q975 – the procs don’t have advice, because one of their statements is in the list below

And of course, non-proc queries (app queries, dynamic SQL, linked server queries, etc) will show up in the list too, but my particular workload here happens to be all stored procedures.

If you find the proc useful at this point in your environment, then you’ll want to keep going to enable more complex workflows.

Setting Up a Shortcuts Config Table

When you work with AI, you’ll get accustomed to calling different models at different times. Most of the time, I just want quick, cheap advice. Remember, if each API call takes 30 seconds, and you’re dealing with the default sp_BlitzCache result set with 10 rows, that would be a 5-minute runtime! Speed matters a lot here. However, sometimes I want the AI to take a lot more time, and feed it a lot less results – like just pass it a single stored procedure to analyze.

To make this easier, set up a config table with a list of AI providers, and put it in the same database where you put sp_BlitzCache. There’s no real credential in here, just the name of the credential, so there’s no security risk here.

Here’s the create table script, plus the most common 3 models for ChatGPT and Google:

The results:

Blitz_AI table contents

So now, you can switch back end providers a little easier, like this:

In the examples so far, I’ve just been saying @Top = 1 to get you started calling the API quickly, but in real life usage, you’ll likely be using parameters like these:

  • @StoredProcName = ‘usp_MyProc’
  • @OnlyQueryHashes = ‘…’
  • @OnlySqlHandles = ‘…’
  • @DatabaseName = ‘MySlowDB’
  • @SortOrder = cpu, reads, writes, duration, spills, etc
  • @MinutesBack = 60 – only show me queries that have run in the last 60 minutes

And so forth.

Power User Tips for the Config Table

Set DefaultModel = 1 for the row you want to use by default. sp_BlitzCache’s built-in default is gpt-5-nano, but you can override that with the config table. It’ll only pull the first row with DefaultModel = 1, sorted by Id.

Picking a default model is a balancing act between how much money you want to spend, how long you’re willing to wait for the advice to return, and the quality of advice you’re looking for. Generally speaking, the more you pay and the longer it takes, the better advice you’ll get. I wouldn’t use the AI parameters when I’m normally calling sp_BlitzCache to get an idea of the top resource-intensive queries! I would only use the AI parameters when I’m tuning a specific query, like a stored procedure or statement that I’ve narrowed down by its name, plan handle, or query plan hash. And at that point, when I’m tuning just one query, I’m fine with the analysis taking 3-4 minutes – because at the same time I’m asking AI for advice, I’m also reviewing the plan and query manually.

If there’s one thing I’ve learned about AI, it’s that the quality of the advice you get is directly proportional to the amount of work you put into writing the prompt. I’mma be honest with you, dear reader, the default prompt I’ve got in there right now isn’t my best work, but I’m using it as a starting point because it works well across a lot of AI models.

Right now, the default system prompt is:

You are a very senior database developer working with Microsoft SQL Server and Azure SQL DB. You focus on real-world, actionable advice that will make a big difference, quickly. You value everyone’s time, and while you are friendly and courteous, you do not waste time with pleasantries or emoji because you work in a fast-paced corporate environment.

You have a query that isn”t performing to end user expectations. You have been tasked with making serious improvements to it, quickly. You are not allowed to change server-level settings or make frivolous suggestions like updating statistics. Instead, you need to focus on query changes or index changes.

Do not offer followup options: the customer can only contact you once, so include all necessary information, tasks, and scripts in your initial reply. Render your output in Markdown, as it will be shown in plain text to the customer.

When I work with AI, I like to have a lot of different prompts at my fingertips, so I designed sp_BlitzCache’s config table to be able to switch personalities easily using the AI_System_Prompt_Override column. I did this on a per-provider basis because I like to have several different personalities, even for the same AI provider. I have personalities for:

  • Index tuning only – as in, here’s the query, but you’re not allowed to change it, you’re only allowed to make index recommendations
  • Query tuning only – no changes allowed to indexes, stats, database-level or server-level settings, etc.
  • Both allowed
  • Code review only – for when we’re not really concerned about performance, but we wanna know if there are any bad T-SQL smells in here
  • Reduce blocking and deadlocking – for when I’m specifically facing those problems with a query
  • Refactoring – when I’d like help rewriting a multi-step query or row-by-row query into a more efficient process

So that way, I can do stuff like this:

I know you’re gonna be excited about that and you’ll be eager to grab those prompts, but hold that thought for now. Today’s announcement is about helping you get the sp_BlitzCache-calling-AI plumbing up and running so that you can test it and give feedback on that. We’ll talk through fancier system prompts in subsequent blog posts.

If You’d Like to Read the Code

In the sp_BlitzCache script itself, do a search for “Artificial Intelligence”.

sp_BlitzCache’s interim results are stored in the global temp table ##BlitzCacheProcs, and there’s a spid column to denote which set of queries you’re working with, since multiple people can call sp_BlitzCache simultaneously.

The first thing that happens is an update statement to update all of your ##BlitzCacheProcs rows to populate the ai_prompt column with the query’s metrics, query text (from the plan cache), and query plan. (At some point in the future, I’d like to pull the full unabridged query text for stored procs & functions from their definitions, but I’m not doing that today.)

Then if @AI = 1, we set up a cursor to loop through your ##BlitzCacheProcs rows, call sp_invoke_external_rest_endpoint for each one, parse the results, and then update the global temp table’s columns for:

  • ai_advice – the parsed results from your LLM
  • ai_payload – the data we sent your LLM, for diagnostic purposes
  • ai_raw_results – the raw data we got back, again for diagnostics and debugging

If you’d like to contribute code to handle other LLM prompt & result formats (like Anthropic Claude, Z.ai, OpenRouter, etc), I’d love a pull request for it. I do wish we could support Github Copilot, but they don’t allow direct API access to their chat/response completion models. They have a REST API, but it’s only for monitoring and managing Copilot. There are hacky workarounds, but I don’t wanna deal with supporting those, especially given that Github will be actively trying to hunt down and stop those workarounds.

If you want to contribute code, I just have to set expectations that I can’t do collaborative real time discussions – I’m usually working with clients or on the road. Folks who want to bang on it, can, though, and submit pull requests.

When You Have Questions or Feedback

If you’re getting an error, post a message in the #FirstResponderKit Slack channel or start a Github issue. If it’s your first time in the community Slack, get started here. Your message or Github issue should include:

  • The exact query you’re running – copy/pasted straight out of SSMS, no changes please
  • If the query succeeds and you get an AI response, copy/paste the contents of the ai_raw_response column – just the rows having problems is fine
  • The output from the Messages tab in SSMS – copy/paste out of there, but feel free to sanitize stuff out of there like your company queries
  • Then try running it again with the @Debug = 2 parameter, and copy/paste the Messages results again, with your company’s stuff sanitized out

If you can’t include the exact query you’re running and the out put of the Messages tab, do not post a message. I need that stuff to do troubleshooting, period, full stop.

Please don’t email me errors directly. I’ve learned over time that when there’s a problem, I’ll get 100 emails saying the same thing, but if we work together through Slack, I can leave messages in there telling people what’s a known issue and what’s fixed, and that makes my life way easier.

Finally, I Just Wanna Say Thank You.

This? It's just coffee, as far as you need to know.Readers like you are what make it possible for me to spend so much time giving back to the community, building tools like this and giving ’em away for free.

When you buy my classes & apps, you’re not only paying for your own training, but you’re effectively sponsoring my work. YOU are the ones who make it possible for me to dedicate the time to publish 3 blog posts a week, maintain the First Responder Kit, give away free training classes, and record my Office Hours videos. You’re the ones who make me confident enough to say, “Hell yeah, I’ll put more time into sp_BlitzCache and share my killer AI prompts for free, because these people make my lifestyle possible. And in 2026, I’ll add AI advice into sp_BlitzIndex, too.”

Thank you, dear reader. Here’s to a great 2026, and using AI to not just keep our jobs safe, but make us even more successful in the new year.


Query Exercise Answer: Generating Big TempDB Spills

Query Exercise Answers
10 Comments

In last week’s Query Exercise, I challenged you to play some code golf to generate big spills with tiny T-SQL.

Today, I’m going to walk you through my thought process – the initial attempts I tried and failed with, and the discoveries I made along the way, because I think it makes for fun storytelling.

In the past, when I’ve seen people trying to generate a lot of data quickly, they’ve cross-joined SQL Server’s built-in system tables like sys.all_objects or sys.all_columns. The more times you cross-join them with each other, the more rows you create with a cartesian join. However, that takes up a lot of space to type, and we’re playing code golf here.

Attempt #1: Sorting Big Text Data

SQL Server 2022 brought big guns to this fight with the GENERATE_SERIES function, which accepts nice big numbers as input parameters. You can simply put in a starting number and ending number, and generate that many rows. Just one call to this can produce more rows than cross-joining two system tables in the same amount of typed characters.

I figured I’d generate a large number of rows, each with a big chunk of text attached to it:

For each row returned from GENERATE_SERIES, I’m also calling the REPLICATE command to build what the cool kids call “unstructured data”, a big long nasty string. Whatever you do, don’t actually execute that query, but the estimate is amusing:

Big data

That gets me a lot of data, but no memory is required to generate those results. Heaven forbid you execute that thing, because whatever has to absorb the firehose of data is going to need a hell of a lot of memory. But our goal here is to generate workspace spills, and the easiest way to do that is to add an ORDER BY:

That doesn’t work, because SQL Server seems to understand that the replicated strings are all the same, so there’s no sort in the plan. Same if I try to just sort by the values, either ascending or descending. But slap just a little math in there:

The plan gets a nasty little sort:

Get Sorty

The desired memory grant on the sort is fairly high, at 1,125,899,906,843,128KB:

Grant with a capital G

That’s 1,125 petabytes, or 1.1 exabytes of memory desired.

Now normally, if I’m trying to get SQL Server to spill to tempdb, I might try to get it to underestimate the number of rows coming into the sort, thereby lowballing the memory grant. But who cares? None of you have a SQL Server with 1.1 exabytes worth of memory, nor 1.1 exabytes worth of TempDB space, so I feel fairly confident that when you hit execute on that query, you’re gonna have a bad time.

Again, that’s the estimated plan because I can’t hit execute on that, because I’d have to deal with a metric poopton of results. I can’t just select TOP 1, either, because I’ll get a lowball memory grant, and I’m going for a high score here. In fact, the memory grant difference between these two versions of the query is one of the most comical differences I’ve ever seen:

  • SELECT TOP 1: 24KB memory desired
  • SELECT TOP 100: 24KB memory desired
  • SELECT TOP 101: 1.1 exabytes desired

Given the size of the data I’m replicating, I’m not even sure SSMS would react okay to getting the top 101, so we’ll dump the results into a temp table:

The TOP 101 makes the resulting execution plan quite wide:

Wide query plan

And the desired memory grant is still at 1.1 exabytes:

I too desire 1.1 exabytes of memory

Finally, we have a query I can execute without bringing SSMS down with it, hahaha. I mean, I can execute it, but I can’t get the actual plans because it’ll fill up TempDB with astonishing speed.

If I put all of that on a single line, that’s 159 characters including spaces. I could make the query shorter by removing the TOP 101 and the INTO #t parts, but I think the fun part of a query like this is being able to knock a SQL Server over without taking your workstation down with it.

I hit execute, giddily excited at the thought of trashing my lab server, only to be met with the Sad Trombone of Disappointment:

What the… I’m not using any integers here, only bigints. I started troubleshooting to see where SQL Server was implicitly converting my bigints into integers, and along the way, I stumbled upon something delightfully terrible.

Attempt #2: Exploiting GENERATE_SERIES

As part of my troubleshooting, I wanted to make sure GENERATE_SERIES really did work with bigints, so I started by getting the smallest numbers possible:

And that returned data instantly. But when I tried to get the biggest numbers, what I should have done was passed in the max bigint as my starting point, and tell it to increment by -1 with each step. That’s not what I did – I just thought I could take a shortcut:

I hit execute on that, waited a few seconds, waited a few more seconds, and realized I’d stumbled upon something wonderful. SQL Server doesn’t have any logic to reverse-engineer your ORDER BY, and push that sorting down into GENERATE_SERIES. No, you asked for a metric poopton of data, so SQL Server will gladly build all that data and sort it for you!

Simple generate series plan

And here’s the funniest part: the desired memory grant is still 1.1 exabytes, even though I’m not generating or sorting any text:

Exabyte grant

Since we’re not generating big text data, I can just plain hit execute on that query without worrying about overwhelming SSMS with the result sets (if they ever get produced.) As the query runs, it gradually fills up TempDB, as shown by sp_BlitzWho @ExpertMode = 1, which shows things like tempdb allocations and memory grant usage:

TempDB usage

The live query plan (also shown by sp_BlitzWho) dutifully shows the rows marching on through the plan:

Live query plan

And after a couple of minutes, we run TempDB out of space:

Nice! So that’s a 100-character query that runs the server out of TempDB space, maxing out possible spills:

In the spirit of code golf, we can do a little tuning to reduce characters:

  • Remove the TOP 101 since the query won’t finish anyway, and I don’t have to worry about overwhelming SSMS’s memory
  • Change the select to select *
  • Change the bigint numbers to the min/max integer sizes – but then our desired memory grant is “just” 300GB

That brings us down to 70 characters for an estimated 300GB spill:

Although I guess as long as I’m at 70 characters, why use small numbers? Let’s change them all to 9s to get the same string length:

(Note that this technique ONLY works if you pass in -9999999999 as a starting number, or some other bigint. If you just try to GENERATE_SERIES(0,9999999999) it will fail, saying both the starting and ending numbers need to be the same datatype.)

And we get a 1.5TB desired memory grant, which is pretty respectable:

1.5TB memory grant

So I end up with a few final entries:

So in less than 100 characters, you can drain a SQL Server out of drive space for TempDB. Of course, keep in mind that the estimated memory grant may not be the amount of the actual spill – but I’ll leave it to the reader to provision a petabyte SSD and then let me know what the actual spill amount was.

I thought that was pretty good, but I watched y’all’s solutions come in, and I learned stuff!

Reece Goding’s Solution

Reece Goding (Github) and Andy Mallon (BlogLinkedIn) worked on a similar theme of using GENERATE_SERIES, but they leveraged a sneaky trick: using scientific notation to shorten the long integer text. To understand Reece’s 72-character solution, you have to know that this works first:

Which produces a result of 9000000000000000000. That saves you some typing:

This is better than my solution in TWO ways. First, both Reece and Andy Mallon figured out that you could take out the spaces between SELECT*FROM and 1DESC, which blows my mind and saves “strokes” in code golf.

Second, the use of local variables surprises SQL Server because the query optimizer doesn’t realize that @H is going to be such a big number, so SQL Server lowballs the amount of memory required to run the query. Second, both Reece and Andy Mallon figured out that you could take out the spaces between SELECT*FROM and 1DESC, which blows my mind.

One downside is that it means the estimated plan for Reece’s query doesn’t show the true awfulness of the spill that would result. To approximate the spill, I’m going to revise his solution to natively use the @H values:

That would get a 1.1 petabyte memory grant, just like my 87-character solution – but Reece’s solution is just 72 characters! Excellent work. Andy Mallon used a similar approach, but without the variables.

Tuyen Nguyen’s Solution

Tuyen (Github) wrote, “I use the cardinality estimation limitation from using local variables so SQL Server misestimates the rows coming out from the function and going into the sort, guaranteeing a spill to TempDB on any server. A giant string sort key then magnifies the spill. The bigger @N is, the larger the spill, but also much slower the query is.”

I love this because it’s the spirit of what I thought for my first attempt, but she did a much more elegant job than I did, especially with the local variable. Top notch. Tuyen’s query:

In the spirit of code golf, let’s switch the numbers to 9s, to get a worse query with no additional characters – on the order by, I need to use the max int size:

Because this solution relies on mis-estimations, I have to actually execute the query to see the memory grants, so I’m going to modify it to only return TOP 101 to avoid causing problems with SSMS. The actual query plan:

Query plan with memory spills

Heh heh heh, there you go: 3,995,134 pages is a 30GB spill. Now, that may not sound like much, but you can use that exact technique and simply amp up the numbers in the query, and get yourself a much bigger spill.

How I Adapted Mine

Ripping off Mallon’s idea of ripping out spaces, here’s what I ended up with for my own answer:

So in less than 100 characters, you can drain any SQL Server out of drive space for TempDB. Reece’s solution beats me though at 72 characters for that same giant spill.

If you liked this exercise, check out the ideas from other readers, and seriously, use SQL Server 2025’s Resource Governor feature to prevent users from doing something similar.


Updated First Responder Kit and Consultant Toolkit for November 2025

First Responder Kit Updates
0

This quarter’s release includes new checks for SQL Server 2025’s new memory pressure warnings, Azure SQL DB’s operations in progress, accurate CPU usage on Docker containers, warnings about not using partitioned statistics where appropriate, and much more.

If you’d like to thank me for my work in maintaining the First Responder Kit, the best way to do it is to pick up my training while it’s on sale for Black Friday. I get your money (or hopefully your boss’s money), you get training on how to use the FRK in powerful new ways, and your boss gets somebody who can solve problems faster. Everybody wins! Not to say that you’re not already a winner. You’re reading this, and you’re using the First Responder Kit to avoid crappy amateur-hour work. You’re already a winner.

Update 2025-11-27: known issue: sp_BlitzIndex fails to install on case-sensitive collations. You can download the dev branch version of sp_BlitzIndex to work around that, and I’ll have another release out shortly.

To get the new version:

Consultant Toolkit Changes

Updated to this quarter’s First Responder Kit, but no changes to the spreadsheet template. This release adds behind-the-scenes code to export to JSON, and then import that data into a database so you can keep a centralized database with all of your clients’ diagnostic data on their servers for easier analysis. If you’re interested in testing that, email me at help@brentozar.com with a short description of your use case.

sp_Blitz Changes

  • Enhancement: warn if Resource Governor is set to cap TempDB usage by percentage, but the TempDB file config won’t allow it. (#3721 and #3729, thanks Vlad Drumea.)
  • Enhancement: exclude AWS RDS built-in objects from warnings. (#3712, thanks Yoni Sade.)
  • Enhancement: add more information about Agent jobs w/o failure emails, like total executions, failed executions, last execution time, and status. (#3732, thanks Vlad Drumea.)
  • Enhancement: warn if SQL Server 2025’s optimized locking is enabled on a database, but RCSI is not. (#3741)

sp_BlitzBackup Changes

  • Fix: no longer mangles NVARCHAR database names. (#3724, thanks OZPTR.)

sp_BlitzFirst Changes

  • Fix: accurately report CPU usage on Docker containers. (#3710, thanks Reece Goding.)
  • Fix: added a missing semicolon to fix a fuzzy underline in VS Code, bless its heart. (#3745, thanks Adrian Burla.)

sp_BlitzIndex Changes

  • Enhancement: more flexible handling of database, schema, and table name inputs and outputs. (#3739, thanks Vlad Drumea.)
  • Enhancement: sort the temporal tables in Mode 4 output. (#3727, thanks Reece Goding.)
  • Enhancement: show JSON indexes in the list of indexes, but none of the data is in place yet, like what column they’re on, or sizes, or create/drop statements. Just literally wanna warn people that the object has a JSON index on it. (#3736)
  • Fix: some columns could be null when the missing indexes report was run for a specific database. (#3723, thanks Chad Baldwin.)
  • Fix: really long combos of database, schema, and table names would throw an error because a dynamic SQL variable wasn’t large enough. (#3746, thanks MisterZeus.)

sp_BlitzLock Changes

  • Fix: could produce duplicate rows in some cases. This query was hella complex, and I punted it over to ChatGPT, who gave me a fix in less than a second. Fantastic. Is the fix perfect? Probably not. Neither was the original code, ha ha ho ho. (#3711, thanks Tisit.)

sp_BlitzWho Changes

  • Enhancement: you can now pass in @OutputTableName = ‘schema.table’ without using the separate @OutputSchemaName parameter. You can also use brackets around your object names. In Azure SQL DB, where cross-database queries aren’t supported anyway, you can now just pass in @OutputTableName without specifying @OutputDatabaseName and @OutputSchemaName. (#3734, thanks Vlad Drumea.)

For Support

When you have questions about how the tools work, talk with the community in the #FirstResponderKit Slack channel. Be patient: it’s staffed by volunteers with day jobs. If it’s your first time in the community Slack, get started here.

When you find a bug or want something changed, read the contributing.md file.

When you have a question about what the scripts found, first make sure you read the “More Details” URL for any warning you find. We put a lot of work into documentation, and we wouldn’t want someone to yell at you to go read the fine manual. After that, when you’ve still got questions about how something works in SQL Server, post a question at DBA.StackExchange.com and the community (that includes me!) will help. Include exact errors and any applicable screenshots, your SQL Server version number (including the build #), and the version of the tool you’re working with.


The First SQL ConstantCare Badges Just Went Out

SQL ConstantCare
6 Comments

Ever wonder how big your database estate is compared to others? Whether you’ve got more servers, more databases, more workloads, or more problems?

Well, if you’re one of my SQL ConstantCare® customers, check your email: our first set of badges just went out!

SQL ConstantCare Badges

With SQL ConstantCare®, your SQL Servers send us diagnostic data once per day, and our systems analyze it and email you specific, actionable recommendations to make your databases faster and more reliable. Now, we also tell you where you rank relative to the hundreds of other companies sending in their diagnostic data.

Some of the awards are things to be proud of, while others, well…

Stinkers

You can sign up for free to get basic health advice, or get on the paid plan to get performance advice, too. Both tiers get the monthly badges, so sign up now to find out if you’re doing a great job compared to everybody else – or, find out if you get one of the stinkers!


Functions in the WHERE Clause Are Bad… Right?

Statistics
11 Comments

Nah, not necessarily. SQL Server’s query optimizer behavior keeps changing with every freakin’ version. Let’s illustrate it with a simple query against the Stack Overflow Users table:

Because there are a lot of people in Ahmadabad, SQL Server gets that estimate correct, which is really important when you start joining this table to other tables, like finding their number of comments or posts:

Accurate estimates

SQL Server brought back 1856 of an estimated 1856 rows. So far, so good. But what happens when we start running functions on the parameter we’re searching for, like if it’s a parameter that we need to clean up. How will this affect our estimated number of rows:

Your knee-jerk reaction is probably to say, “FUNCTIONS BAD!” especially given that this is a case-insensitive database. But let’s test that hypothesis across time, across different database compatibility levels:

The actual execution plans tell an interesting story:

  • 110 (SQL Server 2012): 1,856 estimated rows, perfect (1856 rows actually match) – so the functions didn’t matter
  • 120 (2014): 65 estimated rows – worse
  • 130 (2016): 65 estimated rows
  • 140 (2017): 65 estimated rows
  • 150 (2019): 15 estimated rows – EVEN WORSE
  • 160 (2022): 1,856 estimated rows – whew, better – the functions don’t matter anymore
  • 170 (2025): 1,856 estimated rows – functions still don’t matter

The “new” cardinality estimator introduced in SQL Server 2014 seemed to be unable to parse its way through the functions, even though previous versions had been able to do it. Somehow SQL Server 2019 made things even worse, and then presto: SQL Server 2022 fixed it. I’m not talking about any fancy adaptive cardinality estimation stuff either – note that I’m specifically hinting the compat level that I want in the query hints.

In the last few years, with clients on SQL Server 2022, this has led to a couple of awkward situations. I’ve been on a client call where the DBA would pull up a slow query onscreen, point at the WHERE clause, and scream, “I’VE TOLD THE DEVELOPERS A MILLION TIMES, YOU CAN’T PUT FUNCTIONS IN THE WHERE CLAUSE – BRENT, YOU TELL ‘EM!” I’d remove the functions, and the query would perform absolutely identically, with the same estimates.

What about functions on the table contents?

Above, we were running functions on the incoming parameter. Now, let’s flip it around and run functions against the table contents, but leave the parameter as-is:

The actual plans for this one tell yet a different story:

  • 110 (SQL Server 2012): 163,186 estimated rows – bad, since only 1856 rows come back
  • 120 (2014): 65 estimated rows – also bad, wildly underestimated
  • 130 (2016): 891,751 estimated rows – swinging the pendulum into the other overestimated direction
  • And all subsequent versions still use the same bad estimate as 2016

Functions on table contents generally are a terrible thing, especially since they also force SQL Server to do an index scan, processing all of the rows in the table, running the functions on every single row. It’s CPU-intensive and time-intensive.

In the Mastering Index Tuning class, I explain that to work around queries like this, you can add a computed column:

And after doing that, all of the compatibility levels’ actual plans show the same estimated number of rows: 6,547 rows, pretty doggone close to the actual 1,856 rows produced. Create an index atop that column, and:

On Enterprise Edition, all of their execution plans have nice, tidy index seeks with just 1,856 rows estimated, too. So to recap, for this particular combination of functions, are they actually bad in the WHERE clause?

  • If they’re on the parameter side, and you’re on 2022 or newer compat level, probably no
  • If they’re on the parameter side, and you’re on 2014-2019, probably yes
  • If they’re on the table side, and you’ve put in computed columns (and even better, indexed them), probably no
  • If they’re on the table side, and you haven’t put in computed columns, probably yes

This blog post can’t cover every combination of functions.

Many of SQL Server’s built-in functions have different abilities to give you accurate cardinality estimates, on different versions and compatibility levels, not to mention your own user-defined functions, not to mention that queries often combine multiple functions!

Your goal as a performance tuner isn’t to make knee-jerk reactions of “FUNCTION BAD” – but rather to use a scientific approach like this to actually test. Check to see whether the cardinality estimation is correct, rule out parameter sniffing, make sure you’re getting good index usage, and minimizing the work required by the query. You don’t wanna just point at something and yell at it, because if you’re proven wrong, then managers are gonna point and yell at you.

This post is also a great example of why I’m updating and re-recording my Mastering classes for SQL Server 2025. Most of you are still on SQL Server 2019 – which is great, I love it, and it’s stable – but in the next year or two, you’re going to be embarking on upgrade projects to 2022 or 2025. Your performance tuning journey is gonna start again, and I’m gonna be right here to help you stay up to speed.

Speaking of which, my classes are on sale this month for Black Friday! Go check ’em out.


T-SQL Query Exercise: Generate Big Spills

Query Exercises
18 Comments

When I was testing SQL Server 2025’s new ability to limit TempDB usage with Resource Governor, I wrote a few fun diabolical demos. One of them was to generate giant spills to TempDB, quickly.

When you’re looking at an actual (not estimated) query plan, and you see a yellow exclamation point on a sort operator, that means:
Fundamentals of TempDB

  • SQL Server made a guess about how much memory it would need to accomplish that sort
  • The amount of data coming into the sort wildly exceeded SQL Server’s plans
  • SQL Server resorted to writing that data to TempDB: aka, the data spilled over from memory to disk

This can happen in a few interesting ways:

  • SQL Server under-estimates how much data will be coming into the sort, or
  • SQL Server accurately estimates it, but simply doesn’t have enough RAM available to handle it
  • The query goes parallel, the memory is evenly divided across threads, but the data is not evenly distributed, so some threads run out of memory while other threads’ memory sits unused

Sorts aren’t the only operators that can spill, either!

So for this week’s Query Exercise, we’re going to play a little code golf: write the shortest query you can that will produce the biggest spills. Code golf is a fun way to think out of the box, freeing yourself from traditional limitations around the tables & queries you usually work with.

Those of you who have my Fundamentals of TempDB course can check out this brand-new class module on Resource Governor, and you’ll get an idea of the approach I took. (If you don’t have that class, check out my Black Friday sale.) I couldn’t let this problem out of my head so I ended up toying around with it, iterating over that approach for a while to make it much shorter than the class demo.

Put your queries in a Github Gist and include those link in your comments. Check out the solutions from other folks, compare and contrast your work, and next week I’ll circle back with my answers & thoughts. To give you a rough idea of what to aim for, my solution uses less than 100 characters, and will run any server out of TempDB space with a ginormous spill. Have fun!


SQL Server 2025 Is Out, and Standard Goes Up to 256GB RAM, 32 Cores!

SQL Server 2025
35 Comments

Today marks the official birthday of Microsoft SQL Server 2025. Here’s where to download the evaluation version. Here are the top things you wanna consider as you talk to your managers, developers, and end users.

The feature differences between 2025 Enterprise and Standard have been revealed, and the news for Standard Edition folks is spectacular: it now supports up to 32 CPU cores and 256GB RAM! Other editions get a couple of tweaks: Express Edition now supports up to 50GB per database (although it’s still capped at a single CPU core), and Web Edition is no more.

If you build your own apps and you use Standard Edition in production, there’s a new Standard Developer edition that mimics the behavior and limitations of Standard Edition. During installation, when you hit the edition dropdown, choose Standard Developer. You can read about 2025 Standard’s limitations here. The big one people usually think about is the RAM cap, but there are also subtle performance differences, like Enterprise Edition’s advanced scanning, aka merry-go-round reads.

If your TempDB runs out of space, I love Resource Governor’s new ability to cap space usage for workload groups (or everybody altogether.) This should be a part of everybody’s standard builds for 2025 servers – especially because of a fun new bonus: Resource Governor is now available in Standard Edition too.

If you use columnstore indexes, 2025 continues upon every release’s investments, with a bunch of improvements to make management easier on ordered indexes. In my early tests with clients, we’ve seen massive improvements in easier, faster, more online maintenance that we simply couldn’t do before.

WHO IS THIS JASON AND WHY IS HE IN MY TABLESIf you’re storing and querying JSON, either in NVARCHAR(MAX) columns or 2025’s new JSON data type, there’s a new JSON_CONTAINS search function, which in many cases is quite sargable with the new JSON indexes. This should only be used when you can’t predict which columns you need to index, and index them with computed columns instead though, as I explain in this new Indexing JSON module in my Mastering Index Tuning class.

If you’re doing AI projects, the most interesting piece is the ability to call things like ChatGPT using the new sp_invoke_external_rest_endpoint. I’m genuinely excited for that and I’ve talked to a bunch of clients who are ready to start putting it through its paces for a lot of different use cases. For AI, 2025 also brings a new vector data type, vector functions, and vector search, but I’m much less enthusiastic about those. I imagine those features will see adoption rates similar to spatial data and graph data – just not that high in SQL Server.

If you’re into regular expressions, they’re now built in, so you don’t have to install third party tools for it anymore. However, I’d caution you to test your queries first: I’ve seen horrific performance. SQL Server doesn’t have any magic secret sauce to make your convoluted search conditions sargable, and string processing has never been SQL Server’s strong point. This feature is kinda like cursors, in that there will be very specific places where it makes sense, usually in one-off utility queries, not the kind of thing you wanna put in front of end users.

If you struggle with lock escalations and multiple writers, enable the new Optimized Locking feature in conjunction with Accelerated Database Recovery, and you can have millions of rows involved in write locks, and you still won’t see lock escalation. I demo it in this module of my Mastering Index Tuning class.

There are other improvements, like a few Intelligent Query Processing improvements, lots of Availability Group improvements, and a new faster (but not necessarily smaller) backup compression algorithm to test.

SQL Server versions rarely get big breaking changes, and this release is no exception. There’s a small list of removed and deprecated features, but nobody’s gonna miss that stuff. Data Quality Services? Master Data Services? Synapse Link? C’mon, who would be naive enough to use that stuff? None of my readers, just like they wouldn’t even dream of considering Big Data Clusters or ledger tables. We know better. Databases are persistence layers, not app servers or blockchain spam.

However, 2025 does have several breaking changes focused around secure connectivity between services and servers. 2025 adopts TDS 8.0 support with TLS 1.3, which breaks some linked servers and replication setups. The fixes in that link sound pretty easy, but I’m not the kind of guy who uses linked servers or replication, so that’s left up to you. I just feel like I gotta mention it first in the headlines because it’s so rare that we get known breaking changes, and you wanna head these off before you get surprised by ’em.

But best of all, SQL Server 2025 sees been massive price cuts to respond to market pressure from other databases, and to correlate with the fact that a lot of the new features drive up cloud revenue, like syncing your data up to your pricey new Fabric reporting system and calling Azure OpenAI via stored procedures. Microsoft clearly recognized that the more SQL Server 2025 is adopted, the more you’ll spend on cloud services, and the less likely you’ll be to switch to open source databases, so they… almost had you fooled, didn’t I? Sorry, bud. The price list PDF is out, and prices look the same:

  • SQL Server 2025 Enterprise Edition: $7,562 per core
  • SQL Server 2025 Standard Edition: $1,973 per core

When should you upgrade to SQL Server 2025?

If you’re a regular reader around here, you’ll know that I tend to be a pessimist. I tend to recommend that folks only upgrade to a new version when there’s a specific feature in the new version that they absolutely require for their applications. Even then, I recommend that they test that specific feature first, make sure it actually relieves the pain you’re having in production, and make sure it behaves the way you want.

How about a nice cup of Standard Edition?I think SQL Server 2025 is different. I think you should adopt it sooner rather than later.

If you build your own apps, in the old days, we’d say things like “wait until Service Pack 1 before you put it into production.” These days, Microsoft has done a pretty good job with compatibility levels: in a development environment, you can run SQL Server 2025, but put the databases in the compatibility level that you’re using in production, and be fairly confident that the majority of your query plans will look/feel the same.

I’d recommend putting 2025 into your development environments sooner rather than later.

Initially, keep the databases in the older compatibility level, and warn the developers not to use any new data types, functions, or T-SQL capabilities from 2025 until you’ve got the confidence that you can run 2025 in production. Use the development servers as testbeds for things like your management & monitoring apps, linked servers, replication, etc, and gain the confidence you need to gradually put 2025 into production. You can start gaining that confidence in low-load development environments now, without being worried about the inevitable bugs that’ll get fixed in the first few Cumulative Updates.

If you only host 3rd party apps, hold back until your vendors agree that SQL Server 2025 is one of their supported platforms. Yes, in terms of what the apps feel, 2025 will probably behave identically to prior versions as long as the compatibility level is set to a level the vendor supports. However, you don’t wanna be the one who gets blamed when the vendor says, “Sorry, you’re running a version of SQL Server that we don’t support.” The vendor might blame you for issues that are completely unrelated to 2025, and I can’t have that happening to you. You need this job.

But as soon as they say 2025 is one of their supported platforms, I think you should go for it. These days, SQL Server has more tools than ever to help you manage query performance when you can’t touch the queries. Turn on Query Store, put the database into 2025 compatibility level, turn on automatic tuning (aka automatic plan regression), and use the tips I describe in How to Go Live on a New Version of SQL Server.

To learn more about SQL Server 2025, check out the posts I’ve written on its new features.


I Wish SQL Server Had These Four Innovations.

SQL Server
13 Comments

I love SQL Server, and I’m excited for the release of 2025. I think the query processing keeps getting better, which means your apps are gonna go faster, with less code changes.

The AI stuff is a little fluffy, and I don’t think time will be kind to it, but I totally understand why Microsoft felt like they had to chase that buzzword. All the cool kids are doing it, just like they did blockchain and IOT, and Microsoft is always looking for any way they can to keep SQL Server relevant to developers.

However, I keep seeing stuff in other databases that really is innovative, and would genuinely make developers’ lives easier, and I find myself wishing that Microsoft would focus on these kinds of real-world usability improvements. Here are just a few that have come across my desk recently.

Optional From-First Syntax

Imagine being able to write your queries like this:

If you wrote the from and joins first, then IntelliSense would actually be useful, because the select/where/order would all understand where you’re pulling data from. I am so sick and tired of starting my queries by typing:

And then up-arrowing back up to the top to type SELECT, because now I get autocomplete. I have to do this so freakin’ often that the up and down arrows on my keyboard are visibly more worn out than most:

UP UP DOWN DOWN LEFT RIGHT LEFT RIGHT B A START

The only other keys that are even more worn out are the ones on the left side, and I’ll leave the reason why as an exercise for the reader:

Maybe I use a lot of Softwares as a Service

DuckDB lets you type FROM first, but best of all, it’s optional – not required – so your queries work either the old school way, or the new from-first way. It’s a great way to gradually implement a better querying experience.

Native Rest or GraphQL API

Microsoft’s Azure Data API Builder sits in between your apps and your database. The DBA sets up the DAB, defining which tables people are allowed to query and update, and how joins are formed. It’s like a reporting universe, but for your applications.

Then, developers simply hit the API with REST or GraphQL requests like:

And get easy-to-consume JSON result sets. It supports filtering, sorting, joins, etc, and even better, it has pagination by default, encouraging good, scalable development practices.

Because both the API inputs and outputs are standards, this would be usable across not just SQL Server, but other Microsoft (and other!) database products, especially if Microsoft implemented it in stuff like Azure SQL DB and CosmosDB. Instead of having to build Tabular Data Stream (TDS) support into every app development platform, any dev platform that supported the REST or GraphQL APIs and JSON output would instantly have access to Microsoft’s best relational databases. Easier app development = better chance that Microsoft databases will get picked to play.

Yes, Microsoft built this innovation, and yes, they made it open source. But because it’s buried in with the rest of their open source projects, it’ll never get the marketing attention it needs, and big companies often refuse to install any add-ons. If it was part of the boxed product itself, I think it’d get rapid adoption for new development projects. Will it find its way into the boxed product? Probably not, because it won’t increase licensing sales.

Better Query Plans with Plan Stitching

Back in 2018, Microsoft published a research paper about plan stitching: breaking query plans up into parts, compiling the parts separately, and then stitching together the various parts at runtime. It’s kinda like OPTION (RECOMPILE), but inside the query itself, while it’s running.

This would be so helpful for those 20-table-joins I see so often in reporting queries, not to mention people who like to join half a dozen nested views together. SQL Server just doesn’t stand a chance of building an accurate overall plan in situations like that. Instead, I’d love for it to start pulling data back from the first couple of tables, and then make different decisions based on how many rows come back from each table.

The research looked promising, albeit challenging (and expensive) to actually implement and test. I understand why it didn’t make the cut, but that doesn’t stop me from wishing for it. However, the next innovation is less nice-to-have, and more come-on-why-the-hell-don’t-we-have-this-yet….

Automatic Backups, Encryption, and Patching

When you provision an AWS RDS SQL Server instance (and you get similar options in Azure SQL DB, I just happened to be provisioning an RDS instance so I was reminded of this), you get a simple set of checkboxes and dropdowns:

Automatic maintenance

Company-wide settings policies like Cost Threshold, MAXDOP, trace flags, etc – just a matter of picking them from a dropdown. Automatic backups? Already checked for you, with continuous point-in-time backups managed by the robots. Encryption? Of course. Automatic patching? Sure, if you want (not on by default) and you can even pick the maintenance windows when you want them to happen.

It’s 2025. Why are we screwing around with vector data types when SQL Server still installs itself in such a vulnerable state, with no backups, no corruption checking, and no ability to patch itself, and not offer any options during setup to mitigate these glaring problems? I hate whatever product manager is letting that happen. There’s a word for them, what is it, trying to think of it…

THE WORD IS ASS


How to Query JSON Data Quickly in SQL Server, Part 2: SQL Server 2025

SQL Server 2025
11 Comments

Mastering Index TuningSQL Server 2025 and .NET 10 bring several new improvements to storing JSON natively in the database and querying it quickly.

On the SQL Server 2025 side, the two big ones are the new native JSON indexes and the new JSON_CONTAINS function. Let’s see their improvements in action. On the .NET 10 side, EF 10 not only supports the new JSON data type, but on databases of compatibility level 170 or higher, EF will automatically migrate JSON data from NVARCHAR(MAX) data types over to JSON the next time you do a migration, as explained in the What’s New in EF Core 10 doc. That makes it especially important for you to understand how the new JSON indexes work, because they may be coming at you quickly the instant you change your compatibility level.

In my opinion, the short story is that for now, you should probably stick with yesterday’s technique using indexed computed columns. The long story is in the new JSON indexing module of my Mastering Query Tuning class. The rest of this post is the medium story, I suppose.

Start with storing the data in JSON.

As we did in yesterday’s post about how to get fast JSON queries pre-2025, we’ll create a Users_JSON table, but this time we’ll use the new native JSON datatype, then create a native JSON index on it. The create table statement uses 2025’s new JSON datatype:

Note that last CREATE JSON INDEX statement because there are a few cool things about it.

Create a new JSON index.
Make a pot of tea. Buy RAM.

  • The JSON part is new as of SQL Server 2025.
  • I only specify the source column (UserAttributes) – and whatever’s in it gets indexed, period.
  • WITH (MAXDOP = 0) lets me throw all of the server’s CPU power at it, no matter what the server-level MAXDOP setting is.

That last part has always been supported with any create index statements, but… it’s especially important here. What we’re doing is like creating an index on every column, something you might imagine would take quite a while. It’s like… well, you know, while I’m waiting for the above to finish, I can demo exactly what it’s like on another table. It’s like creating a clustered columnstore index on that same amount of data, breaking the data up into columns, storing it separately, sorting it, and compressing it. That can be pretty time and CPU intensive, as demonstrated by how long it takes to create a clustered columnstore index:

Creating a columnstore index in 18 seconds

18 seconds on the clock, and 43 seconds of CPU-burning time – that’s a lot for a table that’s only 8 million rows and 1GB in size.

Although in comparison, let’s look at my fancy new JSON index now that it finally finished building, and:

JSON index metrics

I just… I don’t understand:

  • CPU time: 43 seconds for columnstore, 24 minutes for the JSON index
  • Elapsed time: 18 seconds for columnstore, 8 minutes for the JSON index

What the hell is this thing doing?!? Sure, I get that the columnstore index has a defined list of columns to start with, whereas the JSON object could have any number of columns, including nested ones, but this feels like a best case scenario – my JSON doesn’t have anything nested at all, and it’s still this bad! Here’s what the query plan shows:

Actual query plan for JSON index creation

I don’t think I’ve ever seen an index creation have two stages before. Looks like the first part took about 6 minutes, and there’s a yellow bang on the insert operator – hover your mouse over that, and there’s a warning about using more memory than the query was granted:

Memory grant warning

Sounds like we better go check out the memory grant properties to see how much memory the index creation wanted – let’s look at Desired Memory:

My kingdom for some commas

7,722,451,296.

SQL Server wanted eight terabytes of memory in order to build a JSON index on a 1GB table. And in fact, if our server had 32TB of memory, 8TB would have actually been granted to this query. How’d you do that? I’m not even mad. That’s amazing.

The resulting index is about 12GB in size – on 1GB of base tabular data! sp_BlitzIndex doesn’t show that yet because as of RC1, the data is stored in a really wonky way – with a child object, and child indexes on that object. For now, you can manually query the DMVs to see it:

Just replace the Users_JSON_Indexed table name with the name of the table you’re examining.

But at least all our JSON queries are fast… right?

We’ll use the same query syntax we used in yesterday’s post, using JSON_VALUE because that’s what our developers have standardized on over the last decade because that’s the way to get fast queries with tidy index seeks:

And it, uh… takes the same CPU time and table scan that it took without an index?!?

JSON_VALUE statistics

Because SQL Server is ignoring the JSON index. We can actually force usage of the JSON index with an index hint:

The first query is the one SQL Server picks, and the second one is my index hint:

A tale of two bad plans

The second query uses the index due to my hint, and uses way less CPU time, but… there’s something really odd about the index usage plan. Why are we doing a clustered index scan of the entire table? The top operator only brought back one row – why would we need to join that to the entire contents of the Users table? Even if I change the SELECT * to just SELECT ‘Hi Mom’, it still has the same problem, insisting on joining to the entire contents of the Users table:

Odd join

I just don’t understand this plan. It doesn’t make any sense.

I do understand the 2,819,960 row estimate, though: it’s a hard-coded 30% of the table, and that’s a serious warning sign.

2025’s JSON indexes aren’t ready yet.

Whenever a new T-SQL function comes out, I ask a few questions:

  • Does it actually work in the way users would expect? This sounds like obvious table stakes, but it’s not.
  • Can it use an index – aka, is it sargable?
  • Can it do an index seek rather than scan?
  • Does the query optimizer estimate the number of rows it’ll produce, or use a hard-coded guess?

Here, the T-SQL function isn’t new – but the index is, and it doesn’t seem to work well with the existing JSON_VALUES function. It performs poorly during build, performs poorly during execution, and unlike yesterday’s computed column strategy, JSON indexes don’t appear to have statistics:

Lack of stats

If we only had the JSON_VALUE function, I’d end the story here, but SQL Server 2025 also introduced a new function, JSON_CONTAINS. Let’s rewrite our query with that new function and see how it goes:

YAY! The query runs in milliseconds!

JSON_CONTAINS plan

But it still insists on joining to the contents of the clustered index – even when we just select ‘Hi Mom’, nothing from the table. Daniel Hutmacher noticed this in May during the early previews, too, and doesn’t seem that’s been fixed yet.

SQL Server 2025’s JSON indexes only make sense when:

  • You can’t use the easy technique of storing the data in NVARCHAR(MAX) with indexed computed columns, like we’ve been doing for the last decade
  • You don’t care how long the index creation & maintenance takes
  • You can deal with index sizes that are several times larger than the underlying data
  • You can rewrite your queries from JSON_VALUE to JSON_CONTAINS

Otherwise, stick with yesterday’s technique for now. I say “for now” because it feels like a lot of these problems could be solved with cumulative updates over time, and Microsoft does have a track record of doing this. Columnstore indexes were awful when they first came out in SQL Server 2012, but Microsoft made steady investments in that feature.

In addition, you’re going to need to have a chat with your developers before touching the compatibility level. The instant you change it to 170, and your developers do their first migration, their NVARCHAR(MAX) columns may get converted over to JSON, which would be a pretty ugly upgrade and outage. You can avoid that for now by having them set those columns’ data types to NVARCHAR(MAX). I’m not saying not to use EF Core 10 – there is an awesome improvement in there that I’ll discuss in another blog post.

If you enjoyed this post, you’ll love the new JSON indexing module in my Mastering Query Tuning class.


How to Query JSON Data Quickly in SQL Server, Part 1: Pre-2025

Indexing
12 Comments

Before SQL Server 2025, if you want to store JSON data in Microsoft SQL Server or Azure SQL DB, and you want fast queries, the easiest way is to:

  • Store the data in an NVARCHAR(MAX) column (because the native JSON datatype didn’t arrive until SQL Server 2025)
  • Add a computed column for the specific JSON keys we’ll want to query quickly
  • Index those keys
  • Query it using the JSON_VALUE function

To demo what I mean, I’m going to take the Users table from the Stack Overflow database, and I’m going to pretend that we’re only storing the Id column, and the rest of the columns are JSON, stored in a UserAttributes column. Let’s create a new Users_JSON table to simulate it:

Now we’ve got a table with just Ids and JSON data:

Users_JSON

If we want to find the users with DisplayName = ‘Brent Ozar’, and we query with the JSON_VALUES function, SQL Server has to scan through that entire table, cracking open every JSON value, and finding it. That’s a lot of reading, and a lot of CPU work:

The query is slow and CPU-intensive, going parallel across multiple cores, maxing our server out for about 4 seconds:

Oof, that CPU time

This stuff works in the sense that it compiles and runs, but as you hit real-world data sizes (there are less than 10 million rows in that table), that doesn’t scale. We’re gonna need to run more than one query every 4 seconds.

Add a computed column and index it.

We do have to decide what columns we wanna index it, but take a deep breath and relax – I’m not saying you need to do any application code work at all. We’re just going to make some of the columns really fast. Take DisplayName:

Now, run the exact same JSON_VALUE query without changing our app or our code:

The query runs instantly, reading hardly any data and doing no CPU work:

Fast index seek

SQL Server automatically recognizes what we’re trying to do in the query, realizes it’s got an indexed computed column ready to go, and uses that to deliver our query results. It’s like magic, and it’s worked this way since 2016 with JSON_VALUE. Works with other datatypes too, like numbers and dates.

It even works with LIKE queries:

Producing a nice index seek, and even pretty good row estimations. Here, SQL Server estimates 3699 rows will be produced (and 1523 are):

Sargable LIKE

If we do a non-sargable filter, like a leading % sign in our LIKE:

SQL Server still uses the index, just scanning it instead of seeking it, which is a good thing:

Non-sargable LIKE

Scan SOUNDS bad, but it’s still less CPU work and logical reads than our un-indexed JSON scan. Back at the beginning of the post, we were looking at half a million reads and 14 seconds of CPU work to scan all the JSON, but with the index, even non-sargable stuff isn’t terrible:

Non-sargable IO metrics

Once you’ve got the computed column and index in place, you could change your queries to be normal T-SQL, like this:

They’ll be super-fast (just like the JSON_VALUE queries) – but I don’t recommend doing this. If you change your queries to point to the computed column, then I always have to have the computed column in place in the database. Leaving your queries as JSON_VALUE queries means we keep the flexibility of changing the database, plus we get fast index seeks. There’s no benefit to changing our queries to point to the new computed column – only heartache when our schema needs eventually change.

This indexed-computed-column technique has worked really well.

It’s a nice compromise that works well when the developers want the flexibility of changing which attributes they store, adding more attributes over time – but some attributes need really fast searches. I’ve been teaching it for years in this module of my Mastering Index Tuning class, where I also talk about the drawbacks and other solutions. (You can join in on these classes for a special price during my Black Friday sale this month!)

Developers are happy because they can change their UsersAttributes schema whenever they want without talking to the DBA. They can even decide to remove columns that used to be part of our core fast-query design – as long as they tell the DBA, and the DBA drops the index and computed column. The developers don’t have to change their queries – they just keep using JSON_VALUE. The results are fast when we agree that it’s one of the core columns, and still doable when they’re outside the core set of columns, just slower.

One drawback is that the NVARCHAR(MAX) datatype isn’t really JSON: SQL Server doesn’t validate the data for you. Another is that if you want multiple filters, you have to write them out individually. Another is that the more complex your JSON becomes, the more you have to pay attention to stuff like lax mode and strict mode. I’mma be honest, dear reader: like the lion, I don’t concern myself with such things, and I just point the developers to the documentation. I say look, if you want JSON queries to be millisecond-fast in SQL Server and Azure SQL DB, you gotta tell me the specific columns you’re going to query, in advance, and I’m gonna tell you how to write the query.

But the biggest drawback – by far – is that we have to define the core set of columns we wanna query quickly.

Developers want real flexibility: the ability to change their JSON schema at any time, and query any values out of it, quickly, without friction from the database side. That’s what Microsoft tried to deliver in SQL Server 2025, and I’ll cover that in the next post in the series. Careful what you ask for, though….


[Video] Office Hours in Roatan, Honduras

Videos
0

My cruise ship took a detour to Honduras to avoid the hurricane. Let’s hang out on the beach and go through your top-voted questions from https://pollgab.com/room/brento.

Office Hours: Ask Me Anything in Honduras

Here’s what we covered:

  • 00:00 Start
  • 00:58 Dopinder: What is your opinion of the Github copilot support in SSMS 22? Is this more beneficial for the DBA or Data Engineer?
  • 03:45 RunningOutofArguments: My company is planning an application that processes incoming emails and wants to store the emails and possibly attachments in SQL Server. You used to argue against that and using the file system instead. Has your opinion changed?
  • 05:26 Harold Bright: When MSSQL Server Services takes up a memory say about 80% when it gets busy, why is that it does not go down (like other apps) even when the application traffic is brought down to nothing? Can something be done to bring the memory down when its not in use?
  • 06:18 Mr. SqlSeeks: What was the rough employee count of the smallest company that has hired you in the past 5 years? I am curious if you are mostly hired by large organizations or if smaller orgs, even startups, bring you in for assistance.
  • 08:47 Scooter: When passing array type data to a TSQL sp, what are your pros / cons of passing the array as csv string value, table variable, or user defined table type? This sp will be called by c#.
  • 10:09 Big Blue Couch: Hey Brent, My T-sql skills are mid-level at best. In recent months, I’ve finished reading T-sql fundamentals and t-sql window functions. Do you have recommendations of blogs to read that go into more advanced T-sql usage? I just started reading Erik Darling and Gail shaw.
  • 13:48 chris: I can physically leave my workplace; however, how do I mentally disengage from work so I can be mentally present for my loved ones? I find disengaging very difficult when I’m neck deep in a difficult problem. How do you close out of work mode when you’ve so many “open loops”?
  • 16:25 AI_is_my_DBA: Setting up SQL Servers with 6 local SSDs. Referenced your setup checklist. Need to balance RAID level, drive separation, and SQL preference for 64K block size. Leaning towards a RAID1 for OS/SQL, a RAID10 for data/logs. I am a novice trying to avoid select pain from poor_choice.
  • 19:09 MyTeaGotCold: How fearful should I be of hinting indexes? Sometimes, they’re the only easy solution to my problem but I worry about them being brittle.
  • 19:45 Q-Ent: Hello Brent, do you think Oracle invests more in its certified professionals than Microsoft does? The new certifications feel quite shallow, both in prestige and technical depth.

Who’s Hiring Microsoft Data People? November 2025 Edition

Who's Hiring
16 Comments

Is your company hiring for a database position as of November 2025? Do you wanna work with the kinds of people who read this blog? Let’s make a love connection.

Yes, you.
I think YOU should apply.

If your company is hiring, leave a comment. The rules:

  • Your comment must include the job title, and either a link to the full job description, or the text of it. It doesn’t have to be a SQL Server DBA job, but it does have to be related to databases. (We get a pretty broad readership here – it can be any database.)
  • An email address to send resumes, or a link to the application process – if I were you, I’d put an email address because you may want to know that applicants are readers here, because they might be more qualified than the applicants you regularly get.
  • Please state the location and include REMOTE and/or VISA when that sort of candidate is welcome. When remote work is not an option, include ONSITE.
  • Please only post if you personally are part of the hiring company—no recruiting firms or job boards. Only one post per company. If it isn’t a household name, please explain what your company does.
  • Commenters: please don’t reply to job posts to complain about something. It’s off topic here.
  • Readers: please only email if you are personally interested in the job.

If your comment isn’t relevant or smells fishy, I’ll delete it. If you have questions about why your comment got deleted, or how to maximize the effectiveness of your comment, contact me.

Each month, I publish a new post in the Who’s Hiring category here so y’all can get the latest opportunities.


Next Week, Start My Mastering Classes for Free!

You’re a developer, database administrator, or DBRE who’s been working with SQL Server, Azure SQL DB, and Amazon RDS SQL Server for yeeears.

Mastering WeekYou’re jaded. You’re confident. You’re pretty sure you know what you’re doing. You’ve never taken my Mastering classes because you’ve been reading my blogs and watching my live streams, and you think you’ve put the pieces together yourself. You can’t imagine there’s anything left for you to learn, any surprises left in the box.

Well, now’s your chance to find out, for free.

On November 11-14, let’s get together and run through the first 4 hours of each of these classes:

  • Tuesday: Mastering Index Tuning
  • Wednesday: Mastering Query Tuning
  • Thursday: Mastering Parameter Sniffing
  • Friday: Mastering Server Tuning

To attend, register here, then grab the calendar invites to block out your coworkers from trying to schedule you for meetings, hahaha. At the time of the class, head here for the live stream. (That URL is in the calendar invites too.)

If you can’t make the live classes, the recordings won’t be on YouTube or free – you’ll need to buy my Recorded Class Season Pass: Mastering bundle, which includes the full day-long versions of each of those classes. You can either buy one year of access, or lifetime access.

In just 4 hours, you’ll find out if you’ve really mastered Microsoft databases – or if there are still things this old dog can teach you. Let’s hang out and talk data!


[Video] Office Hours at Sea

Videos
1 Comment

This Office Hours episode comes to you from Virgin Voyages’ newest cruise ship, the Brilliant Lady, sailing through the Western Caribbean. We were originally scheduled to do the Eastern Caribbean, but Tropical Storm Melissa had other ideas, so they reworked our itinerary at the last minute.

Let’s take your top-voted questions from https://pollgab.com/room/brento:

Office Hours in the Straits of Florida

Here’s what we covered:

  • 00:00 Start
  • 02:14 Do I want to know all 🙂 ?: Hi Brent, thank you for all the community work! Since you work on two database platforms, it does the Job duties for Database Developers, Development DBAs, and Production DBAs change with the platform or do they stay the same ?
  • 03:13 DataBarbieAdministrator: What are your thoughts on pushing semi-structured data types such as JSON into a relational database such as SQL Server? A friend of mine says: It’s not SQL Server’s job—Microsoft isn’t going to revolutionize the engine just for one data type.
  • 05:06 sakaLchuluDBA: I have a server handling around 100,000 batch requests per second. The problem is TempDB contention. I have increased the number of TempDB files to 16 but in high workload is not help me Would a memory-optimized TempDB help me solve this issue? How much memory would be required?
  • 06:15 Josef: Hey Brent, have you ever used sp_CheckSecurity from Straight Path Solutions? Would you recommend that, or is there another tool you prefer for checking security vulnerabilities?
  • 06:51 Nonclustered Cellstore: I’ve just finished Fundamentals of Columnstore, as well as the big reading list at the end. Have you had any further thoughts on what you would put in a Mastering Columnstore class?
  • 07:19 Big Blue Couch: Hey Brent, Do large(fortune 500) enterprise companies use SQL Server more than other database engines? Or does company size not correlate with database platform choice?
  • 08:26 Major Bludd: How did the AWS outage affect you? Are you being compensated by Amazon? Will this happen again?
  • 09:46 MyTeaGotCold: On a busy OLTP system, how can I grab the actual execution plan for a short-running query without setting my whole server on fire? My first idea, sp_HumanEvents, has safeguards to protect from exactly this.
  • 10:16 Big Blue Couch: I’ve got a request to evaluate moving our SQL server from Ent edition to Std edition. We have 4 cores and we are cpu bottle necked. The idea is to just increase core count and not budget. We don’t need HA. What are the important considerations to see if this is viable?
  • 11:56 Pradeep: Hi Brent, hope you’re doing well. I’m focusing on query performance tuning. How secure is the future of this specialization? Do you see it as a safe and valuable career path long-term?
  • 13:35 Whiny App Developer: Have you seen any AI mishaps/been asked for advice on how to prevent them happening in your consulting work? We’ve seen a database user (shared by many apps -rolleyes-) get made read-only by a developer using postgres’ MCP server. Not in prod, but still a time sink!

SQL ConstantCare® Population Report: Fall 2025

In this quarter’s update of our SQL ConstantCare® population report, showing how quickly (or slowly) folks adopt new versions of SQL Server, the data is very similar to last quarter:

  • SQL Server 2025: exactly 1 server being monitored, heh. God bless y’all for giving it a shot this early.
  • SQL Server 2022: 25%, up from 24% last quarter
  • SQL Server 2019: 43%, was 45%
  • SQL Server 2017: 10%, was 11%
  • SQL Server 2016: 13%, was 12%
  • SQL Server 2014: 5%, same as last quarter
  • SQL Server 2012 & prior: 1%, same
  • Azure SQL DB and Managed Instances: 2%, same

SQL Server 2022’s market share has remained pretty stable since its huge jump in Q1, when people replaced a lot of 2016 servers with 2022. With SQL Server 2025’s release approaching quickly, I’m not sure SQL Server 2022 will ever approach the amazing dominant streak that 2019 experienced – that thing’s been a monster! It’s held over 40% market share for two years straight now. Here’s how adoption is trending over time, with the most recent data at the right: SQL Server Adoption Rates

With both Microsoft Ignite and the PASS Data Community Summit on the week of Nov 17-21, I think it’s fair to say that SQL Server 2025’s release is probably imminent, so let’s start thinking about its upcoming adoption rate.

Unlike 2022, I expect SQL Server 2025 to catch on fast.

SQL Server 2022 faced a perfect storm that slowed adoption. 2022 just wasn’t ready in time for users who needed to plan their migrations off of the soon-to-be-unsupported SQL Server 2008, 2008R2, and 2012. The pandemic threw a monkey wrench in a lot of IT projects, and honestly, that included SQL Server 2022 itself. 2022 wasn’t feature-complete for a year, and the cumulative updates were a mess.

I think SQL Server 2025 faces nearly the opposite conditions, and I think it’ll catch on faster.

One reason is that SQL Server 2016 goes out of support in July, yet it still holds 13% of the market. Assuming that 2025 doesn’t suffer from the same rocky CU start that 2022 did, 2025 is primed to capture 2016’s market. SQL Server 2017 follows right behind, going out of support in October 2027, and that’s another 10% of the market. Surely people won’t be replacing 2017 with 2019 or 2022, not in the year 2026.

Another reason is that even though 2025 won’t be feature-complete at release, Microsoft has gotten smarter about that this time around by gating features behind a database-scoped option called PREVIEW_FEATURES. Because change event streaming (aka, low-overhead mirroring to Fabric) is one of those preview features, and because Microsoft looooves getting that sweet sweet Azure revenue, I bet features like that will move out of preview quickly. (I mean, I say that, but at the same time, on-prem AGs with Azure Managed Instance secondaries was a similar situation, and that one took ’em over a year with 2022, so… maybe?)

Finally, I think developers want to call REST endpoints from T-SQL. I know, I know, the DBAs in the audience aren’t excited about that, and I totally understand the risk of dramatically slower transactions and increased blocking problems if external app servers (especially outside of the company network) are suddenly in the critical path of getting transactions completed. It’s like any other tool, it’s gonna have good and bad use cases, but at least I think developers will want to use it, unlike 2019’s ability to call Java with T-SQL.

That’s why I’m updating all of my training classes.

SQL Server 2019 was a big deal, with a lot of intelligent and adaptive query processing stuff. After that, though, 2022 was a yawner. I didn’t bother updating my training classes for it because there just wasn’t that much to share with you. Even if you adopted 2022 – and relatively few shops did – it didn’t make a radical difference in how you tuned indexes, query plans, or wait stats.

SQL Server 2025 is another story. These days, between SQL Server 2025, Azure SQL DB, Managed Instances, and AWS RDS SQL Server, the majority of servers you’re tuning all have adaptive and intelligent query plans, monitored and tuned with Query Store, tracked with Extended Events, indexed with columnstore, and tuned with newer query hints. You’re gonna need new skills – Profiler and Perfmon counters ain’t gonna cut it – and I’m here to help.