Blog

[Video] Office Hours in My Vegas Backyard

Videos
7 Comments

I took 5 of your top-voted questions from https://pollgab.com/room/brento while hanging out in the backyard on my second gin & tonic. I, uh, might have lost my temper on the last one.

Here’s what we covered:

  • 00:00 Start
  • 02:38 Frozt: are you familiar with SQL Server assemblies? and if they can cause assertion or IAM corruptions? Thank you Happy New year Brent
  • 04:41 SQL_Linux: Hey Brent, My company is wanting to use AI for data analytics. What common misconceptions have you seen companies hold concerning using AI to find “insights”?
  • 07:12 Culloden: my company is small and we don’t have devs for reporting. Finance team use prod for report development bc data is near real-time. I want to move them to a dev instance, but data can only be a few hours old. What are options for this? we only use simple recovery model in prod.
  • 09:38 App Developer At Crossroads: When did you decide to stick with SQL Server? After 3 jobs over 8 years all with quite different tech stacks I’m feeling like its time to become an expert in something (tech and business domain)/have a long stay at a company so I can become more and more senior/have a big impact.
  • 11:24 MyTeaGotCold: Did resumable online index rebuilds change any of your opinions on or approaches to index maintenance? I love them too much.

Can AI Rewrite Bad Queries in Seconds? Kinda.

Development
12 Comments

When I see horrific code, stuff that would take a lot of manual labor to fix, I like to lob it over to an LLM like ChatGPT just to see how it does.

For example, on a call with a client, I opened up one of their slowest pieces of code to find that it was running a cursor, processing data row by row rather than working in sets. To simulate it, I’ve written up a few stored procedures against the Stack Overflow database:

  • Posts_Update_AnswerCount takes a single PostId, and updates the AnswerCount of that row
  • Posts_Update_CommentCount does the same thing, but for a different column
  • Post_Update_AllCounts calls both of the above for a list of PostIds, looping through them one post at a time

When I look at that code, sure, conceptually I know what work has to be done – but it’s stupid manual labor. So I copied that code into a text file and asked ChatGPT, “Please rewrite this Microsoft SQL Server code into a more performant version.” That’s about as simple of a prompt as you can get! 

Here’s the code ChatGPT 4o came up with in the first pass, and you can read its full response to see its justifications:

Is it better than what we started with? Yes. Is it good enough that I wanna put it into production? No, because it’s still making two separate passes over the exact same rows in the Posts table, updating them one column at a time.

I replied to ChatGPT with, “Can you make it even faster and more efficient?” The result:

Now we’re getting somewhere! I love that it used CTEs with null protection, too, to handle the case where a post might have no answers or comments. I gave it one last shot by asking, “Could you make it faster, simpler, and easier to read?” but it reverted back to the two-update solution.

I then passed the same code and same prompts over to Google’s Gemini 2.0 Flash Thinking Experimental model. Its first result was similar to ChatGPT’s in that it had two separate update statements, but after prompting again to make it faster and more efficient, it came to this conclusion:

Instead of using a pair of CTEs like ChatGPT 4o, Gemini Flash 2.0 Thinking decided to use a pair of left join subqueries. I’m amused that both platforms chose to dump the contents of string_split into a table variable rather than using a temp table or using the string_split directly in the update’s where clause.

In both cases, the LLM (ChatGPT or Gemini) is able to change the query, but it may take multiple prompts to get to the best result – and the results don’t necessarily get better with every prompt. Today, these tools work best in the hands of someone who actually knows the results that they’re looking for, like you, dear reader.

That’s not to say you shouldn’t use them! This example here represents a really good starting point use case. In the client’s example (and this has happened several times), the LLM was able to give me a good-enough starting point in seconds, saving me a lot of time with manual labor on the rewrite. I still needed to do manual labor – I just had to do less, and that’s a win, because, uh, I’m lazy.


Pop Quiz, 2025 Edition: What Do These Things Cost Today?

Cloud Computing, Hardware
8 Comments

Price check on Aisle 2025.

My main consulting job is a 2-day SQL Critical Care® where I help turn around slow SQL Servers. Companies come to me because they’re not sure if it’s a hardware problem, a database configuration issue, bad app code, or what, and they’ve been getting conflicting answers from different teams.

Before you spend money on consulting, though, stop to review what things cost.

Often, I find a whole team of people who have been struggling with slow performance – only to find they’ve got less hardware than my laptop, and it would only cost a few thousand bucks to make a really dramatic difference instantly.

So let’s do a price check. Without searching the web for prices, guess how much the below things cost – and you don’t even have to guess the exact price, just put ’em into buckets for perspective. If you don’t see a form below, like if you’re reading the blog via RSS or email newsletter, you’ll need to visit my blog to take the quiz.

"*" indicates required fields

This field is for validation purposes and should be left unchanged.
SQL Server 2022 Standard, 4 CPU Cores*
SQL Server 2022 Enterprise, 1 CPU Core*
512GB of memory, bought outright*
1TB of memory, bought outright*
A mirrored pair of 4TB NVMe SSDs*
One year of an 8-core, 64GB RAM, 1.9TB SSD VM in EC2 (i4i.2xlarge)*
One month of an 8-core, 40GB RAM Azure Managed Instance (Gen5 Business Critical)*
One week of IT employee time (say, developer, sysadmin, DBA)*
One week of a team's time (5 people)*
2 days of Brent's time (SQL Critical Care® consulting)*
Name
Things I want


I’m Coming to Croatia!

Conferences and Classes
0

A few things lined up perfectly for a European trip this summer. I’m on a European cruise in early June, then Data Saturday Croatia on June 14, and then SQL Bits on June 21.

For Croatia, I’m teaching a one-day preconference workshop: Faster, Cheaper Cloud Databases.

You’re managing databases that live in AWS, Azure, and Google, and you’re getting a lot of questions about slow performance and crazy costs. Your users want everything faster and cheaper, and you’re not finding good answers.

Join me, Brent Ozar, for a day of real-time demos, honest war stories, and practical fixes. I’ll tackle an Azure SQL DB Managed Instance in the morning, and then an Amazon RDS SQL Server in the afternoon. I’ll show how I use the First Responder Kit to assess the bottleneck, mitigate as much as I can with index and query tuning, and then write up a business case for management for better future budgeting.

Throughout the day, I’ll include real-life stories from my cloud clients, with names changed to protect the innocent. I’ll explain what worked, what didn’t, and what went up in expensive flames.

You’ll learn:

  • How Brent uses the First Responder Kit to assess a cloud SQL Server’s wait types
  • Why each cloud vendor’s VM quirks make performance tuning tricky, and how to adapt
  • How to fix blocking, logging, and CPU issues when the hardware isn’t yours
  • How to write a persuasive performance report for management to fairly lay out where the problem is, and what steps to take next

By the end of the day, you’ll be armed with real-world techniques for keeping your cloud databases faster, cheaper, and more future-proof.

Registration is open now for the in-person class on June 13th at Hotel International, Miramarska 24, 10000 Zagreb. See you there!


Does Separating Data and Log Files Make Your Server Faster?

Storage
21 Comments

I’ve already explained that no, it doesn’t make your database server more reliable – and in fact, it’s the exact opposite. But what about performance?

The answer is going to depend on your hardware and workload, but let’s work through an example. I’ll take the first lab workload from the Mastering Server Tuning class and set it up on an AWS i3en.2xlarge VM, which has 8 cores, 64GB RAM, and two 2.5TB NVMe SSDs. (This was one of the cheapest SQL-friendly VM types with two SSDs, but of course there are any number of ways you could run a test like this, including EBS volumes.)

I’ll run the workload two ways:

  1. With the data files on one SSD, and the log file on another SSD, then
  2. Stripe the two SSDs together into one big volume, and put both the data files & log file on the same big volume

Here’s how long the test took for scenario 1, the data & logs separated:

Test 1: 46 minutes

And then with everything on one big striped volume:

Faster

In this test, less time equals a better result – the single big striped volume wins.

This particular workload has a storage bottleneck because in the class scenario, the database didn’t have good indexes to support the queries. SQL Server had to keep hitting the storage to read uncached data up into memory. When we only have one SSD volume to service our requests, our top wait by a long shot is PAGEIOLATCH, with about 339 seconds of that wait in a 75-second sample:

PAGEIOLATCH Waits

However when we stripe the data across two SSD volumes, that gives us more read throughput, so the test runs faster.

“But my workload is different!”

Oh yes, of course, your app’s workload is quite, uh, “special.” You should pin it on the fridge. Mom & Dad will be very proud. However, you’re still just assuming that your special workload will perform better with its storage IOPs and throughput cut in half, and its latencies doubled.

And that’s a pretty odd assumption to make.

Disclaimer: this post is specifically about separating the underlying storage, not the abstractions that you use between the disks and SQL Server itself. We’re not talking about using different Windows volumes, partitions, physical storage adapters, virtual storage adapters, or SQL Server data files. Those are all great discussions to have too, though!


Announcing the 2025 Data Professional Salary Survey Results.

Salary
8 Comments

We’ve been running our annual Data Professional Salary Survey for almost a decade, and I was really curious to see what the results would hold this year. How would inflation and layoffs impact the database world? Download the raw data here and slice & dice it to see what’s important to you. Here’s what I found.

First, without filtering the data at all, worldwide salaries are are down this year for the first time ever, but hold that thought:

Salaries overall

If we filter for just United States folks whose primary database is SQL Server or Azure SQL DB, the salaries stayed about even. Still, that’s not a great sign:

Microsoft databases

I mean, it’s weird to say “not a great sign” when we’re making six-digit salaries – a lot of folks would love to have that kind of problem.

Because this blog’s primary readership is SQL Server folks, I wouldn’t use the survey to draw conclusions about any other platform. The number of responses for other platforms is really low:

Salaries for other databases

The rest of this post has no filters – we’re looking at all countries, all database platforms. What are y’all’s job plans for this year?

Job plans for 2025

Most respondents intend to stay in the same employer, in the same role. This lines up with prior years, too:

Job plans over time

For the last several years, about 2/3 of y’all have intended to stick it out.

Folks who are planning to make a change also happen to be getting paid less – and that’s probably not a coincidence, heh. If you’re thinking about changing roles, you’re probably interested in who’s bringing home the cheddar:

Salaries by job title

Architect is generally a title that involves seniority, too, so time may play into that. Note that manager is high, too – if you manage staff, you tend to get paid more:

Manage staff

This is the second year of the survey where female pay is actually higher than male! The response rate is pretty skewed, but it always has been:

Salary by gender

People who live in large cities tend to make more:

Nearest town size

And people who work in non-profits and local governments seem to be doing pretty well:

Employment sector

Download the raw data here, and hope this data is useful to you when you have salary and career planning discussions with your manager. Here’s to you getting a raise this year!


[Video] Office Hours: Long Answers Edition

Videos
1 Comment

Today’s questions from https://pollgab.com/room/brento require a little bit longer answers:

Here’s what we covered:

  • 00:00 Start
  • 03:20 Josef: What is table partitioning, and can it improve performance? Does its effectiveness depend on the underlying storage? In what scenarios is it most useful?
  • 05:10 DBAMatt: If there was a new Mt. Rushmore with the heads of SQL Server pros who’ve greatly impacted the DBA world — besides Brent Ozar, obviously—who else would you wanna see up there? Maybe Itzik Ben-Gan, Ola Hallengren, Adam Machanic, Paul Randal, or the Kim Tripp?
  • 07:53 DBA Magician Girl: What’s the sales pitch for SQL Server editions other than Dev, Standard, and Enterprise? I thought the others were just toys, but they’re signed up for your Constant Care.
  • 12:27 Juan Pablo Gallardo: An ERP package has an internal benchmark tool for SQL “speed” that runs within the ERP, a loop read of 100k records from few tables and a few copy write, results reported in millisec time for read,write,erase. My gut tells me this test is useless, am I wrong?
  • 14:32 CharoCPC: I have an 8tb DB with 2tb of free space and is part of an AG. I was thinking of truncating the largest tables, shrinking the DB, move the DB to a newly added disk, restore a copy of the DB, copy the records back into the truncated tables and rebuild indexes. Should I break the AG
  • 20:34 DirtyReadsDoneDirtCheap: What common mistakes do DBAs make in their love lives? I think I’m ready to start mine.
  • 24:43 MartinDBA: How many people behind Brent Ozar Unlimited kingdom? 26:54 DonKiddick: Hey Brent one of our SQL servers (SQL 2016) uses AlwaysOn to provide a read only secondary – but I’m being asked if queries being run on the secondary could force a bottleneck on the primary. Thank you for your time 🙂

I Feel Sorry for Untrained Developers Using Entity Framework.

Development
23 Comments

Most of the time, I love Entity Framework, and ORMs in general. These tools make it easier for companies to ship applications. Are the apps perfect? Of course not – but they’re good enough to get to market, bring in revenue to pay salaries, and move a company forwards.

However, just like any tool, if you don’t know how to use it, you’re gonna get hurt.

One classic example popped up again last month with a client who’d used EF Core to design their database for them. The developers just had to say which columns were numbers, dates, or strings, and EF Core handled the rest.

But if you create a string without specifying its length, EF defaults to using NVARCHAR(MAX). That is not a bug. That is by design, and it’s explained in the documentation:

The wording on that really pisses me off because NO, IT IS NOT SQL SERVER DOING THIS MAPPING. There is absolutely nothing in the database engine that’s saying strings are nvarchar(max). This is an Entity Framework problem, and stop pointing the blame at the innocent database engine. (Update 2025/01/10 – the documentation has been updated by Erik Ejlskov Jensen to point out that no, this isn’t a SQL Server problem.)

The documentation goes on to explain how you can manually set the right column lengths, and I’ve sat through several development conference sessions that emphasize how important it is for developers to do that. The problem here is that most folks don’t read the documentation, let alone attend conferences to learn how to use their tools. (I don’t blame the folks – I blame the companies who are pressuring developers to ship quickly without training.)

Demoing One of the Problems It Causes

Let’s create a table with two string columns – one NVARCHAR(100) and one NVARCHAR(MAX). Then, let’s load a million rows into it, putting the same contents in both the short and long columns.

Then, we’ll run identical queries against the short & long string version:

And review their actual execution plans:

The bottom query is the one that hits the NVARCHAR(MAX) column. Your first signs of danger are the yellow bang on the SELECT, and the 99% query cost estimate on the second query, indicating that SQL Server thinks the NVARCHAR(MAX) one is going to be quite a bit more expensive. However, as is often the case with SQL Server, the really big danger isn’t even shown visually.

Hover your mouse over each SELECT operator, and you’ll get a popup tooltip. One of the lines on that tooltip will say Memory Grant. Here’s the one for the NVARCHAR(100) query:

210MB memory grant

When the datatype is NVARCHAR(100), SQL Server allocates 210MB of memory to run the query because it believes it won’t need too much memory to sort that small of a column. However, check the same metric on the NVARCHAR(MAX) query:

5GB memory grant

Because SQL Server thinks that the contents of NVARCHAR(MAX) columns are larger, this query gets a 5GB memory grant, 24x larger. Depending on your server size and workloads, this can be a real problem because the more queries you have running simultaneously, the quicker your database server will run out of memory.

There’s a Fast Fix. It Doesn’t Work Here.

If you frequently search or order by a column, all you have to do is index it, right? Let’s try it:

Index creation error

The ShortString index gets created – but SQL Server can’t create an index on LongString because we can’t use NVARCHAR(MAX) as a key column in an index.

That’s a bummer, and you could argue that it’s a SQL Server limitation that could be fixed. For example, you can create a computed column on a shorter version of the column, and index that:

There’s nothing to say that SQL Server couldn’t do similar work in order to index abridged versions of NVARCHAR(MAX) columns when it could check to see if there any truly long values in that column. It just doesn’t, though, and I understand that it would require Microsoft to do some work.

It’s much easier for Microsoft to say, “Yo, EF developers, read the docs and you’ll never have problems like this.” To some extent, that’s fair, because I can see how someone would expect people to be well-trained on the tools they use every day to do their jobs. This is only one tiny example of the many, many problems you can hit with Entity Framework if you don’t get trained on how to use it first.

If you’re looking for training resources, start with anything by Julie Lerman in whatever way you enjoy consuming training material – videos, books, etc.


Who’s Hiring in the Microsoft Data Platform Community? January 2025 Edition

Who’s Hiring
5 Comments

Is your company hiring for a database position as of January 2025? Do you wanna work with the kinds of people who read this blog? Let’s set up some rapid networking here.

If your company is hiring, leave a comment. The rules:

  • Your comment must include the job title, and either a link to the full job description, or the text of it. It doesn’t have to be a SQL Server DBA job, but it does have to be related to databases. (We get a pretty broad readership here – it can be any database.)
  • An email address to send resumes, or a link to the application process – if I were you, I’d put an email address because you may want to know that applicants are readers here, because they might be more qualified than the applicants you regularly get.
  • Please state the location and include REMOTE and/or VISA when that sort of candidate is welcome. When remote work is not an option, include ONSITE.
  • Please only post if you personally are part of the hiring company—no recruiting firms or job boards. Only one post per company. If it isn’t a household name, please explain what your company does.
  • Commenters: please don’t reply to job posts to complain about something. It’s off topic here.
  • Readers: please only email if you are personally interested in the job.

If your comment isn’t relevant or smells fishy, I’ll delete it. If you have questions about why your comment got deleted, or how to maximize the effectiveness of your comment, contact me.

Each month, I publish a new post in the Who’s Hiring category here so y’all can get the latest opportunities.


[Video] Office Hours in Tokyo, Japan

Videos
0

For the last Office Hours of the year, I’m coming to you from Tokyo, the last stop on my 3-week tour of China and Japan. This was a big Epic Life Quest task for me, for sure, something I’ve wanted to do for decades. Let’s bring you along and go through your top-voted questions from https://pollgab.com/room/brento:

Here’s what we covered:

  • 00:00 Start
  • 00:43 VeryLost: Do I need to do something special to get your 6-Month DBA Training Plan? I sign up for it every time that I download the FRK at a new job, but I have never got one of the emails.
  • 01:19 MyTeaGotCold: Is it sensible to use query hints to stagger or test a migration to a newer compatibility level? It seems like a good idea but I don’t know anyone else who has had it.
  • 02:09 CuriousDBA: For a table with a 2-column PK and a query filtering IN lists on both columns, SQL is doing a CI seek on only the 1st column (reading 50k+ rows) and then a FILTER operation on the 2nd column afterwards. Is SQL capable of determining when UNION would be better for a query instead?
  • 04:07 AnonymousDBA: We have a heap table with more than 15 non-clustered indexes. No UPDATEs. Only SELECT and INSERT. It shows up at the top of the list with highest number of indexes fragmented. Will a clustered index reduce the fragmentation problem with the non-clustered indexes on this table?
  • 04:51 summer-fondness: Hi Brent, if I had a friend they might have asked, do you think this is the right time to take the risk and learn a db outside the norm, like Iceberg? They might think something like Iceberg has benefits in industries that make Big$ and they might like to stay close to Big$
  • 06:19 Tom: Do you have a process or ideas for creating your own hands-on projects when learning new tech like Postgresql beyond what’s in classes? It’s always a catch-22 at work when wanting to take on new projects but you need the hands-on experience before taking on new projects.
  • 07:02 midDBA: hi brent i have a non clustered index on a table that has 1 billion reads and 1.3 million writes do you think that is a problem?
  • 07:45 Stefan: After 11 years as an underpayed and overworked DBA, I’m leaving the company to do something completly different. I wanna thank you very much for all the things you do for this cummunity, and a big heart to this very lovely und helpful community. I’ll miss ya all.
  • 08:22 Dopinder: Trying to convince the developers to not send customer facing emails from SQL Server sp’s. What are your top supporting reasons for this? Any horror stories?
  • 08:59 Mark Simon: Hey Brent, Does your training cover how to use query store?
  • 10:14 CrookedDBA: Every now and then my CPU elevates and I see queries locking on TempDB 2:1:128. Once it starts the CPU stays elevated until I trigger the plan cache to clear. Can a bad plan flag as a TempDB 2:1:128 wait? SQL Server 2019.
  • 11:14 Pro: I’ve come across some tricky behavior: SELECT IIF(0x10 0x1, 1, 0); — 1 SELECT IIF(0x100 0x10, 1, 0); — 0 SELECT IIF(0x1000 0x100, 1, 0); — 1 and so on. The reason for this is clear (I won’t spoil it), but what would be the best way to compare rowversions?

Your Favorite Blog Posts & Videos from 2024

Company News
1 Comment

The year is coming to a close, so it’s time to look back on your favorite stuff from this year.

2024 Blog Posts with the Most Comments

  1. What’s Your Biggest Database Regret?
  2. Query Exercise: Fix This Computed Column
  3. Query Exercise: Find the Best Time for Database Maintenance
  4. Query Exercise: Find Recent Superstars
  5. Query Exercise: Find Long Values Faster

I started the new Query Exercises series this year and I was really tickled by the reception of it. I figured that if it caught on, I’d make an e-book out of it, so look for that in 2025 along with more Query Exercises posts.

Most-Watched 2024 YouTube Videos

  1. Using the Apple Vision Pro in My Office
  2. Debugging a Complex T-SQL Issue
  3. Availability Groups 101
  4. Automatic Tuning in Inaction
  5. Office Hours Aug 28, 2024

I’m not surprised about #1, and I’m not surprised that the topic-focused videos on debugging, AGs, and automatic tuning do better than Office Hours. They take a lot more work to put together though, so I tend to not do those as often.

On my recent China/Japan trip, I decided to sell the Apple Vision Pro. I was only using it when I traveled, and even then, it takes up a lot of space in my carryon. When I saw the Huawei Mate XT dual-fold, tri-pane phone/tablet thing, I thought, that would be a much better use of the same money. I’d actually carry that around instead when I travel, and it’d be much more practical. (I wouldn’t daily-drive it as a phone though – I’m too tied into Apple’s ecosystem.)

Most Popular 2024 TikTok Comment Responses

  1. You are not Amazon.
  2. Why relational databases don’t work well for large e-commerce shopping carts
  3. How to corrupt databases
  4. “Pretty sure nobody puts business logic in the db layer”
  5. “my applications don’t need time zones”

On TikTok, I have a lot of fun with the commenters – some of which have brilliant insight, and some of which just make for good comedic fun. I really hope TikTok doesn’t die off in 2025, because it’s been the source of a lot of laughs for me so far.

The 2024 Photos Y’all Liked the Most

Over on my Instagram feed, here are the top 9 photos:

Here’s to another fun year of solving database problems!


Updated First Responder Kit and Consultant Toolkit for December 2024

First Responder Kit Updates
0

If there was ever a release you could skip, this is probably the one, because you could count this release’s changes on just one hand! Well, assuming you were an alien with six fingers. I won’t even recap the most important changes – you can just scroll down and read ’em.

How I Use the First Responder Kit
Wanna watch me use it? Take the class.

To get the new version:

Consultant Toolkit Changes

Updated to this month’s First Responder Kit, but no changes to the spreadsheet template.

sp_Blitz Changes

  • Enhancement: when @CheckServerInfo = 1, we now return target & total memory counter outputs for easier troubleshooting of memory pressure. (#3586)
  • Fix: when @OutputType = ‘markdown’, a recently added check could return an invalid XML character, breaking the output. (#3593, thanks Igor Galiney.)
  • Fix: if you ran sp_Blitz with VIEW SERVER STATE permissions, and you were the db_owner of master, but you didn’t have ALTER TRACE permissions, you would have gotten an error. (#3581, thanks Tisit.)
  • Fix: if you had really, really, really big data files, you could get an arithmetic overflow. (#3599, thanks songomajk.)

sp_BlitzIndex Changes

  • Enhancement: new first parameter @Object Name, so you can now call it even faster by putting an object’s name as the only parameter, like ‘schema.tablename’, for the current database. No need for separate @SchemaName and @TableName parameters, although we kept those to avoid breaking changes. (#3587, thanks Goran Schwarz.)
  • Fix: when @Debug = 1, we were returning the #BlitzIndexResult set twice instead of once. (#3590, thanks Vincenzo Marchese.)

For Support

When you have questions about how the tools work, talk with the community in the #FirstResponderKit Slack channel. Be patient: it’s staffed by volunteers with day jobs. If it’s your first time in the community Slack, get started here.

When you find a bug or want something changed, read the contributing.md file.

When you have a question about what the scripts found, first make sure you read the “More Details” URL for any warning you find. We put a lot of work into documentation, and we wouldn’t want someone to yell at you to go read the fine manual. After that, when you’ve still got questions about how something works in SQL Server, post a question at DBA.StackExchange.com and the community (that includes me!) will help. Include exact errors and any applicable screenshots, your SQL Server version number (including the build #), and the version of the tool you’re working with.


Working Today? Take a Break to Fill Out the Annual Salary Survey.

Salary
2 Comments

Working today? You’re not alone. As a small business person, I tend to find myself working on emails, catching up with stuff on quiet days like today.

Let’s take a second for your professional development though. Every year, I run a salary survey to help folks have better discussions with their managers about salaries, benefits, and career progression.

Take the survey now here.

The anonymous survey closes Sunday, January 12th. On Tuesday the 14th, I’ll publish the overall responses on the blog in Excel format so you can do slicing & dicing. The results are completely open source, and shared with the community for your analysis. You can analyze ’em now mid-flight – do not email me asking for edit permissions, buddy – but I’d wait until the final results come in. I’ll combine them into a single spreadsheet with the past results, and publish those here.

Thanks for your help in giving everybody in the community a better chance to talk honestly with their managers about salary.


[Video] Office Hours in Kyoto, Japan

Videos
0

My Asian vacation continues in Kyoto, the City of a Thousand Temples. They’re not playing around – the town is chock full of gorgeous historic temples, and it was tough to pick a scene for today’s Office Hours. Let’s go through your top-voted questions from https://pollgab.com/room/brento:

Here’s what we covered:

  • 00:00 Start
  • 01:35 SQL_Linux: Brent! I have a server that has many tables using nvarchar(MAX) columns. Most of the nvarchar(MAX) cols contain a max datalength of 80 or less. Will i see performance boost by changing size of nvarchar cols to 100? E.G. will this prevent data being written to off-row pages?
  • 02:58 Josef: Using sp_DatabaseRestore, is there a way to restore all databases from a backup location at once, instead of handling them one by one?
  • 03:34 Vineet: Have you considered changing office hours YouTube into a paid membership channel?
  • 04:41 Vishnu: We have developers that continue to run SSMS directly on the production SQL server despite being asked not to. Do you see any downsides to uninstalling SSMS from that VM?
  • 05:30 Bandhu: What’s your opinion of the new features in SQL 2025? Anything to get excited about?
  • 06:46 JRL: What advice do you give to clients who ask about running graph queries? For example, do you suggest the use of recursive SQL, specialized graph search features, or a specialized graph database?
  • 08:29 Rondel: What’s your opinion of Azure SQL DB’s ability to auto create / drop indexes? Does it make mistakes similar to SQL VM suggested indexes?
  • 09:32 Rennie: What’s the best time series database?
  • 10:00 Double Dragon: What’s your opinion of the pending TikTok divestiture / US ban?
  • 10:37 chris: I queried dm_db_index_physical_stats on the readable replica of a CloudSQL SQL Server instance. It took over an half an hour to run at which time the primary started performing poorly. Since the replica is in asynchronous mode is it possible the query could impact the primary?
  • 11:30 JRL: I recently attended AWS re:Invent and was disappointed that a majority of the talks seemed superficial and marketing-driven, in contrast to my experiences with the SQL community. Which tech conferences have you found rewarding in terms of technical depth and networking?

The Ebb and Flow of AI

Processes and Practices
10 Comments

It’s the end of the year. Everybody’s coasting and looking back. Let’s kick our heels up, get comfortable, and talk big picture industry stuff.

"Sir, this is a Wendy's"You and me, our weirdo jobs involve:

  • Databases – which is already unusual, and most people you run into on the street have never heard of it, even though it touches every aspect of their lives
  • One specific kind of database in particular – Microsoft SQL Server (and its various flavors, like Azure SQL DB and Amazon RDS), which isn’t even the biggest database out there, not by a long shot, so we’re a niche in a niche
  • A few particular database tasks – writing new code, performance tuning existing code, and server maintenance – we’re not even the biggest job in our industry, which would probably be database developer or report author

When you think about the entire population of Earth, our specific job is extremely small. There just aren’t that many people who do what we do.

Because we’re so niche, companies don’t really make all that much money building products to help us do our jobs, or eliminate our jobs altogether. Sure, there are definitely companies out there that are spending tens of millions of dollars a year to automate our job out of existence – but are they going to be successful? Should we worry about that? To find out, let’s zoom out and look at the bigger picture.

A More Lucrative AI Target: Self-Driving Taxis

If you zoom out to the population of Earth, an example of a larger industry would be transportation – getting people and goods from one place to another. If companies can make that easier, or eliminate human labor involved with that, then companies would make muuuuuch more money.

So when new golden hammers like artificial intelligence and machine learning come around, companies wielding those hammers look for the most money they can make, and they apply the hammers there. Go where the money is.

Google, Apple, Tesla, and General Motors all threw billions and billions of dollars trying to make self-driving taxis happen. Whoever figures that out first, and successfully scales it, could not only be the next Uber, but simply blow Uber out of the water altogether. When Uber’s worth over $100b, and General Motors is worth half of that, GM’s gotta look over at Uber and go, “I wanna take you down, buddy, and if I spend $50b doing it, I’ll still make money.”

And boy, would they make money. Not only would they cut expenses, but they’d offer a better product. I travel a lot, and I abhor today’s taxi experience. Uber is a step up from what taxis used to be, but I can’t Uber everywhere. For example, when I touch down at a busy urban airport, it’s usually much more convenient to just hop into a waiting taxi as opposed to getting an Uber, then waiting 15-20 minutes in a manic ride share area. Taxis are a throwback to a different era – disgusting back seats, no air conditioning, and bad drivers with terrible driving skills and even worse conversation topics. I would love to get into a well-maintained driverless taxi, especially at the end of a long exhausting day of travel.

However, as surefire as the business model sounds, competitors are dropping out of the AI-driven taxi business. Very smart, very wealthy companies have learned (the very expensive way) that you can’t just rub some AI on taxis and take humans out of the equation:

Self-Driving Taxis Aren’t Impossible.
They’re Just Expensive and Hard.

It’d be easy to look at the list of company failures above, then write a blog post titled something like “The Rise and Fall of AI.” Thing is, I just don’t think that’s correct. I don’t think AI is “falling” in the same sense that, say, the Roman empire fell. It’s just ebbing and flowing like the tide.

Companies invested tens of billions of dollars in self-driving taxis, and while we haven’t crossed the finish line yet, we (consumers) have seen a return on that investment. It just wasn’t the return that companies had been hoping for – instead, much smaller. Rather than eliminating the humans altogether, AI and ML are turning out to be tools that help humans be better drivers.

GM’s SuperCruise gets rave reviews, Mercedes’ new Drive Pilot system even lets you stop paying attention to the car altogether, and Tesla’s “full self driving” – well, it provides for plenty of YouTube comedy.

So How Will AI Reshape Databases? Slowly.

Using the knowledge of what happens when companies spend a decade and billions of dollars trying to improve complex tasks, let’s turn our attention to our own tiny niche industry – just as companies are. Companies are now spending time and money here, too, and some talking heads are saying humans will no longer be needed to build, tune, and troubleshoot databases.

One of the first companies to try it, OtterTune, got about $14M in funding across the span of 4 years to make MySQL and Postgres self-tuning. Richie and I tried it with SQL ConstantCare®’s database, and we had pretty abysmal results, but we chalked that up to it being an early product iteration. I worked with Andy Pavlo & his team on feedback, and I didn’t blog about our experience because I didn’t want to curse the product with bad early publicity. I guessed that given enough time & money, they’d solve the problem, and the world would be a better place for it.

Other companies will try the same thing. Most of them will burn through their investments without getting enough market traction, just as most self-driving taxi companies are setting billions of dollars on fire to be the next Uber. The self-driving database dream will be smaller, involving less research spending, because the end reward isn’t as big as self-driving taxis.

Along the way, over the next several years, those investments will result in better tooling for us humans to use, and I’m genuinely excited for that. We’re already seeing the earliest hints of it with things like SSMS Copilot and AWS DMS, but don’t let the sketchy quality of the early stuff scare you away from it permanently. The early versions of these tools will be just like the early versions of Tesla’s “full self driving” – only for the, uh, “brave.”

But before you panic about your job going away altogether, go take a ride in a taxi.


[Video] Office Hours in Osaka, Japan

Videos
9 Comments

Kon’nichiwa! My vacation tour continues in Osaka, Japan, a city famous for its street food, nightlife, and host and hostess clubs. I’m usually the kinda guy who does classy dinners at 5PM and then hits the sack by 9PM at the latest, so you might wonder how I got here. The answer is that Yves planned this trip out, hahaha, so here we are. As long as we’re here, let’s go through your top-voted questions from https://pollgab.com/room/brento.

Here’s what we covered:

  • 00:00 Start
  • 00:58 Craig: Should you start your procs with BEGIN, so all the proc code is inside a BEGIN…END block?
  • 02:18 MyTeaGotCold: 75% of my skills seem worthless after moving from being the boss of a 1 TB server to the boss of many 10-20 TB servers. Is such a big loss normal?
  • 03:19 Josef: Restoring a large database from a native SQL backup can take a long time. Would you recommend using log shipping to another server and switching over in case of failure instead of a full restore from backups?
  • 04:07 Craig: I see that you use temp tables to pass sets of data from one proc to another. What are your thoughts on serializing data as JSON and passing that into a proc via a parameter?
  • 05:05 BobTheDeveloper: I have a customer that claims his SQL Server is slow. He doesn’t want to pay for a bunch of my time to dig deep with the First Responder Kit. None of CPU, Memory, Disk and Network are being hammered. Even simple queries are taking too long. Is locking the likely problem?
  • 06:04 DogsBarkingAngrily: hi Brent! What is your learning process, how do you learn new things? I started watching your PostgreSQL series and am amazed how you know so much, so I’m curious where do you learn from? (emphasis on you … because I learnt so much from you, but where do you learn from?)
  • 07:10 Dopinder: How do you like to troubleshoot Connection Timeout errors at the client? We originally thought ours was a server side issue but turns out the VM / Client had a CPU pressure issue.
  • 07:36 Craig: If you want to find out what procs get called when a user does something on an application, what tools do you use? Profiler, Extended Events, or something else?
  • 08:26 ayed: Hello Brent, thanks for providing free content! Throughout your career, what is the maximum number of CPU cores and the largest amount of memory you’ve encountered in an operational SQL Server VM?
  • 09:11 Mary Puppins: What’s your opinion of the new copilot support in azure SQL DB for generating queries from English prompts? Seems like this could generate some inefficient queries.
  • 10:29 SQL_Linux: Hey Brent, I recently enabled the setting Optimize for Ad Hoc Workloads to resolve an issue with many single-use plans. Anyway, do you disable this setting when you do consulting if you find it enabled?

How to Explain What We Do

SQL Server
9 Comments

One of my favorite uses for AI is to ask it to explain a concept using wildly unusual points of view.

Well, we’re going to be seeing various family members and friends for the holidays, and we’ll get endless questions of, “So, what is it that you do?” I figured ChatGPT would be the perfect source for answers.

In limerick form, explain what database administrators do.

A database chief, quite astute,
Keeps information in route.
They fix every glitch,
Without any hitch,
Making data’s pursuit absolute.

While accurate and an interesting use of the limerick format, it’s not at all funny. I suppose I should clarify my prompt by adding, “Can you try examples with humor?”

In servers where data would roam,
A DBA made every byte home.
With indexes tight,
And storage just right,
He kept hackers far far from their dome.

That’s… pretty good actually! I don’t think I could explain what we do any better in limerick format. It’s hard to concisely explain this kind of thing – although bytes probably don’t make sense to folks who don’t use computers. Let’s refine our prompt: “Using a limerick format including humor, explain what database administrators do to people who have never heard of databases.”

A guardian of info’s grand hall,
Ensured nothing would stumble or fall.
When chaos would near,
They’d bring in good cheer,
And tidy up records for all.

That’s good, but it feels like the limerick format isn’t going to cut it. Let’s try another famous form of communication…

In haiku form, explain what DBAs do.

Let’s try this: Explain what a database administrator does, in haiku form, to someone who’s never heard of a database.

Keeper of records,
Organizing endless files,
Data safe and clear.

That’s… actually pretty accurate, but it doesn’t include the typical haiku references to nature and the seasons. Let’s ask for that by adding, “Can you include the traditional haiku style of referencing nature and seasons?”

Summer sun shines bright,
Maintaining databases,
Fields of data thrive.

Hmm. That’s actually the best one I got out of the batch, and even that one isn’t great. On the other hand, it’s better than anything I’d come up with in haiku format. Let’s try the longer form of haikus: “Using the humorous tanka form, explain what a database administrator does to someone who’s never heard of databases.”

Digital tidy,
Nerd with endless spreadsheets jokes,
Backups never fail,
Fixing errors with a grin,
Keeping secrets safe with flair.

Alright, I give up. Let’s try something much simpler.

Cutting right to the chase

Prompt: “I’m a professional database administrator. I’m about to meet up with my friends and family for the holidays, and I need to quickly explain what I do. Most of them have never heard of a database before. When someone asks what I do, what do I say?”


Absolutely, here are a few simple and relatable ways you can explain your role as a database administrator to friends and family who might not be familiar with databases:

1. The Digital Librarian – “I manage and organize the company’s digital information, kind of like how a librarian handles books in a library. I make sure all the data is stored safely and can be easily found when needed.”

2. Information Organizer – “Think of me as an information organizer. I keep all of our company’s data neat and accessible, ensuring everything runs smoothly and securely.”

3. Data Gardener – “I’m like a gardener for our company’s data. I plant, nurture, and protect information, making sure it grows well and stays healthy for everyone to use.”

4. Technology Troubleshooter – “I ensure that all our digital information systems are working properly. If something goes wrong with the data, I fix it so everything stays up and running.”

5. Digital Storage Manager – “I oversee how our company stores its information digitally. It’s my job to keep everything organized, backed up, and secure, so we can always access what we need.”

6. Information Security Guard – “I protect our company’s important data from being lost or accessed by the wrong people. It’s like being a security guard, but for digital information.”

7. Efficiency Expert – “I make sure our data systems are efficient and fast. This helps the company run smoothly by ensuring everyone can get the information they need quickly and reliably.”

8. Data Problem Solver – “When there are issues with our information systems, I troubleshoot and solve them. I ensure that our data is always available and functioning correctly.”


Wow! No matter what your background is, I bet you’ll find an answer in there that works for you. Happy holidays!


Are You Underpaid? Let’s Find Out! My Annual Salary Survey is Open.

Salary
1 Comment

Every year, I run a salary survey to help folks have better discussions with their managers about salaries, benefits, and career progression.

Take the survey now here.

The anonymous survey closes Sunday, January 12th. On Tuesday the 14th, I’ll publish the overall responses on the blog in Excel format so you can do slicing & dicing. The results are completely open source, and shared with the community for your analysis. You can analyze ’em now mid-flight – do not email me asking for edit permissions, buddy – but I’d wait until the final results come in. I’ll combine them into a single spreadsheet with the past results, and publish those here.

Thanks for your help in giving everybody in the community a better chance to talk honestly with their managers about salary.


How to Use Github in SSMS v21

SQL Server Management Studio v21 added native Git support, making it easier to use source control natively inside SSMS.

This feature is for developers who are already used to working with Github to manage specific object changes as they work, and who are already accustomed to Git terminologies like branches, commits, pushes, and pulls. This feature is not for DBAs who think it’s going to automatically keep your SQL Server’s object definitions inside source control for you. I totally understand why you’d want that, for sure, and I do as well, but our princess is in another castle.

So back to developers – let’s set it up. To use Github in SSMS v21, we’re going to:

  1. Sign into SSMS using our Github account with permissions on the repo
  2. Clone the repo locally
  3. Add a branch to make our changes
  4. Commit the branch, push it, and open a pull request

Let’s walk through the specifics with screenshots. When you first open SSMS v21, you’re prompted to sign in:

Logging into SSMS

For the sake of simplicity, I’m going to sign in with my Github account.

Cloning a Repo Locally

After SSMS starts, click File, Clone Repository:

Clone Repository

SSMS will ask for which Github repo you want to clone, and what path you want to clone it to:

Picking a repository and folder

The repository location is a little picky – it has to be the exact root folder of the repo, without subfolders like /tree or /tree/main. I’ll be working with the First Responder Kit, so the main URL for that is: https://github.com/BrentOzarULTD/SQL-Server-First-Responder-Kit/

Cloning the FRK

Click Clone at the bottom of the window, and SSMS will clone the repository locally. After it finishes, you’ll have lots of options in the Git menu:

Git menu

If you click Git, Manage Branches, you get an overview of what’s happening in the repo at the moment:

Managing Git branches

Click View, Solution Explorer, and you can go into the folder where the Github repo has been cloned (downloaded) locally – it’s on the right hand side of my window:

Solution Explorer

This is a little bit tricky. What SSMS did was create a completely isolated, standalone local repository. I’m not actually working with the origin repository directly, even though:

  • I’m signed into SSMS with my Github account
  • I’m the maintainer for the First Responder Kit repo

Those two things are unrelated. When SSMS cloned the repo, it acted as an anonymous member of the public. You kinda get a hint of this when you look at the settings & changes for this repo – it says things like you need to configure a username & email address for this repo:

Configure your repo

To set a global default for all of your repos, go into Git, Settings, Git Global Settings, and the first two entries are user name and email:

Git global settings

Just be aware that when you’re working with SSMS, you’re dealing with a completely standalone local set of repositories, even if you’re a project maintainer. (This is different than how Github Desktop works for project maintainers.)

Making a New Branch for Changes

I’ve got an issue I need to work on, so let’s create a new local branch of my repo to hold my changes. In the list of branches on the left, right-click on the default development branch for your remote repo – in this case, remotes/origin, dev – and choose New Local Branch From:

New Local Branch from remote

This is important. If you create a branch any other way, like clicking Git in the File menu, then the origin of your branch is going to be based off whatever local repo line happens to be highlighted in the background. You probably don’t want your code based off your local code – you wanna stick with the current remote repo that you’re working with. I’ll name my branch with my Github issue number and a brief description:

New local branch name

Then in the list of branches on the left, click my newly created branch, and click View, Solutions Explorer to get the list of files on the right hand side:

Local project

I’ll make some code changes to fix my particular issue, and after saving my changes and closing the files, I’ll click Git, Commit or Stash:

Git Commit or Stash

Whenever I make changes, I like to double-click on each of the files that have been changed, then take a look at the diff to make sure that I’m only committing changes I really want. Sounds crazy, but I’ve just left so many accidental things inside my commits, like testing queries. In the example here, I’ll double-click on the sp_Blitz_Checks_by_Priority.md file (Markdown), and the diff viewer pops up, highlighting my changes:

Diff viewer

After poking around in my changes, satisfied that I’m only committing good stuff, I’ll write my change description:

Change description

And click “Commit All”. This commits the change locally in my C drive, but not at the origin (Github), as shown in the confirmation message at the top after I clicked Commit All:

Committed locally

To push it to the origin, I’ll click Git, Push, and I’ll get a confirmation that my changes were pushed up to Github:

Push confirmation

Now I need to do a pull request to get the maintainers (me) to accept my code. There’s a “Pull” option in the Git file menu, but that’s just for your local repo. You have to click Git, Github, New Pull Request.

New pull request

Note that at the top of the screen, it shows “Merge into” and “From”. We’re requesting that our pull get merged into origin/dev, from origin/3586_sp_Blitz_memory_counters. If you were working from a purely local repo (because you weren’t the project maintainer), then the “from” would show your local repo instead.

I’ll make sure my commit notes and my changes look right, then click the Create button at the top left, and I get a confirmation:

Pull request confirmation

Click on the pull request link to go to Github.com, set milestones, assign people, check the results of your continuous integration tests, and merge the PR.

If, like me, you have Github set up to automatically delete branches after they’re successfully merged, you’ll need to update SSMS’s list of branches. Click Git, Manage Branches, then Git, Sync, and you’ll get a warning saying that your branch was not found in the remote repo, as shown at the top here:

Remote branch deleted

That’s totally okay! It’s just your sign that you wanna switch to another branch, or make a new branch to work on your next issue, and the process starts all over again.


[Video] Office Hours in Zhengzhou, China

Videos
0

For the rest of December, I’m out vacationing in China and Japan. For this episode, I’m coming to you from the People’s Park in Zhengzhou, China, which is kinda like the Chicago of China. It’s in the middle of the US, bigger than people think (13 million people), and has beautiful parks.

Here’s what we covered:

  • 00:00 Start
  • 01:50 Uno: What’s the easiest way to setup a secondary server for failover? My friend runs a app on a single virtual server. We take nightly backups, but would like a better option to avoid losing an entire day’s worth of data. Any suggestions?
  • 03:55 GoingAnonymousThisTime :D: I’ve setup an XE using sp_HumanEvents in order to track queries over 5s in a db. After a few minutes, I see high LATCH_UP on tempdb GAM and queries pile ups, blocking the entire server. Everything goes back to normal as soon as I stop the XE session. Any idea why this could happen?
  • 05:22 Josef: What are some good books for beginners learning SQL? Is ‘SQL in a Nutshell’ by Kevin Kline a good choice? Any other recommendations?
  • 06:09 Juan Pablo Gallardo: With the current hardware capabilities, SSD’s,huge RAM availability, do you think SQL would be replaced by solutions that do not require relational dbs, like Cosmos db? Why would you need a relational database datamodel if you can store all in one big table?
  • 06:53 MyTeaGotCold: Has the lack of columnstore on Postgres bothered you?
  • 07:21 Dogpool: What is your opinion of the new MSSQL extension for VSCode? Is the query plan display as good as SSMS?
  • 08:17 Josef: Should data, logs, tempdb, and backups be placed on separate partitions when installing SQL on a local machine with local storage?
  • 09:20 Mary Puppins: Found a SQL VM query that is suffering from parameter sniffing but can’t refactor it because it’s embedded in C#. Any tips?
  • 10:05 Josef: Do you have any plans to visit India in the future? Just curious since I saw you’re visiting China and Japan this December!
  • 11:07 Ricardo: Hi Brent, On a new customers prod server I can see everything coming from the VB sql-generator has isolation level ReadUncomitted. Can this slow performance?
  • 12:22 Vishnu: What are your pros / cons of having your SQL Data warehouse VM and SQL OLTP VM on a single VM vs two separate VMs? Which do you see more of?
  • 13:21 Ricardo: How should I handle my manager who insists on changing production data himself to fix immediate application issues. When all my advice and coaching is ignored.
  • 20:02 KyleDevDBA: How do you go about trying new skills/concepts that you are learning? When trying to learn something, I often get stuck at the “I have a hammer, where can I find a meaningful nail to hit” stage.