[Video] Office Hours: Hong Kong Harbor

Videos
2 Comments

I recently went to Shanghai & Hong Kong, and stopped to take your questions from https://pollgab.com/room/brento while sitting next to the Hong Kong harbor.

Here’s what we covered:

  • 00:00 Start
  • 01:18 Jason G – RN & Accidental DBA: Would you elaborate on DB Owner implications? sp_Blitz help recommends using the SA account, but the articles referenced by Andreas Wolter advocate for using low privileged accounts which are DB specific. Which would you recommend and why?
  • 02:44 MyTeaGotCold: Can you name any good relational databases that aren’t built around SQL? It’s strange that a system so old is still the best.
  • 03:58 John: Hello Brent. Is SQL Server 2022 ready for production/prime time and more more huge bugs/issues? Asking due to last blog post was in 2023 on that particular topic. Thanks.
  • 04:57 Chicago Joe: Is there a trend to move database access to API only? I am asking because we are moving to next version of ERP and our CIO has told database developers that only access to new database will be through a Web API. Database is still on prem on next version, too.
  • 06:23 J. Fisher: Hey Brent, Are you able to comment/explain SQL Server “Native” Geography/Geometry datatypes, other CLR stuff, and how they can use and exhaust “App Domain” memory… leading to “Unloading due to memory pressure”… Can’t “afford” to keep adding memory.
  • 08:19 Steve E: Hi Brent, Is there a way to assess overall reads per table across a workload in an attempt to see which tables we might want to focus our index tuning efforts to? Eg if the Posts table has 90% of the overall workload reads, we would probably want to start our index tuning there.
  • 09:29 neil: dev thinks “azure” will solve all their problems. (they dont understand we’re already sql on azure vm). they’re committing all the same mistakes that created disasters on-prem. what surprises are they in for ?
  • 11:12 Dream catcher: What time do you like to go-to bed and wake up? Do you nap after lunch?
  • 11:43 ChompingBits: What do you think is nominally the difference between ADF and SSMS? ADF has query plans and access to many of the admin tools and reports in SSMS. How long do you think Microsoft will continue to offer both tools.
  • 12:45 gringomalbec: Hi Brent, we realize you recommend not using Linked Servers to connect to other MS SQL Servers. But my friend asks if you find ok using Linked Servers to download data from sources other than MS SQL Server that cannot be connected directly in SSMS using Database Engine ?

How to Have a Conversation About Software Licensing

Licensing
4 Comments

As a consultant, I sometimes see SQL Servers with high CPU core counts relative to their workload. For example, the situation that inspired this post was seeing half a dozen SQL Servers with ~50 cores each, hundreds of cores total – yet they consistently had just 1-5% CPU usage.

Here’s what I said to management:

Right now, you’re running SQL Server on 280 cores. SQL Server Enterprise Edition licensing is about $7,000 per core, so you’ve installed $1,960,000 worth of licensing. If an employee or former employee goes to the BSA and reports you, they may get a pretty sizable cash reward. I’m not the licensing police, and I’m not going to ask if you actually paid for that licensing. I’m going to phrase my question very carefully: would you like help reducing your licensing footprint?

Every now and then, the managers on the call say the company actually is licensed for that many cores, and it usually involves a funny story about someone getting a really good deal. For example, one of my clients got an amazing licensing agreement with Microsoft where they paid for very few cores initially, and they only had to “true up” their licensing every 5 years. Immediately after signing the agreement, they installed dozens of large Enterprise Edition servers. They knew they wouldn’t have to pay for those servers for another 5 years, and by then, they expected their company to be acquired by a much larger one. Good times.

But most of the time, the managers get really wide eyes, and they start stammering and making excuses. On this particular call, the manager immediately started telling the employees, “Don’t get any ideas about reporting us,” and laughed nervously.

I’m completely chill about it – after all, it’s not like their company owes ME for the licensing, and I’m not about to be the one who calls the BSA. Besides, the terms & conditions for BSA rewards are hilariously bad, making it pretty damn unlikely that anybody ever gets a reward. I do like mentioning that, though, because it gives managers an incentive to fix the problem quickly.

The reason I care, as a consultant, is that often:

  • Managers didn’t realize how much money is at stake.
  • Once they know, their performance problems become a much bigger issue (because now they’re concerned about cutting horsepower.)
  • Once they know, the project often changes scope – because now we’re probably going to build new SQL Servers with dramatically lower core counts, and migrate over to those.

When we build those new SQL Servers, we can make other corrective changes that will help their particular bottlenecks. For example, if the workload is memory-starved, we can throw much more memory in the new boxes – because management now understands that the cost of a terabyte of memory is nothing compared to a $2M licensing bill.

This usually starts a whole new discussion with management, and the next step usually involves discussing my 7 rules for SQL Server licensing.

So far, there have only been a couple of times in my career when the client’s management decides that they’re going to keep pirating SQL Server. As a consultant, I don’t wanna go anywhere near companies like that – because when the licensing police inevitably come calling, the company will say, “Well, Brent Ozar worked on these systems, and he told us that it was completely fine.” That’s why I’m out as soon as I find out that the company is purposely pirating the software, and on my way out the door, I leave a very clear paper trail behind me.


Announcing Offline Access to Your BrentOzar.com Courses

Company News
2 Comments

So you’ve got an iPhone or iPad or Vision Pro, and you bought one of my training classes or the Fundamentals or Mastering passes, and you wanna watch the classes while you’re offline. Perhaps you wanna learn on the train ride in to work, or perhaps on the weekends, you like to completely disconnect from the world and retreat into a bunker lined with a Faraday cage. I won’t judge.

Go into the app store and get the Teachable online courses app:

Teachable app

Log in, and you’ll see your list of courses:

Your courses

Click on a course, click on a lesson, and at the top right of each lesson, there’s a download button:

Download button

Click the download button, and … well, you’re an IT professional, and you can figure out what happens next. When you want to clear out your downloads, go into the app’s settings.

I’ve asked Teachable to enable a download button at the course level so you can download entire courses rather than individual modules. I can’t promise a delivery date on that, obviously, but they’re aware of the request.

Enjoy!


[Video] Office Hours: Career & Data Q&A

Videos
2 Comments

A few career and security questions found their way into the queue at https://pollgab.com/room/brento for this episode:

Here’s what we covered:

  • 00:00 Start
  • 02:10 Ethan: What percent of your constant care shops require encryption to connect with SQL server? What are your thoughts on mandatory encryption?
  • 03:24 Miles: Hi Brent, an app that spawns multiple spid’s,multiple txns.While troubleshooting BLOCKING,initially thought to trace one spid and all sql stmts for that spid.But it turned out, multiple spids r getting involved? how to troubleshoot issues when multiple spids & db’s are involved?
  • 05:13 Miles: Hi Brent, Does MSDTC txns gets translated to serializable isolation level? but when I check the additionalinfo column output in sp_whoisactive, it shows ReadCommited. Other thing, even when I am using RCSI, it shows isolation level as ReadCommitted? am i missing anything?
  • 05:58 Ignacio: Do you have a recommended formula for required number of DBAs to number of SQL Server VMs managed?
  • 06:45 Karthik: What’s your opinion of ChatGpt ability to analyze a pasted query execution plan?
  • 07:53 DBA_JR: Hello Brent, why does SQL Server allocate space in tempdb for running a query? And when does this allocation occur? Upon enabling Read Committed Snapshot Isolation (RCSI), why does the tempdb size continue to increase? My database contains 3TB of data.
  • 08:54 Ben Grimm: What are your thoughts on making the domain user for the SQL account a local admin on the VM?
  • 09:09 Pedro Gonzalez: Hi Brent, I’m one of the junior DBA admins in a mid size company, and I want to be like you. Where do I start? I like to read documentation, so they can hire me for that.
  • 10:58 MyTeaGotCold: I find temp tables so valuable to performance tuning that my go-to solution for tuning functions is to convert them to stored procedures. Is this a common problem?
  • 12:17 Steve E: Hi Brent, You often say 5 Non clustered indexes per table is a good rule of thumb, does this advice change when working with very wide data warehouse tables which have more than five columns users often search on?
  • 13:10 Ignacio: Does sp_Blitz check for instant file initialization enablement?
  • 14:10 Sean C: Hi Brent, do you have any experience and/or resources you could point my friend to regarding unit testing for SQLServer procs, etc? My friend has some idea of what unit tests are, but is having trouble visualizing how this is accomplished in TSQL; he’s pretty dumb.
  • 15:16 Atherol: Why does the four day work week work well for the U.K. but not the U.S.A.?
  • 16:01 Ethan: Will strict TLS 1.3 encryption in SSMS20 cause users to think twice about upgrading from SSMS19?
  • 16:30 Karthik: How do you keep your mind from racing at bedtime?
  • 17:55 Ignacio: How does Aurora PostgreSQL failover speed compare with Azure SQL DB / M.I. / V.M. failover speeds?

Your Teammates Need the Fundamentals.

Conferences and Classes
0

You’ve been reading my site for a while, watching videos, keeping up with the best SQL Server techniques. However, you’ve got a lot of other folks on your team who don’t have as much free time as you, or they’re just getting started on their SQL Server performance tuning journey.

Forward this post to them, because they need to know about Fundamentals Week, May 6-10.


Hey you! You love live classes. You’ve tried watching recordings or stepping through demos yourself, but…you just can’t block out the time, and you can’t stay motivated while you’re watching a recording. You like the fast-paced energy of seeing me live, asking me questions, and seeing me respond to your chats.

You hate free conferences that feel like Zoom meetings. You’ve tried attending a few online events, but they felt just like sitting through one meeting after another. They didn’t dive deeply into any one subject – they just hopped around from one topic to another, and many of those topics just weren’t relevant to you. They were good for free stuff, but…

You’re ready for Fundamentals Week! You’ll spend the week with me learning about indexes, query tuning, columnstore, how I use the First Responder Kit, and more.

The conference dates are May 6-10, 2024, and classes are 10AM-6PM Eastern, 7AM-3PM Pacific. Here’s what we’ll cover:

Register now for $1,595. To keep prices low, this price does not include the class recordings. If you want those, during checkout, you’ll see an offer to add lifetime access to my Recorded Class Season Pass Fundamentals for just $395. See you in class!


Using ChatGPT for T-SQL Code Reviews

T-SQL
23 Comments

For this month’s T-SQL Tuesday, Pinal Dave asked us if AI has helped us with our SQL Server jobs. For me, there’s been one instant, clear win: code reviews.

I usually keep a browser tab open with ChatGPT 4, and I paste this in as a starting point:

You are an experienced, diligent database developer who specializes in Microsoft SQL Server. Review the below stored procedure and decide whether it has any security or performance risks.

Then I paste in the code right below that. Here’s an example output using bad dynamic T-SQL. ChatGPT 4 cut straight to the point, identified the risk of SQL injection, and even rewrote the query for me in a less-vulnerable way, plus pointed out the performance risks of using LIKE for string searches.

In milliseconds.

This particular example is a really short piece of code, and even here, the advice isn’t perfect. ChatGPT renamed my @StringToExecute variable, and I’ve experienced tons of similar issues with client code. It’ll introduce subtle changes that are unrelated to the problem it’s trying to solve, but… for a blazing-fast code review, I’ll take it.

The more complex your code gets, the more vague ChatGPT’s advice becomes. For example, I took a query from my Mastering Query Tuning class’s deadlock module and asked ChatGPT why this query is getting deadlock errors. ChatGPT’s answer was really long, but mostly devoid of value. There’s one tiny good nugget in there, but I feel really bad for someone tasked with reading ChatGPT’s wall of text and extracting the tiny nugget that will actually solve the problem. I even tried a very directed, suggestive prompt for ChatGPT to lead the horse to water, and it still didn’t drink. It’s not bad advice, per se – it’s just really wordy and misdirected.

Still, though, I love using this as a very, very, very fast sanity check on code. I also like doing it together while sharing screens with client staff too because:

  • I can get code recommendations faster than they can explain what the code is doing
  • The client’s tech staff learns a new technique to get peer review on their own code
  • The client’s management staff sees firsthand that 2024-level AI is a helpful tool, but not an end-all solution, because a lot of the advice requires human interpretation (and a lot of it is garbage)

Is it a perfect tool? No, but it’s very productive. Like Mitch Ratcliffe said, “A computer lets you make more mistakes faster than any other invention with the possible exceptions of handguns and tequila.”


[Video] Office Hours: Just Three Questions

Videos
2 Comments

Sometimes y’all post questions at https://pollgab.com/room/brento that require extended answers. Today, I’m hitting just 3 questions:

Here’s what we covered:

  • 00:00 Start
  • 00:44 MyTeaGotCold: If storage is no issue, do I need to worry about the size of Query Store? It’s going to hit 10% of the size of my largest database. I have no interest in changing QUERY_CAPTURE_MODE away from ALL or my retention period.
  • 04:12 DB-Ay?: For a high activity table, is it worth updating stats at a higher percentage just for auto-update to resample at a lower rate within an hour? I’m not able to persist the value and auto-update must remain enabled. I think my only option is to exclude it from the custom stats job.
  • 08:39 David E: I took over for an idiot who setup every database to have 4 transaction Log files per database. I have tried everything I know to remove the files, but it errors out everytime. Never had to deal with before in my career. I have tried some things with no success.

Who’s Hiring in the Microsoft Data Platform Community? April 2024 Edition

Who's Hiring
6 Comments

Is your company hiring for a database position as of April 2024? Do you wanna work with the kinds of people who read this blog? Let’s set up some rapid networking here.

If your company is hiring, leave a comment. The rules:

  • Your comment must include the job title, and either a link to the full job description, or the text of it. It doesn’t have to be a SQL Server DBA job, but it does have to be related to databases. (We get a pretty broad readership here – it can be any database.)
  • An email address to send resumes, or a link to the application process – if I were you, I’d put an email address because you may want to know that applicants are readers here, because they might be more qualified than the applicants you regularly get.
  • Please state the location and include REMOTE and/or VISA when that sort of candidate is welcome. When remote work is not an option, include ONSITE.
  • Please only post if you personally are part of the hiring company—no recruiting firms or job boards. Only one post per company. If it isn’t a household name, please explain what your company does.
  • Commenters: please don’t reply to job posts to complain about something. It’s off topic here.
  • Readers: please only email if you are personally interested in the job.

If your comment isn’t relevant or smells fishy, I’ll delete it. If you have questions about why your comment got deleted, or how to maximize the effectiveness of your comment, contact me.

Each month, I publish a new post in the Who’s Hiring category here so y’all can get the latest opportunities.


Announcing Fundamentals Week, May 6-10

Conferences and Classes
3 Comments

You’ve attended some of my free online streams, read my blog posts, and you use the First Responder Kit. You enjoy my laid-back, humorous approach to sharing what I’ve learned over the years.

You love live classes. You’ve tried watching recordings or stepping through demos yourself, but…you just can’t block out the time, and you can’t stay motivated while you’re watching a recording. You like the fast-paced energy of seeing me live, asking me questions, and seeing me respond to your chats.

You hate free conferences that feel like Zoom meetings. You’ve tried attending a few online events, but they felt just like sitting through one meeting after another. They didn’t dive deeply into any one subject – they just hopped around from one topic to another, and many of those topics just weren’t relevant to you. They were good for free stuff, but…

You’re ready for Fundamentals Week! You’ll spend the week with me learning about indexes, query tuning, columnstore, how I use the First Responder Kit, and more.

The conference dates are May 6-10, 2024, and classes are 10AM-6PM Eastern, 7AM-3PM Pacific. Here’s what we’ll cover:

Register now for $1,595. To keep prices low, this price does not include the class recordings. If you want those, during checkout, you’ll see an offer to add lifetime access to my Recorded Class Season Pass Fundamentals for just $395. See you in class!


Tomorrow’s the Big Day: Free Live Webcasts!

Conferences and Classes
0

No fooling here. Tomorrow, I’m teaching my very favorite class live online, for free, two different ways – one with all slides, and one with all demos. Whatever way you like to learn, I’ve got you covered. So, what’s the subject?

You’re comfortable writing queries to get the data you need.

But you’re uncomfortable if someone asks you how it works, how SQL Server and Azure SQL DB turn your queries into results. You know there are execution plans, but … you’re uncomfortable reading them. You know queries need indexes, but you’re not sure which columns to put in order, or how SQL Server chooses between them.

How to Think Like the SQL Server EngineI’d like to teach you How to Think Like the SQL Server Engine for free.

In a 2-hour live session, you’ll learn the differences between clustered & nonclustered indexes, why seeks aren’t necessarily good and scans aren’t necessarily bad, why queries run so much faster in development than production, and much more. I wanted to design the most important concepts you could learn in your first 2-hour training session – whether you’re a developer, report writer, sysadmin, or DBA.

I’m teaching the class two different ways: one with slides, and one completely in Management Studio. Both classes cover the same material, but the all-demo version lets me go off on interesting tangents based on your questions. I’d encourage you to attend the slide one first, then circle back for the demo one so that you can really soak in the material and get your questions answered. If you’re busy, and you can only do one, I totally understand!

Slide version, all April 2nd, Tuesday, links are to calendar invites:

Demo version, all April 3rd, Wednesday:

No registration required – just add the event to your calendar, and at the time of the event, hop into https://www.brentozar.com/training/live/ or my YouTube channel to watch the live stream. See you in class!


[Video] Office Hours: 25 Answers in 10 Minutes

Videos
1 Comment

It’s time for a speed round, a rapid-fire tour of your top-voted questions from https://pollgab.com/room/brento:

Here’s what we discussed:

  • 00:00 Start
  • 00:17 SteveTV: Hi Brent! Are there solutions for managing SQL Agent jobs that obviate the need for manual updates and write access to Job definitions in SSMS? We need to remove write access to agent jobs, and create a deployment strategy that includes approval, change history and testing.
  • 00:36 MyTeaGotCold: I have Query Store enabled on every database and it is overwhelming. Any advice for AUTOMATICALLY monitoring Query Store data across multiple databases? Both sp_BlitzQueryStore and sp_QuickieStore are designed for single databases.
  • 00:53 Miles: Hi Brent, Good Morning. Having too many VLFs will cause blocking? if so, why? Also, how many VLF count is a good VLF count to maintain so that database recovery doesn’t take long time. Thank you.
  • 01:16 Drew: We send out SQL backups direct to Azure and recently the verify backup is taking hours. Is this an Azure bandwidth issue or a SQL server issue?
  • 01:37 22DBA: for how long do you think the DBA full-time position will exist?
  • 01:56 NotCloseEnough2RetirementToStopLearning: Hi Brent, are you seeing any signs that Microsoft Fabric is catching on enough to merit career investment or is just another failed repackaging?
  • 02:11 Alexey: one of SQL server 2016 enterprise has 3500 opened sleeping sessions when I am running sp_whoisactive. Is it reflect on a performance overall?
  • 02:36 Ignacio: Have you read James Serra’s new book on Data fabric, data lakehouse, and data mesh architectures?
  • 02:56 Peter: Hi Brent, as and when you update sections of your recorded classes do you plan to email subscribers? I would rather not have to re-watch everything to find useful updates.
  • 03:04 Barry Allen: Will we ever see a flash office hours (30 seconds) YouTube short?
  • 03:21 Karthik: Have you ever run into the issue where query plan shows warning of a missing statistic for a column but that statistic / index actually already exists? We had to fix this by manually rebuilding stats. Not sure what causes this.
  • 03:37 Carrie Jeffries: Does anyone have a work around for applying both CUs and GDRs with sccm? We are told to we can only patch CUs or GDRs with sccm now and we have to pick one branch? https://learn.microsoft.com/en-US/tro…
  • 03:54 WhatsNext : Hello Brent. I have been a production DBA on SQL Server for 15 years. Lately, have felt like I want to learn something new. BI and Analytics being popular, do I go with Snowflake or Databrick? Look at AI and ML? I have never been much of an analytics or data visualization guy.
  • 04:25 MyTeaGotCold: When should I replace a temp table with a memory-optimised table variable?
  • 04:44 Miles: Hi Brent, query is running slow. spid’s waitype is CXPACKET. After digging further one of the thread wait on SOS_SCHEDULER_YIELD. Server has 32 cpu’s,CPU util is under 15% nothing else is running. Why it is had to wait for cpu when it is not a high cpu condition on the server?
  • 05:29 Greef Karga: Is there much demand for a PostgreSQL version of constant care? How hard would a port be?
  • 05:54 Vittorio: What contingencies do you have in place in the event that Amazon raises prices on Aurora PostgreSQL by 5x? Would you move to another cloud?
  • 06:25 Captain Lou: In SQL 2019, Is it ok to leave LIGHTWEIGHT_QUERY_PROFILING turned on or should it only be enabled situationally?
  • 06:42 Finch: Brent, what’s your recommendation on when to create a new SQL instance vs. when to just add additional databases in an existing instance.
  • 07:14 Miles: Hi Brent, seeing INSERT blocking SELECTs? hoping that SQL inserts data at the end of the table and those r new rows so it shouldn’t block any spid. We have application which INSERTs based on SELECT. eg) INSERT INTO TNAME SELECT.. WHERE …; Does it make a difference? any fix?
  • 07:46 GP Geek: What’s best way to analyze a nested query with multiple UNION ALL that runs forever apparently on only one or 2 of the UNION query..
  • 08:06 Oswald: What are your thoughts on the newer cloud DB’s that are managed / accessed via API’s? How might this affect the DBA role and required skills?
  • 08:25 Does bout nothing: If you were a Microsoft SQL PM, what top things would you do to retake market share from PostgreSQL?
  • 08:50 Jr Wannabe DBA: Hi Brent, 3 years later from your blog post, what is your opinion about PolyBase? Would you recommend it as an option to replace Linked Servers?
  • 09:30 Ólafur : What do the default transaction isolation level metrics look like for your constant care shops?

[Video] Office Hours: The Long One

Videos
2 Comments

You posted a lot of great questions at https://pollgab.com/room/brento and I spent almost an hour covering these:

Here’s what we covered:

  • 00:00 Start
  • 02:05 Kulstad: I’m using Ola Hallengren’s maintenance scripts for my db maintenance on Saturday evenings, and I’ve noticed my memory consumption go from approx 35% usage during the business week to 85% on Monday morning. How can I troubleshoot the scripts that are causing this increase?
  • 03:28 mailbox: Hey Brent, What is the difference between vertically partitioning a table into 2 tables, and creating a non-clustered index on a subset of the columns that would have partitioned on?
  • 04:46 Hankthedba: Hi Brent, I manage about 90+ SQL Servers and struggle to keep up with Patching. What is the best method to ensure that each env are keep up to date. I have various version and edition and trying to create some kind of PowerShell script that can assist. What are your thoughts?
  • 07:27 Jorriss: Are there any other tools to unit test SQL other than tSQLt? Asking for a friend.
  • 08:29 Nazan: Is there a good way to send out notifications any time someone creates a new DB in boxed SQL Server?
  • 10:51 Raffi Musiker: Will there soon be a day when AI can rewrite all code / sp’s for a one DB vendor to another DB vendor? How might this affect the DB landscape?
  • 11:50 Elnor: What is your opinion of the OPTION (FAST N) query hint? Has it ever got you across the finish line?
  • 14:25 Ozan: Hi Brent, on a VMware infrastructure, would you recommend a vSAN storage config and then two different vSAN’s for data and log files? Would you enable HA for the ESXi host or is AlwaysOn HA on the SQL Server VM enough? Thanks!
  • 20:18 Agnes Jurati: How would you implement query charge back by connecting application? Which flavor of SQL is best suited for this?
  • 22:21 Rom: Do you think we will ever see support for cross DB queries / SQL CLR in Azure SQL DB or is it better off without these features?
  • 24:03 MyTeaGotCold: If index maintenance is now an outdated idea, why are Ola’s scripts still commonly recommended?
  • 24:46 Thanos Tokakis: What are your pros/cons of putting relational data into a non-relational DB (i.e. Cosmos) and non relational data into relational DB (SQL Server)?
  • 26:35 sandimschuh: Temp table (~100 rows) inner join big table (~ 40GB and 800M rows). Each row from
  • #t matches 5-25 rows in BT. BT is clust. on join key. Had efficient NL until recently when deleted some 100k rows from BT I get clust. index scan and a Column with no stats warning on
  • #t. Any Idea?
  • 27:32 Stone Breaker: Does PostgreSQL full text search compare favorably with SQL Server full text search?
  • 28:24 Drinking Violet: Should Microsoft use AI support to convince peeps to migrate from SSMS to Azure Data Studio?
  • 29:31 Youlika S: Do you have a recommended way to parameterize the order by column when using dynamic sql that is performant, safe, and doesn’t trash the plan cache?
  • 30:18 db_ape: I am trying to automate weekly database restores on test environment. Using sp_databaserestore, how can I close existing connections on existing test database ?
  • 35:37 iliyan Rashev: Hey Brent, my question is how total_worker_time is correlated with CPU utilization? I thought that higher total_worker_time means more CPU utilization, but it is not the case obviously for me. I have master and tempdb as highest consumers on AwalysON Secondary replica
  • 36:32 Geordi: How is Azure SQL DB plan cache better/worse than SQL VM plan cache?
  • 37:27 Lwaxana: Does Microsoft notice / care that Office Hours reduces the need for Microsoft SQL Server support tickets?
  • 39:50 DBA in VA: Hi Brent – looking at index usage stats, what’s the difference between a singleton lookup and a read? I know that a singleton lookup is a traversal through the b-tree to return a single record — but then what is a read and why are those numbers so different in my usage stats?!
  • 40:17 Dimez: We have queries that regress in performance due to a changes in execution plan (query store). This sometimes happens after a statistics update or when the customer comes in Monday morning after the weekend. How can we consistently keep good known plans on Standard Edition of SQL?
  • 41:23 Kore Soong: What is your favorite tool for looking at session waits for a given query?
  • 41:58 Go sports!: Does Vegas living change much when the super bowl is in town?
  • 45:09 RoJo: You mention using realworld data to Load test the database – especially for scale. Is there a mechanism to replay data into the DB and to throttle it up for Load testing ?
  • 45:35 Smaragda K: What’s your opinion of DB sharding for relational DBs? Which product does it best?
  • 48:04 Venkat: What was your old saying about DB performance? You can have 2 out of 3 of A, B, C but not all.

Join Me for Free Live SQL Server Training Next Week.

Conferences and Classes
3 Comments

Next week, I’m teaching my very favorite class live online, for free, two different ways – one with all slides, and one with all demos. Whatever way you like to learn, I’ve got you covered. So, what’s the subject?

You’re comfortable writing queries to get the data you need.

But you’re uncomfortable if someone asks you how it works, how SQL Server and Azure SQL DB turn your queries into results. You know there are execution plans, but … you’re uncomfortable reading them. You know queries need indexes, but you’re not sure which columns to put in order, or how SQL Server chooses between them.

How to Think Like the SQL Server EngineI’d like to teach you How to Think Like the SQL Server Engine for free.

In a 2-hour live session, you’ll learn the differences between clustered & nonclustered indexes, why seeks aren’t necessarily good and scans aren’t necessarily bad, why queries run so much faster in development than production, and much more. I wanted to design the most important concepts you could learn in your first 2-hour training session – whether you’re a developer, report writer, sysadmin, or DBA.

I’m teaching the class two different ways: one with slides, and one completely in Management Studio. Both classes cover the same material, but the all-demo version lets me go off on interesting tangents based on your questions. I’d encourage you to attend the slide one first, then circle back for the demo one so that you can really soak in the material and get your questions answered. If you’re busy, and you can only do one, I totally understand!

Slide version, all April 2nd, Tuesday, links are to calendar invites:

Demo version, all April 3rd, Wednesday:

No registration required – just add the event to your calendar, and at the time of the event, hop into https://www.brentozar.com/training/live/ or my YouTube channel to watch the live stream. See you in class!


[Video] Office Hours in Snowy Telluride, Colorado

Videos
0

On a beautiful sunny afternoon in the mountains, I went through your top-voted questions from https://pollgab.com/room/brento. Audio is a little rough on this one because I used a new wireless microphone and forgot to put the wind shield on it.

If you’d like to watch the spatial video version on an Apple Vision Pro or Meta Quest, download this 2GB video directly – YouTube doesn’t seem to support MV-HEVC video just yet. If you’re on an AVP, go to that URL, then when the video starts to play, click Stop. (Safari doesn’t play spatial videos yet.) Click the Share button up by the URL, then choose Save to Files, and save it locally on the AVP’s file system. (Don’t save it to your iCloud Drive – you don’t wanna bother taking up space.) After it saves, then go into Files, find the video file, and play it from there. Sound convoluted? It absolutely is! I look forward to YouTube (or Vimeo or someone) supporting MV-HEVC uploads, and Safari supporting MV-HEVC playback. Sharing & playing spatial video in Feb 2024 blows.

Here’s what we discussed:

  • 00:00 Start
  • 00:41 Artyom: Hi, Brent! We currently have a huge corporate dwh (100+ db, ~50TB in size) with mainly olap workload (reports, analytical ad-hoc). Queries are suffering from latches (pagelatch_up/sh) in tempdb. Any advice on what to try? Can it be tuned or do we need better hardware?
  • 01:53 MyTeaGotCold: How can I test something under the same load as Production? My test server is identical, but I don’t know how to give it identical burdens.
  • 04:31 Q-Ent: Hello Brent. I see that most of advises around SQL server performance tuning are related to SELECT and rarely on DML queries. Is this because they follow the same logic or we don’t have a chance for better performance?
  • 05:57 tominyorks: Have you ever seen Query Store and Optimize for ad-hoc cause an extreme increase in log growth? Using Erin Stellato’s suggested defaults for QS settings, I saw log sizes grow up to 2000% and backups take 200 times longer so now I’m trying to figure out out to use fn_dump_dblog
  • 07:24 ChompingBits: My friend has a 6TB Data Warehouse running OLA scripts on Indexes and Statistics weekly Saturday. They run from a CAS for multithreading, but run until Wed/Thu. Any advice to speed up processing? The server is almost a whole VMWare host worth of resources, so that’s out.
  • 09:31 RollbackIsSingleThreaded : Hi Brent! Does disabling lock escalation on a table reduce blocking during concurrent bulk inserts?
  • 10:42 Sophie M: What is your opinion of Microsoft Fabric certification? Worth it?
  • 11:18 Frozt: Have you ever conduct a Daily Standup for Production DBA? I feel like my new manager wants me to micro my team but we are already functioning automatically and not sure what is the purpose of this. He/she wants to build camaraderie but is this micromanagement?
  • 12:20 Rom: What is your opinion of optimized locking in Azure SQL DB? Will we see this feature flow down to boxed SQL?

Does Your GROUP BY Order Matter?

Execution Plans
5 Comments

Sometimes when you do GROUP BY, the order of the columns does matter. For example, these two SELECT queries produce different results:

Their actual execution plans are wildly different:

They both use the index to retrieve their data, sure, but:

  • They both use index scans on the same index, but the first one only has to read data for the first 100 Location, DisplayName combos. The second one has to read all of the rows because the first 100 DisplayNames could be anywhere in the Location_DisplayName index.
  • The first one can dump the results out as-is, whereas the second one has to sort all of the rows by DisplayName, Location. The second one takes a hell of a lot longer to do that.
  • The first one is so lightweight that it’s single-threaded, whereas the second one goes parallel.

Okay, sure, makes sense. But what about when the GROUP BY column order doesn’t matter, like this:

Note that I’m doing a GROUP BY on the same columns in the index, but I’m doing an ORDER BY on something else altogether. This is really common in grouping reports where you wanna see things by a new calculated number.

In theory, the order of the GROUP BY columns doesn’t matter here. It doesn’t matter if I group by Location then DisplayName, or if I group by DisplayName then Location. In either case, I’m gonna have to sort the data by COUNT(*) DESC anyway, so the order of stuff in the meantime doesn’t matter.

To find out if our query text matters, let’s run both queries and check out their actual execution plans in SQL Server 2016 compat level:

They’re identical! SQL Server is doing witchcraft in the second plan, where:

  • The index is on Location, DisplayName, but
  • SQL Server seems to be doing a stream aggregate on DisplayName, Location

Or is it? Check out the properties of the stream aggregate operator:

SQL Server says, “I’mma output DisplayName, Location, but I’mma group by Location, then DisplayName.” I LOVE THIS. SQL Server is smart enough to understand that the order of columns in the GROUP BY doesn’t matter.

SQL Server 2022 compat level has a different set of execution plans that run much more quickly, but the answer is still the same: both queries have the same operators in the plan, and both perform identically:

Nice work, SQL Server. Extra time spent at plan compilation, coming up with these kinds of optimizations, results in faster query execution. This is a good example of the kinds of things I refer to as the choose-your-own-adventure-book in my Fundamentals of Query Tuning class.

Side note – I wrote this post after reading how PostgreSQL is getting this optimization in the future. Work on it began back in 2018!


Free Live Webcasts: Slide & Demo Versions of How to Think Like the Engine

Conferences and Classes
6 Comments

You’re comfortable writing queries to get the data you need.

But you’re uncomfortable if someone asks you how it works, how SQL Server and Azure SQL DB turn your queries into results. You know there are execution plans, but … you’re uncomfortable reading them. You know queries need indexes, but you’re not sure which columns to put in order, or how SQL Server chooses between them.

How to Think Like the SQL Server EngineI’d like to teach you How to Think Like the SQL Server Engine for free.

In a 2-hour live session, you’ll learn the differences between clustered & nonclustered indexes, why seeks aren’t necessarily good and scans aren’t necessarily bad, why queries run so much faster in development than production, and much more. I wanted to design the most important concepts you could learn in your first 2-hour training session – whether you’re a developer, report writer, sysadmin, or DBA.

I’m teaching the class two different ways: one with slides, and one completely in Management Studio. Both classes cover the same material, but the all-demo version lets me go off on interesting tangents based on your questions. I’d encourage you to attend the slide one first, then circle back for the demo one so that you can really soak in the material and get your questions answered. If you’re busy, and you can only do one, I totally understand!

Slide version, all April 2nd, Tuesday, links are to calendar invites:

Demo version, all April 3rd, Wednesday:

No registration required – just add the event to your calendar, and at the time of the event, hop into https://www.brentozar.com/training/live/ or my YouTube channel to watch the live stream. See you in class!


[Video] Office Hours: Speed Round

Videos
1 Comment

Not all of the questions y’all post at https://pollgab.com/room/brento require long-winded answers. Let’s conquer 22 questions in 15 minutes!

Here’s what we discussed:

  • 00:00 Start
  • 01:05 JoseDBA: We just started monitoring our AlwaysOn with Datadog. Any recommendations of what to monitor that is valuable and not filling ourselves with meaningless alerts? Thanks!
  • 01:43 Boutaga: Hey Brent ! Good morning from Switzerland ! As a DBA I have a lot of struggle explaining Devops guys to implement CI/CD all the way to the database… What do think would be a limitation technically for Devops ? What could Microsoft do about it ?
  • 02:18 Neil: Reports are searching a varchar(max) auditlog column. Is full-text search a viable option for this? Is it difficult to support or is it just like maintaining an index? My spider sense is tingling.
  • 02:52 Ozan: Hi Brent, what is your experience so far with sql server on Linux based systems like rhel (HA, performance, stability)? Thanks
  • 03:31 MyTeaGotCold: Do you ever bother with dbachecks? It’s relatively new, but I reckon your First Responder Kit makes it redundant.
  • 04:19 Claus: Which relational cloud DB’s have the best separation of compute and storage?
  • 04:50 Laris: In what scenario would you want to create a view without schema binding option?
  • 05:27 Doug Fresh: What is your opinion of tSQLt for unit testing? Is this actual unit testing with mocks and fakes or is this integration testing?
  • 06:12 the crazy frog: Aren’t you doing the Live Master Classes anymore and so update the course material this year?
  • 06:40 DB-Ay?: Hi Brent, it is normal to see 10k logical reads per execution (6k per execs p/hour) on an update using the primary key? E.g. UPDATE tab1 SET time = getdate() where ID = @p0 (there’s around 60 columns being updated in the query but I can’t fit them all into the q!). Thanks.
  • 07:15 The Swedish Chef: Do you have an easy way to identify an SSMS query plan operator as batch mode or row mode?
  • 07:35 Mr. Burns: Have you ever been burned where an end user ask you to delete a user table and said it was ok but after you dropped the table things started to break?
  • 08:29 Adam Soong: What is the best way to track down which apps are querying a given view? (Need to make a breaking change to the view).
  • 08:50 Cristóbal: It’s simple to see the performance impact of a new non-clustered index on a slow select query in SSMS, but how do you view the new overhead of the new index on inserts / updates / deletes to that table?
  • 09:38 Biscuits: How well does the Azure PostgreSQL offering compare to the Aurora PostgreSQL offering?
  • 10:27 chandwich: Hey Brent. What’s one part of your online presence you don’t like? For me, it’s thinking of blog posts.
  • 11:26 Moshiko: hello brent, is there a reason why microsoft won’t release a native version to mssql server to apple’s new chips ? (M’s)
  • 11:54 Firat: Hello from Netherlands 🙂 SQL Standard should be limited to 128GB. But when i give it 140 GB i see SQL use up to 140GB. I believe i even saw it on 160GB when i allow the max. Does this help in my queries like for sorting? Or is it just ignoring all the GB’s above 128GB?
  • 12:34 ToffeeC: In any SDLC where does looking at how performant any new/changed database code start? In ours, using Agile, its usually after deployment when the production DBAs deal with any negative the impact lol
  • 13:12 Power Paul: What is your opinion of using Copilot to write power BI reports? Has Microsoft finally perfected natural language queries?
  • 14:00 Dahj Asha: What is your top 3 Microsoft abandonware for SQL Server?
  • 14:55 Accidental DBA: Sometimes when running a database restore it takes over 30 minutes before seeing any percentage complete to start (using spWhoIsActive; bkup 120GB) and other times less than a minute. Can’t find any wait stats causing the difference or docs on the delay. How should I diagnose?

Option Recompile is a Magic Turbo Button That Actually Works.

T-SQL
22 Comments

I didn’t say that – Guy Glantser did.

Guy Glantser is an Israeli SQL Server guru with a ton of great presentations on YouTube. I’ve had the privilege of hanging out with him in person a bunch of times over the year, and I’ll always get excited to do it again. He’s not just smart, but he’s friendly and funny as hell.

Matan Yungman, Guy Glantser, and I

In the most recent SQL Server Radio podcast, Guy explained what OPTION RECOMPILE does, and basically suggested that everyone should be comfortable using it outside of OLTP or high-CPU environments. In fact, if you’re working on data warehouses, Guy said, “Your default should be to add OPTION (RECOMPILE) on every query.”

Right here is the part where you’re expecting me to say, “But I disagree with Guy on this one.”

But I’m on Team Guy here.

For years, it’s been a meme that people constantly ask Office Hours questions about, “Why is this same query sometimes fast and sometimes slow?” And as we dig into their question, I inevitably have to talk about Erland Sommarskog’s excellent Slow in the App, Fast in SSMS, and explain the concept of parameter sniffing. It’s the year 2024, and most of you are sick and tired of hearing about it, and yet there are new people coming into our industry every year who have mind-blowing moments when they learn about the problems of plan reuse.

In the past, I’ve even heard Guy proposing a server-level switch to simply turn off plan caching, and compile every query that comes in.

I think that’s overkill, but I would really like a Cost Threshold for Recompile (CTfR) setting. In data warehouses, for example, I might set CTfR to something low like 50 query bucks. I asked Microsoft for it a while back, but no official response so far.


What Happens When Multiple Queries Compile at Once?

Execution Plans
5 Comments

An interesting question came in on PollGab. DBAmusing asked:

If a query takes 5-7s to calculate the execution plan (then executes <500ms) if multiple SPIDS all submit that query (different param values) when there’s no plan at start, does each SPID calc the execution plan, one after the other after waiting for the prior SPID to finish?

Well, it’s easy enough to demonstrate! Let’s take a query from my Fundamentals of Query Tuning class that takes tens of seconds (or longer) to generate a plan on most versions and compat levels:

And then let’s run it simultaneously across 10 threads with SQLQueryStress:

And run sp_BlitzWho to see what’s happening:

We see that most of the sessions are piled up waiting on RESOURCE_SEMAPHORE_QUERY_COMPILE. That means queries are waiting on memory in order to build execution plans. It’s not that the SQL Server is out of memory altogether – it’s just that it has gateways to prevent too many queries from compiling at once.

Most DBAs can go their entire career without seeing that bottleneck, but given my weirdo job – I’m like an emergency room trauma surgeon for SQL Servers – I see it at least a couple times a year when:

  • A SQL Server is under serious memory pressure (typically due to queries getting oversized memory grants), and
  • There are seriously complex queries in the workload (typically reporting queries), and
  • Those queries aren’t parameterized (because if they were properly parameterized, they’d stick around in the plan cache, avoiding the compilation problem)

In that situation, my first line of defense is to improve plan caching like we discuss in this module of my Mastering Server Tuning class. The last line of defense would be trace flag 6498, which allows more large queries to compile simultaneously. I’ve never needed that in my life, and I hope you don’t either!


Finding Sister Locations to Help Each Other: Answers & Discussion

This week’s query exercise asked you to find two kinds of locations in the Stack Overflow database:

  • Locations populated with users who seem to be really helpful, meaning, they write really good answers
  • Locations where people seem to need the most help, meaning, they ask a lot of questions, but they do not seem to be answering those of their neighbors

In the challenge post, I gave you the same kinds of requirements that our end users would typically give, but not the kinds of requirements that a business analyst might specify. This challenge was actually inspired by a friend of mine who transitioned into an analyst role, and I was reminded of just how hard that work really is. Translating vague requests into detailed specs is a lot harder than it looks, and the wide variety of this week’s answers reflects that.

Normally when I write these challenges, as soon as I’m done writing the challenge post, I immediately start writing my own solution. I didn’t do that here because I wanted to show your different interpretations of the request. As the answers started coming in, I copy/pasted ’em into SSMS just like everything else I find on the internet, and I used 3 sniff tests:

  1. Did your top locations have a decent number of people in them? I didn’t wanna see answers that said, “This podunk town is a location we should focus on.”
  2. Did your two sets of locations have overlaps? I didn’t wanna see answers that said, “This place has a lot of people who write questions, and a lot of people who write answers.” That’s just a list of big places, which violates the “they do not seem to be answering those of their neighbors” part of the request.
  3. Do the result sets show how they were sorted? Anytime I see a ranked list, I want to know how the data was ranked. (In this particular challenge, I’m really flexible on sorting methods – I just wanna know what it was.)

As an example, I’m showing a picture of one answer’s results here. The person shouldn’t be ashamed by any means – this was a hard question, for sure – but it serves to illustrate an answer that doesn’t pass the sniff test. The top result set are the helpful locations, and the bottom result set are the locations that need help. A single location shouldn’t be in both – and yet the same locations show up in both list. This is just a list of the biggest locations, basically.

This would happen if your queries were just:

  • The users who posted the most answers (without posting questions), and
  • The users who posted the most questions (without posting answers themselves)

In any big location, you’ll get a lot of those, so that isn’t going to work. Instead, we’re going to need to get more creative with our filtering & sorting.

I’ll discuss two approaches, but to see more, check out the comments on the challenge post, or my own approach on the Postgres version of the answers.

I am a potato?’s Answer

I had to laugh out loud at their name on their comments, and they most certainly were not a potato. Here’s their answer in a Github Gist, and my edited version is a comment on theirs. The results will take a little explaining:

The first result is the helpful people, and to find it, they sorted by average answer score descending. This does favor places that seem to favor quality over quantity: note how #6 (Sherbrooke) produced only 326 answers, and #7 (Sunnyvale) produced over 64x more questions – but they’re ranked lower.

That sort order is completely fine – we gotta pick something to sort by, since the users didn’t tell us a specific sort order. This answer is just as valid as anybody else’s, and I wanted to pick a variety of sort orders to show how the request could be interpreted.

However, there’s another problem with line 6: population is only 31 people. If you look at both result sets, they’re both riddled with places with less than 100 people. So, how’d that happen? They started by building a list of locations with >100 people, but didn’t consider whether those people had produced over 100 questions or answers. To fix it, we just need to filter the result sets for >100 population, and then we’ll get better quality.

The second result set is the people who need help, and to find that, they sorted by a high ratio of questions to answers. The ratio isn’t shown in the results, and I would have preferred that, but that’s okay. The general idea makes sense. Again, we have small populations showing up here, and row 9 isn’t really better than row 10, but once we get a filter for at least 100 people in the location, it’ll be better.

Nick’s Answer

NotPotato’s answer averaged out Posts.Score across the entire population of the location. Nick’s answer took a different approach: he got the average answer score for each user in the location (because people may have posted multiple answers with varying scores), and then averaged out all of the users’ averages. As a result, this top 10 looks different than NotPotato’s:

But also, his second result set looks wildly different than NotPotato’s. Part of that is because Nick’s answer correctly filters for large populations, and another reason is because he used a different metric for what helpful means. He used the formula of (AnswerCount – QuestionCount), calling that AnswerDelta. This is interesting because places with higher volumes of answers will be favored, not necessarily tiny locations with tiny metrics (like 8 answers vs 1 question).

Both of those are valid approaches!

Sure, we had some bumps along the way, but I like this kind of challenge, especially for testing potential new hires. I don’t really care if their sorting choices match my own ideas – I’m much more interested in seeing if they can spot obvious bugs in their result sets, and how their T-SQL looks.

I hope you enjoyed this week’s challenge! The Query Exercises will be taking a break for the next couple of weeks because I’m traveling to Hong Kong and Shanghai. I’ve still got plenty of blog posts queued up, but the interactive exercises will return in April when I’m back. If you’re starving for homework, check out the prior Query Exercises posts and catch up on the ones you missed so far.