Your Turn: I’m Not Answering These 13 Office Hours Questions

Normally, after I do a round of Office Hours of going through the questions that got posted at https://pollgab.com/room/brento, I answer the highly upvoted ones and then clear the queue. If y’all didn’t upvote questions, then I don’t answer ’em.

However, today I’m trying something different: I’ll post the non-upvoted questions here for y’all to see if there’s something you want to answer in the comments. I’ll number ’em so y’all can refer to ’em easier in the comments.

  1. GI Joe DBA: Thoughts on contract to hire jobs? I don’t like them, I want to be hired FT immediately. I think agencies use them to squeeze $$ from the placement but, there’s no guarantee I will be hired and it’s a hassle switching benefits. I’ll only consider it if pay is higher than average.
  2. lockDown: Hi brent, we work on a multitenancy model sql DB and we need to split the data to 2 Dbs, since there multiple apps writing to the DB & the Devs are asking for a schema lock on the (2k) schemas that will be moved. I feel like stoping writes should be done on the App am i right?
  3. Jarkko: Should there be any concerns when a query runs in 2 seconds but consumes 16 seconds worth of CPU time (SQL 2019 Enterprise, 64 cores)?
  4. RonS: Hello Brent, Moving to 2019 SS Standard/HA. 7 databases that need to talk with each other. Standard edition requires each DB to have its own IP address. Current environment all the DB’s live in one server and can see each other (db.schema.table). Linked servers right direction?
  5. CPL: Hi Brent .. What is the process that actually updates indexes? e.g. table with 3 NC indexes & all three indexes contain Col1. If one of these indexes is used where we then update Col1 how are the other 2 indexes subsequently updated? Is it still via the transaction log?
  6. Wenenu: What are the top SQL index naming schemes you see in the field? How did the various schemes originate?
  7. Anatoli: CONTAINSTABLE and FREETEXTTABLE seem to hide the logical reads when statistics io is enabled. Is there an alternate way to see the logical reads incurred by these TVF’s? Using SQL 2019 – 2014 compat mode
  8. Sebastian: How would you change SQL Sentry Plan Explorer to make it more usable / friendly?
  9. DBA_preparing_for_jobsearch: Could you suggest a good strategy on improving left anti semi joins?
  10. Least Significant Who In Whoville: Hola mi amigo. I am the sole DBA at a recently acquired adult beverage manufacturer and our new controlling company. Do you have any suggestion on how to break the news that their adhoc architecture is one mistake away from disaster when their shadetree DBAs know more than me?
  11. Itching to Emigrate: Is it true that Americans in Europe working remotely for American companies fall under GDPR? Can you point us to resources for this question? Everything I see only looks at the consumer side of GDPR.
  12. reps_for_bulk_deletes: For “Full” recovery model, how much bigger than a row of data is the log entries associated with the deletion of that record? Trying to come up with a heuristic for when to approach alternate deletion methods by looking at remaining space in log and on log disk.
  13. BlitzFan: Hi Brent, At our shop we make all kind of backups, SQL backups, Avamar backups of SQL backup shares and of entire databaseservers. But we never test those Avamar backups, so no disaster recovery tests and no disaster recovery plan. What would you say to management?

[Video] 300-Level Guide to Career Internals at SQLBits

You’re overwhelmed with choices: so many things you could learn, so many ways you could specialize in your career. Which one should you choose? What are the safe bets, and what are the risky bets? Should you be a contractor? A consultant? A freelancer? Specialize in the cloud, and if so, what products? There’s no guide for data professionals here, and it’s kinda scary.

About ten years ago, I sat down with a pen and paper to analyze the ways that I could make a living with data. I devoured a lot of business books, thought about the kinds of things companies pay money for, and laid out a simple grid to help decide what kind of career I wanted to build for myself.

Today, in 2022, you’re facing the same problems. In one session, I’ll explain what companies pay for, what individuals pay for, and how you could build different careers with that knowledge. I won’t tell you what’s right for you – I just want to give you a map of different choices, like a guidance counselor would. I’ll finish the session by explaining the choice I made, and why.


[Video] Office Hours at the End of the World

Videos
3 Comments

I hopped down to Cabo San Lucas and sat at the End of the World to answer the questions you posted & upvoted at https://pollgab.com/room/brento.

Here’s what we covered:

  • 00:00 Intros
  • 00:44 Kirk Saunders: Hey Brent! When did you determine your knowledge/skillset/etc.. was sufficient to teach your various classes? I think I’m doing well at my job, but I’m concerned I’m overestimating my skillset (big fish, small pond). Any insight on a more objective measure is greatly appreciated!
  • 02:42 Monkey: Howdy, Brent! When I specify RETAINDAYS or EXPIREDATE when doing backups and set either to 30 days, does it mean after 30 days some job checks whether my backup expired, and deletes it ? if no, how can I make so older than 30 days backups be deleted automatically ?
  • 03:31 Yakov: What was office hours like prior to the introduction of PollGab?
  • 04:54 Rizzo: How should you deal with a SQL power user that forgets to commit their transaction on a busy server before leaving for the day?
  • 06:43 Gary Kendall: Hey Brent! We have a commercial app that creates many many temp tables (not in TempDB) for ad-hoc reporting and other things but doesn’t clean up after itself. Is there a tipping point where too many tables in one DB might cause overall performance problems with SQL Server? Thx!
  • 09:30 Select_Star_not_Asterisk: What’s your though about Contained Availability Group? For me, the new features, contained master and MSDB is a big check for me.
  • 11:16 Uncle Kenny G: We have 1000 databases on Azure SQL Server. We are going to adopt some third party monitoring software (Red-Gate, SolarWinds, checkmk, etc…). Does using these monitoring software will put pressure on the DBs and increase the overall costs at the end of the month?
  • 12:05 SQL Helper: In the consulting world, how do you handle an engagement that you cannot solve or is technically not possible as outlined in the SOW? Do you get compensated for your time working on the issue?
  • 16:34 Haddaway: What is the best tool for monitoring Always On replication progress? Do we need third party monitoring for good Always On Monitoring?
  • 17:24 Eduardo: What are the keys to becoming a good public speaker?

Download the Current Stack Overflow Database for Free (2022-06)

Stack Overflow
3 Comments

Stack Overflow, the place where most of your production code comes from, shares a version of their data in XML format from time to time, and then I import it into SQL Server format.

Stack Overflow’s database makes for great blog post examples because it’s real-world data: real data distributions, lots of different data types, easy to understand tables, simple joins. Some of the tables include:

  • Comments: 85M rows, 16GB data
  • PostHistory: 150M rows, 250GB (most of which is text, though)
  • Posts: 56M rows; 150GB
  • Users: 18M rows, 2GB
  • Votes: 230M rows; 4.5GB, making for fun calculations and grouping demos

This isn’t the exact same data structure as Stack Overflow’s current database – they’ve changed their own database over the years, but they still provide the data dump in the same style as the original site’s database, so your demo queries still work over time. If you’d like to find demo queries or find inspiration on queries to write, check out Data.StackExchange.com, a public query repository.

I distribute the database over BitTorrent because it’s so large. To get it, open the torrent file or magnet URL in your preferred BitTorrent client, and the 54GB download will start. After that finishes, you can extract it with 7Zip to get the SQL Server 2016 database. It’s 4 data files and 1 log file, adding up to a ~430GB database.

Want a smaller version to play around with?

  • Small: 10GB database as of 2010: 1GB direct download, or torrent or magnet. Expands to a ~10GB database called StackOverflow2010 with data from the years 2008 to 2010. If all you need is a quick, easy, friendly database for demos, and to follow along with code samples here on the blog, this is all you probably need.
  • Medium: 50GB database as of 2013: 10GB direct download, or torrent or magnet. Expands to a ~50GB database called StackOverflow2013 with data from 2008 to 2013 data. I use this in my Fundamentals classes because it’s big enough that slow queries will actually be kinda slow.
  • For my training classes: specialized copy as of 2018/06: 47GB torrent (magnet.) Expands to a ~180GB SQL Server 2016 database with queries and indexes specific to my training classes. Because it’s so large, I only distribute it with BitTorrent, not direct download links.

As with the original data dump, these are provided under cc-by-sa 4.0 license. That means you are free to share it and adapt it for any purpose, even commercially, but you must attribute it to the original authors (not me):

so-logo

Happy demoing!


[Video] Office Hours: Speed Round Edition

Videos
0

Not all of the questions you post at https://pollgab.com/room/brento require long answers. Here’s a quick speed round:

Here’s what we discussed:

  • 00:00 Introductions
  • 00:24 EngineHorror: Hey Brent! What’s your opinion on page- and row-level compression in general? Is it true it increases locking besides burning extra CPU?
  • 01:08 Pony: Is Microsoft Assessment and Planning tool (MSAP) still best tool for discovering SQL Servers within a company? are there better or alternative tools for this ? We need this to build inventory list of SQL servers we have
  • 01:21 DBA Champion: What monitor would you recommend for a DBA ? Inches, resolution, number monitors (1,2,3…) ?
  • 02:00 Monkey: Hi Brent! Have you ever considered working with Microsoft on SQL Server improvement?
  • 03:20 Monkey: Do you recommend performing transaction log backups into 1 single TRN file (append them), or each t-log backup should be performed into separate file (1 backup file = 1 trn) ?
  • 03:43 Mike: Have you ever implemented Dynamic Data Masking, and what is your opinion on it ? Do you think in-house developers who has read access to Production, should or should NOT be able to read Personally Identifiable Information such as Customer Name, Email and Address ?
  • 04:22 James Adams: Will the Senior DBA class ever come back in stock?
  • 05:07 B-treehouse: I must optimize the database of a multitenant app. The first col of most NC indexes is the ‘TenantId’. This causes param sniffing problems due to the mix of very small and very large tenants. Would it be an acceptable use case for ‘OPTIMIZE FOR(@TenantId UNKNOWN)’ ?
  • 06:02 Tefnut: What types of perf issues do you like to use sp_HumanEvents for?
  • 06:25: Lilandra Neramani: Who is the Brent Ozar for all things Power BI?
  • 06:45 Can I join you: Hi, I use azure sql db and running a query against a view with a WHERE clause with a variable (Id =@id). Query runs slow and performs a full table scan. When I hard code the value of @id it uses a seek and runs fast. I thought Sp_executesql would fix it but it runs slow also. Why
  • 07:27 Dom: How would you describe your driving style ? Are you slowly cruising around looking at the scenic view or do you drive a bit more “sporty” ? Asking cause some of your car have been ingeneered to be driven pretty fast 😮
  • 08:43 sol: Howdy, sir! How come you haven’t gone bald still? Based on recent driving videos your hair is gorgeous!

Office Hours Speed Round, Text Edition

Got questions for me? Post ’em at https://pollgab.com/room/brento and upvote the ones you’d like to see me cover. I filter out the ones that are too short for video answers, and here’s the latest batch:

Corrupted: Should DBCC CheckDB be run on secondary replicas in AG as well ? Is it recommended to attach corrupted database to Prod server, to check how your integrity check job will react ?

You should run it anywhere you want to fail over to. Read this.

Khaled Budajaja: Brent Ozar Office Hours is more informative then many hours spent on reading blogs and technical articles. Did you create this? I don’t recall any body else doing office hours. Many thanks Brent

Aww, thanks. It’s definitely not my idea though – it’s common for college professors to have open office hours.

David: Where can I find information about the master and msdb databases similar to what you present in Fundamentals of TempDB?

What are the pain points you’re having with those databases that you need to solve? Leave ’em in the comments.

Bill Hicks still lives: God created the Earth in 7 days. Do you think it’s possible to migrate from Azure SQL Database to CosmosDB or Table Storage in 7 days? Apart from the joke have you ever done that and what are the pros and cons? The goal is to have reports go faster.

CosmosDB or Table Storage for fast reports?!? That is not even wrong.

Unut: Is indexing bit fields like “IsActive” on large tables a futile effort when the large majority of records are active?

Generally, single-column indexes are less effective than multi-column indexes, regardless of the data type.

Khepri: Does FORCE ORDER query hint ever have any practical uses?

Yes, if you’re low on time, and it gets you across the finish line quickly, and you’re willing to live with the technical debt.

Not_a_DBA_but_I_play_one_on_tv: I restore a 600gb database backup to a test server every morning. Typically the restore takes 45 minutes to run. One day last week, it only took 15 minutes. Everything appeared to have run correctly, size didn’t change. What could cause that one restore to run quicker?

Less competition for shared hardware resources.

Unut: What types of issues do you like to use the SQL default trace for?

Check out the sp_Blitz source code.

James Adams: What are the powershell’ish things a senior SQL DBA should know?

I stopped teaching production database administration work years ago because I focus on development DBA work now. I’d start with DBAtools.io. I will note, though, that it’s interesting that a lot of these questions are still about production database administration. I just don’t wanna do that work anymore – I don’t find it enjoyable because it’s basically on-call break/fix stuff. Like, I could go the rest of my life without getting woken up because somebody ran a server out of drive space. It’s 2022. Quit treating drive space like it’s unobtainium.

sqldeo: Hi Brent Any suggestion on tempdb growth,not clearing the space even though developer dropping the temp table after doing ETL stuff SQl 2019, initials size same for all 8 files, auto growth enabled. db are aprox 5 TB in total size, snapshot iso. enable for 1 db. any suggestion?

Attend my Fundamentals of TempDB class.

T: Brent, currently we have SQL Server Failover Cluster with 2 nodes. We are planning to migrate it to Availability Group solution. Do we need to reinstall instances on both nodes or can we reuse it ?

Before going live on any HA/DR platform, you should test it repeatedly. That means you should build a new environment from scratch and test the bejeezus out of it before you go live.

Murad: What is the widest varchar column you would ever consider as a key for a new non-clustered index?

I wouldn’t set limits on that without knowing more about the business problem we’re facing. I can totally imagine scenarios where I’d index a couple of NVARCHAR(2000) columns, for example.

Milind: Hi Brent, how are you? How to do 2.5M+ records update without table lock? It is about account balance limit reset overnight daily. During midnight schedule, other processes are getting hampered due to table lock. How to do such operation without impact and possibly quick? Thxs

Read this series from Michael J. Swart.

Mekhit: Is there a recommended way to identify the hot (or not) columns for a given NC index?

There’s no instrumentation in SQL Server for that. You’ll be forced to use common sense. I know, terrifying, right?

Sothis: What are the signs that a given operator’s estimated number of rows was a hard coded estimate provided by the engine?

Take the estimated number of rows, divide that by the number of rows in the table, and look for suspiciously round numbers that don’t change regardless of the predicates you’re searching for.

Sothis: Is there a good way to determine the appx tipping point (row count) for index seek + key lookup vs index scan operation?

No.

Livnat: Any tips for tuning for “PARTITION BY” performance in TSQL?

Read these.

Waldemar: Do you have any query tuning / index tuning tips for linked server queries?

Don’t run them. Connect to the server that has the data you want. If you want to get something done quickly, you don’t turn to someone else and say, “Hey, can you ask that person over there to do this for me?” It’s just idiotic.

Nephthys: When passing multiple args to a stored procedure via a single delimited string and using STRING_SPLIT, is there a max number of args you recommend to not exceed for performance reasons? 

There are even gotchas with just one row. Read this.

Right Said Fred: What are the top 3 DB communities in order?

That’s a weird question – can you rephrase it for the kind of thing you’re looking for out of a community? I’m not sure if you mean places where general database people hang out, or where SQL Server people hang out, or where questions get answered, or where presentations happen, or what.

SQL Crooner: Can we safely ignore the page life expectancy alerts from Idera SQL DM?

I cover that in detail in my Mastering Server Tuning class. I wish I could do justice to it quickly, but that’s why I have training classes – some topics require deeper discussions and details.

Jose: What is your opinion of the various commercial apps that re-write your queries a hundred different ways to find the optimal query syntax?

Really cool, but usually really expensive.

Nephthys: Does clustered index fragmentation matter for INSERT performance?

Run an experiment with your particular table’s structure, your nonclustered index structures, and your typical insert statement to find out.

CakeAndEatItToo: When running a query through the Actual Execution Plan, is it more important to be looking at time taken or cost%?

Neither. Time to head to my free How to Think Like the Engine class.

Alberto: Hi Brent, I read that default isolation level for SQL Azure DB is Read Committed Snapshot. In your experience, what should we pay attention to when migrating from an on-prem with Read Committed isolation ?

Start here.

DBGeek: Like you, I use a Macbook Pro as my primary work computer. I use a virtual machine where I install VPN clients etc. But SSMS sometimes behaves really strangely on the Macbook’s screen resolution. What is your solution/workaround to handle high DPI settings?

In your hypervisor, change the Retina settings. Whatever you’ve got it set to, use the opposite one. Then, in Windows, don’t use the HighDPI settings at all.

Reshep: What are the top conventions used by your clients when creating different availability groups (business unit, region, etc)?

Business units and RPO/RTO.

Heryshaf: How do you know when insert contention is happening and what are the common ways of dealing with it?

Review your server’s top wait types. For common ways of dealing with it, check out my training classes.

Pietro: What naming convention do you recommend for clustered and non-clustered indexes? We currently use IX_TableName_Co1_Col2_Inc everywhere.

When you’re looking at an execution plan, why waste valuable pixels on IX? You know it’s an index. Why waste space on the table name? The table name’s already shown to the left of the index name. Just use the columns.

Ptahhotep: Do you have any kind of tab organization strategy in SSMS when many tabs are open?

Yes, save files and close them. The kinds of psychopaths who leave multiple SSMS tabs open are the same kinds of nut jobs who leave hundreds of browser tabs open instead of using bookmarks. They should be hunted down and eliminated before they harm others.

Tenenit: What data profiling queries do you like to run against client tables when identifying new index key candidates?

I don’t. The contents of the data don’t matter – the contents of the query do. For the mechanics of how that works, and why you should never select a table’s data to determine selectivity, check out my Fundamentals of Index Tuning class.

OMC: Is it possible to optimize a SQL table both for high INSERTS and SELECTS or must you pick one or the other?

That’s a great question, and it’s a sign that you’re ready for my Mastering Index Tuning class.

Chris May: How does the optimiser know which set of statistics to use for a given column if there are multiple statistics available? e.g. default system statistics, statistics as part of an index, manual created/partitioned statistics etc

I’ve heard someone say it’s the most recently updated one, but I have no idea if that’s true. I was this close to recording a video of me testing that hypothesis, but I ran out of time before heading down to Cabo for vacation.

Len: What are the pros / con’s of creating a clustered index on an identity int field vs a datetime field based upon GETDATE() ?

The datetime column will take up more space due to its data type size and the fact that SQL Server will have to add a tie-breaking uniqueifier behind the scenes. The clustering keys get added to all of the nonclustered indexes, too, so more space there as well.

Waldemar: What are the top areas of specialized consulting for SQL Server?

Watch this. I explain the different kinds of specialized consulting, what they do, and how to get into those careers. I know it’s kinda long, like 43 minutes, but if you don’t have 43 minutes to learn, well, we don’t have to worry about you getting a career as a specialist, do we? 😉

Horus: What is the killer feature Microsoft should add to SQL table partitioning that would lead to more wide scale adoption?

Why does it need to be adopted more widely? That feature is a pain in the butt. Instead, I wish they could build automatic sharding, automatically splitting data across multiple database servers for faster queries. That’s kind of a pipe dream for relational databases though.


Estimated and Actual Plans Can Have Different Shapes.

Execution Plans
15 Comments

A reader posted a question for Office Hours:

Hi Brent, What is your take on Hugo Kornelis’s explanation of execution plan naming. As her his explanation, estimated exec plan is simply an execution plan whereas actual execution plan = execution plan+run-time stats. Do you agree that the naming is flawed and confusing? – Yourbiggestfan

I like Hugo a lot – brilliant fella, and he knows way more than I do about execution plans – but he’s wrong on this one. Estimated and actual plans don’t always have the same shape.

I’ll use the training version of the Stack Overflow database, but any size will work here. Start with a few indexes:

Then create a stored procedure – yes, doing this with a temp table is silly, but I need a short, simple example to show the problem:

Then get the estimated execution plan for India:

The estimated plans for procs are a little weird in that they look like they’re a single statement:

But let’s zoom in on the second query in the proc. I’m not concerned about the plan for the insert into the temp table – that one’s pretty straightforward since we have an index on Location. Pay particular attention to the second query’s plan, though:

Read right to left, top to bottom to see that SQL Server estimates it will use a single CPU core (no parallelism) to:

  1. Scan the temp table
  2. Do a series of clustered index seeks on Users, then
  3. Do a series of UserId index seeks on Comments, then
  4. Sort the comments by Score, descending

But when you actually run the query and include the actual plan, the shape is totally different:

The query went parallel, and it chose a completely different query plan. Because there were so many rows in the temp table, SQL Server decided to scan the Score index on the Comments table, from highest-ranking Comments to lowest. It figured it wouldn’t have to read too many Comments before it stumbled across 100 that were written by people in the temp table. There were tons of differences, and here are just a few:

  • The estimated plan was serial, but it actually went parallel
  • The estimated plan used the Comments.UserId index, the actual one used Comments.Score
  • The estimated plan started with the temp table, the actual one started with Comments

The root cause on this one was the fact that when the estimated plan was generated, SQL Server hadn’t created the temp table yet, and it didn’t have any statistics on its contents. At runtime, the new creation of statistics caused SQL Server to go back and recalculate the plan while the proc ran, so the actual plan had a wildly different shape.

SQL Server 2022 is even worse.

I’ll do a classic parameter sniffing demo that I do all the time:

Execute it for Reputation = 2 – no need to get the actual plan, just run it:

And then try to get the estimated plan for Reputation = 1:

SQL Server 2022 is all, “Estimated plan? I could tell you – but then I’d have to kill you.”

That’s why I keep using the terms estimated plan & actual plan. For another example, check out Erik Darling’s recent post about estimated plan parallelism. (I’d already written & scheduled mine when his published, and I had to link to it here because the timing’s great.)


[Video] Office Hours: Long Answers Edition

Videos
0

This time on Office Hours, I let a few questions piled up at https://pollgab.com/room/brento that required in-depth answers to really do ’em justice. In particular, there was a statistics question that needed demos.

Here’s what we covered:

  • 00:00 Introductions
  • 00:35 Grogu: Did DBA Brent ever have to support a version of SQL Server that Microsoft had end of lifed? Any tips on discussing this issue with management?
  • 04:19 Ahsoka Tano: What are your favorite PASS Summit presentations from prior years?
  • 07:11 Can: I have 3000 DB per SQL Server. When I need to restart SQL Services, the service is waiting on Stopping mode.
  • 09:22 i_use_uppercase_for_SELECT: When doing filtered statistics do you basically have to use full scan? It appears the where clause for filtered statistics is applied after the tablesample system is applied so you end up sampling the same data as the main statistics which wasn’t seeing this data originally.
  • 29:02 Shaheen: Is there info in SQL Server that will tell us why it picked one index over another for a given query?
  • 31:58 muppet#1: What’s your favourite SSMS plugin (assuming you use any)?
  • 33:02 DBAInProgress: What are some of the major gotchas of using TDE?
  • 34:22 Dmitriy: How would I begin to optimize a reporting query that has no WHERE clause, but has a bunch of LEFT JOINs to a bunch of other tables via foreign keys?
  • 35:55 Columnstore newbie: I have a clustered columnstore index. The segments look good but selecting TOP 1000 order by UtcDate is still very slow. Why?

Let’s Make September Our Free Community Tools Awareness Month.

Last week, I was reading a brand new article from a Microsoft employee about how you should directly query sys.dm_exec_requests in order to find out what’s running on your system.

Brent Reading Book
“Step 1: get a stone that looks round.”

I lost my mind.

There was a lot of yelling at the monitor.

In the year 2022, nobody should be reinventing the wheel. There are plenty of free wheels available for you to choose from. You’re literally wasting your time if you start from scratch with a boulder and chisel, and then try to turn it into a wheel.

The Microsoft data platform community is amazing, and has been that way for years. There are so many free resources to help you do your job faster, easier, and more accurately.

And sure, I’ve been around for quite a while, and I take for granted that everybody in the database business knows about all this cool free stuff. I’m not talking about the First Responder Kit, either – I’m talking about a stunning list of resources so large that it’s intimidating just to get started.

That’s where you come in.
What do you rely on every week?

In September, I want you to improve community knowledge about one free tool that you rely on every week in order to get your job done.

Your first reaction is gonna be that everybody already knows it, but trust me, they do not. Just by reading this blog post, you’re already ahead of many folks out there who don’t have the time to keep up with the industry. Imagine that you’re talking to a brand new hire at your organization who needs to get up to speed on how you’re able to do your job so effectively.

Pick one of these things to share:

  • Introduce the tool to readers for the first time
  • Tell a story about how it saved your bacon
  • Share a non-default configuration option that you use, and why
  • Write a review – explain what you like about a tool and what you wish was different
  • Compare several free tools that do the same thing – explain the pros & cons of each one
  • Put together a list of learning resources for a free tool – maybe you like the tool, but it isn’t easy to use, and you want to put together a set of links to show a new user where to begin

And there are any number of ways you can share it:

  • Write a blog post (if you don’t have a blog, write on LinkedIn, SQLServerCentral, MSSQLTips)
  • Record a short video
  • Improve the tool’s documentation

You can schedule it anytime you want during September. When it goes live, leave a comment here with a link to it. I’ll post a roundup post, and I’ll set up social media re-sharing so that I can keep driving new folks to your work over time. I’ll be working on it too – most of my September blog posts will be focused on free community tools.

Let’s make sure that nobody in our industry has to reinvent the wheel again!


Who’s Hiring in the Database Community? August 2022 Edition

Who's Hiring
15 Comments

Is your company hiring for a database position as of August 2022? Do you wanna work with the kinds of people who read this blog? Let’s set up some rapid networking here. If your company is hiring, leave a comment.

The rules:

  • Your comment must include the job title, and either a link to the full job description, or the text of it. It doesn’t have to be a SQL Server DBA job, but it does have to be related to databases. (We get a pretty broad readership here – it can be any database.)
  • An email address to send resumes, or a link to the application process – if I were you, I’d put an email address because you may want to know that applicants are readers here, because they might be more qualified than the applicants you regularly get.
  • Please state the location and include REMOTE and/or VISA when that sort of candidate is welcome. When remote work is not an option, include ONSITE.
  • Please only post if you personally are part of the hiring company—no recruiting firms or job boards. Only one post per company. If it isn’t a household name, please explain what your company does.
  • Commenters: please don’t reply to job posts to complain about something. It’s off topic here.
  • Readers: please only email if you are personally interested in the job.

If your comment isn’t relevant or smells fishy, I’ll delete it. If you have questions about why your comment got deleted, or how to maximize the effectiveness of your comment, contact me.

Each month, I publish a new post in the Who’s Hiring category here so y’all can get the latest opportunities.


Office Hours Speed Round: Text Edition

Got questions for me? Post ’em at https://pollgab.com/room/brento and upvote the ones you’d like to see me cover. I filter out the ones that are too short for video answers, and here’s the latest batch:

Thomas Franz: When I have a very large table with no statistics (or not on a specific column in the WHERE) and want to show the estimated execution plan it takes minutes because it builds the statistics first. How can I prevent / speed up it (want the create stats asynchron as stats updates)?

There’s no such thing as asynchronous stats creation, only updates. SQL Server needs the stats first to create the plan, or else the plan would literally be based on random guesses.

Kinneret: Is table ordering (Table B appears before Table A) within a view definition ever important for query performance?

Yes, read this all the way through.

SQLSteve: I have inherited a system that has an unusual index setup. Most tables have a clustered index but also have a PK non-clustered index which is the same as the clustered. IndexUsageStats show 0 for some NCI’s but can also show usage up to 50% of the Clustered.

Cool. As long as we’re just sharing stuff we got from others, I got a metal pineapple with an unusual inside compartment:

A metal pineapple - shout out to the fine folks at é

I don’t have any questions about what I have either.

Luca: Are Azure DR nodes more prone to “replica not healthy” errors? Our nodes are in a different region & every ~2 wks we see: Always On Availability Groups connection…terminated / The local replica of availability group … is preparing to transition to the resolving role

My gut feeling is that sysadmins who use the cloud are just more reckless when it comes to network config. They’re more likely to just up and change firewalls, subnet configs, routing, etc and wing it as they go.

Midwest DBA: Hi Brent, Does a Senior Development DBA need to be better at writing T-SQL than developers? Do you need to be some sort of T-SQL Guru, or just know the most common anti-patterns. Also, any good book suggestions on T-sql Anti-patterns?

I think they need to better understand the execution plan ramifications of the T-SQL they write. For T-SQL books, get Itzik Ben-Gan’s.

WhyIsItAlwaysOnFire: Any recommendations to prevent Non Yielding Scheduler errors? Seeing occurrences of this and subsequent thread pool starvation occurring during index maintenance.

Patch SQL Server, Windows, and your hardware drivers/firmware/BIOS.

Film_Buff: Hello! I’ve heard you say that doing sorting in SQL Server is expensive since one pays for licensing, and that it should be offloaded to the app. Is this a difficult task for developers to workout? What are the things to consider in regards to design and implementation? Thanks!

Your developers should be able to sort a list or an array fairly easily. That isn’t something that requires a lot of design planning. You’re overthinking it.

Midwest DBA: Will my SAN admin try and get me fired if I make all Identity columns Big Int going forward? Or is there a case when you can trust the business and set Identity to Tiny INT or INT?

Read how row compression works.

Martin: Our company has 3rd part DBA management. They do one size fits all index maintenance( reorg >5% and rebuild >30%) for all user DBs once a week. Is it better to allow them to do this, or not? They also run a nightly update stats job against all user DBs. Is this helpful or harmful

If you don’t trust your remote DBA firm, I can help you find a better one.

Garry Kasparov: I have a query that calculates Instance wide signal wait time as a % of clock time per period. if (instance wide signal wait time / Clock time )*100 = 100, then I’m thinking the instance would benefit from an additional CPU core. Is this a easy way to determine more cpu?

Without looking at the queries using the most CPU to see if you can tune them? That seems like a pretty expensive script.

DBA douglas: Hello Brent, I’m currently in the USA. I’m wondering if I can move to Europe or New Zealand, and provide American Companies off hours DBA support? Do you think companies would find it valuable to have an American Citizen Working in a different time zone?

Why would their DBA need to be an American citizen?

I_wish_I’d_done_better_in_school: What are the 3 most common issues that clients hire you to fix?

Slow performance (#1 by far), unreliable SQL Servers, and planning for the future.

I_wish_I’d_done_better_in_school: Hey Brent, Do you see many productions DBA’s skilling up in powershell still? Or are there new tools that make this unnecessary? I’m trying to get my team to adopt DBAtools, but my manager says that all the DBAs would have to learn powershell and that is too much to ask.

Be the change you want to see in the world. Consistently solve problems with PowerShell faster than your teammates, and they’ll want to be more like you. (Assuming, that is, that you can – if you can’t, then there’s your answer too, but you don’t need them to adopt it first in order to improve your own skills.)

Greef Karga: What are your pros / cons for generating / executing dynamic SQL in the C# client app vs doing so in a stored proc?

Developers are best at the language they use the most. What language do your developers use the most?

Qaiser: Rolling upgrade or Backup Restore for database migration in AG from sql 2012 SP4 to sql 2019 Enterprise. Some experts say that Rolling upgrade is risky as no rollback option.

Read this.

sqldeo: Hi Brent,I got unusual issue for my tempdb,i have 8 tempfile for 8 core cpu but not growing all equally (initial size and growing size equal across all files)using sql 2019 so traceflag 1117 also not recommended by MS,any solution to look the problem !thanks

Grow ’em out manually.

Seriously, just do it once and be done. Grow ’em out to 25% of the total size of the databases on the server for starters, and just be done with it. Why waste a minute of your life trying to solve this problem? If they continue to grow beyond that size, take my Fundamentals of TempDB class to understand what’s using the space and what you can do about it.

Valdemar: Should stored procs that create local temp tables also drop the local temp tables upon completion?

You can if you want, but it doesn’t really matter, as Paul White discusses in these posts. SQL Server keeps the structure & statistics around for other sessions. The drop only affects your ability to access the temp table after your proc finishes.

WBurgh: Good old MDS, any word on the street on when it’s being killed off? Asking since I’m not seeing anything in the Azure pipeline from MS.

No, but I certainly wouldn’t do any new implementations of it.

LegalEagle: Good Morning Brent! I have an availability group running on physical hardware and wish to add an additional node but add it as a VM to offload reporting. Can I mix a physical SQL with a VM SQL in the same AG? Google doesn’t seem to have information on that. See you at PASS!

Yes, you can, period. See you in Seattle!

Scrootch: My friend is struggling with PAGELATCH on the tempDB, leading to GAM contention. Server spec: 2019 Enterprise, 24 CPU/tempDB files, 320GB memory, 760 batch req/s. Would adding more tempDB files than CPUs have a negative effect to address this issue? Eg adding 32 file vs 24 CPUs.

Check out my Fundamentals of TempDB class.

Maksim: Has moving lob data off row for a given table ever moved you across the performance finish line?

No.

CakeAndEatItToo: The ‘missing index’ dmv is limited to 600 results, but for a SaaSy server with 1000+ db’s there’s a lot of cpu cycles spent cleaning up every couple of minutes. Is it possible to increase the limit or scale back the cleanup or do ANYthing without turning the whole thing off?

I’d be curious as to how you’re measuring that. Post a question on DBA.StackExchange.com with your evidence showing the cleanup of that DMV is causing overhead on your system.

Bill: I have an OLTP system with cache instability, but plenty of cpu to cover it. I had a frequently run proc that runs in <10ms., been stable for years, start blocking, and queries to timeout after 30s., seemingly due to it being Recompiled. indexstats are current. Where to start?

Run sp_BlitzCache, look at the Priority 1 warnings, and read the URLs in each of those.

Kinneret: Have you ever written an epic SQL query only to see SSMS freeze up or crash?

Click File, Save. Save your work every few minutes. No different than the advice I’d give to anyone using any computer application, ever.

Mollusk: What are your thoughts on running SQL HA technology like AG’s or a failover cluster instance in a public cloud. Issues are that VMs could be vmotioned (a lot). How will this affect auto-failovers or cluster health? Can I just set healthcheck and timeout settings higher?

I haven’t done hands-on AG work in years, sorry. I got out of production DBA work because complexity kept rising, fragility kept getting worse, and I kept hitting weird undocumented problems. I got tired of being on-call to fix that kind of thing.

J Katz: How to get a job starting out with perf tuning? Im a Senior infrastructure DBA, and I want to get experience perf tuning –I love your courses BTW, best investment ever!. Does Microsoft have positions like this? Should I expect to take a pay cut?

Use the Buddy System. Call people in your network who know you know your stuff. They’ll get you past the stack of resumes. Generally speaking, Microsoft doesn’t tune the T-SQL or indexes that runs on their customers’ servers.

DataPayload: Hello Brent, My friend understand that 1=1 help to comment out and 1=2 in create table without data in where clause. He is not sure if these helps the query logic or plan any other way. Can you please help him. Thank you.

Read this.

SQLServile: Our server is to be replaced with a newer one, which is fine. The staff doing this though, say it will be “too hard” to retain the original servername in this exercise. A new name will break countless connections (but hey, not their problem). Is there a less disruptive way?

They’re right. Instead, immediately after migrating to the new server, rename the old server as something else, like PROD_OLD. Create a DNS CNAME (alias) for the old name, and have it point to the new server’s name.

Yousef: Is it ever OK to specify an additional / redundant WHERE condition value to get a better index selection?

I would want to see an example.

Dan Griswold: Should Managed Service accounts like NT SERVICE\…. be disabled for security purposes? Why or why not? I have searched all over and can’t seem to find a clear answer.

I don’t do security work.

Raguel: Do you know of anyone that has ported the Stackoverflow or AdventureWorks DBs to other platforms (PosgreSQL, etc) for bakeoff performance testing purposes against SQL Server?

Read this.

Galina: What are your pros/cons of streaming on Youtube vs Twitch?

In my experience, Twitch pays better. When I was streaming regularly, I was getting $500-$750/month from Twitch, where as I don’t think I ever cracked $200/mo from YouTube.

Stacey Q: What is the largest DB you have seen that had enough server memory to cache the entire DB in RAM?

About 1.5TB. I’ve seen much larger databases, but they just weren’t cached in RAM.

Shaheen: How does index tuning for data warehouse queries differ from index tuning for OLTP queries?

These days, data warehouse tuning tends to start with clustered columnstore and go from there. OLTP tuning starts with multiple rowstore indexes.

Don’t Blame Anthony: How does SQL server determine row order for “Select Top” or “Delete Top” when no order by clause is specified? I see this pattern frequently.

Whatever rows the query plan pulls first, that’s what comes out. Order isn’t guaranteed without an order by. It can change based on parallelism, available indexes, merry-go-round reads, all kinds of stuff.

Stockburn: Hi Brent, company has 600K+ databases in Azure SQL DB. How would you change your perf tuning methods, investigation when faced with so many DBs. They all use a similar schema but may be used more / less heavily. We have been using query store, interested in your advice.

Read this.

Marc Spector: Sometimes when we run sp_whoisactive we see long duration sessions from our .NET apps with a sql_command of “sys.sp_reset_connection;1” and status of “dormant”. What causes this? Is this anything to be concerned with?

Read this.

TheCaptain: Hi Brent, Some developers at my organisation have written a module to encapsulate some common client parameters, making them optional and default: timeout=0, connection_timeout=0. What’s your opinion and do you know of any articles with guidance on best practices?

No.

MikeNM: I was watching your talk about debugging stored procedures, and your laments about the death of debugging in SSMS. Have you worked with SQL debugging in VS Code? If not, what do you recommend?

I use the technique I describe in the video.

Cats_Everywhere: Brent, Do you know any DBAs who have taken advantage of the WFH movement to move to a low cost of living city/area? Any suggestions? Must have great restaurants!

Yes, but it’s typically been to move closer to family.

Tefnut: When SQL connections are re-used via .NET connection pooling, does the sproc author need to take care to reset connection level settings that were enabled in the executed stored procedures? Ex: At the end of the the sproc do SET NOCOUNT OFF, SET XACT ABORT OFF, etc.

No, but remember that the procs might be called by other things, not just .NET.

Yourbiggestfan: Hi Brent, Can you point me to any website/blogsite which has performance tuning challenges/exercises for SQL Server where we can test our knowledge? Your invisible indexes challenge is the kind I am looking for.

Yes, click Training at the top of the site.

Emelio: What is the best book on Data Warehouse Design?

I have no idea.

NotCloseEnought2RetirementToStopLearning: Hi Brent My shop has been mainly SQL 2012 and 2016 (nagging my manager to upgrade). Can you recommend key\cool features released in SQL 2017, 2019 and 2022 that I should focus on so I Iook super smart when we do upgrade? One feature per version please.

Read this. 

Taweret: What are the top rules / guidelines from older SQL Servers that you see used on new SQL Server versions but shouldn’t be?

To monitor Page Life Expectancy and to lower Fill Factor in order to reduce fragmentation.

Khonsu: What are the pros / cons of creating NC temp table indexes inline vs a separate step after table creation?

Read this and/or attend my Fundamentals of TempDB class.

Steven Grant: Do you have a recommended sample DBA calendar that shows admin functions to perform daily, weekly, monthly, yearly for SQL shops with little previous DBA structure?

No. Neat question, but no.

James: Hi Brent – With the non stop surge towards cloud, companies desiring and hiring more towards BI/Data Teams/Engineers etc, and being adamant that we are not needed, is it time to accept the inevitable that all respect and understanding of a DBA is gone and we need to move on?

Yes, as you can tell by the complete lack of database questions above, there’s no work left to be done. You should move on. You go on ahead without me. Best wishes, hugs, thoughts, and prayers.

Midwest DBA: Who is the Brent Ozar of Data Warehousing? Do you have any recommendations on training resources?

Data warehousing is being reinvented, and there are tons of approaches. For example, check out James Serra’s video on what a modern data warehouse entails:


Office Hours: Dodging Work Edition

Videos
8 Comments

Post your questions at https://pollgab.com/room/brento and upvote the ones you’d like to see me cover. Today, I’m dodging work, so I went through your questions while I waited for the coffee shop to open:

Here’s what we covered:

  • 00:00 Introductions
  • 03:28 Ivan: What are the top signs that a table has a poor clustered index?
  • 05:39 NotCloseEnough2RetirementToStopLearning: Hi Brent, it seems the job market wants and is paying more for data generalists (wider but less deep skillset) than data specialists (DBA, deep but less wide skillset). This seems to be limiting salaries and the number of Senior positions. Any thoughts or comments?
  • 07:52 Gennady: When should you use a table per year of data (and union them together in a view) vs using formal sql table partitioning?
  • 12:32 Mr. Ed: How would I determine if my IT team needs a DBA?
  • 14:28 Tom: Can you recommend a live T-SQL class?
  • 15:23 Jorriss: If you were a woodchuck, would you prefer brick and mortar structures?
  • 17:18 J.P.: Why should I learn noSQL if I know SQL? Is it only because of the speed?
  • 19:06 Rob: We have a 4.8 tb SharePoint database that is causing a lot of issues like backups due to the file size. Has anyone run across this issue?

[Video] Fundamentals of Stored Procedures at SQLBits

T-SQL, Videos
11 Comments

Anybody can write a stored procedure with a little help from Google. This session is about how to write stored procedures that have a high likelihood of performing well and are easy to troubleshoot.

This fast-paced, all-demo session from SQLBits will NOT cover how to write a query, syntax, or performance tuning. This is about good best practices after you’ve written the first one – things like how to catch errors, how to pass in multiple values, how to debug without the debugger, and more.

If you enjoyed this session, check out SQLBits 2022’s free video library with the other sessions from this year, and for all years.


Why Adding Some Memory Doesn’t Fix All Caching Problems

Wait Stats
16 Comments

It seems obvious: add some memory, and SQL Server should be able to cache data, thereby making queries run faster … right?

Well, if you can’t cache everything you need for a query, you might be surprised to find that SQL Server may still read the entire table up from disk, regardless of what parts of the table are currently cached in memory.

Let’s start with a little setup. I’m going to set the server’s max memory at about 8GB, which isn’t enough to cache the entire Comments table from the Stack Overflow database. (I’m using the 2018-06 version of the database, in which the Comments table is about 21GB. I’m also going to drop all of my nonclustered indexes to force SQL Server to do a table scan.

Then in order to demo the problem as quickly and easily as possible, I’m going to write a query that you probably wouldn’t normally write. I’ll take the Comments table of the Stack Overflow database, which has a clustered primary key on the Id column, and query it to find all the comments Jon Skeet has ever made:

A couple of things to recognize:

  • There’s no index on UserId, so SQL Server will need to scan the entire clustered index
  • The ORDER BY Id actually pays off a little for the clustered index scan – because we asked for the data sorted by Id anyway

The first time the query runs on 8GB RAM…

If we look at the actual execution plan, we spent 48 seconds waiting on PAGEIOLATCH (storage reads) in a 22-second query:

During that 22-second query, we read the entirety of the Comments table up from storage. We can prove that by checking sys.dm_io_virtual_file_stats before & after the query runs. That’s where SQL Server tracks how much it’s read & written from the various data & log files:

SQL Server read 21GB of data from the Stack Overflow database, which is spread across 4 data files. That’s the size of the Comments table. Okay, so far so good.

The second time the query runs on 8GB RAM…

We’ve already got at least some of the table cached in memory, right? If we rerun the table again, surely our PAGEIOLATCH waits will drop a little because the buffer pool is warmed up, right? Right?

And we read the same 21GB of data up from disk again:

The problem is that this query needs the data sorted in the order of the clustered index’s Id, so it’s going to do a scan from the beginning of the clustered index to the end. When the query completes, sure, some of the Comments table is cached in memory – but only the end of it, the highest Ids. So when our query starts again, it’s like Groundhog Day – we begin reading the table from the beginning again.

Adding just some memory doesn’t fix this.

Let’s amp our server’s memory up to 16GB and try the query again:

I’m also running the query a couple of times just to make doggone sure the buffer pool is warmed up and that SQL Server’s total memory has risen up to match its target. Then, run it a third time, and check wait stats and storage reads:

The query still takes 22 seconds, and we still wait about a minute on storage because we’re reading it all back up from disk again:

But cache the whole table, and magic happens.

Raise the server’s memory up to 24GB, enough to cache the 21GB Comments table, and run the query a couple times to prime the pump:

Suddenly the query finishes in 3 seconds instead of 22, and we don’t spend any time waiting on storage whatsoever:

Because we did no physical reads at all – the only activity during this span was a little writing to TempDB by sp_BlitzFirst itself:

Keep in mind that this is a simplified demo to illustrate a complex problem:

  • Real-life queries are more complex than this, especially in how we have to dig into the plan to understand why they’re scanning an index in order.
  • Adding an index to get a seek is a great solution – but only if you’re reading a relatively small portion of the table. If you seek in and read a range of rows, especially on real-world-sized data sets, you can still hit this problem.
  • Real-life servers serve multiple databases at a time, each running multiple queries. It’s hard as hell to cache everything you need.
  • SQL Server uses memory for lots of things, not just caching data, and they interact in tricky ways.
  • This is especially tricky in cloud VMs where you might want to make a giant leap up in memory, but you’re required to buy way more CPUs than you need in order to get more memory.

The point of the story here was to explain why adding memory – even doubling or tripling it – might not be enough to put a dent in PAGEIOLATCH waits. If SQL Server wants to read pages in a specific order, and the pages it wants aren’t in memory, it will throw out cached pages on the very same object – only to read those pages back up from disk again later while executing the query.


How to Add Invisible Indexes in SQL Server Enterprise Edition

Indexing
19 Comments

Every now and then, a client says, “We want to add indexes to this third party application, but if we do, we’ll lose support.”

No problem – enter indexed views.

Say our vendor app is the Stack Overflow database, and the app constantly queries Users by display name:

That query plan scans the Users table because there’s no index on DisplayName:

And our mean ol’ nasty vendor won’t let us add any indexes. No problem – let’s add a view:

A regular view like that is just syntactic sugar that makes it easier to write queries. You hide the complexity in the view, and then people can easily select stuff from the view without understanding everything that the view is pulling together. A view is just a T-SQL shortcut, but it doesn’t change the database structure.

However, you can turn a regular view into an indexed view (or a materialized view, different terms, same thing) by creating a unique clustered index atop the view.

Normally, when we think about clustered indexes, we would assume that the Id column is the right one to use here, since it’s unique. However, the entire purpose of what we’re about to do is just to give ourselves an index on DisplayName – so let’s cluster on that:

Then run our query again and check out its new execution plan:

SQL Server Enterprise Edition says, “Hey, you’re looking for Users with a specific DisplayName – it just so happens I’ve got a copy of that over here, sorted in order!” (Standard Edition doesn’t know this trick.) It produces a nifty execution plan with a key lookup – without actually having an index operation on the Users table itself.

The real beauty, though, is that when our mean ol’ nasty vendor looks at the indexes on the Users table, they have no idea that we’ve done anything. If we need to open a support case with the vendor or upgrade the software, we can temporarily drop our indexed views, and then recreate ’em after the support call is over. (That’s actually important to do, too – if the vendor tries to alter the table, their scripts can fail due to the presence of the indexed view.)

Indexed views have tons of gotchas, and explaining ’em all is way outside of the scope of this blog post. I demo ’em in the live Mastering Index Tuning class next month, or if you’ve got the recordings, head over here now.

Speaking of which, you know who else has the recordings? Paul White and Michael J. Swart, both of whom had good answers to the challenge in Monday’s blog post. Paul used the indexed view, whereas Michael had a more, shall we say, “creative” solution.


Updated First Responder Kit and Consultant Toolkit for July 2022

Summer has turned the corner. Sure, right now the First Responder Kit is wearing its swimsuit, basking in the sun, but already as we speak, the days are getting shorter. It’s only going to be a matter of time before the Pumpkin Spice release of the First Responder Kit. Go outside and take a walk, work on your tan while you still can.

How I Use the First Responder Kit
Wanna watch me use it? Take the class.

To get the new version:

Consultant Toolkit Changes

I updated it to this month’s First Responder Kit, but no changes to querymanifest.json or the spreadsheet. If you’ve customized those, no changes are necessary this month: just copy your spreadsheet and querymanifest.json into the new release’s folder.

sp_Blitz Changes

  • Enhancement: ignore SQL Server 2022’s default In-Memory OLTP usage for TempDB if it’s low. (#3110)
  • Enhancement: ignore SQL Server 2022’s idle POPULATE_LOCK_ORDINALS wait. (#3105)
  • Enhancement: ignore queries in system databases that have recompile hints. (#3119, thanks Erik Darling.)
  • Fix: typo on a check. (#3116, thanks Andreas Jordan.)

sp_BlitzCache Changes

  • Fix: should no longer get arithmetic overflow errors on servers with huge numbers of reads and writes that overflowed bigints. (#2980, thanks sunsickteck and RihoA.)
  • Fix: the Average Max Memory Grant column contents formula was wrong. (#3120, thanks MrTCS.)
  • Fix: make warning capitalization more consistent. (#3096)

sp_BlitzFirst Changes

  • Enhancement: the file stats output section now shows the database name at the far right. (#3118)
  • Fix: only repopulate the ##WaitCategories table if its contents are older, not newer. (#3092)

sp_BlitzIndex Changes

  • Fix: lower memory usage on SQL Server 2019 instances that are facing a bug in an out-of-control number of entries in sys.dm_db_missing_index_group_stats_query. This DMV is only supposed to have 600 rows, but it looks like in some older 2019 CUs, the number of recommended index plans wasn’t capped the way BOL says it should be. (#3085, thanks Paul Neering.)

sp_BlitzLock Changes

  • Enhancement: add spid and wait_resource columns to output. (#3101, thanks David Hooey.)

sp_DatabaseRestore Changes

  • Enhancement: look for Ola’s CommandExecute proc in the current database, so it should be supported in more non-system-database scenarios. (#3094 and #3095, thanks Ben Wiggins.)
  • Fix: case sensitivity typo involving BackUpFile. (#3100, thanks Maarten Clardij.)

Bonus changes: Anthony Green kept the SQL Server versions file up to date.

For Support

When you have questions about how the tools work, talk with the community in the #FirstResponderKit Slack channel. Be patient: it’s staffed by volunteers with day jobs. If it’s your first time in the community Slack, get started here.

When you find a bug or want something changed, read the contributing.md file.

When you have a question about what the scripts found, first make sure you read the “More Details” URL for any warning you find. We put a lot of work into documentation, and we wouldn’t want someone to yell at you to go read the fine manual. After that, when you’ve still got questions about how something works in SQL Server, post a question at DBA.StackExchange.com and the community (that includes me!) will help. Include exact errors and any applicable screenshots, your SQL Server version number (including the build #), and the version of the tool you’re working with.


SQL Server Pop Quiz: A Key Lookup Without the Index

Execution Plans
14 Comments

How do we get this to show up in an execution plan:

Without having an index seek or index scan operation on the same table?

Take any copy of the Stack Overflow database and write a query that will produce a key lookup on the Users, table, but it’s not allowed to have an index seek or index scan operation on the Users table. You’re allowed to change the database structure – for example, if you need to change the clustered index, drop it, add nonclustered indexes, whatever – and of course include those changes in your comment as well.

Post your query in the comments here by end of day Wednesday, and I’ll pick my favorites and put ’em in the blog on Friday. My favorite creative answers will get 1 year access to the Recorded Class Season Pass: Masters Classes. My own answer blog post is written & scheduled for Thursday.


[Video] Office Hours in the Valley of Fire State Park

Videos
0

I drove the Ferrari out through the Valley of Fire State Park and stopped out in the middle of nowhere to take the questions you’d upvoted at https://pollgab.com/room/brento.

Here’s what we covered:

  • 00:00 Introductions
  • 00:39 Saket Mittal: Hi Brent, I am working with a developer on a query, this query has lots of join , cross apply subqueries etc. When running this query, it gives result within 2 mins, but when this query runs as select … into #table it keeps running for ever. Could you suggest the reason?
  • 02:43 Anthony DiNozzo: Did DBA Brent ever have to deal with boredom on the job? How did they deal with it?
  • 04:40 Lennie Briscoe: Do you read the SSMS release notes? Anything exciting of late?
  • 06:04 Rashad: Has SQL trace flag 1800 ever boosted you over a HADR / Log shipping speed bump?
  • 06:54 Martin Guth: Hi Brent. Regarding HA I soon have to switch from FCI (we ran for years) to another option (currently running Standard Edition). What’s your take on Mirroring in 2022? Is this still an option? Can you name 3 perks moving to AGs on Enterprise Edition could bring regarding HA?
  • 09:50 Too Much Spare Time: Hi Prof. Party Person, my friend is an app developer (but long time SQL Server enthusiast) working at a big company with hundreds of sql servers & team of DBAs. He feels like only one DBA is friendly/easy to work with – any suggestions to keep on the rest of the team’s good side?
  • 11:42 The Ferrari

Office Hours Speed Round: Text Edition

Not all of the questions you post at https://pollgab.com/room/brento are hard. Some of ’em can be answered in one line:

Q: accidentalDBA: Hi Brent, My friend is moving from two physical servers (production + failover) to VMs on the same host and SAN. His manager wants to keep log shipping as a DR solution. I advised against this as most disasters would affect both VMs equally. Am I missing something?

If both your servers are in the same physical location, you don’t have disaster recovery. You just have disaster.

Q: Efraim: When refactoring a clustered index for a multi-terabyte table, is it recommended to take the db out of full recovery and put it into simple recovery mode first? Concerned about maxing out the transaction log during the refactoring.

Will the business allow you to lose point-in-time recovery while this operation is happening? If no, then no.

Q: Doug E: Is Azure Data Studio ready for SSMS users to make the switch? If not, what functionality needs to be added to Azure Data Studio?

Why do you need to make the switch? Your answer to that question will determine whether Azure Data Studio has what you need in order to switch.

Q: Pramod A: On one of the servers, BPE is enabled and the SQL max memory is set to 4x the physical RAM. Does this make sense ? Are there any best practices for SQL max memory configuration when BPE is enabled specifically ?

I haven’t seen a good use case for BPE. Read this. I’m sure there *is* one somewhere, I just haven’t seen it.

Q: SaveTonight: You always say that once we move to the cloud we cannot move back to on-prem. Azure doesn’t let you manage your backups, but there must be a way to go back to on-prem from Azure SQL Database. Azure/AWS cannot steal your data, that would be a law infringement, right?

They didn’t steal it. You gave it to them.

Q: AlwaysLearningDBA: Hello Brent, a friend of mine : ) has been asked to implement TDE on two node AlwaysOn with Replication. He is starting with Failover Cluster with Replication for now.He knows that you don’t work with Replication.Google didn’t help much.Do you recommend any resources about this?

You’re asking me for good learning resources for things that you know I don’t use? Okay, uh, sure, try this. That video is everything I know about 2-node AGs with TDE and replication.

Q: Jack McCoy: Which job provides more satisfaction / income, development DBA vs production DBA?

For me, development DBA because I’m not on call, and I can provide a high value in a short period of time. Production DBA work is more grueling, involves a lot of on-call and weekend time, and isn’t as financially rewarding because you’re seen as the cost of doing business to keep the plumbing flowing.

Q: Anthony DiNozzo: What percentage of work time should DBA’s commit to continuing technical education?

Learning the right stuff pays off in higher future earnings. How much more would you like to earn in the future, as opposed to today? Figure out what you want to earn, and then learn what it would take to earn that. If your goals are minor and incremental, your current employer might sponsor it. If your goals are major and involve leaving your current employer, well, you’re gonna have to figure it the %$&* out.

Q: Joe Friday: Is clustered index fragmentation any more / less detrimental to query performance than non clustered index fragmentation?

Watch this.

Q: Pramod A: A table has 80 statistics ( Index + system ) on table row count of 160 million. To improve the performance of the query, if I have to pick few stats specifically and update those stats with full scan, how would I know which stats to pick in order to expect a performance gain ?

The ones the query is using for its cardinality estimation decision. We cover how to do that in my Fundamentals of Query Tuning and Mastering Query Tuning classes.

Q: Omer: Did DBA Brent have a outlook rule / folder strategy for classifying / triaging the various SQL errors encountered on a daily basis during their career? If so, what strategy was used?

As soon as you start putting alert emails into a folder, you’re screwed. You’re not looking at those folders. Go change the alert thresholds so you only get actionable emails you’d want to have in your inbox.

Q: Ned B: Is there any harm / benefit in using SQL page compression if the SAN is also compressing the data?

Compressed pages can help you can store more in memory and get faster throughput to storage since less 8KB pages go across the wire.

Q: DBAInHiding: If you’ve been employed as a staff employee for various companies over the years but feel your next gig as a DBA or developer will require you to be a 1099 contractor, what’s the essential checklist to prepare for such a move and what gotchas should you be on the lookout for?

I don’t have an easy answer for that one. I’d pick up a book on how to start freelancing.

Q: Anul: What is the ideal cluster and sector size for new SAN drives hosting SQL data files and transaction log files?

Ask your storage vendor.

Q: Kurt Wagner: What are the monitoring hurdles / challenges when moving bare metal SQL from onprem to Azure SQL? Do we throw away our current SQL monitoring software?

Not all of the same monitoring data is available, and when it’s available, it has to be captured in different ways.

Q: Hello Brent, my friend need some help to understand apply, cross apply and outer apply operator. He understand joining table and function using those, but not joining 2 table using cross or outer apply. Probably he is missing some basic understanding.

Pick up this book.

Q: Andrew: Hi Brent, On daily basis we keep our data on a SAN storage, so does the SQL servers, not so long ago we made a SQL RAS, and i asked the Microsoft Premier Field Enginer about the virtual disk separation for the SQL data files. What do you think about it?

What’s the problem you’re trying to solve?

Q: Raphra: Hi Brent, I thought I knew how a recursive CTE works until I saw the execution plan of a recursive CTE (that was to return the last 100 calendar years) and I saw a Nested Loops (inner join). Can you explain what a Nested Loop operator is doing there?

Not without seeing the plan, but I’m not sure why you think nested loops are bad. They can make perfect sense for some operations.

Q: Hany: Hello Brent, what is the best Azure video learning resources out there?

Read this.

Q: Bob: Why is an Int primary key on a date dimension recommended instead of a date? All tables that link to the dim keep their native type and can use built in date functions. This can also avoid joining the Date dim for simple queries.

Ask whoever recommended that to you.

Q: Quartz: Why does CAST(‘2022-06-13’ AS DATETIME) fail for SET LANGUAGE british but work for SET LANGUAGE us_english?

I didn’t know British was a language.

Q: DBABA (database administrator by accident): What are good alternatives to dacpac deployments? (Not having to write manual migration scripts but defining the target state of the database)

Check out Redgate Schema Compare.

Q: I Inherited It: I have a server that has a 14TB Data Warehouse database on it. For a couple of days each month, it get’s hammered, often by very poorly written queries and performance suffers. It never recovers, even when the traffic backs back down, performance remains bad until reboot. Why?

Analyze its wait stats like we talk about in Mastering Server Tuning.

Q: YourbiggestFan: Hi Brent, If you are asked to trbshoot perf issues on az paas database, how would you go abt finding the main bottleneck. Would you use sp_blitzfirst(log to table) or use query store to get reliable wait stats info or via any other way? (Account for DB scale down during offhours)

I’d log wait stats to table.

Q: SteveB: In a recent office hours you said the best way to upgrade an AG was with a Distributed AG. Is there a reason you would not add the new servers to the current AG and just failover to them? We done this and it worked great but I might have just gotten lucky.

Yeah, whenever I have new servers in an AG, I prefer testing their failover processes, doing cluster validation, etc before go-live, and that’s really dangerous when you have an existing AG.

Q: Yabba DB Doo: We have a web app in Azure that has individual databases for each of our customers. What’s the best way to manage DB changes during development and then deploy them out to multiple production databases?

See above.

Q: MikeO: My vendor friend is using sp_prepexec. When passing the query with the recompile option the plan shows using an index seek. When passing the same query without the recompile option the plans shows using an index scan. Is this a parameter sniffing issue or a quirk with sp_prepexec

I can’t tell without seeing the queries and plans.

Q: mr_Arturo: hi Brent, a friend of mine (: recently changed the job and “he” (: noticed that the new company use lots of views, but even stranger is that some of views are request data from other views! What about performance and should it be avoided request the view data of the view data?

Read this.

Q: Ozymandias: Is there a DB size where native SQL backup / restore becomes a concern?

Generally by the time you hit a terabyte, full restore times are tough.

Q: mukesh chaurasia: Dear i am new in sql administrator so i want to know about which book is best to guide about sql administrator real time problem.

Read this.

Q: Daniyyel: Is it ok to run perfmon directly on the SQL server?

No, try this DMV instead.

Q: Konstantin: How does Jorriss combat oxidative stress?

He’s made of carbon fiber, not metal.

Q: Odafin: Any noteworthy columnstore improvements in SQL2022?

Read this.

Q: Jethro: Is the OPTION Label hint ever helpful?

I’ve never used it.

Q: Experto no en la materia: What is the largest number of active databases you have seen hosted by a single SQL instance?

10-12,000.

Q: Arslan: When troubleshooting a SQL agent job step failure, is it better to “Include step output in history” or to output job step results to an Output file?

Which one is easier for you to parse when it’s a big long output? There’s your answer.

Q: David: Who is the “Brent Ozar” of the SSIS world?

Andy Leonard.

Q: SqlPadawan: Hi Brent! A part from grabbing parameters and good/bad plans (as you teach in MPS), how have you used Query Store during your 2-day emergency rescue? Any other practical uses in terms of performance tuning?

Not during my 2-day SQL Critical Care, no, but if I had a full time DBA job again, I’d use it.

Q: Dev on the Dark Side: What are the top 10 (or even 5) things that you can “teach” a development team to do / not do in regards to a mix of stored procs, entity framework, & CF ad hoc queries. So many problems… so little time…

If your developers can only learn 5-10 things, they’re doomed.

Q: Kunoichi: Hi Brent! My question is related to CDC : Why does MS SQL Server allows ONLY TWO capture instances on any given table at a given time?

For “why”, ask Microsoft.

Q: Steve E: Hi Brent, All of our SQL Servers are currently on premise but I feel I should gain some awareness of cloud services such as Azure Managed Instances or Azure SQL Databases. Is there a way I can do this as a personal user? Most of the plans seem to be 00’s of (British) pounds p/m

Read articles or watch videos rather than running your own server.

Q: Eric Beaumont: My friend read a blog that write once file shares are good at stopping potential ransomware from infecting sql backups. Have you ever implemented at write once file share for sql backups? What all is involved?

The DBA shouldn’t be the one implementing it. The sysadmin team should be. This kind of thing is usable across all servers, not just the database servers.

Q: DBA by addiction: What kind of malware / virus protection can or should be installed on a SQL-Server? Can you give some advice if things around this topic are pretty much senseless?

Whatever your security team requires.

Q: Have you used SQL Server with docker? If so, please elaborate.

Yes, in an early attempt to do SQL Server development on my Mac. It was a huge pain in the rear. I hated it. I can see why people would use it for continuous integration, but it’s not for me, especially with the new Apple Silicon processors.

Q: Hi Brent! A CLR function uses a temporary table. I don’t want to use forced parameterization at the database level at this time. In this case, how to deal with the bloated plan cache?

I’m not a fan of using CLR in SQL Server for a whole bunch of reasons, but if you insist on using it, then I’ll assume you can solve the problems you run into.

Q: Emelio: Does page level compression affect backup performance or size?

Read this.

Q: Fyodor: Do you have a File Group strategy you recommend when creating new SQL databases?

If it’s going to be 1TB or larger within 30 days, create a new filegroup with 4 files spread across 4 volumes, and make that the default filegroup.

Q: Does Basically Anything: Hi Brent, I’m having issues with the T-log not freeing space after log backups and growing a lot. There are no runaway queries, but there are 200+ sleeping SPIDs with is_implicit_transaction = 1. Are they the likely culprit? What troubleshooting path do you recommend? It’s ISV.

Read this.

Q: Kinneret: What is your opinion of SQL 2022 Buffer Pool Parallel Scan feature? Who will this benefit the most?

I haven’t looked into it at all, but the Microsoft post on it implies that it focuses on people with a lot of memory, say 512GB and above.

Q: Jayden: What are top gotcha’s you see when using implicit transactions in TSQL and C#?

Read this.

Q: Biz: Is there a good way to view cache hit / cache miss ratios for when SQL has to load data from slow disk vs fast memory?

By going back in a time machine to 1999 when that was how we did performance tuning. Today, use wait stats instead.

Q: Bart: Have you consulted on Microsoft Dynamics AX (2012) database query/index optimizations?

Yes.

Q: Lincoln: What bad things can happen when data warehouse and OLTP mixed workloads are run against the same SQL 2019 enterprise instance?

Slow performance.

Q: RoJo: We have Dev, Stage, Prod environments. Sometimes the config (for example) points to the Dev from Stage or worse. To isolate them from each other, I’m thinking white listing inside Windows Firewall. Is this too slow, or any best practice to keep collateral damage from near servers

I don’t think I even understand this question. It’s probably too big of a scope to ask for Office Hours – try dba.stackexchange.com.

Q: Henrique Almeida: Hi Brent, how are you, I hope so. Which free SSMS add-ins do you recommend and if not, why?

You hope so what? What the hell does that even mean?

I don’t use any because I can’t usually install apps on the client machines I work with. I’m sure there are good ones out there, but I just don’t have experience on what they’d be.

Q: Raphra: I like notebooks in Azure Data Studios and I can see how we can add comments to stored proc, by replacing the stored proc’s text with a notebook file (with markdowns for comments). However, do you know how I can call this notebook from SQL Agent instead of the stored proc?

I adore notebooks too. I wouldn’t use them as scheduled jobs though.

Q: Maani: Hi Brent! how come select from a view executed by ‘sp_executesql’ is much slower than executing it as a query (it took 4 mins to be executed)?I wrote option(recompile) at the end of the select clause from view, and the request was executed in less than a second.(was no parameter)

I would need to see the query and the execution plans to answer that.

Q: Neil: i have a job that uses OLE Automation to pull some json stuff. can i enable and disable OLE automation at the beginning and end of the job to prevent anyone else from using it ?

No, it’s completely impossible to enable and disable OLE automation, as shown by this link.


This Is Your Last Chance to Attend My Mastering Classes Live This Year.

Conferences and Classes
0

You’ve conquered my Fundamentals classes, and you’re ready to take it up a notch.

You know you can get the recordings of my Mastering classes, but you prefer the interaction of live classes – the ability to ask questions live, have me go off-topic for demos, and block out your calendar to make sure you conquer the material.

You have one last chance for 2022:

Or attend the whole set for $1,995.

Once this round of live online classes finishes, I won’t be teaching classes until at least mid-2023. I totally love doing it – but after the pandemic marathon of teaching weeks of online classes every month for years on end, I’m ready to take a break for a while! So if you wanna hang out live, come on down. See you in class!