Here Are The Results of the 2022 Data Professional Salary Survey.

Salary
23 Comments

Every year, I run a salary survey for folks in the database industry. This year, I was especially curious to see the results to find out whether salaries went up. Anecdotally, I’ve seen a lot of people jumping ship to new companies due to the Great Resignation – but what does the data actually show? Let’s find out.

Download the results here and slice & dice ’em to see what’s important to you.

I’m a database administrator, so I filtered for DBAs in the United States and then looked at average salary, and yes, it took a pretty good jump this year:

About a 5% raise, much higher than previous years:

That’s a good sign, but I’ll leave it to y’all to dig into the numbers that mean the most to you, and in the comments, tell me what you see.

Update: Mattia Nocerino has a free Tableau report with this year’s data.


The SQL Server Posts You Read the Most in 2021

SQL Server
0

I think of my blog posts in a few different categories:

  • Posts that will only be read at the moment they’re published (like upcoming webcasts, humorous stuff, and career advice)
  • Posts that will show up in search history over time and continue to be evergreen (like how a feature works)
  • Posts that provoke discussion and interactivity (like asking you a question or running a contest)

So with that in mind, let’s take a look back at the stuff you seemed to enjoy the most this year. I’m gonna sort them in a few ways:

  • 2021 posts you read the most (could be flashes in the pan, or might end up being evergreen over time)
  • Older posts you read the most this year (evergreen)
  • Posts you commented on the most

Evergreen Posts You Kept Reading

Let’s start with these first because I think it helps paint a picture of the different kinds of readers.

If you’re going to write a post that stands the test of time, you want to solve a timeless problem that readers have faced for years, and will continue to face as time goes on. These tutorial posts aren’t often the favorites of readers when the post first goes live, but they’re the kinds of posts that bring in new readers over time.

Not only is it hard to write posts like this initially, but it takes work to continue to refine the content over time, adding in the kinds of key words and content that people are searching for. I actively prune some of ’em, and some of them were perfect when they were published.

If I *purely* wrote tutorial posts, then it’d be a different kind of blog. Pinal Dave’s SQLAuthority is a great example of that – I guarantee every one of you finds his stuff in your search results CONSTANTLY, and as a result, he has way, way higher page views than I have. I appreciate how much work he puts into that. I’m too lazy for that, as you’ll see in the next list.

2021’s Most-Read New Posts

Armed with that list of above evergreen posts, now you’ll be able to look at 2021’s top new posts and make a guess as to which ones will be evergreen, and which ones won’t generate any hits in 2022:

2021’s Most-Commented Posts

And with that, I’ll let you have the rest of the year off – no new blog posts will get published here for the rest of 2021. I’ll see you next year!


[Video] Office Hours: SQL Server at Sunset

Videos
0

Let’s kick back with a tasty beverage and go through your highest-upvoted questions from https://pollgab.com/room/brento on the beach:

Here’s what we covered:

  • 00:00 Introductions
  • 00:49 Alexander Speshilov: How to determine amount of memory (RAM) used by query store?
  • 02:23 Aslan: Merhaba Brent, in your opinion, what are the top 4 databases to administer by DBA pay scale?
  • 04:40 Murdock: In a 2 node sync AG, should my friend let the reports read from Primary or Secondary? What if the DBs from which the reports are reading have RCSI enabled?
  • 07:10 Tobias Fünke: Howdy Brent, we see a lot (3 thousand plans for 1 query hash) of duped query plans due to queries using literal values instead of parameterized values in the where condition. Are there any potential gotcha’s associated with enabling Forced Parameterization in this scenario?
  • 08:16 Annyong Bluth: Hello Brent, We have ISV app for which we made our own non-clustered index to help with speed for our own custom report. Unfortunately, this broke the next major upgrade script for that app / db. What are your thoughts on applying NC indexes to ISV app db’s?
  • 10:10 Buster Bluth: Hello Brent, from a hiring perspective, what are the traits that distinguish a junior SQL DBA from a Senior SQL DBA?
  • 11:30 Adam West: Hi Brent, is it possible/safe/wise to offload backups and CHECKDB to an AG secondary server?
  • 14:00 Wrap-up and discussion of life in Cabo

Check Your SQL Server Backup Performance & Safety with sp_BlitzBackups

Out of all of the scripts in our free First Responder Kit, sp_BlitzBackups is probably the one you’ve used the least. Let’s talk about what comes out of it, and why you should use it more often.

First, let’s define two terms:

  • Recovery Point Objective (RPO) – measured in time, it’s how much data you would lose if you restored a backup. For example, if your last backup finished at 1AM this morning, and it’s currently 10AM, you’d lose 9 hours of data if the server went down right now. That’s a 9 hour RPO.
  • Recovery Time Objective (RTO) – also measured in time, it’s how much time it would take you to perform a recovery. For example, if the server went down right now, and it took you a total of 1 hour to decide to do a restore, start the restore process, finish the restore, and then let people back into the app again, that’s a 1 hour RTO.

Businesses have a really hard time agreeing on what their desired RPO and RTO are because they wanna lose as little data as possible, and they wanna be down for as short a time as possible. That’s where sp_BlitzBackups comes in. It can’t tell you what your goals are, obviously. Instead, it estimates what you’re actually delivering today.

When you run sp_BlitzBackups, it analyzes the last 7 days of backups. Here’s what it returns:

I know, tiny screenshot – you can click on it if you wanna see the results, but I’ll describe the non-obvious columns for you:

  • RPOWorstCaseMinutes – the longest length of time between two successful backups. Say you regularly do log backups every 15 minutes, but between 9:10AM and 9:40AM, the backup target was offline, and no backups could be done. You had successful log backups at 9:00AM and 9:45AM. Your worst case for RPO would be if the server went down at 9:44AM, just before the 9:45AM backup ran, so you’d have a 44-minute worst case RPO.
  • RTOWorstCaseMinutes – the longest length of time it would take you to do a restore and get back online. Say you do full backups every night, and log backups every 15 minutes. Your worst case RTO scenario would be if the server went down right before the nightly full backup because you’d have to restore the prior day’s full backup, plus all day long’s transaction log backups. In order to calculate this number, we take the total backup time spent during all of those backups. Technically, this isn’t accurate because your restores could take longer than the backups, especially due to the lack of instant file initialization on the transaction log. This just gives you a rough starting point idea, though.
  • Supporting info – when people see these “worst case” numbers, their next question is, “Holy smokes, when was the time where we might have lost 44 minutes worth of data?” The following columns give you information about what the dates/times were and the backups involved, plus more-info queries so you can examine the history stored in MSDB, like this:

Wanna performance tune your backups too?

Continue scrolling to the right in the result set, and you’ll find:

  • FullMBpsAvg, Min, Max – the throughput you’re getting in megabytes per second.
  • FullSizeMBAvg, Min, Max – how large your backups are, before compression.
  • FullCompressedSizeMBAvg, Min, Max – how large your backups are, after compression.
  • Similar columns for Diffs, Logs – these are useful if you need to estimate change rates.

I love using backup throughput as an early warning system, like a canary in the coal mine. If backup throughput suddenly drops, it’s a sign that something went wrong with the storage or the networking. We can’t tell you whether it’s a problem with slower read speeds, slower write speeds, or more traffic on the storage network, but it’s just your clue that it’s time to start troubleshooting – because user queries are likely running more slowly too. And when backup throughput drops simultaneously across multiple SQL Servers, that’s an even bigger clue that something went seriously awry with the company’s overall storage.

When you sit down at your desk each morning, backups aren’t usually the first thing on your mind. Just take a few minutes today to double-check that you’re actually backing this stuff up as frequently as you think, and that downtime will be as short as you think. I wish I had a dollar for every time a client’s DBA got caught with their pants down because backups weren’t scheduled correctly or were regularly failing at specific days/times.

Oh wait – I actually do have a dollar for each of those times. Several dollars, in fact. Look, let’s not get sidetracked.


[Video] Office Hours: SQL Server Q&A in Cabo at Sunrise

Videos
4 Comments

Let’s watch the sun rise at Land’s End and cover your top-voted questions from https://pollgab.com/room/brento/.

Here’s what we covered:

  • 00:00 Introductions
  • 01:04 Jr Wannabe DBA: Hi Brent, a colleague wants to upgrade from Standard to Enterprise purely for performance gains on several regular servers (up to 8 CPU, 128GB of RAM, SQL 2016 or 2019). Is the change worth paying for? I cannot find any benchmarks to support or contradict such a change.
  • 04:12 OhMyKingdomForANetworkEngineer: Hi Brent! You mentioned using DNS CNAMEs to redirect during log shipping failovers. My friend wants redirect end users to new production after upgrade by changing CNAME, but knows nothing re: DNS, CNAMEs, or active directory. My LMGTFY didn’t help. Is this some kind of AD magic?
  • 06:38 Loud Howard: Hi Brent, what are your top MS hyped features of SQL Server that never caught on?
  • 08:54 SQL Steward: Hey Brent, a friend of mine is trying to explain to his boss why using a function to create a temp table which you then use to re-create your main OLAP fact table every 5 minutes is a crazy bad idea. How would you explain this in plain English? Really appreciate all you do.
  • 10:30 Francesco Mantovani: Hi dad, does liked server handle parallelism on the remote server? Is the OPTION (MAXDOP x) worth using in a linked server query? ….you rock, make those turtles go faster!!
  • 12:36 DBAGreg14: Is there any reason to stagger start times for DBCC CheckDB’s on AG replicas? We just added a 3rd node, and have each replica’s CheckDB running on the same schedule. Is there any impact on log replays between nodes when DBCC runs? Would the internal snapshot affect this?
  • 13:48 David: Hi Brent, Is it ok to defrag the Data files/disk? VM hosted on vmware platform.Datbases size 2-4TB.
  • 14:16 GreatWorkBrent: Were you aware that your videos don’t play in ‘restricted mode’? (my work enforces this mode) It may be due to your graphically violent slaying of common SQL problems, or your profane & incendiary comments about stupid people doing stupid things. Or you have a drinking problem.
  • 15:30 Kumaran: Hi Brent, we have app with many dyn SQL SPs. MS suggests to parameterize the execsql and app team prefers to not rewrite SP. But my friend says dynamic SQL should be avoided for better perform and for query stats (agree on stats), is he only right about dyn sql at app layer.
  • 17:15 Sunrise
  • 17:30 Kumaran: Could you suggest any solutions for sql server patching which is SQL Server feature aware?
  • 18:52 SQL100: When trying to design a Data Warehouse, there will be data coming from different sources- do you recommend to convert the data e.g. into a .csv file so that it can then be imported easily (after all the required filters have been done) or directly from the database? Thank you
  • 20:18 Brave Sir Robin: Hi Brent, Should we do any SQL maintenance tasks non traditionally if we have crazy fast TempDB (local SSD) and crazy fast SAN (NVME SSD)?
  • 21:25 Gob Bluth: Hello Brent, how do you measure write latency to an always on synchronous secondary replica?
  • 22:24 A DBA DBAing: Sometimes I see multiple values for the same parameter in the ‘cached parameters’ section of sp_blitzcache, what is the reason for that?
  • 23:23 Raphra: How do you select the picturesque places, where you stay and stream your office hours from?

[Video] Office Hours: First Post From Cabo

Videos
0

Join me on my balcony in Cabo as I review your top-rated questions from https://pollgab.com/room/brento.

Here’s what we covered:

  • 00:00 Introductions
  • 00:48 md: Hi Brent, as SQL Server works with 8K pages why is the recommended NTFS Unit Allocation Size for SQL volumes 64K? Wouldn’t 8K be a more efficient fit? Sneaking in a second question – I’m guessing with modern storage systems this setting doesn’t have a big impact on performance?
  • 01:47 Ratbert: Hi Brent, is there a good way to be notified when a new table is created on production? Would like to know when a new table is created by anyone on production.
  • 03:08 Dogbert: Hi Brent, do you foresee much commercial opportunity for someone creating online Postgres training classes similar to what you do with SQL Server training?
  • 04:20 Skunkbert: Hi Brent, when will you stream your “awful” playing skills in Dead By Daylight?
  • 05:16 icklemouse: Hi Brent. Pretty new to Azure as Evil Corp is now moving everything up from on prem. Would appreciate a knowledge check. I can find no SQL Db solution that allows a cross region auto-failover to take place. Everything I see involves a manual step. Did I miss something? Cheers
  • 08:00 San: Hi Brent, My DBA has set the autogrowth to 1MB and 10%, and this now have hundreds of fragments created. what is the best approach to reduce these fragments, do you advice defragmentation using contig.exe or shrink databases? I am lost here.
  • 10:06 Jim Ignatowski: Hi Brent, trying to investigate a stored proc that uses cursors. Unfortunately, execution of this sproc (with actual plans enabled) crashes SQL management studio. Too many plans generated I guess. Is there a better tool to capture the query plans for sprocs that use cursors?
  • 11:52 i_use_lowercase_for_select: Hi Brent, hope you have a decent Wi-Fi coverage there in Mexico. My customer is using mixed mode authentication for some of its apps. When googling for best practices on that I get inconsistent results. What is your stand on mixed mode vs windows-only authentication and why?
  • 13:43 DefaultDBA: Hey Brent, Really enjoying your office hours and recorded season pass training. My question is about INT vs GUID for key fields. We have systems that use both, but wondering if there are reasons to use one over the other.
  • 14:49 Wrap-up

Don’t Use SQL Server to Send Emails.

I would like to unsubscribe from this idea.

I’m not talking about Agent alerts when jobs fail. That’s fine when you can’t get the business to spring for a monitoring system.

I’m talking about using SQL Server to send emails to confirm a customer’s order, alert that a product is low on stock, or other business processes.

SQL Server’s email delivery isn’t very robust. Whoever started the documentation by writing “Database mail is an enterprise solution” had clearly never seen an enterprise solution in real life. Sure, it seemed robust when it came out a couple decades ago, but these days, anything involving the word “enterprise” needs to survive a data center failover. To protect your user databases, you’re likely using log shipping or Always On Availability Groups, and in the future, it’s only getting more complex with SQL Server 2022’s ability to fail over to Azure SQL DB Managed Instances. Database Mail lives in the MSDB database and Service Broker. Your messages ain’t coming along for the ride.

When email delivery stops, you won’t notice. People think, “Hey, things must be going great these days. It’s been forever since I’ve gotten a failure email.” In reality, there are hundreds or thousands of emails piled up in some SQL Server somewhere with a broken email delivery subsystem, like an old SMTP server address. You won’t figure it out until it’s too late.

SQL Server’s email troubleshooting is almost nonexistent. When things do go wrong, the logging isn’t detailed, and it’s not the system that your sysadmins and email admins are used to working with. They’re going to be shrugging and pointing the finger at you, going, “Well, the rest of our systems are doing email just fine.”

These things aren’t as much of a big deal for DBA-focused emails like a job failed. However, if your business users are relying on the email getting there, hand the task over to your developers, and have them use a truly modern enterprise solution like SendGrid or Simple Email Service.


[Video] Office Hours: SQL Server 2008 R2 Edition, I Guess

Videos
0

I went through your highest-voted questions from Pollgab and talked through ’em, and strangely, there was a bit of a 2008R2 theme!

https://youtu.be/3-B5L3Komq4

Here’s what we covered:

  • 00:00 Introductions
  • 02:00 Pete Donnelly: Hi Brent, I have a 7 TB DB and the largest table is 2 TB, has 6 indexes and 6 related Statistics. It has 42 “_WA_Sys” Statistics. We truncate this table every Jan 1st, to start the new year with a empty table. Should I drop the “_WA_Sys” Statistics when I truncate this table?
  • 03:58 LifelessDBA: Hi Brent, Suppose SQL 2022 GA is available now, Is there any kind of feature or enhancement convincing you to think about migrating to SQL 2022?
  • 06:26 2008R2people: Hi Brent, my company is still using SQL Server 2008R2 which will be end of support soon. Is it better to upgrade to newer version, or stay at 2008R2? Technically currently our apps are running smoothly in 2008R2. Which version do you suggest we upgrade to if we must?
  • 12:14 Mehdi: Hi Brent! How does the query optimizer calculate the estimated number of rows when there are clustered index seek and residual predicate in the actual execution plan?
  • 13:42 BiggieDB: Hello Brent and all the party people! I have CTE that is finding all “parent” elements…think finding a managers manager etc. Number of levels is unknown. Noticed that iit gets wildly incorrect estimations as a CTE. Is there any better way to write a query like this?
  • 16:01 Trushit: This question is more regarding the consulting career. If you were to choose a technology/tool today to specialize in which one would that be so that you can grow in your career for next decade or so? Does Power BI seem like a good tool to specialize in?
  • 19:14 Redgate State of the Database Monitoring Report: https://www.brentozar.com/go/report
  • 21:37 Paul Mugs: What would you suggest as a broad approach to version control for tables/views, everyone just points me towards github, but I struggle to see how this translates to what me and the team do in SSMS.
  • 23:15 Stefano: I Brent, about “blob out of the database”: I’ve a 5TB database with blob inside, AlwaysOn (async) replication between primary and DR site. Moving documents outside, how guarantee consistency between metadata and file system? And consistency of replication (data+file) on DR site?
  • 25:28 HD: There are so many valuable TSQL sugg & feedbk from the TSQL users on the user voice forum but MS doesn’t act on it. Does MS really read & respect the user’s voice or it is there for the name’s sake? SS still missing a good no. of features in the TSQL area compare to other dbs
  • 28:06 Dr. Dolittle: Dear Dr. Ozar, can my friend install Sql Developer Edition for his Dev and Test Server or are there any gotchas? At the moment he uses Enterprise Edition on both server
  • 32:33 Chris Weaver: I have been asked to look at a DB in SQL Server 2008 R2. No diagrams or documentation. Is it possible at all to work out what the relationships could be between the tables? There are roughly 500 of the *******
  • 35:52 Would y’all want to play co-op with me, or watch me stream games?

How Do You Test IOPs, Latency, and Throughput?

Storage
5 Comments

You’ve heard these terms thrown around about storage, and you’re not sure what they mean, and how they relate to each other. Let’s make it simpler by using an analogy.

Say you want to measure a shipping company.

You’re trying to decide whether to use DHL, FedEx, UPS, or your local postal service.

You could measure them by sending me a gift – after all, it is the holidays, and I do a lot of stuff for you year round, and it’s probably the least you could do.

  • You place one box outside
  • You call the shipping company to come pick it up, and start the clock
  • When I get it, I’ll call you to confirm receipt, and you stop the clock

The length of time between when you called, versus when I acknowledged it – that would be how long it takes the shipping company to move exactly one package. Let’s say – just to say – that you sent the package overnight. You called the company on Wednesday at 10AM, and I received it Friday at 10AM.

Our performance tests are:

  • Input/output Operations per Second (IOPs): 1 package every 2 days
  • Latency, how long an operation takes: 2 days
  • Throughput, how many operations are coming through at once: 1 package at a time

That test only makes sense
if you’re sending one package at a time.

Sure, in rare cases, you really do passionately care about how long it takes to send me a Christmas gift. (But seriously, why did you pick a 55-gallon drum of Froggy’s Fog? I don’t even own a bubble machine.)

But most of the time, when you’re benchmarking shipping, you’re doing it because you want to ship a lot of things, to a lot of people, at the same time. And indeed, if you try to ship more things at a time, you’re going to get dramatically different test results.

If you set out 10 shipments of Froggy’s Fog, and call the shipping company for a pickup, your metrics magically become:

  • Input/output Operations per Second (IOPs): 10 package every 2 days (goes up)
  • Latency, how long an operation takes: 2 days (this stays the same)
  • Throughput, how many operations are coming through at once: 10 packages at a time (goes up)

A good load test involves lots of simultaneous operations – ideally, a similar number of operations that you’d actually need in your daily work, but not too many. Because imagine what happens if you try to put 100 drums of Froggy’s Fog out for pickup, but the shipper’s default van can only hold 10 drums. The van driver will arrive, find too many drums out, and just maybe tell you, “Sorry, we’re going to have to come back tomorrow with a bigger truck to handle this much.”

Or, maybe on the other side of the test there’s a problem. Maybe my front door just doesn’t have that much space, and after dropping off 10 drums, the shipper says, “There’s no space left here – we’re gonna have to drop the rest off tomorrow.”

This isn’t necessarily a shipping company problem – instead, it’s a problem with our planning and our infrastructure.

Storage tests have the same problems.

If you only test with one storage request at a time, you’ll see artificially low IOPs and throughput numbers. Sure, the latency is roughly accurate for one request at a time – but that is rarely a useful number by itself.

If you test with too many requests at a time, you’ll surface different bottlenecks. You may not have enough storage adapters, paths, or even volumes to really take advantage of the storage device.

Even your package size selection influences the test. What if, instead of sending out 55-gallon drums of Froggy’s Fog, you chose smaller gifts, like a bunch of yodeling pickles? You might get an artificially high number of packages shipped per second, but if your workload really does involve 55-gallon drums, then the pickle test isn’t accurate. Similarly, you can see storage tests that are purposely trying to achieve artificially high IOPs requests, so they’ll do absurdly tiny I/O size requests.

It’s hard to plan storage for SQL Server. On Thursday, I’m doing a free webcast with Pure Storage to talk more about how I approach the problem when building and consolidating servers. See you there!


Going to SQLBits? Join me for my Mastering Parameter Sniffing workshop.

SQLBits
2 Comments

You’re a database developer or DBA with at least 3 years experience performance tuning queries and indexes.

You already know you have parameter sniffing issues, and now you need to figure out how to fix it. In this one-day class, you’ll learn how to reduce the blast radius with index changes, query tuning, and database-level settings. You’ll also see firsthand how SQL Server 2017, 2019, and 2022 try to reduce it with adaptive joins, adaptive memory grants, automatic tuning, and caching multiple plans.

Join me at SQLBits on Tuesday, 8 March, at ExCel in London for a one-day workshop. I’ll be there in person – my flights are already booked, and I’m excited to see everybody in person again. The conference is taking great precautions: masks are required at the event, tables will be spaced apart, and proof of double vaccination, Covid-19 antibodies, or a negative lateral flow test (within the last 48-hours) will be required.

This course is 100% demos: the only slides are the introduction and recap. The rest of the time, I’ll be working live in SQL Server Management Studio and SQL Server 2022 (assuming the preview bits are ready.) You can even follow along in your laptop if you bring SQL Server 2017 or newer and the 50GB Stack Overflow 2013 database.

I’m taking the best modules of my 3-day Mastering Parameter Sniffing class and doing it as a training day session. We’re going to be moving quickly and covering a lot of ground, and I know it can be tough to keep up. That’s why attendees will also get one year’s access to the full 3-day Mastering Parameter Sniffing recordings, too! That’s a £695 value, and it’s free with your workshop admission. (You’ll get the access in class.)

My SQLBits workshops always sell out, and you’ll wanna move fast to save your seat. Talk to the finance folks now to get the credit card primed up and ready so that when registration opens – it’ll open by the end of this month – you’ll be able to grab your seat quickly.

See you in London!


Who’s Hiring in the Database Community? December 2021 Edition

Who's Hiring
21 Comments

Is your company hiring for a database position as of December 2021? Do you wanna work with the kinds of people who read this blog? Let’s set up some rapid networking here. If your company is hiring, leave a comment.

The rules:

  • Your comment must include the job title, and either a link to the full job description, or the text of it. It doesn’t have to be a SQL Server DBA job, but it does have to be related to databases. (We get a pretty broad readership here – it can be any database.)
  • An email address to send resumes, or a link to the application process – if I were you, I’d put an email address because you may want to know that applicants are readers here, because they might be more qualified than the applicants you regularly get.
  • Please state the location and include REMOTE and/or VISA when that sort of candidate is welcome. When remote work is not an option, include ONSITE.
  • Please only post if you personally are part of the hiring company—no recruiting firms or job boards. Only one post per company. If it isn’t a household name, please explain what your company does.
  • Commenters: please don’t reply to job posts to complain about something. It’s off topic here.
  • Readers: please only email if you are personally interested in the job.

If your comment isn’t relevant or smells fishy, I’ll delete it. If you have questions about why your comment got deleted, or how to maximize the effectiveness of your comment, contact me.

I close the comments here after 1 week, and publish another post each month in the Who’s Hiring category so y’all can get the latest opportunities.

If you want to be notified as people leave new job comments here, there’s a “subscribe without commenting” option where you leave a comment.


Are You Underpaid? Let’s Find Out: The Data Professional Salary Survey is Open.

Salary
1 Comment

We’re data people, you and I. We make better decisions when we work off data instead of feelings.

It’s time for our annual salary survey to find out what data professionals make. You fill out the data, we open source the whole thing, and you can analyze the data to spot trends and do a better job of negotiating your own salary:

We pay Richie in query bucks

Take the Data Professional Salary Survey now.

The anonymous survey closes Sunday, Jan 2. The results are completely open source, and shared with the community for your analysis. (You can analyze ’em now mid-flight, but I’d wait until the final results come in. I’ll combine them into a single spreadsheet with the past results.)

Thanks for your help in giving everybody in the community a better chance to talk honestly with their managers about salary.


Today is the last day to buy a Live Class Season Pass, EVER.

Company News
0

Today is literally it. This is the end of an era.

One of the most popular things I sell is my Live Class Season Pass, which gives you two cool things:

  • The ability to drop in on any live online class that I teach for a year straight, plus
  • The ability to stream the recordings, too

For one low price, students can attend classes whenever works for them, as often as they like, continuously raising their skills. Plus, when an emergency pops up at work, they can watch the recordings after hours to prep for the next day’s class. When they’re getting ready to tackle a problem at work, they can jump back to a specific module, watch the recordings, step through the demos, and even try their hand at the lab assignment to see if they really got it.

I love the Live Class Season Pass, but starting in 2023, I’m going to be teaching way less live online classes. I can’t be selling an unlimited pass if there’s a limited number of live classes you can attend. So because of that, today is literally the last day you can buy a pass, and attend a huge number of classes over the next 12 months. In 2023, I might teach an online class or two, but I’m aiming to scale way back, and folks will have to pay for each class they want to attend.

Today, the Live Class Season Pass is $1,595.

If you want an even bigger/better deal, the Level 3 Bundle is just $100 more, and it adds SQL ConstantCare and the Consultant Toolkit.

But move quick. When the calendar flips over to December, the Black Friday deals shut off, and the Live Class Season Pass is gone forever. See you in class!

Update 2021/12/01 – the Black Friday Sale is over.


[Video] Office Hours: Black Friday Edition, Ask Me Anything About SQL Server

Videos
0

Yesterday, I finished setting up my new Apple MacBook Pro with the new M1 Max processor and did its first live stream on my Twitch channel to see how the hardware held up. I was tickled pink – worked great!

Here’s what we covered:

  • 00:00 Introductions
  • 03:48 TheGreatFode: Hi Brent, have you ever seen cases where to low HBA queue depth setting was a problem, and increasing it improved I/O performance? BusyIsGood: Hi Brent, I’m trying to use sp_BlitzLock to get recent deadlocks but the procedure has started taking over 30 minutes to produce records. What could be causing this delay? Or is this just my life now…
  • 09:33 NewPadawan: Hey Brent, is splitting tempdb still needed on SSD? On the company I work at, we have about 40 databases with each db is about 20-40GB, on 1 SSD. We only use 1 tempdb (on another SSD) so far, on sqlserver 2008 r2.
  • 10:39 BusyIsGood: Hi Brent, I’m trying to use sp_BlitzLock to get recent deadlocks but the procedure has started taking over 30 minutes to produce records. What could be causing this delay?
  • 12:15 LookingAtProcsAllDay: Hi Brent I’ve seen many different upsert patterns (some of them considered anti) in TSQL. Everything from: If (not) exists / merge / @@rowcount / updlock / serializable / list goes on… What is your “go to UPSERT pattern” in TSQL and why?
  • 15:20 sol: Howdy, sir, tasty be your beverage! Earlier this century twas like “MaxDOP at 50? – Brent wrote this… – Who’s Brent?!” and now it’s more like “Why..? – Thus Brent Hast Written – Ooh, so wise!”. By giving more and more to the community, are you afraid of becoming “the default”?
  • 18:04 i_use_lowercase_for_select: Hi Brent, My friend who is also my customer has a 100GB database and a terrible application which uses ‘select * from [table]’ for every query. Performance is terrible and now he wants to use ephemeral disks for data and logfiles. Can you help me expess how bad of an idea that is
  • 20:25 Lostindb: Hi Brent, I used to put mdf and ldf on separate physical disk location. But it is still needed when we use SSD? Or is it safe to put mdf and mdf in same SSD? Thanks!
  • 21:26 RelationalDBA: Hey Brent, love your office hours videos. If you were a Production DBA, what would your approach be to monitor AlwaysON AG? My friend’s office decided to have HA on the DBs and he wants to be prepared.
  • 24:22 Accidental DBA: Hi Brent, sometimes when we run sp_blitzfirst with no arguments, it runs for several minutes (15 minutes). Is this something to be concerned about? Looking at messages tab, it says “Beginning investigatory queries”. SQL2014
  • 25:28 Stuart: Hi, on SQL2016 I have a query using outer-join to empty columnstore index table to benefit from batch mode; on 2019 (compat150) if I remove the outer join to columnstore table it gets a very different plan and no batch mode so seems this is still useful on 2019 – any thoughts?
  • 27:50 Stuart: Hi Brent, I recently created a high-value missing index on a single (computed) column, according to sp_BlitzIndex. When I check the index usage stats DMV days later though it shows it’s not been used at all, any thoughts on why that might be?
  • 28:54 Ivan: Is it necessary to rebuild index which had page count more then 500 above & fragmentation more then 50% daily or weekly once should be fine ?
  • 31:09 Wrap-up

Wanna get alerted whenever I go live? Subscribe to my Twitch channel. See you there!


[Video] Office Hours Speed Round: 17 Answers in 11 Minutes

Videos
1 Comment

Not all of your questions need detailed answers. In this speed round, I race through 18 answers in 11 minutes:

Here’s what we covered:

  1. 00:11 UnfortunatelyPostgresIsCheaper: I know you’ve answered that, as far as u know, there isn’t a ‘Brent Ozar’ of Postgres, but as you spend a lot of time with it u must know a quite a bit so what were/are your learning sources?
  2. 00:46 dadepretto: Hi Brent, first thanks for your amazing work! Regarding SQL 2022 Parameter Sensive Plans filling up query store, Bob Ward replied to a comment on the YT Microsoft Mechanics video “we have logic here not to cause too many plans to be cached”. Any thoughts? Is there any hope?
  3. 01:14 Jacob H: Hi Brent, what are your thoughts on the new Azure SQL Database Ledger feature? Do you think there is any future for “blockchains in the database”?
  4. 01:45 suku2022 dba: hi Brent, Performance tuning is something will come based on experience. so wonder is there any site where we get queries or kind of mock scenarios and we can tune then or make them run faster..
  5. 02:40 LookingAtProcsAllDay: Hi Brent, waiting on mgr app for Bundle 2 What could explain a slow exponential deterioration in CPU and Duration of certain procs, that is reset every time my friend does a failover? Visible for past year on 2017, not enough data points to see if problem in 2019 with 3x more mem
  6. 03:20 Shy: Hi Brent, For DR Env – When to use Log Shipping verses Availability Groups? What is the point that I have to switch from Log Shipping to Availability Groups with respect to RPO and RTO?
  7. 03:59 Richard: If I duplicate a linked-server connection, except for the friendly name (and with two sys64 connectors), have I doubled the capacity?
  8. 04:15 HybridDBA: Could you mention any equivalent community contributor in Oracle like Brent currently is for SQL Server? Brent, your community mentorship for SQL Server is amazing.
  9. 04:48 Fyodor: Hi Brent, what is your opinion on crypto currencies ? I know this is not related to SQL server but I don’t see where I can ask you anything non-related to sql server.
  10. 05:56 FullTextIndex: Full-text indexing: what is your opinion and experience. My friend has put a full-text index on a XEvent session capturing all SQL statements on an Instance
  11. 06:38 CarlosDBA: just renewed the annual subscription to your classes (3rd year and still loving it!!). Any new courses you are working on (or planning to post in the next year) that we can look forward to?
  12. 07:29 Yawnder: Hi sir. Do you think frameworks like EF should integrate with partition (partition functions, scheme, etc., etc.) or should that be left to the DBAs? Would it be too risky to give that kind of power / responsibility to less “DB-Aware” Devs?
  13. 08:08 Trushit: Is it a good idea to connect tools like Power BI to SQL server via Direct Query mode? Direct Query mode queries data from sql server every time someone interacts with report. I am concerned that there would be too many query hits to live database.
  14. 08:48 Kumaran: What would be the best method/ practices for a query that spans 2 versions of SQL using linked server?
  15. 09:24 PerformanceIsMyLife: If I have an SSIS package that is essentially an ELT and then logic to process – is there a performance difference vs if I did the same process scripts after initial data import via stored procedure instead of part of the SSIS package?
  16. 09:47 Tancredi: Hi Brent, do you have any thoughts on using a pre-existing production SQL Server (that admittedly sees low usage) for DPM 2019? My friend’s boss is pushing for it to save money on licensing.
  17. 10:11 KondzioSSJ4: Hi Brent I have many-to-many relations between the 2 tables Should I create a connection table with separated auto-increment ID or maybe put PK for both columns? or should I do something else to get the best performance? (connection table have unique values)

[Video] Office Hours on the Balcony Part 2

Videos
0

Picking up right where we left off yesterday, we continue down through the not-so-highly-upvoted questions from https://pollgab.com/room/brento.

Here’s what we covered:

  • 00:00 Introductions
  • 00:22 Andrew from Canada: To Null or not to Null? That is the question. On tables with 100M to 1B rows and with several columns that contain data 20% of the time, should we still allow nulls or set them to an empty/zero value instead? If it’s just retuned but not queried, does that matter?
  • 02:02 FARAN: Hi Brent, is separating MDF and LDF on different physical drive still needed when we use SSD?
  • 04:34 Maciej: Hi Brent, Recently Amazon made available Babelfish for Aurora PostgreSQL. Could it seriously threaten Microsoft business model and force them to ship more improvements to SQL Server engine (e.g. like Parameter-sensitive plan optimization)?
  • 06:59 That’s All Folks – already answered
  • 07:17 Michael Devor: Brent. My friend found a plan w/missing index recommendation. For testing he created it but the plan does not show it used. Could the engine be using some part of the index without reporting it? Could there be a reason to keep the index? (nothing else would seem to need it)
  • 09:10 MancDBA: Hi Brent, what are your thoughts on DBA Job postings that ask for a “Expert” on various topics eg “Expert at AlwaysOn” or “Expert at TSQL”. I have 10 years experience, should I be an “Expert” by now?
  • 11:16 Faruk Ami: Hi Brent, i want to create report for top consuming cpu queries on our sqls but some of them are running only Entity FW queries 🙁 how would you approach this? tried using query store but we are running in multi tenancy (schema level) which makes it a bit hard to investigate
  • 12:20 Maciej: Hi Brent, What was the highest value for missing index (returned by sp_BlitzIndex) that you have ever seen?
  • 13:13 DrDrakken: Hi Brent! Learned a lot from your fundamentals training and started my own experiments on cardinality estimation. What I don’t understand (and my friend neither): Event simple queries with a sort request way to much memory even if returned row estimates are accurate (sql2019)
  • 14:45 Steve E: I know it’s maybe too early to say but do you think SQL Server 2022 will finally fix parameter sniffing for good?
  • 16:30 MikeO: I have heard you mention the cost of Sql Server, at what point should processing be done within the application instead of SQL Server ?
  • 18:30 FrustratedAnalyst: Hi Brent, many thanks for hosting these office hours sessions, I get a lot out of them! As an analyst who wants to move over to data engineering or administration, how would I impress potential employers when I don’t have much working experience in those areas?
  • 20:19 Mike: Brent, can you change the default SQL Server port 1433 years after the server has been configured? What steps or links can I read before doing this task? Need to do this for IRS Audit purposes.
  • 21:33 Jim: Hi Brent. Thanks for doing these! We have a vendor app with a ~3 TB DB, mostly one table (docs in DB…grrr). The DB is stored on a Netapp LUN (flash storage). Would you have any recommendations here, as to whether it would be good to add a database file?
  • 23:53 Todd Chittenden: You always state that there are enough poorly configured SQL Server to keep us all employed. I’m having a hard time convincing bosses, clients, etc. that SQL performance tuning is needed. How do I get this off the ground? Any suggestions?
  • 26:22 SayItAintSo: Do I understand from your black Friday sale “The final time I’m selling it!” that you might be retiring??? (: Asking for many friends.
  • 28:41 Oleg: Hi Brent, you often mentioned you wish developers do this or that in your trainings, what would be your ultimate checklist or wish list that you would like all database developers to keep in mind when developing new application?
  • 30:48 Torben Iisager: Hi. During rollback and database recovery we ask for the status, but the percent count fluctuating up and down, we don’t know why (and don’t know the look for doc on this issue), the recovery/rollback completes despite that, but I would like to know why the number is fluctuating.
  • 32:03 LookingAtProcsAllDay: Hi Brent, love your office hours. A friend just upgraded to 2019 at the same time as the hardware. He is seeing reduction in CPU and Duration of some of the most used procs. But query store metric on memory grant has gone up 8x. New hardware has 3x the memory. Is this OK
  • 35:04 DataWriter: My friend’s table has 25 indexes on it. dm_db_index_usage_Stats shows many are unused/barely used: 12 indexes are .002% of the total index usage, but 53% of the total user_updates. His manager isn’t a SQL guy. Is there a way to quantify the performance gain from removing them?
  • 36:33 Wrap-up

NOLOCK Is Bad And You Probably Shouldn’t Use It.

Eyes up here, kid
I’m waiting for the YOLOck option.

When you put NOLOCK in your query, SQL Server will:

  • Read rows twice
  • Skip rows altogether
  • Show you changes that never actually got committed
  • Let your query fail with an error

This is not a bug. This is by design. To understand what happens, read these blog posts in order:

  1. Using NOLOCK? Here’s how you’ll get the wrong query results.
  2. “But NOLOCK is okay when my data isn’t changing, right?”
  3. “But surely NOLOCK is okay if no one’s changing data, right?”

After reading those, you’re going to ask, “So what am I supposed to do if I need to avoid blocking problems?” There are 3 solutions:

  1. Have enough indexes that your queries go fast, but not so many that your deletes/updates/inserts take forever. I cover this in the locking module of my Mastering Index Tuning class.
  2. Keep your transactions short and sweet, and use batching to avoid lock escalation. I cover this in the deadlocking module and the batching module of my Mastering Query Tuning class.
  3. Use read committed snapshot isolation (RCSI) or snapshot isolation (SI) so that readers and writers can coexist. I cover this in the isolation levels module of my Mastering Server Tuning class.

I know: NOLOCK sounds so much easier because you can just slap it into a query and it feels like it’s going faster. Just remember the demos in those first 3 blog posts up above: sooner or later, your users are going to see inaccurate query results.

If it’s so bad,
why is NOLOCK an option?

Because in some applications, accuracy really doesn’t matter. I know it sounds like a joke, but I’m being serious.

For example, one of my clients is a high-volume online shopping site. Their admins want to see that data is moving through the shopping cart process, that users have been adding new shopping carts in the database. They use monitoring queries to check for the last 10 carts that were created recently and checked out recently. They don’t really care whether the results are transactionally accurate – they just wanna see that any new orders are moving through the system. NOLOCK works fine there.

Another example is the open source First Responder Kit scripts like sp_BlitzIndex. I don’t need transactional accuracy when I’m checking the usage statistics on an index – we’re just getting rough numbers. If the same index happened to be shown twice, you would simply go, “Well that’s odd,” and you’d rerun it to check.

But if your application involves money, health care, or anything else vitally important, you better think twice before using NOLOCK.


Free Live Class This Week: Fundamentals of TempDB

Company News
2 Comments

You’ve been working with Microsoft SQL Server for a couple of years, and you know a little bit about TempDB. You’ve heard that temp tables and table variables have different performance characteristics than regular user database tables, and you’ve heard that you’re supposed to have multiple data files.

You’re wondering what exactly goes on behind the scenes.

This class is for curious folks who want to learn:

  • What uses TempDB, and how that affects performance: not just temp tables and table variables, but also triggers, cursors, sorting indexes, workspace spills, the version store, and more
  • How to host TempDB depending on how you use it: understanding what goes in the log file, the data files, why the number of data files matter, and whether your particular workload should be on SAN, local storage, or ephemeral storage, and why
  • How to monitor it: after you understand what uses it and how to host it, you need to keep tabs on how its performance changes over time using DMVs, Perfmon counters, and the First Responder Kit scripts

This course is 90% demos: the only slides are the introductions at the start of the day, illustrations to support a few topics, and the recap at the end of the day. The rest of the time, we’ll be working in SQL Server Management Studio. Roll up your sleeves and join me!

Join me live, for free, on Tuesday 8:00-16:00 UTC in Europe-friendly times, and again Wednesday in Americas-friendly times, 9AM-5PM Eastern. Register here.


[Video] PASS Data Community Summit Keynote: 5 Ways the Cloud Impacts Your Career

Succeeding as a data professional is all about placing the right bets at the right times. The more you know about what’s happening around you, the better bets you can place.

I’ve been working with databases for over two decades. In this session, I’ll invite you into my home to have a frank, honest discussion about where our industry has been recently, and where it’s heading next. We’ll look at how that will impact the available jobs in our industry, and how it impacts where you focus your spare training time. We’ll even look at job ads and resumes in an effort to help you stand out.

In one hour, I can’t make you a pro at everything – but I can help you understand where you need to be a pro, and how to make your resume reflect that. Join me for this recording of my PASS Data Community Summit keynote:

If you liked this session, there are plenty more, all free at PassDataCommunitySummit.com.

Big thanks to Redgate for organizing this year’s Pass Data Community Summit, and I’m honored to have been involved! I’ve already got my travel booked for the in-person event in Seattle in 2022, too. See you there!


[Video] Office Hours on the Balcony Part 1

Videos
1 Comment

I hang out on my balcony in San Diego with a gin & tonic and a bunch of your highly upvoted questions from https://pollgab.com/room/brento.

Here’s what we covered:

  • 00:00 Introductions
  • 01:33 FishCakesForTea: I have designed a filtered index for a specific query but SQL Server doesn’t use it unless I move the columns in the WHERE clause (in the index and the query) into the INCLUDEd columns as well. Why do I have to do that? These columns are not in the SELECT statement or a join.
  • 04:25 HeavenKnowsIAmMiserableNow: Sometimes I tune a query and the logical reads are lower but the duration is consistently higher even after a few executions. What are the likely causes and is using reads as a performance yardstick misguided?
  • 06:13 NotMySelf: Hi Brent, what are your thoughts on the new multiple plans per query feature in SQL Server 2022?
  • 09:06 Alexandre Jobin: How can you know if your SQL Server is over provisionned (or under provisionned)? We are planning to go in the cloud and we are questionning the server specs.
  • 11:02 bobthebuilder: Hi Brent, My friend has a table of 18M rows with no primary key or clustered index. But there are 6 non-clustered indexes, few on null and nvarchar columns. Is it advisable to change one of the index to a clustered index for better performance even though having duplicates values
  • 13:13 DrDrakken: Hi Brent! My friend’s DB has tables with a lot of columns. For reasons this is not to be changed in an easy way. Is the rule of max 5 indexes per table still valid if you have about 50 columns where 4 of them are foreign keys and therefore already having an index?
  • 15:36 UseLessFellow: I am excited to join in the class. but at the same time one of your suggestions about getting a certification is useless because it doesn’t provide any job opportunity, if that is the case after your course done what are the opportunities you will provide? and pls provide a path
  • 18:29 Peter Sawatzki: Hi Brent, would you put tempdb files on an unmirrored storage volumes or unmirrored ssd volumes ?
  • 19:26 Mordac: Hi Brent, do you have any best practices for optimizing linked server query performance (i.e. putting servers on same switch, etc)?
  • 21:14 i’m_a_null_set: Hi Brent. Love your content and classes. Asking for a friend: Is there a hierarchy for which pages SQL server will flush out first when it needs memory for queries? Are some pages more expendable than others, such as indexes that won’t get used by the query?
  • 23:26 The SQL Guy: Hey Brent. Let’s say you have a query that INNER JOINS the orders table twice, using different fields, so different indexes are used. When you look at the execution plan, there is a troublesome Key Lookup. How do I know which index I should modify to get rid of the Key Lookup?
  • 25:50 Thunderous roars
  • 26:29 Doug: On average, after becoming a DBA, how long does a person experience “imposter syndrome?”
  • 29:07 Stranger in the night: Hi Brent! Long time listener, first time caller. Who does the drawings of you that we see when you post?
  • 30:29 Gio: Who is the “Brent Ozar” of PostgreSQL?
  • 31:47 Wrap-up

Menu