Where do I file bugs and feature requests for Microsoft Data Platform products?

It’s not just you: it’s hard for all of us to figure out where to go when we find a bug or want to add a feature. Here’s a quick rundown:

  • Azure Data Studioopen an issue in the Github repo. While you open an issue, Github helps by searching the existing issues as you’re typing, so you’ll find out if there’s already a similar existing issue.
  • Azure SQL DBfile feedback requests here. However, the search functionality is pretty terrible: before you open a request, try searching Google with the site:https://feedback.azure.com/forums/217321-sql-database parameter, like this search. That restricts your search results to just that portion of Azure Feedback.
  • Azure SQL DB Managed Instancesfile feedback requests here.
  • Power BIfile ideas here.
  • SQL Serverfile feedback requests here.
  • SQL Server Management Studio – right now, this one is kinda in flux. If you click Help, Technical Support in SSMS, it takes you here, but that’s the same feedback forum as SQL Server. There’s also a SQL Server Tools forum at MSDN monitored by Microsoft staff, but it’s not kept up to date – for example, the top pinned note talks about where to submit defect reports, and it still refers you to Connect, which has been dead for over a year.
  • SQL Server Data Tools – like SSMS, this one’s in flux, and your best bet is the SSDT forum on MSDN.

There’s a bottleneck in Azure SQL DB storage throughput.

As you pay more for Business Critical Azure SQL DB servers, they’re supposed to get more storage throughput. The documentation on this is kinda hard to read, but boiling it down, for every core you add to a Gen5 server, you’re supposed to get 2,500 IOPs. That should scale linearly: insert speed should go up evenly with each added core.

The bottom line of this screenshot shows the IOPs going up linearly from 16 cores at 40,000, up to 200,000 IOPs for 80 cores:

The bottom line

But can an 80-core server
write any faster than a
16-core server? Well…no.

Sounds like a trick question, doesn’t it?

Let’s start with the Stack Overflow database, which has a 104GB Posts table – 18GB of which is off-row text data because Lord knows y’all like to write some long questions and answers. Then we’ll run this simple query to copy that Posts table to a new one:

The query will go parallel, using more cores for the reading & writing – so will more cores help? Plus, when you move from an 8-core Azure SQL DB to an 80-core one, you get more memory, and theoretically, more storage throughput.

Drum roll please:

  • 8 cores, $1,991 per month: 64 minutes
  • 16 cores, $3,555 per month: 32 minutes (and interestingly, it’s the same speed with zone redundancy enabled)
  • 80 cores, $18,299 per month: 32 minutes
  • Just for reference: 8-core AWS EC2 i3.2xl VM, $1,424 per month with SQL Server Standard Edition licensing: 2 minutes (and I don’t put that in to tout AWS, I just happen to have most of my lab VMs there, so it was a quick comparison)

You can spend five times more,
but you hit a wall at 16 cores.

So why don’t more cores equal dramatically more speed? Your first thought is probably, “Well, Brent, that query isn’t CPU-bottlenecked” – and you’re only partially right, but also pretty wrong. Here’s the Azure Portal showing performance metrics on the 8-core setup while the insert is running:

The very light-colored line banging up against the top at 100% is the transaction log IO percentage. Our workload is simply limited by how much Azure is willing to write to disk at any one time. The CPU percentage is 6%, and the read IO is 7%. Even with only 8 cores, we’re paying for CPUs that are sitting around bored.

Upgrade the server from 8 cores to 16 cores, and the load speeds double – which is great! Now, what’s our bottleneck – here are the load metrics on the 16-core box:

We’re still transaction-log-file-bottlenecked at 100%, and the only way to get more transaction log throughput is to buy more cores. So throw 80 cores at it, and:

The only thing happening here is that the CPU load is going down, but storage throughput isn’t getting faster. Our workload completes in the same 32 minutes.

This also leads to some interesting MAXDOP behavior. On all of the servers, restricting the query to MAXDOP 1, 2, 4, or any other number didn’t seem to affect the query’s runtime at all. The storage limit is the storage limit, full stop.

The takeaway: bottlenecked on transaction log IO?
Right now, over 16 cores is a waste of money.

Just because sys.dm_db_resource_stats shows that you’re hitting a resource limit, and the documentation says that the limit will go up linearly as you add more cores, don’t trust that it will. You’re going to have to experiment with your workload and different Azure SQL DB sizes in order to find the sweet spot for how much you pay for how fast your query will go. Trust, but verify.

You might be spending more money, but hitting the same limits a 16-core server is hitting!

The old-school DBAs out there will say, “Yes, but if the transaction log file is your limit, you shouldn’t add more CPUs – you should be adding storage.” That’s the problem: in Azure SQL DB, you simply can’t. The documentation says you can by adding more CPUs, but as of this writing, it just isn’t true. I don’t have any inside information here, but there seems to be a problem with Azure SQL DB’s storage scalability. Either there’s a bug in the storage limits that were set, or the documentation isn’t right, because the throughput isn’t simply rising once you get to 16 cores.

Update 2019/02/26 – clarified that we’re talking about the transaction log file, not the error log file, per the discussion on Reddit.

Building SQL ConstantCare®: Europeans, what do you want to see in a cloud-based monitoring product?

About a year ago, with GDPR enforcement looming, we decided to hit the brakes on EU sales because we weren’t confident enough in the third party app ecosystem’s compliance capabilities, or the EU’s ability to police the regulation effectively.

In the time since, it’s been interesting to see that:

  • Few folks requested their data or the right to be deleted
  • Among others, Google got hit with a GDPR fine
  • The compliance ecosystem has gotten marginally better, but still not great
  • Brexit (or maybe not, who knows)
  • The EU’s working on a new copyright filter law (Julia Reda’s writing on that has been phenomenal)

Looking at that list, it’s not that I’m suddenly filled with an overwhelming confidence that it’s easy and cost-effective to build a product with GDPR compliance, AND to defend a company against an EU country’s GDPR inquiry. However, I do think that it’s time to start looking at what it would take to sell SQL ConstantCare® in the EU. I specifically mean to sell it though – not just what it would take to be compliant, but have a cloud product that Europeans would open their wallets for. (And I’m not even sure that’s a possibility yet, thus the post.)

Brent Ozar at the Porsche Museum

I’d like to house the EU’s most prized possessions in a United States garage, specifically mine

As we go down that road, I’m curious about Europeans’ opinions to a few questions:

  • Would your company even be open to a solution like that?
  • What kinds of objections have your managers raised about cloud-based products?
  • Have you tried to implement a cloud-based product, and failed? Or succeeded?
  • Between Amazon’s regions (Frankfurt, Ireland, London, Paris, and Stockholm), is there one that your company absolutely insists that its data must be stored in? (Strictly speaking, EU data residency isn’t a technical requirement, but companies often have political requirements.)

Let me know what you think in the comments. I’m not making promises that we’ll sell to the EU by the end of the year – but it’s something that we’re starting to consider.

What should you do about memory dumps?

A SQL Server monitoring tool is warning you that you’re getting memory dumps, and you’re wondering what that means.

This isn’t normal: you’re encountering a situation where SQL Server is so freaked out that it’s unable to proceed normally with a query. In order to help you (and really, Microsoft) debug the problem, it’s calling a time-out, writing the contents of some of its memory to disk, and then proceeding.

Think of it like going to a gas station, getting a questionable hot dog, and then realizing that you need to make an urgent trip to the bathroom to … undo that decision. So reading a dump file is kinda like…actually, this analogy is more apt than I’d originally thought!

Start by querying sys.dm_server_memory_dumps. Run this query:

That’ll tell you how many times it’s happened, and where the dump files live:

sys.dm_server_memory_dumps

sys.dm_server_memory_dumps

If you’re comfortable with a debugger, read the post Intro to Debugging a Memory Dump by Microsoft’s Adam Saxton. Adam writes:

Having some development experience is also helpful. While you may not need to look at Code directly in a dump, you are looking at the results of code. The concept of Pointers is sometimes hard for someone to grasp that doesn’t necessarily have a programming background, but when dealing with a memory dump, or debugging, the understanding of how Pointers work can be very helpful. Even when we are debugging a Managed (.NET) application instead of a Native (C++ or non-managed) Application. Another good book is Windows via C/C++ by Jeffrey Rickter and Christophe Nasarre.

Don’t feel bad if that sounds terrifying – it’s terrifying to me too. I don’t read dump files either.

If you’re not comfortable with that, patch your SQL Server, then call Microsoft. I dunno about you, but I’ve never been particularly good with a debugger, and I certainly don’t wanna learn when I’m facing a production SQL Server that’s freaking out so badly that it’s dumping memory contents to disk.

Hit up SQLServerUpdates.com to grab the latest patch for your SQL Server and apply that. You’d be pleasantly surprised at how often Microsoft fixes bugs that cause memory dumps, and the bug you’re hitting has probably already been fixed. (I’d hate to have you waste money on a support call only to hear that the bug is already fixed.)

After applying the patch, if the memory dumps continue, open a support case with Microsoft. I know: you may have had rough support experiences in the past where you were told to defrag your turn signal fluid and reboot the firewall, but repeated memory dumps are different. Memory dumps are much easier for them to analyze because they see ’em more often, and they have more specialized tools at their disposal. (For a brief moment in time, you could upload your memory dumps for free in SSMS, but Microsoft took that down due to GDPR security concerns since the dumps can contain personally identifiable information like queries and data.)

Don’t dilly-dally: memory dumps are not a small deal. You wouldn’t wanna be making repeated trips to the bathroom after multiple gas station hot dogs – that’s no way to live your life. Your SQL Server deserves better.

You know what your deadlock graphs need? Animation.

In SQL Server Management Studio, plain ol’ deadlock graphs look like this:

Two thumbs down, would not lock the dead again

BOOOOO-RING. That’s why I prefer opening deadlock graphs in SentryOne Plan Explorer, which presents a much better visual in the form of a circle…a circle, let’s just stop there:

Deadlock graph in SentryOne Plan Explorer

And Plan Explorer even has several different ways it can visualize deadlocks – hit the Layout Type dropdown at the bottom to choose:

Collect the whole set

However, during the last Mastering Index Tuning class, @Dan_Clemens pointed out the Play button at the bottom – somehow I’d totally missed that over the years! Check this out:

Playing the circular deadlock dance

And all of the layouts support playback, too! Here’s playback in the Force Directed layout:

And in the Layered Digraph:

Somehow, that makes troubleshooting deadlocks suck just a little less. Thanks, Dan!

Updated First Responder Kit and Consultant Toolkit for February 2019

I hereby christen this the Rich Benner Memorial Release. He’s still alive, it’s just that we’ll always remember him for the work he put into this month’s version. (I’m kidding, of course. We won’t remember him. (I’m kidding. Rich will appreciate the humor in that.))

You can download the updated FirstResponderKit.zip here, and Consultant Toolkit customers can grab their updated version in their My Accounts page. EU customers – check your email for the updated version.

Across-the-Board Changes

  • Improvement: you can now call the procs with @VersionCheckMode = 1, and they’ll just set their version number & date output variables, then immediately return without doing any work. (#1949, thanks Jeff Chulg for the great idea and implementation.)

Here it is in action:

And the code for your copy/pasta delight:

sp_Blitz Changes

  • Improvement: new check for SSAS, SSIS, SSRS services running. (#1943, thanks Rich Benner.)
  • Improvement: added Azure SQL DB’s POOL_LOG_RATE_GOVERNOR as a poison wait. (#1971.)
  • Fix: updated out-of-support version list to mark that SQL 2012 pre-SP4 is now an unsupported build. (#1967, thanks Rich.)
  • Fix: typo in check IDs where check 222 was reporting that it was 114. (#1964, thanks Rich.)
  • Fix: typo in documentation where checks 203 and 224 were swapped. (#1966, thanks Rich.)

sp_BlitzCache Changes

  • Improvement: autocorrect for sort orders. If you pass in something that’s kinda-sorta-near a real sort order, we correct it. (#1945, thanks Rich Benner.)
  • Improvement: Azure SQL DB and Hyperscale compatibility. (#1935)
  • Fix: faster performance when you have a lot of missing indexes. We were missing a field in a join between two temp tables. (#1956, thanks Ron MacNeil for the eagle eye.)
  • Note: in this release, #1935 also renamed all of the ##bou_ temp tables without bou, so just ##BlitzCacheResults. This was done to reduce the BOU branding in the Consultant Toolkit in case your customers start reading the scripts.

sp_BlitzFirst Changes

  • Improvement: in the headline-news result set, batch requests/sec and wait time per core per second are now shown as decimals instead of integers. (#1940, thanks Rich Benner for being the perfect class attendee who actually improves the tools during the class.)
  • Improvement: added Azure SQL DB’s POOL_LOG_RATE_GOVERNOR as a poison wait. (#1971.)

sp_BlitzIndex Changes

  • Improvement: better index naming – removed the table name from the index to tighten up execution plan visualization. (#1938, thanks Rich Benner for doin’ it again.)
  • Fix: to work around a bug in sys.dm_db_stats_histogram and sys.dm_db_stats_properties, we now only show statistics for tables when your database context is the same as the database you’re analyzing. This means if you wanna examine stats, you’ve gotta be in the same database as the object you want to analyze. (#1947, thanks Morten Abrahamsen.)
  • Fix: works again with Azure SQL DB. (#1933, thanks Jacob Golden.)
  • Fix: output to table works again. (#1952, thanks Aram Pendley for the bug report and Matt Monroe for the bug fix.)
  • Fix: compression information truncation error for over 900 partitions. (#1973.)

Power BI Dashboard: Deprecated

You can totally keep using this, but you’re on your own for making improvements. Sadly, Power BI + source control = terrible nightmare. It’s a binary file, so it’s damn near impossible to deal with code contributions from outsiders.

Putting Power BI files in Github is bad for customers because when we update the Power BI file, you have to take it whole-hog, as-is. You can’t make your own tweaks, and then keep those tweaks with each release. You have to redo all your tweaks in their new version.

Putting Power BI files in Github is bad for open source maintainers because when someone wants to contribute changes, either they have to give you step-by-step instructions as to how to make the changes inside the PBIX file, or else you have to take their changed PBIX file as the new source of truth for your project. It’s like trying to source control Excel files when multiple people around the world are making changes.

The PBIX isn’t going away – it’s just moving into the Deprecated folder in Github. It’s still open source under the MIT License, so folks are absolutely welcome to take it and go start their own open source projects.

For Support

When you have questions about how the tools work, talk with the community in the #FirstResponderKit Slack channel. If you need a free invite, hit SQLslack.com. Be patient – it’s staffed with volunteers who have day jobs.

When you find a bug or want something changed, read the contributing.md file.

When you have a question about what the scripts found, first make sure you read the “More Details” URL for any warning you find. We put a lot of work into documentation, and we wouldn’t want someone to yell at you to go read the fine manual. After that, when you’ve still got questions about how something works in SQL Server, post a question at DBA.StackExchange.com and the community (that includes us!) will help. Include exact errors and any applicable screenshots, your SQL Server version number (including the build #), and the version of the tool you’re working with.

You can download the updated FirstResponderKit.zip here, and Consultant Toolkit customers can grab their updated version in their My Accounts page. EU customers – check your email for the updated version.

What does Azure SQL DB Automatic Index Tuning actually do, and when?

Azure SQL DB’s Automatic Tuning will create and drop indexes based on your workloads. It’s easy to enable – just go into your database in the Azure portal, Automatic Tuning, and then turn “on” for create and drop index:

ENGAGE

Let’s track what it does, and when. I set up Kendra Little‘s DDL trigger to log index changes, which produces a nice table showing who changed what indexes, when, and how:

Contents of Kendra’s logging table (click to zoom)

I wanted to do that because the documentation is a little hand-wavy about when these changes actually take effect:

You can either manually apply tuning recommendations using the portal or you can let Automatic tuning autonomously apply tuning recommendations for you. The benefits of letting the system autonomously apply tuning recommendations for you is that it automatically validates there exists a positive gain to the workload performance, and if there is no significant performance improvement detected, it will automatically revert the tuning recommendation. Please note that in case of queries affected by tuning recommendations that are not executed frequently, the validation phase can take up to 72 hrs by design.

To give my Azure SQL DB some horsepower to crank through these truly terrible queries, I gave it 8 cores, then fired up SQLQueryStress to run 20 simultaneous queries against it, and let it run for a day.

Setting up my workload

I took the queries from lab 2 in my Mastering Index Tuning class. We start with the Stack Overflow database with every table having a clustered index, but no nonclustered indexes at all. Students run a workload with SQLQueryStress, which populates the missing index DMVs and the plan cache, and then you have to put the pieces together to figure out the right indexes. Needless to say, it’s a Mastering class, and it’s purposely designed with pitfalls: if you rely on Clippy’s missing index recommendations, you’re gonna have a bad time.

I didn’t expect Clippy to be a master by any means – this automatic tuning feature is included with your Azure SQL DB bill at no extra charge. It’s free index tuning. I’m pretty much happy with ANY index improvements when they’re free! It’s a great feature for businesses that can’t afford to have a professional performance tuner hand-craft the finest artisanal indexes. (Also, those folks make mistakes.)

After letting the workload run overnight, here’s what Azure SQL DB’s performance metrics looked like:

100% CPU and 100% IO – for those of you who are new to performance tuning, the idea isn’t to score 100 points out of 100. (At the same time, it’s not to score 0 either, because then you’re overspending on resources.) We definitely need some index help – and good news, sp_BlitzIndex reports lots of opportunities for tuning:

Zooming in a little:

Plenty of opportunities on the Posts, Votes, Comments, Badges, and Users tables.

What did Automatic Tuning PLAN to do?

Over in the Azure portal, you can see what recommendations Automatic Tuning came up with:

I love it! All three of those look like good choices. I’m not disappointed that it only picked indexes on 3 of the 5 tables, nor am I disappointed that it picked so few indexes – I would always much rather have it err on the conservative side. That’s great!

That page in the portal doesn’t show the index includes, but if you drill down a level on each index, the includes are shown:

You can even click “View script” to get the creation T-SQL. I do wish they’d name ’em with the fields they’re on, but I understand that might be challenging given that folks may use really long field names. I can live with these names.

The Badges and Users indexes exactly matched the missing index DMVs’ recommendations. However, the Votes index was different, and it’s different in a way that gets me all tingly inside. Here were the missing index recommendations on the Votes table:

And here was the recommendation from Azure:

Azure managed to merge the first two recommendations together in a way that works successfully! Neither of the DMVs had included all 4 of those fields in a single index, but Azure did. Oh, Clippy, I love you, you’ve gone off to boarding school and passed high school chemistry. (Or maybe this is Clippy’s big sister, and I’m falling in love. Look, let’s not get bogged down in this metaphor.)

What did it ACTUALLY do, and when?

The above screenshots imply that this create-index statement is executing right now. The documentation says “Executing” means “The recommendation is being applied.” but that definitely isn’t true: the Users index shows as “Executing” as of 10:03PM, but hours later, the index still wasn’t there, and no create-index scripts were running.

I guessed maybe Azure was waiting for a lull in the load (love it!) so I stopped the load test and waited. About half an hour later, the index on Users magically appeared. I say “magically” because the DDL trace on index creations didn’t capture what Azure did. That is a bit of a bummer because it’ll make tracking what happened just a little bit harder. We’re going to have to use one mechanism (DDL triggers) to track user-modified indexes, and another mechanism to track Azure-modified indexes.

Azure exposes its tuning actions in sys.dm_db_tuning_recommendations:

Buckle up: the state and details columns are…JSON. Here are the example details for an index. <sigh> Developers love JSON because they can change the contents without having to change the database structure – they can just name a column “details” and then stuff it full of hot garbage without any forethought. I know, it’s not normalized stuff like you’re used to with the DMVs, but stay with me – this will come in handy a few paragraphs from now.

I happened to catch one of the indexes as it was being created. Here’s the live query plan in PasteThePlan:

It’s building the index with the Power of Dynamic SQL™. Here’s the query it’s running – it’s parameterized, but I switched the parameters into a declare, and ran it through format-sql.com to make it easier to read:

Interesting. That means the script you get in the portal isn’t exactly the same index creation script that Microsoft is actually running – note that this one is terminated with a semicolon, for example, but the portal’s version wasn’t.

How did that affect the server while it ran?

Regular readers may recall our recent adventure, How fast can a $21,468/mo Azure SQL DB load data? In that, I kept bumping up against transaction log throughput limits that seemed strangely low, like USB thumb drive slow.

Well, this is only an 8-core (not 80-core) server, so I did expect index creation (which is a write-heavy activity) to bump up against the log limits hard, and it did. Here’s a screenshot of a 60-second sample of sp_BlitzFirst while the Votes index creation was happening:

A few notes:

  • Log file growing – this strikes me as really odd because I’d loaded this entire database from scratch recently, going from 0 to 300+ GB. The log file shouldn’t have been small, and there hadn’t been write activity in the last half-hour before the index creations. There should have been plenty of space in the log file – which leads me to think that Microsoft is actually – and I hope you’re sitting down for this – shrinking the log file. For shame. That’s a real bummer, causing customer slowdowns for something that we all know is a terrible anti-pattern.
  • CXCONSUMER waits with an average of 467 seconds each? Each?!?
  • INSTANCE_LOG_RATE_GOVERNOR – ah, our throttle friend.
  • Physical write volumes (bottom of the screenshot) – the 8-core instance was able to write about 2GB of data in 60 seconds. Again, terrible USB thumb drive territory, but…not that much worse than the 80-core instance, which really makes me question what’s happening with the scalability of the IO in Azure SQL DB. (I’ve got a blog post coming up on that, and the short story is that while the documentation says IO scales linearly per-core, it…doesn’t. Not even close.)

After the index creations finished, I fired up my workload again.

How Automatic Tuning validates its changes

After the workload started again, the Azure portal showed the measurements it was taking for each of the indexes:

Badges index savings

Badges index savings

This Badges index seems to have paid off well with huge DTU savings, although one of my queries regressed. (I’m curious how they’re measuring that given that many of the queries in my workload are joins across multiple tables, 3 of which got new indexes in this last round of index tuning.)

The Users table didn’t fare quite as well, only seeing minor improvements:

DTU savings on dbo.Users

In a perfect world, I’d love to be able to click on “Queries with regressed performance,” see the queries involved, and figure out how to hand craft better indexes for them. Unfortunately, while you can go to Query Insights, you’re left to start over with the filtering, trying to figure out which queries were involved.

The portal doesn’t do that, but…we might be able to, someday. Remember how I said that the JSON data in sys.dm_db_tuning_recommendations would come in handy? Well, for the automatic-plan-forcing part of Azure’s Automatic Tuning, the improved & regressed query plan IDs are stored in the JSON, and Grant Fritchey wrote a query to fetch ’em. There’s no reason Microsoft couldn’t do that same thing with the improved & regressed queries for the automatic index tuning – after all, they made the Details column a generic JSON dumping ground. They could add the query data in there, and here’s a feedback request to make that happen.

About six hours later, 2 of the 3 indexes showed as validated. Here’s the validation for the Votes index:

I feel so validated

Note that 2 queries got better, and 1 query got worse – but I can’t tell what that one query is. Again, no drilldown on “Queries with regressed performance,” and the Query Insights tab just takes you to the database’s overall insights, leaving you to start searching from scratch. Sure would be nice to have this here feature.

Summary: I love it.

Things I love:

  • It’s conservative
  • It’s doing deduplication, merging recommendations together
  • It’s not trying to apply the indexes during peak loads
  • It does a way better job of validating its changes than most DBAs

Things I don’t mind:

  • The execution scheduling is opaque
  • The index names aren’t great
  • The indexes take a long time to apply (but that’s an Azure SQL DB storage limitation, apparently)

Things I don’t like:

  • The validation report doesn’t show me which queries got worse or better (and I know Azure knows)

This feature is really appealing for the classic use case of a database server that doesn’t get any tender loving care from a database administrator. I’m sure the deeper I look at it, the more edge cases I’ll find, but…who cares? I have that same feeling about Clippy’s missing index recommendations, but I still rely on those every day as a starting point when I parachute into a new server. Used wisely, they can help you do a better job of tuning and let you focus more on the hard edge cases.

Building SQL ConstantCare® Version 2: Easy-to-Afford Monitoring

tl;dr – SQL ConstantCare® is now just $59 per month, or $495 per year.

As we were building SQL ConstantCare®, I wanted it to be mentoring, not monitoring. I wanted to take daily data samples, trend them over time, and give you big-picture advice on what tasks you should focus on to make your SQL Servers faster and more reliable.

I envisioned it having two parts:

  • Our robots would give you daily tactical advice
  • I’d check in weekly to give you bigger-picture strategic advice

Over the last year, I learned a lot!

Our robots are keeping you pretty busy. The basic care and feeding of a production SQL Server is more challenging than most folks expect. (For example, over the last week, 391 servers have databases with no recent corruption checks, and 168 servers have databases without a full backup.)

Our robots are getting smarter, too. Over the last couple of months, we’ve been gradually rolling out query plan advice. If you’re sending in query plans, then each Monday, we tell you which queries to focus on tuning, and what changes you need to make.

Implicit conversion warning

Implicit conversion warning

For example, if we detect that your most resource-intensive queries are suffering from implicit conversion, we tell you, and we include the query plans attached to the email so you can see exactly which ones to fix.

That’s still in early access because we still have work to do on that front. For example, early access user feedback noted that:

  • We need to add the database name in the query plan file name to make it easier to know which queries can be ignored (like from 3rd party apps, or known in-house tools)
  • We need to filter out commonly known queries & databases (for example, you can’t really tune Ola Hallengren’s maintenance scripts, or by popular monitoring tools)
  • We need to filter out really low-impact queries (because some servers are just flat out bored, and there’s no return-on-investment in tuning them)
  • We need to let you quickly mute specific queries (as in, yes, I know that query is terrible, and we have no plans to fix it because the person who wrote it left, and we’re replacing that part of the app)

You don’t necessarily need the video training bundled in. A lot of folks know what a particular warning means, and for warnings they haven’t seen before, they have plenty of resources available in the more-info links in each email anyway. We could save them money by not bundling the video training if folks don’t need it.

You don’t necessarily need the active mentoring, either. In a lot of shops, even the most basic robot alerts are keeping your hands so full that you don’t need to hear me check in. In many others, the servers just aren’t having bad performance problems, so there’s no sense in spending time diagnosing them – users just bought SQL ConstantCare® to keep an eye on the server’s overall health.

So we’re relaunching it: simpler, with an even lower price.

SQL ConstantCare® is now just $59 per month, or $495 per year.

That’s one easy-to-digest price that covers daily monitoring of all of your SQL Servers, regardless of their location. As always, it’s priced by email address that we send the alerts to – and we’ve found that most shops are just sending the alerts to a single distribution list email, and that keeps their costs down.

Then, if you need to step up to mentoring, you’ve got plenty of options: the Recorded Class Season Pass (and a team version), live training classes, or consulting with me.

As existing subscribers approach their renewal date, their subscription renewal pricing will automatically be updated to the new lower price in the my-account page. (Europeans, fret not: we’re working on a GDPR-friendly release, and I’ll write about that in next week’s post.)

[Video] How to Capture Baselines with sp_BlitzFirst

Everybody tells you to capture baselines – but what exactly are you supposed to capture? Which Perfmon counters matter? How do you track which data files are growing, and which ones are slow? How can you track which queries are using the most resources?

All you have to do is install the free, open source sp_BlitzFirst and sp_BlitzCache from our First Responder Kit, create a database to hold your results (I like to call mine DBAtools), and then set up an Agent job to run it every 15 minutes:

I show you how to do it and query the baseline data in today’s webcast:

Should you run SSAS/SSIS/SSRS on the SQL Server?

When you’re building a new SQL Server, you’re going to see a few intriguing checkboxes during setup.

SQL Server installation wizard

It’s all free, right? Check everything!

The services are all free, right? You can just check the boxes for Machine Learning Services, R, Python, Data Quality Services, PolyBase, Integration Services, Analysis Services, and Reporting Services – it’s not like you have to put more coins in the front of the server. So what’s the drawback? Why not collect ’em all? What are the pros and cons?

The more you install, the harder performance troubleshooting becomes. They all drink from the same pool of available CPU, memory, storage, and network throughput. When one of them is performing slowly, you’ll be constantly asking questions like:

  • “Is this slow because one of the other services are using all the power?”
  • “Is this service’s power consumption affecting other critical apps?”
  • “How can I monitor which one is using CPU/memory/etc at any given time?”

Patching and upgrades are harder, too. If your team actively uses these services, then at some point, they’re going to want to patch or upgrade just a portion of the services. Classic example: the BI team wants to move to a newer version of Analysis Services to support a cool new reporting tool, but the main app developers can’t certify their app on a newer version of the relational engine yet. If everything’s on the same box, this gets a lot harder. (Coordinating downtime is especially tricky.)

Uptime management is harder. If you want to make any of these services highly available, then you need to get clusters, load balancers, and things like Always On involved. Each of the features has its own unique ways of building high availability, and when you pile them into the same box, you’re not making your job easier.

However, Standard Edition licensing is easier. If you just bought enough licenses to cover this one VM or physical box, then you probably want to consolidate these services all onto that same box. You probably can’t afford to split them out across multiple boxes. (On the other hand, if you’re using Enterprise Edition and licensing per VM host, then you’re much better off splitting these into different VMs for the reasons we mentioned above – it’s not like it’ll cost you more for SQL Server licensing.)

Piling lots of services into the SQL Server does indeed have its costs – even when licensing is “free.”

How fast can a $21,468/mo Azure SQL DB load data? (Updated)

Update March 19: Microsoft has since acknowledged a hidden limit, then documented it, then raised it – but it’s still disappointingly slow.

In my last post, I explored how fast a $5,436/mo Azure SQL DB Hyperscale could load data. I’d had a client who was curious about spinning up their dev environment up there to see how query plans might look different. Well, as long as I was running this test, I thought – “How does this compare with Azure SQL DB?”

There are 3 different products with similar names: Azure SQL DB, Managed Instances, and Hyperscale. In this post, I’m specifically talking about Azure SQL DB – not Managed Instances or Hyperscale. Managed Instances have a way easier method to ingest data – simply restore a database backup – so that one isn’t really worth exploring. There are a bunch of similar restore-a-backup requests for features to make restores easier in Azure SQL DB, they don’t exist today, so we’re stuck with the same options we had with Hyperscale:

To do this quickly and easily, I went with the first option (as I did in the last post with Hyperscale.)

Attempt #1: 80 cores, $20,088 per month.

The Data Migration Assistant and its documentation strongly suggest that you shouldn’t start small: you should over-provision your Azure SQL DB during the migration process:

"Microsoft strongly recommends that you temporarily change your Azure SQL Database to performance level P15 during the migration process for the optimal migration experience."

“Microsoft strongly recommends that you temporarily change your Azure SQL Database to performance level P15 during the migration process for the optimal migration experience.”

That seems fair – a P15 is 4,000 DTUs and up to 4TB storage. I was using the new Business Critical vCore model, though, so I maxed it out at 80 cores, 408GB RAM, $20,088 per month:

Gen 5, 80 cores, 408GB RAM, $20K USD per month

Gen 5, 80 cores, 408GB RAM, $20K USD per month

Note that it sys up to 200,000 IOPs, 1-2ms latency – that sounds like a pretty good target for data ingestion. Nothing in the wizard (or the documentation that I could find) seemed to tie data size to storage throughput – which is surprising, because I’ve seen that kind of relationship all over the cloud – but since it wasn’t mentioned, I just left it at 400GB (my database is 340GB.)

I fired up the Data Migration Assistant and let ‘er rip:

Migration in progress, one table done

Migration in progress, one table done

You can’t read anything into the time numbers on this screenshot compared to the last post because the Data Migration Assistant doesn’t load tables in the same predictable order every time, and the times represent the point-in-time in the total load at which the table completed.

Why couldn’t it go faster? sp_BlitzFirst makes it easy to find out:

Maxed out

Note the “Database is Maxed Out” lines at the bottom, and in the details at the far right, we were bumping up against the max allowed log write percent. Ouch. The Azure portal showed writes maxing out too:

Log writes maxing out at 100%

Well, that was disappointing. (Note that in these screenshots, you’ll see server & database names switch around – that’s because I tried these tests with several different servers and instances in the hopes of getting past the log throughput limits.)

Attempt #2: 80 cores, 4TB storage, $21,468 per month

In the Azure portal provisioning wizard, I only had one remaining slider, and I cranked it as far to the right as I would go, upsizing my storage to 4TB (even though I only needed 340GB for the database) – hoping it would upsize my log rate throughput:

This one goes up to 11, and by 11, I mean 4

This one goes up to 11, and by 11, I mean 4

One of the awesome things about the cloud is that you can make these kinds of changes on the fly, and sure enough, it took less than 3 minutes for Azure to report that my storage upsize was complete.

Except I was still banging up against the log rate governor:

Maxed out my wallet

There are several interesting things to note in that screenshot:

  1. At the time of this screenshot, queries are waiting on writelog waits (but you can’t rely on point-in-time checks of individual queries)
  2. I’m hitting my DTU limits, and I’m getting poison LOG_RATE_GOVERNOR waits
  3. I’m averaging 96.85% of my allowed log throughput – not peaking at, but averaging
  4. Log backups are happening at the same time, which is probably also affecting my log rate throughput, but I can’t blame Microsoft for taking log backups
  5. In a 60-second sample, 4GB of data was read from the logs (most likely the log backups)
  6. In that same sample, 1.6GB was written to the data file, and 1.7GB to the log file, and averaged a nice and tidy zero milliseconds for the log file writes (very nice) – but in 60 seconds, that’s a only 28 megabytes of data per second to the log file. That’s USB thumb drive territory.

Eventually, it finished in 3 hours, 52 minutes – remarkably similar to the Hyperscale speeds:

And at first you’re probably saying, “Well, Brent, that means you’re bottlenecked by the source. Both Hyperscale and Azure SQL DB had the same ingestion speeds, and they have something in common: your data source.” That’s what I thought too, so…

Attempt #3: synthetic local loads

Just in case there was a problem with something with my Azure SQL DB instance, I blew it away and started again from scratch. I also took the Data Migration Assistant and networking completely out of the equation. I created a new 80-core Azure SQL DB, maxed out on storage, and loaded a table with junk contents from sys.messages – this way, I didn’t have to worry about any cross-server communication at all:

Waits while that happens:

Synthetic load test

We’re not hitting the log rate governor, but we might as well be. In any given 60-second sample, we waited for hundreds of log file growths, and we managed to write less than 2GB to the log file. 30MB/sec of write throughput is simply terrible – even if we’re growing out a 2GB log file, that’s just terrible for $21,468 per month. You really could do better with a USB thumb drive.

To really stress things out, I started 8 sessions all running that same insert query (with different table names.) I started hitting the log rate governor again, while still only writing about 1.5GB per minute:

Wait stat during 8 simultaneous loads

And then all of a sudden, my wait stats dropped to nothing! Things were flying, because…uh…wait a minute…

Well, that’s not good

Huh. All my queries had crashed at once. That’s not good, and I’d dig deeper to do a repro query, but…the beach is calling.

They left the database in a bit of an odd state, too – the tables were there, with pages allocated, but no rows, indicating they’d all rolled back:

Empty heapsj

That’s…not ideal.

Summary: Azure SQL DB’s log throughput
seems kinda slow for $21,468 per month.

This is faster.

You measure storage with 3 numbers:

  • How many operations you can perform (often abbreviated as IOPs)
  • The size of each operation (operations * size = total throughput, usually measured in MB/sec or GB/sec)
  • Latency of each operation (how long individual operations take to complete)

The latency was great – fast and predictable – but the log rate governor is capping throughput. It means you can’t do too many things at once – but boy, those few things you can do, they sure are fast. Having reliably low-latency operations is great for day-to-day transactional workloads that only do a few writes per user transaction, but…for data ingestion, it makes for slow going.

For imports, I really wish Azure SQL DB would have a mode to let you temporarily optimize for ingestion throughput rather than low latency. I don’t need sub-millisecond latency when I’m dealing with insert commands coming from another machine (especially less-than-optimized commands like the ones from Data Migration Assistant, which even ran into blocking problems during my loads, blocking themselves across threads.)

Have any of you migrated a sizable database (at least 100GB) into Azure SQL DB? If so, did you measure how long it took, like how many GB per hour you were able to import? Any tricks you found to work around the log rate governor?

Update March 9: Microsoft
acknowledged & documented the limit.

Microsoft contacted me, agreed that there’s a limit, and that it wasn’t documented for the public. They’ve now updated a few pages:

  • Single database limits – now shows the transaction log is limited 20 MBps for General Purpose, and 48 MBps for Business Critical.
  • Those limits are reached at 8 cores for GP and BC Gen4, and 16 cores for BC Gen 5.
  • Log limit workarounds – like load data into TempDB or use a clustered columnstore index for compression.

Ouch. 48 MBps really is bad, especially at $21,468 per month. This is such a great example of understanding the limitations of a product before you make that leap.

Update March 19: Microsoft raised the limit.

Within a few days of documenting the limit, Microsoft must have realized that it was just set entirely too low, and quietly raised it:

It’s twice as high now – 96 MB/sec – but…still uncompetitive with modern USB 3.0 thumb drives. Good to see that it’s moving in the right direction.

How fast can a $5,436/mo Azure SQL DB Hyperscale load data?

A client asked, “How quickly could we spin up a full copy of our database in the new Azure SQL DB Hyperscale?” Their database size wasn’t too far off from the 340GB Stack Overflow database, so I decided to migrate that to Hyperscale to see how the experience went.

Hyperscale is Microsoft’s intriguing competitor to Amazon Aurora. Hyperscale gives you most of the SQL Server compatibility that you’re used to, but a wild new architecture under the hood. Let’s see how that new design affects ingestion rates.

Setting up Azure SQL DB Hyperscale

Hyperscale’s setup in the Azure portal is a little easier since it’s a new product: you just pick how many cores you want and how many replicas, and that’s it. You don’t have to define the database size ahead of time because, uh, this isn’t 2005. (Seriously, can we just talk for a minute about how archaic it is that Azure SQL DB makes you pick a maximum size, yet it doesn’t affect the pricing? They’re just handing you an uncooked chicken breast with their bare hands. You just know that’s going to come back to haunt you later.)

Azure SQL DB Hyperscale in the portal

I maxed out my core count for the database migration – I can always downsize it later after the ingestion finishes. About pricing – Hyperscale is in preview right now, and I wanna say preview prices are half of general availability prices, but don’t quote me on that.

Attempt #1: firing up the Data Migration Assistant
(but in an AWS VM)

You can’t just hand Microsoft a database backup and restore it into Hyperscale. (Azure SQL DB suffers from this same limitation, but Managed Instances do not.) If you wanna move a SQL Server database into Hyperscale, you have the same options that you have with Azure SQL DB:

I didn’t want to hassle with transactional replication just to get a development database up to the cloud, and I didn’t care about downtime, so I went with the first method, and followed the instructions to maximize performance during my move (except for one thing that I’ll mention at the end, and why it didn’t seem to matter.) If this was a production migration scenario, I’d probably spend a lot more time digging into the last option, but my eyes started glazing over in the documentation once I learned I had to set up custom VNets, and I had limited time to pull this project off.

I ran the Data Migration Assistant, and dear reader, let’s pause for a moment to give Microsoft a round of applause for that app. It’s delightfully easy to use, full stop. Connect to the source server, connect to the target, pick the stuff you wanna move, it warns you about problems, and we’re off to the races. It’s just powerful enough to get the job done, but easy enough that I can see a lot of just-barely-technical admins stepping through it. Nicely done.

And off we went.

However, the load speeds were…not what I was hoping. 20 minutes into it, I realized the database was going to take several hours, and I didn’t really wanna leave a $5,436 per month server running for a day. Plus, if anything broke about the migration, and I had to start over, I would be one angry fella. The DMA doesn’t have a resumable way to keep going if something goes wrong midway through loading a big table, although at least I can pick which tables I want to migrate. (That’s where the Azure Database Migration Service makes so much more sense for production workloads, and I’m gonna be honest: after the ease of use of the Data Migration Assistant, I’m really tempted to spend some time with the Database Migration Service just to see if it’s as elegantly simple.)

Good news: no artificial log write throttling!

I monitored performance with sp_BlitzFirst on both the SQL Server source and the Hyperscale target. The SQL Server source was all local solid state NVMe storage, so it wasn’t a read performance bottleneck. The Hyperscale wait stats were interesting though:

Before you read too much into this – note that wait stats changed dramatically during the loads. Having said that, annotated notes:

  1. At the time of this screenshot, queries are waiting on PAGEIOLATCH waits, reading data pages from data files, but you can’t rely on point-in-time checks of individual queries. Those 200-300 millisecond delays are a bit of a bad foreboding of what’s to come, though.
  2. A log file is growing – that’s something I saw repeatedly during my tests, and that makes perfect sense given that I’m ingesting ~340GB of data.
  3. Top wait type is ASYNC_NETWORK_IO, which makes sense given that we’re moving data across networks. This is a case where I wish the Data Migration Assistant pushed data across even more threads because I don’t think we’re bottlenecked on the target write side. However, the storage waits – PAGEIOLATCH – are kinda concerning, especially with the THREE TO FOUR HUNDRED MILLISECOND delays. Those are not small numbers.
  4. In a 60-second sample, we only managed to write 1GB to the data file and 1.5GB to the log. Even combined, that’s only 41.5 MB/sec – that’s USB thumb drive territory. Log file latency at 4ms is great given Hyperscale’s storage architecture, though.

Here’s another screenshot from a different point in the loads – note that this time around, we have similar storage throughput (2.7GB written in total in 60 seconds) yet near zero storage waits:

Hyperscale waits

Note what’s not in either screenshot: any governor waits. From the wait stats side at least, it didn’t look like we’re being throttled from writing more data. To drill deeper and examine sys.dm_db_resource_stats directly to see if we’re hitting the avg_log_write_percent limits:

sys.dm_db_resource_stats

Not even close to being throttled. That’s fantastic!

So was attempt #1 faster? No, but…

The Hyperscale FAQ notes that:

The transaction log with Hyperscale is practically infinite. You do not need to worry about running out of log space on a system that has a high log throughput. However, the log generation rate might be throttled for continuous aggressive workloads. The peak and average log generation rate is not yet known (still in preview).

I love that wording – it’s like they discovered a new species, not like they wrote the code themselves. My guess, and this is just a guess, is that the throttle limits are set artificially high right now while they figure out the best place to set the limits.

  • The good news is that loads weren’t being artificially throttled
  • The bad news is that they might implement throttling as we approach GA (but hey, live for today)
  • The great news is that if I put work into performance tuning this migration, I’d likely be able to ingest data way faster

Well, I’m curious, and my client was curious, so let’s do this.

Attempt #2: nearby Azure L8 VM as source

My original source SQL Server was in Amazon because that’s where the client keeps their stuff today too, but given the lack of log write throttling in Azure SQL DB Hyperscale, I decided to try with an Azure VM source.

My Hyperscale target database was in West US 2, so I spun up an Azure VM in that same region. To give Hyperscale a faster/closer source, I used an L8s v2 VM – that’s 8 cores, 64GB RAM, and over 1TB of local (ephemeral) NVMe storage. Most production database workloads aren’t run on ephemeral storage since of course you can lose the database when the VM goes down, but you can mitigate that with a lot of defensive scripting and Availability Groups. That is left as an exercise for the reader. For today, YOLO.

I logged into the newly created VM, downloaded the 340GB Stack Overflow database via BitTorrent, deleted the former tables in Hyperscale, installed the Azure Data Migration Assistant, and started the migration again. Full disclosure: because the import had gotten maybe 10% finished on attempt #1, some of the log file growths were already out of the way. (I deleted the Hyperscale tables rather than deleting the entire database and starting over.)

And…Hyperscale still maxed out at about 1.6GB in any given 60-second sample:

Still about 1.6GB per minute

And sys.dm_db_resource_stats still showed nowhere near the log_write_percent limits, whatever they’re set to:

sys.dm_db_resource_stats

Hmm. Disappointing. Well, as long as we’re here…

Attempt #3: synthetic local loads

Just in case there was a problem with something with my Hyperscale instance, I blew it away and started again from scratch. I also took the Data Migration Assistant and networking completely out of the equation. I created a new 80-core Hyperscale database, maxed out on storage, and loaded a table with junk contents from sys.messages – this way, I didn’t have to worry about any cross-server communication at all:

I ran that same query across 8 sessions, but with different table names on each so they wouldn’t conflict. Log write volumes did about double, doing about 3.6GB of log file writes in 60 seconds:

And no, we weren’t anywhere near the limits:

But now I’m pretty sure the “limits” are set to an imaginary number just to let the hardware do as many writes as it can. I like that. I also like the Autobahn.

Attempt #4: letting it run overnight

I spun up another 80-core monster – this time in East US – and again, can I just take a second to high-five Microsoft for completing an 80-core Hyperscale deployment in 3 minutes, 19 seconds? This is probably the biggest reason I get excited about the cloud: operational flexibility.

Then, I fired up the Data Migration Assistant and just let it run overnight, rackin’ up charges, moving the ~340GB Stack Overflow database (with no indexes) from an AWS East VM to my Hyperscale database. (Granted – not the best case scenario for Hyperscale – but in reality, most folks are probably looking at moving from on-premises, not Azure to Azure.) The deployment finished after 4 hours and 47 seconds:

Hyperscale import complete: 4 hours, 47 seconds

~340GB in 4 hours is about 85GB per hour, or 1.4GB per minute. After the load finished, the Azure portal showed that my peak log consumption was 39.43% of the limits. Based on that, we can extrapolate that the limits are somewhere around 3-4GB per minute. (Actually achieving those limits is another matter, though, as I noted during the synthetic testing.)

Summary: not bad for this test scenario.

Hyperscale is a brand new product, and I wasn’t exactly giving it the best case scenario for ingesting data:

  • I was importing across clouds
  • I was using a non-performance-tuned app (Data Migration Assistant is easy, but it’s no SSIS)

But I think this represents a pretty good sampling of what customers are likely to try as they dip their toes into Hyperscale. In my next post, I’ll look at how an $21,468/mo Azure SQL DB fares under this same test.

[Video] An Introduction to GitHub for DBAs

Distributed source control is really intimidating: branches, pull requests, merges – will somebody just take my code, for crying out loud? Why does it have to be so complicated and involved?

I’m with you: I hated GitHub. For years, I struggled with it, but I’ve come to a gradual truce. I’m not a GitHub pro by any means, but in about an hour, I can explain the most important terms to you in a way that’ll make sense for non-developers. I’ll show you how to contribute to someone else’s open source project, and how to get started putting your own scripts under source control.

The resource links:

What Queries Does Microsoft’s Telemetry Service Run On Your SQL Server?

I was working in my lab and I was having some odd blocking waits that I wasn’t expecting. I started firing up sp_WhoIsActive to see what queries were running, and I got a rather odd surprise:

sp_WhoIsActivelyRunningTelemetryQueries (click to see full size)

Ah-ha, the telemetry service, also known as SQLCEIP! Starting with SQL Server 2016, your database server phones home to Microsoft by default. I clicked on the sql_text to see what query was running:

Brent Ozar in a tin foil hat

The trick is blocking the mind control waves while still letting the headphones get a signal

Well, whaddya know: that’s where my lock wait times were coming from. The SQL Server telemetry service was trying to query system objects using the default isolation level, which means they would get blocked. (We avoid that problem in sp_BlitzIndex with SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED.)

That got me to thinking – why not run a trace on SQLCEIP?

I captured a trace for a few hours after SQL Server started up. Here’s the 1MB trace file (trc) output readable with Profiler, and here’s a 1MB SQL Server 2017 database backup with the trace output stored in table. If you use the table version, this will help you analyze the queries involved:

Some of the interesting queries include…

Microsoft is getting your top 30 user database names.

They don’t return the size, but they do pull the database names with this query:

That query really surprised me because I hadn’t expected Microsoft to bring back database names – the rest of the queries seemed to take extra steps to only get database IDs or group data together by databases, but not this one:

Databases by size

Huh. I gotta think that’s an oversight. I have clients who consider database names to be confidential data since they have their client names or legal matters as part of the database name. (Not that that was a great design, but that ship has sailed.)

Few – but very few – of the other queries also return database names, like this one:

Most of them work more like this, just returning database ids:

I wouldn’t be surprised if, after this gets published, somebody goes through the telemetry queries looking for database names, and changes those queries to use something like the database-class approach used in other queries below. (Not to mention fixing the default read-committed isolation level bug that started me on this whole hunt – some of the telemetry queries use read uncommitted, and some don’t.)

They’re looking for SharePoint, Dynamics, and…AdventureWorks?

They’re categorizing databases by name:

The output looks like this:

I was kinda hoping AdventureWorks would be shown as “economy class”

I love this – if you’re going to analyze which customers are using SQL Server features, you want to avoid false positives. The sample databases like AdventureWorks use all kinds of features, so you wouldn’t want to count those as customers actually leveraging, say, spatial data. These database names are also specific enough that they’re going to avoid most false positives.

Most of the queries are uncommented,
but the security ones seem to have comments.

The queries that return data about encryption, certificates, and the like usually seem to have comments. I’m guessing that was done on purpose, trying to assuage folks’ fears. If I told a manager, “Microsoft’s telemetry is sending back what kind of encryption you’re using, how many columns are encrypted, whether your backup is encrypted, etc.,” they would probably sit up a little straighter in their chairs. I’m guessing the comments were left in to make people feel a little better about what encryption config data is leaving the building.

They’re checking your storage speed.

Storage speed is one of the classic database admin complaints, especially around 15-second IO warnings. They’re aggregating it by TempDB (interesting that they used spaces and called it “Temp DB”), user databases, and system databases.

Are you using encryption? They know.

In each database, they’re checking for encrypted columns:

And if you’re using SSMS’s Data Classification feature to classify data as personally identifiable, they’re inventorying that:

They’re also analyzing the backups you took in the last 24 hours, how long they’re taking, how big they are, and what kind of encryption type you’re using:

Summary

I was disappointed when Microsoft first announced that you can’t turn off telemetry in Developer and Express Editions. I understood the value of having telemetry is on by default, but not allowing developers to turn it off struck me as ridiculous because that’s probably the least valuable telemetry you could gather.

Stopped the trace, back to rocking out

Now that I see these queries, I wonder even more.

Is Microsoft really making decisions based on how fast a developer’s laptop performs with encrypted backups, or how columnstore deltas are keeping up on her laptop? Or do they segment out the Developer Edition data?

And if they segment it out, why not let people turn off telemetry? Why gather so much ill will from SQL Server’s mandatory phoning-home of that near-useless data?

Normally I’d end this post with a link to the feedback item requesting it, but Microsoft has closed it and erased all votes on it, saying:

Microsoft has no current plans to change the model for how usage data is emitted.

Update 2019/02/08 from Conor Cunningham

Microsoft’s Conor Cunningham pointed out a few things in the comments:

  • These are indeed queries that SQLCEIP is running
  • However, that data may not actually be going back to Microsoft
  • In order to figure out if it’s going back to Microsoft, customers are expected to follow these steps
  • Even when customers do follow those steps, you have to repeat that process continuously because “Microsoft can and does adjust the queries we evaluate over time” – meaning, you might think they’re not gathering database names today, and they can turn right around and gather them in the next Cumulative Update.

I’m stunned that Microsoft still won’t just publish a list of data that they’re gathering from customers’ SQL Servers. What are they so afraid of? Surely they’ve got a written list of data they’re gathering, right?

The Next Mastering Series is About to Start. Wanna Save $2,000?

Ever wanted to take my Mastering series of classes? I recommend that you take ’em in a specific order because each class builds on the next. If you try to parachute in out of order, you’re gonna have a bad time.

The next rotation is about to start:

  1. Fundamentals of Index Tuning – Feb 13
  2. Mastering Index Tuning – March 4-6
  3. Mastering Query Tuning – April 1-3
  4. Mastering Server Tuning – May 1-3

It’s not a cheap investment, I know: the Mastering classes are $2,995 each. That’s why a lot of folks choose the Live Class Season Pass which lets you attend all of ’em for a year straight – so you can re-take classes when work gets in the way, or when you wanna take your game up a notch.

There’s a new less expensive option:
save $2,000 by skipping the lab VMs.

During the Mastering classes, you get your own private lab VM in the cloud to follow along during the labs. You’re tasked with a series of tough assignments, and they help you reinforce the concepts that you’ve been seeing during the lectures. You get a lunchtime lab, and an afternoon/overnight lab as homework.

However, some students don’t use their lab VMs – they have a pesky day job that interferes. During the lunch & afternoon breaks, they work on their day job stuff instead, answering emails and doing help desk tickets.

Since the lab VMs are pretty expensive, I’d rather pass the savings on to you. So starting today, you can pick up a Live Class Season Pass for just $3,995. Enjoy, and see you in class!

Registration Open for My Upcoming Pre-Cons: Belgium, Dallas, Iceland, Israel, Miami, Orlando!

Brent Ozar Techorama 2018I’m hitting the road this spring. Here’s where to find me in person:

SQLSaturday IcelandYour First SQL Server Performance Tuning Class – March 16 – You’re stuck with a database server that’s not going fast enough. You have a hunch that the biggest bottleneck is inside the database server somewhere, but where? In this one-day class, you’ll learn how to identify which queries are killing your server, what parts of the database server are holding you back, how to tackle indexing improvements, and how to identify query anti-patterns. Learn more and register. – Update: sold out! I pre-announce these to our mailing list subscribers, and they snapped up the all seats over the weekend.

SQLSaturday IsraelPerformance Tuning in 21 Demos – April 10 – You’re a developer or DBA who wants to tune indexes and queries to make your application go faster. You don’t like learning by watching slides or reading white papers – you want to jump right into the code and see what happens. You’re the kind of person who loves watching YouTube videos that start with, “Hold my beer and watch this.” Learn more and register. (24 seats left as of this writing.)

Techorama BelgiumSQL Server Performance Tuning in a Day – May 20 – You’re stuck with a database server that’s not going fast enough. You’ve got a hunch that the biggest bottleneck is inside the database server somewhere, but where? In this one-day class, you’ll learn how identify which queries are killing your server, what parts of the database server are holding you back, how to tackle indexing improvements, and how to identify query anti-patterns. You’ll get your developers, sysadmins, and DBAs all on the same page, talking about the same metrics, and understanding whether the right fix is code, indexes, or hardware. Learn more and register.

SQLSaturday Dallas – Performance Tuning in 21 Demos – May 31 – You’re a developer or DBA who wants to tune indexes and queries to make your application go faster. You don’t like learning by watching slides or reading white papers – you want to jump right into the code and see what happens. You’re the kind of person who loves watching YouTube videos that start with, “Hold my beer and watch this.”Attendees will receive a year’s access to their choice: the Recorded Class Season Pass or the Consultant ToolkitLearn more and register.

Techorama - now that's a projector

Techorama – now that’s a projector

SQLSaturday South Florida – Performance Tuning in 21 Demos – June 7 – You’re a developer or DBA who wants to tune indexes and queries to make your application go faster. You don’t like learning by watching slides or reading white papers – you want to jump right into the code and see what happens. You’re the kind of person who loves watching YouTube videos that start with, “Hold my beer and watch this.” Attendees will receive a year’s access to their choice: the Recorded Class Season Pass or the Consultant Toolkit. Learn more and register.

SQL Intersection OrlandoDeveloper’s SQL Server Recipe Book – June 9 – You and your team have to build a lot of queries: transactional insert/update/deletes, batch jobs, and reports. You’re comfortable writing queries to get data in and out, but they’re just not as fast as you’d like. You’re looking for a set of examples that you can follow – plus a list of anti-patterns that seem good until they just don’t scale. I’ll share my favorite T-SQL starting points for different business requirements. We’ll implement them on the Stack Overflow database, and you can follow along on your laptop.Attendees will receive a year’s access to their choice: the Recorded Class Season Pass or the Consultant ToolkitLearn more and register.

If you’re organizing an event this fall/winter, and you’d like me to present a pre-con, I’d love to help. Email me at Help@BrentOzar.com – the earlier, the better. I tend to book several months in advance. See you out there!

The “Guess the Next Release Date” contest continues…

Back in December 2017, when I posted the Guess the Next SQL Server Release Date Contest, you could leave your guess in the comments for a chance to win a Live Class Season Pass and a Recorded Class Season Pass.

Here’s how the guesses so far have gone – and for this quick analysis, I’m only including people who followed the rules, not people who put a bunch of text in their comment. When it comes time to pick a winner, I’ll go through the text ones too.

Guess the Release Date

Looks like most folks expected it to be around the end of last year, 2018. I talked to a few folks about where their guesses came from, and they’d taken hints from the 2017 post from the Release Services team about the new Modern Servicing Model. At that time, Microsoft wrote:

CUs will be delivered more often at first and then less frequently. Every month for the first 12 months, and every quarter for the remainder 4 years of the full 5-year mainstream lifecycle.

Between that, and the quick cadence between SQL Server 2016 and 2017, they suspected that Microsoft would be shipping a new version of SQL Server every year. Made sense at the time. (Later, Microsoft revisited that support policy.)

Now, looking back:

  • SQL Server 2012: 2012-04-01
  • SQL Server 2014: 2014-04-21 (~25 months later)
  • SQL Server 2016: 2016-06-01 (~25 months later)
  • SQL Server 2017: 2017-10-02 (~16 months later)
  • Today, 2019-02-04: 16 months later

The SQL Server 2019 release cadence is actually slower than the 2017 pace. Interesting. The 2019 release is definitely more ambitious – Kubernetes, Big Data Clusters, batch mode for rowstore queries – there’s a lot of huuuuge stuff in here.

Something to think about as you place your guess. Good luck!

Update 2019/03/01 – closed comments here since folks got confused as to where the contest is happening.

[Video] Getting Started with the Consultant Toolkit

You’re a consultant who needs to figure out why your clients’ SQL Servers are slow and unreliable.

You’ve collected a bunch of diagnostic queries over the years, but they have a lot of holes, and you don’t really have the time to maintain them – let alone make ’em better. You waste time connecting to the server, running one query at a time, copy/pasting the results into Excel or taking screenshots. You’re burning the client’s money doing manual labor, and neither of you are happy with the results.

That’s where our Consultant Toolkit comes in, and today I did a live webcast talking about how it works:

Get a free trial and a big discount during our launch sale – ends Saturday.

10 Database Tasks That Sound Easy, But Aren’t

All of these are doable with scripts, tools, and/or elbow grease, mind you – they’re just not as easy as they sound at first glance until you know about the existence of those scripts, tools, and/or free people waiting around with spare elbow grease.

  1. “Go through the log and find out who ran this query.”
  2. “Apps sometimes time out when connecting to database server. Why?”
  3. “Build a new server with the same configuration as the old server.”
  4. “I just ran an UPDATE without a WHERE clause. Undo just that one query.”
  5. “Show me everyone who can read from this table, including groups and app users.”
  6. “Run a load test on these two servers and compare them.”
  7. “The reports server needs a synced copy of the accounting server’s customer table.”
  8. “What were the slowest running queries yesterday?”
  9. “Why is this query slow in production, but fast in development?”
  10. “Apply the latest updates for the cluster.”

As you read through that list and say to yourself, “Hey, I know how to do some of those really easily,” stop and congratulate yourself. You’ve learned things over the years, and you’re better at databases than when you started.

Announcing the Consultant Toolkit: Now Available in the European Union, too

Yesterday, I got a barrage of emails and LinkedIn messages that went like this:

Hey Brent, about your new consultant toolkit: I really love that idea and would want to get my hands on it. Unfortunately i originate in the EU. I know about your reasoning around not selling to EU members and I don’t want to start a discussion around that. Do you see ANY way how I can purchase it? I’ve written plenty of wrappers in the past, but I’d rather spend my time making servers faster than maintaining custom code bases whenever procedures change.

Alright, alright, alright.

You can now buy the Consultant Toolkit through an EU-friendly distributor. (That launch sale pricing is good for this week only.)

And I know what your next question’s going to be: “So, when can we buy the training classes and SQL ConstantCare® through them?” Those are quite a bit harder since the training classes involve a lot of streaming videos, and SQL ConstantCare® involves a lot of data. It’s on my radar, and it’s something that we’re working on for later this year, but I don’t have anything to announce yet.

Menu
{"cart_token":"","hash":"","cart_data":""}